The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003, Nucleic Acids Research, vol.31, issue.1, pp.365-370, 2003. ,
DOI : 10.1093/nar/gkg095
Swiss-Prot: Juggling between evolution and stability, Briefings in Bioinformatics, vol.5, issue.1, pp.39-55, 2005. ,
DOI : 10.1093/bib/5.1.39
The protein data bank, Nucleic Acids Research, vol.28, pp.235-242, 2000. ,
Computational approaches for protein function prediction: A survey, 2006. ,
Structural descriptor database: a new tool for sequence-based functional site prediction, BMC Bioinformatics, vol.9, issue.1, p.492, 2008. ,
DOI : 10.1186/1471-2105-9-492
URL : https://hal.archives-ouvertes.fr/hal-00684135
Computational approaches for protein function prediction: A combined strategy from multiple sequence alignment to molecular docking-based virtual screening, Biochimica et Biophysica Acta (BBA) - Proteins and Proteomics, vol.1804, issue.9 ,
DOI : 10.1016/j.bbapap.2010.04.008
SCOP database in 2004: refinements integrate structure and sequence family data, Nucleic Acids Research, vol.32, issue.90001, pp.226-229, 2004. ,
DOI : 10.1093/nar/gkh039
Basic local alignment search tool, Journal of Molecular Biology, vol.215, issue.3, pp.403-410, 1990. ,
DOI : 10.1016/S0022-2836(05)80360-2
[5] Rapid and sensitive sequence comparison with FASTP and FASTA, Methods Enzymol, vol.183, pp.63-98, 1985. ,
DOI : 10.1016/0076-6879(90)83007-V
Hidden Markov models for sequence analysis: extension and analysis of the basic method, Bioinformatics, vol.12, issue.2, pp.95-107, 1996. ,
DOI : 10.1093/bioinformatics/12.2.95
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs, Nucleic Acids Research, vol.25, issue.17, pp.3389-3402, 1997. ,
DOI : 10.1093/nar/25.17.3389
Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure, Journal of Molecular Biology, vol.313, issue.4, pp.903-919, 2001. ,
DOI : 10.1006/jmbi.2001.5080
Within the twilight zone: a sensitive profile-profile comparison tool based on information theory, Journal of Molecular Biology, vol.315, issue.5, pp.1257-1275, 2002. ,
DOI : 10.1006/jmbi.2001.5293
Profile-profile comparisons by COMPASS predict intricate homologies between protein families, Protein Science, vol.28, issue.Suppl 5, pp.2262-2272, 2003. ,
DOI : 10.1110/ps.03197403
Protein homology detection by HMM-HMM comparison, Bioinformatics, vol.21, issue.7, pp.951-960, 2005. ,
DOI : 10.1093/bioinformatics/bti125
Performance of an iterated T-HMM for homology detection, Bioinformatics, vol.20, issue.14, pp.2175-2180, 2004. ,
DOI : 10.1093/bioinformatics/bth181
Using 3d hidden markov models that explicitly represent spatial coordinates to model and compare protein structures, BMC Bioinformatics, vol.5, pp.1-10, 2004. ,
Improving model construction of profile HMMs for remote homology detection through structural alignment, BMC Bioinformatics, vol.8, issue.1, pp.1-12, 2007. ,
DOI : 10.1186/1471-2105-8-435
URL : https://hal.archives-ouvertes.fr/hal-00684130
Supra-domains: Evolutionary Units Larger than Single Protein Domains, Journal of Molecular Biology, vol.336, issue.3, pp.809-823, 2004. ,
DOI : 10.1016/j.jmb.2003.12.026
On the detection of functionally coherent groups of protein domains with an extension to protein annotation, BMC Bioinformatics, vol.8, issue.1, p.390, 2007. ,
DOI : 10.1186/1471-2105-8-390
Predicting Subcellular Localization via Protein Motif Co-Occurrence, Genome Research, vol.14, issue.10a, pp.1957-1966, 2004. ,
DOI : 10.1101/gr.2650004
CDART: Protein Homology by Domain Architecture, Genome Research, vol.12, issue.10, pp.1619-1623, 2002. ,
DOI : 10.1101/gr.278202
Detection of new protein domains using co-occurrence: application to Plasmodium falciparum, Bioinformatics, vol.25, issue.23, pp.3077-3083, 2009. ,
DOI : 10.1093/bioinformatics/btp560
URL : https://hal.archives-ouvertes.fr/lirmm-00431171
Using context to improve protein domain identification, BMC Bioinformatics, vol.12, issue.1, p.90, 2011. ,
DOI : 10.1073/pnas.87.6.2264
A Discriminative Framework for Detecting Remote Protein Homologies, Journal of Computational Biology, vol.7, issue.1-2, pp.95-114, 2000. ,
DOI : 10.1089/10665270050081405
Remote homology detection: a motif based approach, Bioinformatics, vol.19, issue.Suppl 1, pp.26-33, 2003. ,
DOI : 10.1093/bioinformatics/btg1002
Efficient remote homology detection using local structure, Bioinformatics, vol.19, issue.17, pp.2294-2301, 2003. ,
DOI : 10.1093/bioinformatics/btg317
Mismatch string kernels for discriminative protein classification, Bioinformatics, vol.20, issue.4, pp.467-476, 2004. ,
DOI : 10.1093/bioinformatics/btg431
Combining Pairwise Sequence Similarity and Support Vector Machines for Detecting Remote Protein Evolutionary and Structural Relationships, Journal of Computational Biology, vol.10, issue.6, pp.857-868, 2004. ,
DOI : 10.1089/106652703322756113
Remote homolog detection using local sequence-structure correlations, Proteins: Structure, Function, and Bioinformatics, vol.284, issue.3, pp.518-530, 2004. ,
DOI : 10.1002/prot.20221
Protein homology detection using string alignment kernels, Bioinformatics, vol.20, issue.11, pp.1682-1689, 2004. ,
DOI : 10.1093/bioinformatics/bth141
URL : https://hal.archives-ouvertes.fr/hal-00433587
eBLOCKs: enumerating conserved protein blocks to achieve maximal sensitivity and specificity, Nucleic Acids Research, vol.33, issue.Database issue, pp.178-182, 2005. ,
DOI : 10.1093/nar/gki060
Implicit motif distribution based hybrid computational kernel for sequence classification, Bioinformatics, vol.21, issue.8, pp.1429-1436, 2005. ,
DOI : 10.1093/bioinformatics/bti212
PROFILE-BASED STRING KERNELS FOR REMOTE HOMOLOGY DETECTION AND MOTIF EXTRACTION, Journal of Bioinformatics and Computational Biology, vol.03, issue.03, pp.527-550, 2005. ,
DOI : 10.1142/S021972000500120X
Profile-based direct kernels for remote homology detection and fold recognition, Bioinformatics, vol.21, issue.23, pp.4239-4247, 2005. ,
DOI : 10.1093/bioinformatics/bti687
Remote homology detection based on oligomer distances, Bioinformatics, vol.22, issue.18, pp.2224-2231, 2006. ,
DOI : 10.1093/bioinformatics/btl376
Application of latent semantic analysis to protein remote homology detection, Bioinformatics, vol.22, issue.3, pp.285-290, 2006. ,
DOI : 10.1093/bioinformatics/bti801
Motif kernel generated by genetic programming improves remote homology and fold detection, BMC Bioinformatics, vol.8, issue.1, p.23, 2007. ,
DOI : 10.1186/1471-2105-8-23
A discriminative method for protein remote homology detection and fold recognition combining Top-n-grams and latent semantic analysis, BMC Bioinformatics, vol.9, issue.1, p.510, 2008. ,
DOI : 10.1186/1471-2105-9-510
SVM-HUSTLE--an iterative semi-supervised machine learning approach for pairwise protein remote homology detection, Bioinformatics, vol.24, issue.6, pp.783-790, 2008. ,
DOI : 10.1093/bioinformatics/btn028
Physicochemical property distributions for accurate and rapid pairwise protein homology detection, BMC Bioinformatics, vol.11, issue.1, p.145, 2010. ,
DOI : 10.1186/1471-2105-11-145
Inductive Logic Programming: Theory and methods, The Journal of Logic Programming, vol.19, issue.20, pp.629-679, 1994. ,
DOI : 10.1016/0743-1066(94)90035-3
Homology induction: the use of machine learning to improve sequence similarity searches, BMC Bioinformatics, vol.3, issue.1, p.11, 2002. ,
DOI : 10.1186/1471-2105-3-11
An Automated ILP Server in the Field of Bioinformatics, Proceedings of the Eleventh International Conference on Inductive Logic Programming, pp.91-103, 2001. ,
DOI : 10.1007/3-540-44797-0_8
Applying inductive logic programming to predicting gene function, AI Magazine, vol.25, pp.57-58, 2004. ,
A data-mining tool for chemical data, Journal of Computer-Aided Molecular Design, vol.15, issue.2, pp.173-181, 2001. ,
DOI : 10.1023/A:1008171016861
A discriminative method for family-based protein remote homology detection that combines inductive logic programming and propositional models, BMC Bioinformatics, vol.12, issue.1, p.83, 2011. ,
DOI : 10.1093/bioinformatics/14.9.755
URL : https://hal.archives-ouvertes.fr/hal-00684137
Combining evolution and machine learning for functional annotation in plasmodium falciparum annotation ,
The Pfam protein families database, Nucleic Acids Research, vol.38, issue.Database, pp.211-222, 2010. ,
DOI : 10.1093/nar/gkp985
URL : https://hal.archives-ouvertes.fr/hal-01294685
Prediction of the general transcription factors associated with rna polymerase ii in plasmodium falciparum: conserved features and differences relative to other eukaryotes, BMC Genomics, vol.6, issue.1, p.100, 2005. ,
DOI : 10.1186/1471-2164-6-100
URL : https://hal.archives-ouvertes.fr/hal-00021609
Cross-disciplinary perspectives on meta-learning for algorithm selection, ACM Computing Surveys, vol.41, issue.1, pp.1-25, 2009. ,
DOI : 10.1145/1456650.1456656
Hausman in The Cell: A Molecular Approach, ch. 7: RNA Synthesis and Processing, Sinauer Associates, 2009. ,
Hausman in The Cell: A Molecular Approach, ch. 8: Protein Synthesis, Processing, and Regulation, 2009. ,
The Structure of Proteins, Journal of the American Chemical Society, vol.61, issue.7, pp.205-211, 1951. ,
DOI : 10.1021/ja01876a065
Possible Significance of Duplication in Evolution, Advances in Genetics, vol.4, pp.247-265, 1951. ,
DOI : 10.1016/S0065-2660(08)60237-0
The roles of mutation, inbreeding, crossbreeding and selection in evolution, Proceedings of the VI International Congress of Genetrics, pp.356-366, 1932. ,
Models of Speciation: New concepts suggest that the classical sympatric and allopatric models are not the only alternatives, Science, vol.159, issue.3819, pp.1065-1070, 1968. ,
DOI : 10.1126/science.159.3819.1065
Galperin in Sequence -Evolution -Function Computational Approaches in Comparative Genomics, ch. 2: Evolutionary Concept in, Genetics and Genomics, 2009. ,
Hydrophobic cluster analysis: An efficient new way to compare and analyse amino acid sequences, FEBS Letters, vol.112, issue.1, pp.149-155, 1987. ,
DOI : 10.1016/0014-5793(87)80439-8
Phylogenetic trees made easy: a how-to manual, 2004. ,
PIRSF: family classification system at the Protein Information Resource, Nucleic Acids Research, vol.32, issue.90001, pp.112-114, 2004. ,
DOI : 10.1093/nar/gkh097
ProtoMap: automatic classification of protein sequences and hierarchy of protein families, Nucleic Acids Research, vol.28, issue.1, pp.49-55, 2000. ,
DOI : 10.1093/nar/28.1.49
The TIGRFAMs database of protein families, Nucleic Acids Research, vol.31, issue.1, pp.371-373, 2003. ,
DOI : 10.1093/nar/gkg128
ProDom and ProDom-CG: tools for protein domain analysis and whole genome comparisons, Nucleic Acids Research, vol.28, issue.1, pp.267-269, 2000. ,
DOI : 10.1093/nar/28.1.267
URL : https://hal.archives-ouvertes.fr/hal-00427044
PROSITE, a protein domain database for functional characterization and annotation, Nucleic Acids Research, vol.38, issue.Database, pp.161-166, 2010. ,
DOI : 10.1093/nar/gkp885
Progress with the PRINTS protein fingerprint database, Nucleic Acids Research, vol.24, issue.1, pp.182-188, 1996. ,
DOI : 10.1093/nar/24.1.182
The CATH database: an extended protein family resource for structural and functional genomics, Nucleic Acids Research, vol.31, issue.1, pp.452-455, 2003. ,
DOI : 10.1093/nar/gkg062
iProClass: an integrated, comprehensive and annotated protein classification database, Nucleic Acids Research, vol.29, issue.1, pp.52-54, 2001. ,
DOI : 10.1093/nar/29.1.52
InterPro--an integrated documentation resource for protein families, domains and functional sites, Bioinformatics, vol.16, issue.12, pp.1145-1150, 2000. ,
DOI : 10.1093/bioinformatics/16.12.1145
URL : https://hal.archives-ouvertes.fr/hal-01213734
Amino acid substitution matrices from protein blocks., Proceedings of the National Academy of Sciences, vol.89, issue.22, pp.10915-10919, 1992. ,
DOI : 10.1073/pnas.89.22.10915
Identification of common molecular subsequences, Journal of Molecular Biology, vol.147, issue.1, pp.195-197, 1981. ,
DOI : 10.1016/0022-2836(81)90087-5
Profile analysis: detection of distantly related proteins., Proceedings of the National Academy of Sciences, pp.4355-4358, 1987. ,
DOI : 10.1073/pnas.84.13.4355
Determinants of a protein fold, Journal of Molecular Biology, vol.196, issue.1, pp.199-216, 1987. ,
DOI : 10.1016/0022-2836(87)90521-3
Hidden Markov models for sequence analysis: extension and analysis of the basic method, Bioinformatics, vol.12, issue.2, pp.95-107, 1996. ,
DOI : 10.1093/bioinformatics/12.2.95
Profile hidden Markov models, Bioinformatics, vol.14, issue.9, pp.755-763, 1998. ,
DOI : 10.1093/bioinformatics/14.9.755
Using dirichlet mixture priors to derive hidden markov models for protein families, Proc.of First Int. Conf. on Intelligent Systems for Molecular Biology, p.4755, 1993. ,
Improved sensitivity of profile searches through the use of sequence weights and gap excision Computer applications in the biosciences, CABIOS, vol.10, issue.1, pp.19-29, 1994. ,
Maximum entropy weighting of aligned sequences of proteins or dna, Proceedings of the Third International Conference on Intelligent Systems for Molecular Biology, pp.215-221, 1995. ,
A tutorial on hidden markov models and selected applications in speech recognition, Proceedings of the IEEE, pp.257-286, 1989. ,
Uniprot: the universal protein knowledgebase, Nucleic Acids Research, vol.32, pp.115-119, 2004. ,
Advances in kernel methods: support vector learning, 1999. ,
Protein ranking: From local to global structure in the protein similarity network, Proceedings of the National Academy of Sciences, vol.101, issue.17, pp.6559-6563, 2004. ,
DOI : 10.1073/pnas.0308067101
Mining association rules in multiple relations, Proceedings of the 7th International Workshop on Inductive Logic Programming, pp.125-132, 1997. ,
DOI : 10.1007/3540635149_40
C4.5: Programs for machine learning, Machine Learning, pp.235-240, 1994. ,
Using a mixture of probabilistic decision trees for direct prediction of protein function, Proceedings of the seventh annual international conference on Computational molecular biology , RECOMB '03, pp.289-300, 2003. ,
DOI : 10.1145/640075.640114
RUSE-WARMR: Rule Selection for Classifier Induction in Multi-relational Data-Sets, 2008 20th IEEE International Conference on Tools with Artificial Intelligence, pp.379-386, 2008. ,
DOI : 10.1109/ICTAI.2008.73
The ASTRAL compendium for protein structure and sequence analysis, Nucleic Acids Research, vol.28, issue.1, pp.254-256, 2000. ,
DOI : 10.1093/nar/28.1.254
The relationship between Precision-Recall and ROC curves, Proceedings of the 23rd international conference on Machine learning , ICML '06, pp.233-240, 2006. ,
DOI : 10.1145/1143844.1143874
Association rules between sets of items in large databases, Proceedings of the ACM SIGMOD Intl. Conf. on Management of Data, pp.207-216, 1993. ,
Clustal w: improving the sensitivity of progressive multiple sequence alignment through sequence weighting,position-specific gap penalties and weight matrix choice, Nucleic Acids Research, vol.22, pp.4673-4680, 1994. ,
Individual Comparisons by Ranking Methods, Biometrics Bulletin, vol.1, issue.6, pp.80-83, 1945. ,
DOI : 10.2307/3001968
Constraint based mining of first order sequences in seqlog, Database Support for Data Mining Application, pp.155-176, 2004. ,
PlasmoDB: the Plasmodium genome resource. A database integrating experimental and computational data, Nucleic Acids Research, vol.31, issue.1, pp.212-215, 2003. ,
DOI : 10.1093/nar/gkg081
PlasmoDB: a functional genomic database for malaria parasites, Nucleic Acids Research, vol.37, issue.Database, pp.539-543, 2009. ,
DOI : 10.1093/nar/gkn814
Computational modeling of the Plasmodiumfalciparum interactome reveals protein functionon a genome-wide scale, Genome Research, vol.16, issue.4, pp.542-549, 2006. ,
DOI : 10.1101/gr.4573206
cDNA sequences reveal considerable gene prediction inaccuracy in the Plasmodium falciparum genome, BMC Genomics, vol.8, issue.1, p.255, 2007. ,
DOI : 10.1186/1471-2164-8-255
A structural annotation resource for the selection of putative target proteins in the malaria parasite, Malaria Journal, vol.7, issue.1, p.90, 2008. ,
DOI : 10.1186/1475-2875-7-90
SMART 5: domains in the context of genomes and networks, Nucleic Acids Research, vol.34, issue.90001, pp.257-260, 2005. ,
DOI : 10.1093/nar/gkj079
Gene3D: modelling protein structure, function and evolution, Nucleic Acids Research, vol.34, issue.90001, pp.281-284, 2005. ,
DOI : 10.1093/nar/gkj057
The PANTHER database of protein families, subfamilies, functions and pathways, Nucleic Acids Research, vol.33, issue.Database issue, pp.284-288, 2005. ,
DOI : 10.1093/nar/gki078
A fast and automated solution for accurately resolving protein domain architectures, Bioinformatics, vol.26, issue.6, pp.745-751, 2010. ,
DOI : 10.1093/bioinformatics/btq034
In silico and biological survey of transcription-associated proteins implicated in the transcriptional machinery during the erythrocytic development of Plasmodium falciparum, BMC Genomics, vol.11, issue.1, p.34, 2010. ,
DOI : 10.1186/1471-2164-11-34
URL : https://hal.archives-ouvertes.fr/pasteur-00663529
Ensemble Methods in Machine Learning, Multiple Classifier Systems, pp.1-15, 2000. ,
DOI : 10.1007/3-540-45014-9_1
Bagging predictors, Machine Learning, pp.123-140, 1996. ,
DOI : 10.1007/BF00058655
Experiments with a new boosting algorithm, International Conference on Machine Learning, pp.148-156, 1996. ,
Stacked generalization, Neural Networks, vol.5, issue.2, pp.241-259, 1992. ,
DOI : 10.1016/S0893-6080(05)80023-1
Is Combining Classifiers with Stacking Better than Selecting the Best One?, Machine Learning, pp.255-273, 2004. ,
DOI : 10.1023/B:MACH.0000015881.36452.6e
Application of majority voting to pattern recognition: an analysis of its behavior and performance, " Systems, Man and Cybernetics, Part A: Systems and Humans, IEEE Transactions on, vol.27, pp.553-568, 1997. ,
Large margin dags for multiclass classification, Advances in Neural Information Processing Systems 12, pp.547-553, 2000. ,
Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods, Advances in Large Margin Classifiers, pp.61-74, 1999. ,
Predicting protein structural class by SVM with class-wise optimized features and decision probabilities, Journal of Theoretical Biology, vol.253, issue.2, pp.375-380, 2008. ,
DOI : 10.1016/j.jtbi.2008.02.031
Survey of multi-objective optimization methods for engineering Structural and Multidisciplinary Optimization, pp.369-395, 2004. ,
An engineering approach: Hierarchical optimization criteria, IEEE Transactions on Automatic Control, vol.12, issue.2, pp.179-180, 1967. ,
DOI : 10.1109/TAC.1967.1098537
LIBSVM, ACM Transactions on Intelligent Systems and Technology, vol.2, issue.3, pp.1-27, 2011. ,
DOI : 10.1145/1961189.1961199
Combining Trees as a Way of Combining Data Sets for Phylogenetic Inference, and the Desirability of Combining Gene Trees, Taxon, vol.41, issue.1, pp.3-10, 1992. ,
DOI : 10.2307/1222480
The neighbor-joining method: a new method for reconstructing phylogenetic trees, Molecular Biology and Evolution, vol.4, pp.406-425, 1987. ,
Modified Mincut Supertrees, Proceedings of the Second International Workshop on Algorithms in Bioinformatics, pp.537-552, 2002. ,
DOI : 10.1007/3-540-45784-4_41
Phylogenetic relationship of organisms obtained by ribosomal protein comparison, Cellular and Molecular Life Sciences, vol.53, issue.1, pp.34-50, 1997. ,
DOI : 10.1007/PL00000578
The tree of eukaryotes, Trends in Ecology & Evolution, vol.20, issue.12, pp.670-676, 2005. ,
DOI : 10.1016/j.tree.2005.09.005
An Evolutionary Trace Method Defines Binding Surfaces Common to Protein Families, Journal of Molecular Biology, vol.257, issue.2, pp.342-358, 1996. ,
DOI : 10.1006/jmbi.1996.0167
Evolutionarily Conserved Pathways of Energetic Connectivity in Protein Families, Science, vol.286, issue.5438, pp.295-299, 1999. ,
DOI : 10.1126/science.286.5438.295
Evolutionarily conserved networks of residues mediate allosteric communication in proteins, Nature Structural Biology, vol.10, issue.1, pp.1072-8368, 2003. ,
DOI : 10.1038/nsb881
A Combinatorial Approach to Detect Coevolved Amino Acid Networks in Protein Families of Variable Divergence, PLoS Computational Biology, vol.15, issue.1, p.1000488, 2009. ,
DOI : 10.1371/journal.pcbi.1000488.s002
Co-evolution and information signals in biological sequences, Theoretical Computer Science, vol.412, issue.23, pp.2486-2495, 2011. ,
DOI : 10.1016/j.tcs.2010.10.040
Predicting protein function from domain content, Bioinformatics, vol.24, issue.15, pp.1681-1687, 2008. ,
DOI : 10.1093/bioinformatics/btn312
Using model trees for classification, Machine Learning, pp.63-76, 1998. ,
Hmmer-struct: Adding structural properties to profile hmms, 2007. ,