B. Boeckmann, A. Bairoch, R. Apweiler, M. Blatter, A. Estreicher et al., The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003, Nucleic Acids Research, vol.31, issue.1, pp.365-370, 2003.
DOI : 10.1093/nar/gkg095

A. Bairoch, B. Boeckmann, S. Ferro, and E. Gasteiger, Swiss-Prot: Juggling between evolution and stability, Briefings in Bioinformatics, vol.5, issue.1, pp.39-55, 2005.
DOI : 10.1093/bib/5.1.39

M. Helen, J. Westbrook, Z. Feng, G. Gilliland, T. Bhat et al., The protein data bank, Nucleic Acids Research, vol.28, pp.235-242, 2000.

G. Pandey, V. Kumar, and M. Steinbach, Computational approaches for protein function prediction: A survey, 2006.

J. Bernardes, J. Fernandez, and A. Vasconcelos, Structural descriptor database: a new tool for sequence-based functional site prediction, BMC Bioinformatics, vol.9, issue.1, p.492, 2008.
DOI : 10.1186/1471-2105-9-492

URL : https://hal.archives-ouvertes.fr/hal-00684135

C. Pierri, G. Parisi, and V. Porcelli, Computational approaches for protein function prediction: A combined strategy from multiple sequence alignment to molecular docking-based virtual screening, Biochimica et Biophysica Acta (BBA) - Proteins and Proteomics, vol.1804, issue.9
DOI : 10.1016/j.bbapap.2010.04.008

A. Andreeva, D. Howorth, S. Brenner, T. Hubbard, C. Chothia et al., SCOP database in 2004: refinements integrate structure and sequence family data, Nucleic Acids Research, vol.32, issue.90001, pp.226-229, 2004.
DOI : 10.1093/nar/gkh039

S. Altschul, W. Gish, W. Miller, E. Myers, and D. Lipman, Basic local alignment search tool, Journal of Molecular Biology, vol.215, issue.3, pp.403-410, 1990.
DOI : 10.1016/S0022-2836(05)80360-2

W. Pearson, [5] Rapid and sensitive sequence comparison with FASTP and FASTA, Methods Enzymol, vol.183, pp.63-98, 1985.
DOI : 10.1016/0076-6879(90)83007-V

R. Hughey and A. Krogh, Hidden Markov models for sequence analysis: extension and analysis of the basic method, Bioinformatics, vol.12, issue.2, pp.95-107, 1996.
DOI : 10.1093/bioinformatics/12.2.95

S. Altschul, T. Madden, A. Schaffer, J. Zhang, Z. Zhang et al., Gapped BLAST and PSI-BLAST: a new generation of protein database search programs, Nucleic Acids Research, vol.25, issue.17, pp.3389-3402, 1997.
DOI : 10.1093/nar/25.17.3389

J. Gough, K. Karplus, R. Hughey, and C. Chothia, Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure, Journal of Molecular Biology, vol.313, issue.4, pp.903-919, 2001.
DOI : 10.1006/jmbi.2001.5080

G. Yona and M. Levitt, Within the twilight zone: a sensitive profile-profile comparison tool based on information theory, Journal of Molecular Biology, vol.315, issue.5, pp.1257-1275, 2002.
DOI : 10.1006/jmbi.2001.5293

R. Sadreyev, D. Baker, and N. Grishin, Profile-profile comparisons by COMPASS predict intricate homologies between protein families, Protein Science, vol.28, issue.Suppl 5, pp.2262-2272, 2003.
DOI : 10.1110/ps.03197403

J. Soeding, Protein homology detection by HMM-HMM comparison, Bioinformatics, vol.21, issue.7, pp.951-960, 2005.
DOI : 10.1093/bioinformatics/bti125

B. Qian and R. Goldstein, Performance of an iterated T-HMM for homology detection, Bioinformatics, vol.20, issue.14, pp.2175-2180, 2004.
DOI : 10.1093/bioinformatics/bth181

V. Alexandrov and M. Gerstein, Using 3d hidden markov models that explicitly represent spatial coordinates to model and compare protein structures, BMC Bioinformatics, vol.5, pp.1-10, 2004.

J. Bernardes, A. Davila, V. Costa, and G. Zaverucha, Improving model construction of profile HMMs for remote homology detection through structural alignment, BMC Bioinformatics, vol.8, issue.1, pp.1-12, 2007.
DOI : 10.1186/1471-2105-8-435

URL : https://hal.archives-ouvertes.fr/hal-00684130

C. Vogel, C. Berzuini, M. Bashton, J. Gough, and S. Teichmann, Supra-domains: Evolutionary Units Larger than Single Protein Domains, Journal of Molecular Biology, vol.336, issue.3, pp.809-823, 2004.
DOI : 10.1016/j.jmb.2003.12.026

W. Mclaughlin, K. Chen, T. Hou, and W. Wang, On the detection of functionally coherent groups of protein domains with an extension to protein annotation, BMC Bioinformatics, vol.8, issue.1, p.390, 2007.
DOI : 10.1186/1471-2105-8-390

M. Scott, D. Thomas, and M. Hallett, Predicting Subcellular Localization via Protein Motif Co-Occurrence, Genome Research, vol.14, issue.10a, pp.1957-1966, 2004.
DOI : 10.1101/gr.2650004

L. Geer, M. Domrachev, D. Lipman, and S. Bryant, CDART: Protein Homology by Domain Architecture, Genome Research, vol.12, issue.10, pp.1619-1623, 2002.
DOI : 10.1101/gr.278202

N. Terrapon, O. Gascuel, E. Marechal, and L. Breehelin, Detection of new protein domains using co-occurrence: application to Plasmodium falciparum, Bioinformatics, vol.25, issue.23, pp.3077-3083, 2009.
DOI : 10.1093/bioinformatics/btp560

URL : https://hal.archives-ouvertes.fr/lirmm-00431171

A. Ochoa, M. Llinas, and M. Singh, Using context to improve protein domain identification, BMC Bioinformatics, vol.12, issue.1, p.90, 2011.
DOI : 10.1073/pnas.87.6.2264

T. Jaakkola, M. Diekhans, and D. Haussler, A Discriminative Framework for Detecting Remote Protein Homologies, Journal of Computational Biology, vol.7, issue.1-2, pp.95-114, 2000.
DOI : 10.1089/10665270050081405

A. Ben-hur and D. Brutlag, Remote homology detection: a motif based approach, Bioinformatics, vol.19, issue.Suppl 1, pp.26-33, 2003.
DOI : 10.1093/bioinformatics/btg1002

Y. Hou, W. Hsu, M. Lee, and C. Bystroff, Efficient remote homology detection using local structure, Bioinformatics, vol.19, issue.17, pp.2294-2301, 2003.
DOI : 10.1093/bioinformatics/btg317

C. Leslie, E. Eskin, A. Cohen, J. Weston, and W. Noble, Mismatch string kernels for discriminative protein classification, Bioinformatics, vol.20, issue.4, pp.467-476, 2004.
DOI : 10.1093/bioinformatics/btg431

L. Liao and W. Noble, Combining Pairwise Sequence Similarity and Support Vector Machines for Detecting Remote Protein Evolutionary and Structural Relationships, Journal of Computational Biology, vol.10, issue.6, pp.857-868, 2004.
DOI : 10.1089/106652703322756113

Y. Hou, W. Hsu, L. Lee, and C. Bystroff, Remote homolog detection using local sequence-structure correlations, Proteins: Structure, Function, and Bioinformatics, vol.284, issue.3, pp.518-530, 2004.
DOI : 10.1002/prot.20221

H. Saigo, J. Vert, N. Ueda, and T. Akutsu, Protein homology detection using string alignment kernels, Bioinformatics, vol.20, issue.11, pp.1682-1689, 2004.
DOI : 10.1093/bioinformatics/bth141

URL : https://hal.archives-ouvertes.fr/hal-00433587

Q. Su, L. Lu, S. Saxonov, and D. Brutlag, eBLOCKs: enumerating conserved protein blocks to achieve maximal sensitivity and specificity, Nucleic Acids Research, vol.33, issue.Database issue, pp.178-182, 2005.
DOI : 10.1093/nar/gki060

V. Atalay and R. Cetin-atalay, Implicit motif distribution based hybrid computational kernel for sequence classification, Bioinformatics, vol.21, issue.8, pp.1429-1436, 2005.
DOI : 10.1093/bioinformatics/bti212

R. Kuang, E. Ie, K. Wang, K. Wang, M. Siddiqi et al., PROFILE-BASED STRING KERNELS FOR REMOTE HOMOLOGY DETECTION AND MOTIF EXTRACTION, Journal of Bioinformatics and Computational Biology, vol.03, issue.03, pp.527-550, 2005.
DOI : 10.1142/S021972000500120X

H. Rangwala and G. Karypis, Profile-based direct kernels for remote homology detection and fold recognition, Bioinformatics, vol.21, issue.23, pp.4239-4247, 2005.
DOI : 10.1093/bioinformatics/bti687

T. Lingner and P. Meinicke, Remote homology detection based on oligomer distances, Bioinformatics, vol.22, issue.18, pp.2224-2231, 2006.
DOI : 10.1093/bioinformatics/btl376

Q. Dong, X. Wang, and L. Lin, Application of latent semantic analysis to protein remote homology detection, Bioinformatics, vol.22, issue.3, pp.285-290, 2006.
DOI : 10.1093/bioinformatics/bti801

T. Handstad, A. Hestnes, and P. Saetrom, Motif kernel generated by genetic programming improves remote homology and fold detection, BMC Bioinformatics, vol.8, issue.1, p.23, 2007.
DOI : 10.1186/1471-2105-8-23

B. Liu, X. Wang, L. Lin, Q. Dong, and X. Wang, A discriminative method for protein remote homology detection and fold recognition combining Top-n-grams and latent semantic analysis, BMC Bioinformatics, vol.9, issue.1, p.510, 2008.
DOI : 10.1186/1471-2105-9-510

A. Shah, C. Oehmen, and B. Webb-robertson, SVM-HUSTLE--an iterative semi-supervised machine learning approach for pairwise protein remote homology detection, Bioinformatics, vol.24, issue.6, pp.783-790, 2008.
DOI : 10.1093/bioinformatics/btn028

B. Webb-robertson, K. Ratuiste, and C. Oehmen, Physicochemical property distributions for accurate and rapid pairwise protein homology detection, BMC Bioinformatics, vol.11, issue.1, p.145, 2010.
DOI : 10.1186/1471-2105-11-145

S. Muggleton, L. De, and R. , Inductive Logic Programming: Theory and methods, The Journal of Logic Programming, vol.19, issue.20, pp.629-679, 1994.
DOI : 10.1016/0743-1066(94)90035-3

A. Karwath and R. King, Homology induction: the use of machine learning to improve sequence similarity searches, BMC Bioinformatics, vol.3, issue.1, p.11, 2002.
DOI : 10.1186/1471-2105-3-11

A. Karwath and R. King, An Automated ILP Server in the Field of Bioinformatics, Proceedings of the Eleventh International Conference on Inductive Logic Programming, pp.91-103, 2001.
DOI : 10.1007/3-540-44797-0_8

R. King, Applying inductive logic programming to predicting gene function, AI Magazine, vol.25, pp.57-58, 2004.

R. King, A. Srinivasan, and L. Dehaspe, A data-mining tool for chemical data, Journal of Computer-Aided Molecular Design, vol.15, issue.2, pp.173-181, 2001.
DOI : 10.1023/A:1008171016861

J. Bernardes, A. Carbone, and G. Zaverucha, A discriminative method for family-based protein remote homology detection that combines inductive logic programming and propositional models, BMC Bioinformatics, vol.12, issue.1, p.83, 2011.
DOI : 10.1093/bioinformatics/14.9.755

URL : https://hal.archives-ouvertes.fr/hal-00684137

J. Bernardes, G. Zaverucha, C. Vaquero, and A. Carbone, Combining evolution and machine learning for functional annotation in plasmodium falciparum annotation

R. Finn, J. Mistry, J. Tate, P. Coggill, A. Heger et al., The Pfam protein families database, Nucleic Acids Research, vol.38, issue.Database, pp.211-222, 2010.
DOI : 10.1093/nar/gkp985

URL : https://hal.archives-ouvertes.fr/hal-01294685

I. Callebaut, K. Prat, E. Meurice, J. Mornon, and S. Tomavo, Prediction of the general transcription factors associated with rna polymerase ii in plasmodium falciparum: conserved features and differences relative to other eukaryotes, BMC Genomics, vol.6, issue.1, p.100, 2005.
DOI : 10.1186/1471-2164-6-100

URL : https://hal.archives-ouvertes.fr/hal-00021609

K. Smith-miles, Cross-disciplinary perspectives on meta-learning for algorithm selection, ACM Computing Surveys, vol.41, issue.1, pp.1-25, 2009.
DOI : 10.1145/1456650.1456656

G. Cooper and R. , Hausman in The Cell: A Molecular Approach, ch. 7: RNA Synthesis and Processing, Sinauer Associates, 2009.

G. Cooper and R. , Hausman in The Cell: A Molecular Approach, ch. 8: Protein Synthesis, Processing, and Regulation, 2009.

L. Pauling, R. Corey, and H. Branson, The Structure of Proteins, Journal of the American Chemical Society, vol.61, issue.7, pp.205-211, 1951.
DOI : 10.1021/ja01876a065

S. Stephens, Possible Significance of Duplication in Evolution, Advances in Genetics, vol.4, pp.247-265, 1951.
DOI : 10.1016/S0065-2660(08)60237-0

S. Wright, The roles of mutation, inbreeding, crossbreeding and selection in evolution, Proceedings of the VI International Congress of Genetrics, pp.356-366, 1932.

M. White, Models of Speciation: New concepts suggest that the classical sympatric and allopatric models are not the only alternatives, Science, vol.159, issue.3819, pp.1065-1070, 1968.
DOI : 10.1126/science.159.3819.1065

E. Koonin and M. , Galperin in Sequence -Evolution -Function Computational Approaches in Comparative Genomics, ch. 2: Evolutionary Concept in, Genetics and Genomics, 2009.

C. Gaboriaud, V. Bissery, T. Benchetrit, and J. Mornon, Hydrophobic cluster analysis: An efficient new way to compare and analyse amino acid sequences, FEBS Letters, vol.112, issue.1, pp.149-155, 1987.
DOI : 10.1016/0014-5793(87)80439-8

B. Hall, Phylogenetic trees made easy: a how-to manual, 2004.

C. Wu, A. Nikolskaya, H. Huang, L. Yeh, D. Natale et al., PIRSF: family classification system at the Protein Information Resource, Nucleic Acids Research, vol.32, issue.90001, pp.112-114, 2004.
DOI : 10.1093/nar/gkh097

G. Yona, N. Linial, and M. Linial, ProtoMap: automatic classification of protein sequences and hierarchy of protein families, Nucleic Acids Research, vol.28, issue.1, pp.49-55, 2000.
DOI : 10.1093/nar/28.1.49

D. Haft, D. Selengut, and O. White, The TIGRFAMs database of protein families, Nucleic Acids Research, vol.31, issue.1, pp.371-373, 2003.
DOI : 10.1093/nar/gkg128

F. Corpet, F. Servant, J. Gouzy, and D. Kahn, ProDom and ProDom-CG: tools for protein domain analysis and whole genome comparisons, Nucleic Acids Research, vol.28, issue.1, pp.267-269, 2000.
DOI : 10.1093/nar/28.1.267

URL : https://hal.archives-ouvertes.fr/hal-00427044

C. Sigrist, L. Cerutti, E. Castro, P. Langendijk-genevaux, V. Bulliard et al., PROSITE, a protein domain database for functional characterization and annotation, Nucleic Acids Research, vol.38, issue.Database, pp.161-166, 2010.
DOI : 10.1093/nar/gkp885

T. Attwood, M. Beck, A. Bleasby, K. Degtyarenko, and D. P. Smith, Progress with the PRINTS protein fingerprint database, Nucleic Acids Research, vol.24, issue.1, pp.182-188, 1996.
DOI : 10.1093/nar/24.1.182

F. Pearl, C. Bennett, J. Bray, A. Harrison, N. Martin et al., The CATH database: an extended protein family resource for structural and functional genomics, Nucleic Acids Research, vol.31, issue.1, pp.452-455, 2003.
DOI : 10.1093/nar/gkg062

C. Wu, C. Xiao, Z. Hou, H. Huang, and W. Barker, iProClass: an integrated, comprehensive and annotated protein classification database, Nucleic Acids Research, vol.29, issue.1, pp.52-54, 2001.
DOI : 10.1093/nar/29.1.52

R. Apweiler, T. Attwood, A. Bairoch, A. Bateman, E. Birney et al., InterPro--an integrated documentation resource for protein families, domains and functional sites, Bioinformatics, vol.16, issue.12, pp.1145-1150, 2000.
DOI : 10.1093/bioinformatics/16.12.1145

URL : https://hal.archives-ouvertes.fr/hal-01213734

S. Henikoff and J. Henikoff, Amino acid substitution matrices from protein blocks., Proceedings of the National Academy of Sciences, vol.89, issue.22, pp.10915-10919, 1992.
DOI : 10.1073/pnas.89.22.10915

T. Smith and M. Waterman, Identification of common molecular subsequences, Journal of Molecular Biology, vol.147, issue.1, pp.195-197, 1981.
DOI : 10.1016/0022-2836(81)90087-5

M. Gribskov, A. Mclachlan, and D. Eisenberg, Profile analysis: detection of distantly related proteins., Proceedings of the National Academy of Sciences, pp.4355-4358, 1987.
DOI : 10.1073/pnas.84.13.4355

D. Bashford, C. Chothia, and A. Lesk, Determinants of a protein fold, Journal of Molecular Biology, vol.196, issue.1, pp.199-216, 1987.
DOI : 10.1016/0022-2836(87)90521-3

R. Hughey and A. Krogh, Hidden Markov models for sequence analysis: extension and analysis of the basic method, Bioinformatics, vol.12, issue.2, pp.95-107, 1996.
DOI : 10.1093/bioinformatics/12.2.95

S. Eddy, Profile hidden Markov models, Bioinformatics, vol.14, issue.9, pp.755-763, 1998.
DOI : 10.1093/bioinformatics/14.9.755

M. Brown, R. Hughey, A. Krogh, I. Mian, K. Sjlander et al., Using dirichlet mixture priors to derive hidden markov models for protein families, Proc.of First Int. Conf. on Intelligent Systems for Molecular Biology, p.4755, 1993.

J. Thompson, D. Higgins, and T. Gibson, Improved sensitivity of profile searches through the use of sequence weights and gap excision Computer applications in the biosciences, CABIOS, vol.10, issue.1, pp.19-29, 1994.

A. Krogh and G. Mitchison, Maximum entropy weighting of aligned sequences of proteins or dna, Proceedings of the Third International Conference on Intelligent Systems for Molecular Biology, pp.215-221, 1995.

L. Rabiner, A tutorial on hidden markov models and selected applications in speech recognition, Proceedings of the IEEE, pp.257-286, 1989.

A. Rolf, A. Bairoch, C. Wu, W. Barker, B. Boeckmann et al., Uniprot: the universal protein knowledgebase, Nucleic Acids Research, vol.32, pp.115-119, 2004.

B. Scholkopf, C. Burges, and A. Smola, Advances in kernel methods: support vector learning, 1999.

J. Weston, A. Elisseeff, D. Zhou, C. Leslie, and W. Noble, Protein ranking: From local to global structure in the protein similarity network, Proceedings of the National Academy of Sciences, vol.101, issue.17, pp.6559-6563, 2004.
DOI : 10.1073/pnas.0308067101

L. Dehaspe and L. D. Raedt, Mining association rules in multiple relations, Proceedings of the 7th International Workshop on Inductive Logic Programming, pp.125-132, 1997.
DOI : 10.1007/3540635149_40

J. Quinlan, C4.5: Programs for machine learning, Machine Learning, pp.235-240, 1994.

U. Syed and G. Yona, Using a mixture of probabilistic decision trees for direct prediction of protein function, Proceedings of the seventh annual international conference on Computational molecular biology , RECOMB '03, pp.289-300, 2003.
DOI : 10.1145/640075.640114

C. Ferreira, J. Gama, and V. Costa, RUSE-WARMR: Rule Selection for Classifier Induction in Multi-relational Data-Sets, 2008 20th IEEE International Conference on Tools with Artificial Intelligence, pp.379-386, 2008.
DOI : 10.1109/ICTAI.2008.73

S. Brenner, P. Koehl, and M. Levitt, The ASTRAL compendium for protein structure and sequence analysis, Nucleic Acids Research, vol.28, issue.1, pp.254-256, 2000.
DOI : 10.1093/nar/28.1.254

J. Davis and M. Goadrich, The relationship between Precision-Recall and ROC curves, Proceedings of the 23rd international conference on Machine learning , ICML '06, pp.233-240, 2006.
DOI : 10.1145/1143844.1143874

R. Agrawal, T. Imielinski, and R. Srikant, Association rules between sets of items in large databases, Proceedings of the ACM SIGMOD Intl. Conf. on Management of Data, pp.207-216, 1993.

D. Higgins, J. Thompson, T. Gibson, J. Thompson, D. Higgins et al., Clustal w: improving the sensitivity of progressive multiple sequence alignment through sequence weighting,position-specific gap penalties and weight matrix choice, Nucleic Acids Research, vol.22, pp.4673-4680, 1994.

F. Wilcoxon, Individual Comparisons by Ranking Methods, Biometrics Bulletin, vol.1, issue.6, pp.80-83, 1945.
DOI : 10.2307/3001968

S. Lee and L. De-raedt, Constraint based mining of first order sequences in seqlog, Database Support for Data Mining Application, pp.155-176, 2004.

A. Bahl, B. Brunk, J. Crabtree, M. Fraunholz, B. Gajria et al., PlasmoDB: the Plasmodium genome resource. A database integrating experimental and computational data, Nucleic Acids Research, vol.31, issue.1, pp.212-215, 2003.
DOI : 10.1093/nar/gkg081

C. Aurrecoechea, J. Brestelli, B. Brunk, J. Dommer, S. Fischer et al., PlasmoDB: a functional genomic database for malaria parasites, Nucleic Acids Research, vol.37, issue.Database, pp.539-543, 2009.
DOI : 10.1093/nar/gkn814

S. Date and C. Stoeckert, Computational modeling of the Plasmodiumfalciparum interactome reveals protein functionon a genome-wide scale, Genome Research, vol.16, issue.4, pp.542-549, 2006.
DOI : 10.1101/gr.4573206

F. Lu, H. Jiang, J. Ding, J. Mu, J. Valenzuela et al., cDNA sequences reveal considerable gene prediction inaccuracy in the Plasmodium falciparum genome, BMC Genomics, vol.8, issue.1, p.255, 2007.
DOI : 10.1186/1471-2164-8-255

Y. Joubert and F. Joubert, A structural annotation resource for the selection of putative target proteins in the malaria parasite, Malaria Journal, vol.7, issue.1, p.90, 2008.
DOI : 10.1186/1475-2875-7-90

I. Letunic, R. Copley, B. Pils, S. Pinkert, J. Schultz et al., SMART 5: domains in the context of genomes and networks, Nucleic Acids Research, vol.34, issue.90001, pp.257-260, 2005.
DOI : 10.1093/nar/gkj079

C. Yeats, M. Maibaum, R. Marsden, M. Dibley, D. Lee et al., Gene3D: modelling protein structure, function and evolution, Nucleic Acids Research, vol.34, issue.90001, pp.281-284, 2005.
DOI : 10.1093/nar/gkj057

H. Mi, B. Lazareva-ulitsky, R. Loo, A. Kejariwal, J. Vandergriff et al., The PANTHER database of protein families, subfamilies, functions and pathways, Nucleic Acids Research, vol.33, issue.Database issue, pp.284-288, 2005.
DOI : 10.1093/nar/gki078

C. Yeats, O. C. Redfern, and C. Orengo, A fast and automated solution for accurately resolving protein domain architectures, Bioinformatics, vol.26, issue.6, pp.745-751, 2010.
DOI : 10.1093/bioinformatics/btq034

E. Bischoff and C. Vaquero, In silico and biological survey of transcription-associated proteins implicated in the transcriptional machinery during the erythrocytic development of Plasmodium falciparum, BMC Genomics, vol.11, issue.1, p.34, 2010.
DOI : 10.1186/1471-2164-11-34

URL : https://hal.archives-ouvertes.fr/pasteur-00663529

T. Dietterich, Ensemble Methods in Machine Learning, Multiple Classifier Systems, pp.1-15, 2000.
DOI : 10.1007/3-540-45014-9_1

L. Breiman, Bagging predictors, Machine Learning, pp.123-140, 1996.
DOI : 10.1007/BF00058655

Y. Freund and R. Schapire, Experiments with a new boosting algorithm, International Conference on Machine Learning, pp.148-156, 1996.

D. Wolpert, Stacked generalization, Neural Networks, vol.5, issue.2, pp.241-259, 1992.
DOI : 10.1016/S0893-6080(05)80023-1

S. Dzeroski and B. Zenko, Is Combining Classifiers with Stacking Better than Selecting the Best One?, Machine Learning, pp.255-273, 2004.
DOI : 10.1023/B:MACH.0000015881.36452.6e

L. Lam and S. Suen, Application of majority voting to pattern recognition: an analysis of its behavior and performance, " Systems, Man and Cybernetics, Part A: Systems and Humans, IEEE Transactions on, vol.27, pp.553-568, 1997.

J. Platt, N. Cristianini, and J. Shawe-taylor, Large margin dags for multiclass classification, Advances in Neural Information Processing Systems 12, pp.547-553, 2000.

J. Platt, Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods, Advances in Large Margin Classifiers, pp.61-74, 1999.

A. Anand, G. Pugalenthi, and P. Suganthan, Predicting protein structural class by SVM with class-wise optimized features and decision probabilities, Journal of Theoretical Biology, vol.253, issue.2, pp.375-380, 2008.
DOI : 10.1016/j.jtbi.2008.02.031

R. Marler and J. Arora, Survey of multi-objective optimization methods for engineering Structural and Multidisciplinary Optimization, pp.369-395, 2004.

F. Waltz, An engineering approach: Hierarchical optimization criteria, IEEE Transactions on Automatic Control, vol.12, issue.2, pp.179-180, 1967.
DOI : 10.1109/TAC.1967.1098537

C. Chang and C. Lin, LIBSVM, ACM Transactions on Intelligent Systems and Technology, vol.2, issue.3, pp.1-27, 2011.
DOI : 10.1145/1961189.1961199

B. Baum, Combining Trees as a Way of Combining Data Sets for Phylogenetic Inference, and the Desirability of Combining Gene Trees, Taxon, vol.41, issue.1, pp.3-10, 1992.
DOI : 10.2307/1222480

N. Saitou and M. Nei, The neighbor-joining method: a new method for reconstructing phylogenetic trees, Molecular Biology and Evolution, vol.4, pp.406-425, 1987.

R. Page, Modified Mincut Supertrees, Proceedings of the Second International Workshop on Algorithms in Bioinformatics, pp.537-552, 2002.
DOI : 10.1007/3-540-45784-4_41

E. Muller and B. Wittmann-liebold, Phylogenetic relationship of organisms obtained by ribosomal protein comparison, Cellular and Molecular Life Sciences, vol.53, issue.1, pp.34-50, 1997.
DOI : 10.1007/PL00000578

P. Keeling, G. Burger, D. Durnford, B. Lang, R. Lee et al., The tree of eukaryotes, Trends in Ecology & Evolution, vol.20, issue.12, pp.670-676, 2005.
DOI : 10.1016/j.tree.2005.09.005

O. Lichtarge, H. Bourne, and F. Cohen, An Evolutionary Trace Method Defines Binding Surfaces Common to Protein Families, Journal of Molecular Biology, vol.257, issue.2, pp.342-358, 1996.
DOI : 10.1006/jmbi.1996.0167

S. Lockless and R. Ranganathan, Evolutionarily Conserved Pathways of Energetic Connectivity in Protein Families, Science, vol.286, issue.5438, pp.295-299, 1999.
DOI : 10.1126/science.286.5438.295

G. Suel, S. Lockless, M. Wall, and R. Ranganathan, Evolutionarily conserved networks of residues mediate allosteric communication in proteins, Nature Structural Biology, vol.10, issue.1, pp.1072-8368, 2003.
DOI : 10.1038/nsb881

J. Baussand and A. Carbone, A Combinatorial Approach to Detect Coevolved Amino Acid Networks in Protein Families of Variable Divergence, PLoS Computational Biology, vol.15, issue.1, p.1000488, 2009.
DOI : 10.1371/journal.pcbi.1000488.s002

A. Carbone and L. Dib, Co-evolution and information signals in biological sequences, Theoretical Computer Science, vol.412, issue.23, pp.2486-2495, 2011.
DOI : 10.1016/j.tcs.2010.10.040

K. Forslund and E. Sonnhammer, Predicting protein function from domain content, Bioinformatics, vol.24, issue.15, pp.1681-1687, 2008.
DOI : 10.1093/bioinformatics/btn312

E. Frank, Y. Wang, S. Inglis, G. Holmes, and I. Witten, Using model trees for classification, Machine Learning, pp.63-76, 1998.

J. Bernardes, A. Davila, V. Costa, and G. Zaverucha, Hmmer-struct: Adding structural properties to profile hmms, 2007.