S. Abney, Parsing By Chunks, Principle-Based Parsing, pp.257-278, 1991.
DOI : 10.1007/978-94-011-3474-3_10

E. Agichtein and V. Ganti, Mining reference tables for automatic text segmentation, Proceedings of the 2004 ACM SIGKDD international conference on Knowledge discovery and data mining , KDD '04, pp.20-29, 2004.
DOI : 10.1145/1014052.1014058

A. Airola, S. Pyysalo, J. Björne, T. Pahikkala, F. Ginter et al., A graph kernel for protein-protein interaction extraction, Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing, BioNLP '08, pp.1-9, 2008.
DOI : 10.3115/1572306.1572308

S. A. Al-haddad, S. A. Samad, A. Hussain, K. A. Ishak, and A. O. Noor, Robust Speech Recognition Using Fusion Techniques and Adaptive Filtering, American Journal of Applied Sciences, vol.6, issue.2, pp.290-295, 2009.
DOI : 10.3844/ajassp.2009.290.295

URL : http://doi.org/10.3844/ajassp.2009.290.295

E. Amigo, J. Gonzalo, J. Artiles, and F. Verdejo, A comparison of extrinsic clustering evaluation metrics based on formal constraints, Information Retrieval, vol.30, issue.4, pp.613-613, 2009.
DOI : 10.1007/s10791-008-9066-8

P. M. Andersen, P. J. Hayes, A. K. Huettner, L. M. Schmandt, I. B. Nirenburg et al., Automatic extraction of facts from press releases to generate news stories, Proceedings of the third conference on Applied natural language processing -, pp.170-177, 1992.
DOI : 10.3115/974499.974531

A. Bagga and B. Baldwin, Entity-based cross-document coreferencing using the vector space model, Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics -Volume 1, ACL '98, pp.79-85, 1998.

C. Bibliography-bahlmann, B. Haasdonk, and H. Burkhardt, On-line handwriting recognition with support vector machines -A kernel approach, proceedings of the 8th International Workshop on Frontiers in Handwriting Recognition, pp.49-54, 2002.

M. Baroni, S. Bernardini, A. Ferraresi, and E. Zanchetta, The WaCky wide web: a collection of very large linguistically processed web-crawled corpora, Language Resources and Evaluation, vol.10, issue.4, pp.209-226, 2009.
DOI : 10.1007/s10579-009-9081-4

D. M. Blei, A. Y. Ng, J. , and M. I. , Latent Dirichlet Allocation, The Journal of Machine Learning Research, vol.3, pp.993-1022, 2003.

D. T. Bollegala, Y. Matsuo, and M. Ishizuka, Relational duality, Proceedings of the 19th international conference on World wide web, WWW '10, pp.151-160, 2010.
DOI : 10.1145/1772690.1772707

L. Breiman, Random forests, Machine Learning, pp.5-32, 2001.

E. Brill, A simple rule-based part of speech tagger, proceedings of the 3rd Applied Natural Language Processing Conference (ANLP), pp.152-155, 1992.

P. Buitelaar, T. Declerck, J. Nemrava, and D. Sadlier, Cross-media semantic indexing in the soccer domain, 2008 International Workshop on Content-Based Multimedia Indexing, pp.296-301, 2008.
DOI : 10.1109/CBMI.2008.4564960

R. Bunescu and R. Mooney, Subsequence kernels for relation extraction, Advances in Neural Information Processing Systems, pp.171-178, 2006.

R. C. Bunescu and R. J. Mooney, A shortest path dependency kernel for relation extraction, Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing , HLT '05, pp.724-731, 2005.
DOI : 10.3115/1220575.1220666

H. Bunke, Recent developments in graph matching, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000, pp.2117-2124, 2000.
DOI : 10.1109/ICPR.2000.906030

V. T. Chakaravarthy, H. Gupta, P. Roy, and M. Mohania, Efficiently linking text documents with relevant structured information, Proceedings of the 32nd international conference on Very large data bases, VLDB '06, pp.667-678, 2006.

C. C. Chang and C. J. Lin, LIBSVM, ACM Transactions on Intelligent Systems and Technology, vol.2, issue.3, 2001.
DOI : 10.1145/1961189.1961199

O. Chapelle, V. Vapnik, O. Bousquet, and S. Mukherjee, Choosing multiple parameters for support vector machines, Machine Learning, pp.1-3131, 2002.

H. Chen, E. Benson, T. Naseem, and R. Barzilay, In-domain relation discovery with meta-constraints via posterior regularization, Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp.530-540, 2011.

S. F. Chen and J. Goodman, An empirical study of smoothing techniques for language modeling, Proceedings of the 34th annual meeting on Association for Computational Linguistics, ACL '96, pp.310-318, 1996.

H. L. Chieu and H. T. Ng, Named entity recognition, Proceedings of the 19th international conference on Computational linguistics -, pp.1-7, 2002.
DOI : 10.3115/1072228.1072253

M. Choi, H. Kim, and B. W. Croft, Dependency trigram model for social relation extraction from news articles, Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval, SIGIR '12, pp.1047-1048, 2012.
DOI : 10.1145/2348283.2348462

N. Chomsky, Three models for the description of language, IEEE Transactions on Information Theory, vol.2, issue.3, pp.113-124, 1956.
DOI : 10.1109/TIT.1956.1056813

N. Chomsky, Syntactic Structures. Mouton classic, 2002.

J. H. Clear, The British national corpus, The digital word, pp.163-187, 1993.

M. Collins and J. Brooks, Prepositional Phrase Attachment Through a Backed-off Model, Proceedings of the Third Workshop on Very Large Corpora, pp.27-38, 1995.
DOI : 10.1007/978-94-017-2390-9_11

M. Collins and Y. Singer, Unsupervised models for named entity classification, Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, pp.100-111, 1999.

D. Coppersmith and S. Winograd, Matrix multiplication via arithmetic progressions, Proceedings of the nineteenth annual ACM conference on Theory of computing , STOC '87, pp.251-280, 1990.
DOI : 10.1145/28395.28396

URL : http://doi.org/10.1016/s0747-7171(08)80013-2

C. Cortes and V. Vapnik, Support-vector networks, Machine Learning, pp.273-297, 1995.
DOI : 10.1007/BF00994018

T. Cover and P. Hart, Nearest neighbor pattern classification, IEEE Transactions on Information Theory, vol.13, issue.1, pp.21-27, 1967.
DOI : 10.1109/TIT.1967.1053964

M. A. Covington, A fundamental algorithm for dependency parsing, Proceedings of the 39th Annual ACM Southeast Conference, pp.95-102, 2001.

R. E. Cullingford, Script Application: Computer Understanding of Newspaper Stories, 1978.

A. Culotta and J. Sorensen, Dependency tree kernels for relation extraction, Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics , ACL '04, 2004.
DOI : 10.3115/1218955.1219009

W. Daelemans, A. Van-den-bosch, and T. Weijters, IGTree: Using Trees for Compression and Classification in Lazy Learning Algorithms, Artificial Intelligence Review, vol.11, issue.15, pp.407-423, 1997.
DOI : 10.1007/978-94-017-2053-3_15

F. Debole and F. Sebastiani, Supervised term weighting for automated text categorization, Proceedings of SAC-03, 18th ACM Symposium on Applied Computing, pp.784-788, 2003.

X. Dong and A. Y. Halevy, A platform for personal information management and integration, Proceedings of the Second Biennial Conference on Innovative Data Systems Reasearch (CIDR), pp.119-130, 2005.

R. B. Doorenbos, O. Etzioni, and D. S. Weld, A scalable comparisonshopping agent for the World-Wide Web, Proceedings of the First International Conference on Autonomous Agents, pp.39-48, 1997.

A. R. Ebadat, Extracting protein-protein interactions with language modelling, Proceedings of the Second Student Research Workshop associated with RANLP 2011, pp.60-66, 2011.

A. R. Ebadat, V. Claveau, and P. Sébillot, Proper noun semantic clustering using bag-of-vectors, Proceedings of the Twenty-Fifth International Florida Artificial Intelligence Research Society Conference, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00760105

A. Ekbal, E. Sourjikova, A. Frank, and S. P. Ponzetto, Assessing the challenge of fine-grained named entity recognition and classification, Proceedings of the 2010 Named Entities Workshop, NEWS '10, pp.93-101, 2010.

K. Elagouni, C. Garcia, F. Mamalet, and P. Sebillot, Combining multi-scale character recognition and linguistic knowledge for natural scene text OCR. Document Analysis Systems, IAPR International Workshop on, pp.120-124, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00753908

D. Embley, M. Hurst, D. Lopresti, and G. Nagy, Table-processing paradigms: a research survey, International Journal of Document Analysis and Recognition (IJDAR), vol.7, issue.1, pp.66-86, 2006.
DOI : 10.1007/s10032-006-0017-x

O. Etzioni, M. Cafarella, D. Downey, S. Kok, A. Popescu et al., Web-scale information extraction in knowitall, Proceedings of the 13th conference on World Wide Web , WWW '04, pp.100-110, 2004.
DOI : 10.1145/988672.988687

O. Etzioni, M. Cafarella, D. Downey, A. Popescu, T. Shaked et al., Unsupervised named-entity extraction from the Web: An experimental study, Artificial Intelligence, vol.165, issue.1, pp.91-134, 2005.
DOI : 10.1016/j.artint.2005.03.001

T. Fayruzov, M. De-cock, C. Cornelis, and V. Hoste, DEEPER: A Full Parsing Based Approach to Protein Relation Extraction, Proceedings of the 6th European conference on Evolutionary computation, machine learning and data mining in bioinformatics, EvoBIO'08, pp.36-47, 2008.
DOI : 10.1007/978-3-540-78757-0_4

T. Fayruzov, M. De-cock, C. Cornelis, and V. Hoste, The role of syntactic features in protein interaction extraction, Proceeding of the 2nd international workshop on Data and text mining in bioinformatics, DTMBIO '08, pp.61-68, 2008.
DOI : 10.1145/1458449.1458463

T. Fayruzov, M. De-cock, C. Cornelis, and V. Hoste, Linguistic feature analysis for protein interaction extraction, BMC Bioinformatics, vol.10, issue.1, 2009.
DOI : 10.1186/1471-2105-10-374

J. R. Finkel, T. Grenager, and C. Manning, Incorporating non-local information into information extraction systems by Gibbs sampling, Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics , ACL '05, pp.363-370, 2005.
DOI : 10.3115/1219840.1219885

M. Fleischman and E. Hovy, Fine grained classification of named entities, Proceedings of the 19th international conference on Computational linguistics -, pp.1-7, 2002.
DOI : 10.3115/1072228.1072358

K. Fort and V. Claveau, Annotating football matches: Influence of the source medium on manual annotation, Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12) European Language Resources Association (ELRA), 2012.
URL : https://hal.archives-ouvertes.fr/hal-00709170

K. Fundel, R. Kuffner, and R. Zimmer, RelEx--Relation extraction using dependency parse trees, Bioinformatics, vol.23, issue.3, pp.365-371, 2007.
DOI : 10.1093/bioinformatics/btl616

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.98.3299

H. Gaifman, Dependency systems and phrase-structure systems, Information and Control, vol.8, issue.3, pp.304-337, 1965.
DOI : 10.1016/S0019-9958(65)90232-9

URL : http://doi.org/10.1016/s0019-9958(65)90232-9

D. Gildea and D. Jurafsky, Automatic Labeling of Semantic Roles, Computational Linguistics, vol.19, issue.2, pp.245-288, 2001.
DOI : 10.1016/0010-0285(72)90022-9

C. Giuliano, A. Lavelli, and L. Romano, Exploiting shallow linguistic information for relation extraction from biomedical literature, Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL-2006), pp.401-408, 2006.

M. Goadrich, L. Oliphant, and J. Shavlik, Learning to extract genic interactions using Gleaner, Proceedings of the 4th Learning Language in Logic Workshop (LLL05), pp.62-68, 2005.

P. Gosselin, M. Cord, and S. Philipp-foliguet, Kernels on Bags of Fuzzy Regions for Fast Object retrieval, 2007 IEEE International Conference on Image Processing, pp.177-180, 2007.
DOI : 10.1109/ICIP.2007.4378920

Y. Gotoh and S. Renals, Information extraction from broadcast news, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, vol.358, issue.1769, pp.1295-1310, 2000.
DOI : 10.1098/rsta.2000.0587

URL : http://arxiv.org/abs/cs/0003084

G. Gravier, C. Guinaudeau, G. Lecorvé, and P. Sébillot, Exploiting Speech for Automatic TV Delinearization: From Streams to Cross-Media Semantic Navigation, EURASIP Journal on Image and Video Processing, vol.23, issue.2, p.2011, 2011.
DOI : 10.1016/j.csl.2009.10.001

URL : https://hal.archives-ouvertes.fr/hal-00645216

M. A. Greenwood, M. Stevenson, Y. Guo, H. Harkema, and A. Roberts, Automatically acquiring a linguistically motivated genic interaction extraction system, Proceedings of the 4th Learning Language in Logic Workshop (LLL05), pp.46-52, 2005.

R. Grishman and B. Sundheim, Message Understanding Conference-6, Proceedings of the 16th conference on Computational linguistics -, pp.466-471, 1996.
DOI : 10.3115/992628.992709

C. Guinaudeau, G. Gravier, and P. Sébillot, Can Automatic Speech Transcripts Be Used for Large Scale TV Stream Description and Structuring?, 2009 11th IEEE International Symposium on Multimedia, pp.489-494, 2009.
DOI : 10.1109/ISM.2009.80

URL : https://hal.archives-ouvertes.fr/hal-00762125

J. Hakenberg, C. Plake, U. Leser, H. Kirsch, and D. Rebholz-schuhmann, LLL'05 challenge: Genic interaction extraction -Identification of language patterns based on alignement and finite state automata, Proceedings of the 4th Learning Language in Logic Workshop (LLL05), pp.38-45, 2005.

M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann et al., The WEKA data mining software, ACM SIGKDD Explorations Newsletter, vol.11, issue.1, pp.10-18, 2009.
DOI : 10.1145/1656274.1656278

D. Harel and Y. Koren, On Clustering Using Random Walks, FST TCS 2001: Foundations of Software Technology and Theoretical Computer Science, pp.18-41, 2001.
DOI : 10.1007/3-540-45294-X_3

T. Hasegawa, S. Sekine, and R. Grishman, Discovering relations among named entities from large corpora, Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics , ACL '04, 2004.
DOI : 10.3115/1218955.1219008

M. A. Hearst, Automatic acquisition of hyponyms from large text corpora, Proceedings of the 14th conference on Computational linguistics -, pp.539-545, 1992.
DOI : 10.3115/992133.992154

W. Hu, N. Xie, L. Li, X. Zeng, and S. Maybank, A survey on visual content-based video indexing and retrieval, IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, issue.6, pp.41797-819, 2011.

L. Hubert and P. Arabie, Comparing partitions, Journal of Classification, vol.78, issue.1, pp.193-218, 1985.
DOI : 10.1007/BF01908075

H. Isozaki and H. Kazawa, Efficient support vector classifiers for named entity recognition, Proceedings of the 19th international conference on Computational linguistics -, pp.1-7, 2002.
DOI : 10.3115/1072228.1072282

P. S. Jacobs and L. F. Rau, SCISOR: extracting information from on-line news, Communications of the ACM, vol.33, issue.11, pp.3388-97, 1990.
DOI : 10.1145/92755.92769

T. Joachims, Text categorization with suport vector machines: Learning with many relevant features, Proceedings of the 10th European Conference on Machine Learning, ECML '98, pp.137-142, 1998.
DOI : 10.1007/bfb0026683

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.11.6124

K. Jung, Text information extraction in images and video: a survey, Pattern Recognition, vol.37, issue.5, pp.977-997, 2004.
DOI : 10.1016/j.patcog.2003.10.012

S. Katrenko, S. Marshall, M. Roos, M. Adriaans, and P. , Learning biological interactions from Medline abstracts, Proceedings of the 4th Learning Language in Logic Workshop (LLL05), pp.53-58, 2005.

J. Kazama and K. Torisawa, Exploiting Wikipedia as external knowledge for named entity recognition, Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp.698-707, 2007.

S. Kim, J. Yoon, J. Yang, and S. Park, Walk-Weighted Subsequence Kernels for Protein-Protein interaction extraction, BMC Bioinformatics, vol.11, issue.1, p.107, 2010.
DOI : 10.1186/1471-2105-11-107

K. Sang, T. Erik, F. , D. Meulder, and F. , Introduction to the CoNLL- 2003 shared task: Language-independent named entity recognition, Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003, pp.142-147, 2003.

A. Kiryakov, B. Popov, I. Terziev, D. Manov, and D. Ognyanoff, Semantic annotation, indexing, and retrieval, Web Semantics: Science, Services and Agents on the World Wide Web, pp.49-79, 2004.
DOI : 10.1007/978-3-540-39718-2_31

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.122.2996

R. Kohavi, A study of cross-validation and bootstrap for accuracy estimation and model selection, Proceedings of the 14th international joint conference on Artificial intelligence, pp.1137-1143, 1995.

D. Kolbe, Q. Zhu, and S. Pramanik, Efficient k-nearest neighbor searching in nonordered discrete data spaces, ACM Transactions on Information Systems, vol.28, issue.2, pp.1-7, 2010.
DOI : 10.1145/1740592.1740595

R. Kondor and T. Jebara, A kernel between sets of vectors, Proceedings of he twentieth International Conference on Machine Learning (ICML), 2003.

Z. Kozareva, Bootstrapping named entity recognition with automatically generated gazetteer lists, Proceedings of the Eleventh Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop on, EACL '06, pp.15-21, 2006.
DOI : 10.3115/1609039.1609041

F. Kubala, R. Schwartz, R. Stone, and R. Weischedel, Named entity extraction from speech, Proceedings of DARPA Broadcast News Transcription and Understanding Workshop, pp.287-292, 1998.

J. Kupiec, Robust part-of-speech tagging using a hidden Markov model, Computer Speech & Language, vol.6, issue.3, pp.225-242, 1992.
DOI : 10.1016/0885-2308(92)90019-Z

H. Ku?era and W. N. Francis, Computational Analysis of Present-Day American English, 1967.

S. Lawrence, L. Giles, C. Bollacker, and K. , Digital libraries and autonomous citation indexing, Computer, vol.32, issue.6, pp.67-71, 1999.
DOI : 10.1109/2.769447

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.17.1607

J. Lawto, J. Gauvain, L. Lamel, G. Grefenstette, G. Gravier et al., A scalable video search engine based on audio content indexing and topic segmentation, Proceedings of 2011 NEM Summit, p.160, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00645228

G. Lecorvé, G. Gravier, and P. Sébillot, On the use of web resources and natural language processing techniques to improve automatic speech recognition systems, proceedings of the Conference on Language Resources and Evaluation (LREC), 2008.

D. Lee, O. Jeong, L. , and S. , Opinion mining of customer feedback data on the web, Proceedings of the 2nd international conference on Ubiquitous information management and communication , ICUIMC '08, pp.230-235, 2008.
DOI : 10.1145/1352793.1352842

M. Li, L. Lin, X. Wang, and T. Liu, Protein protein interaction site prediction based on conditional random fields, Bioinformatics, vol.23, issue.5, pp.597-604, 2007.
DOI : 10.1093/bioinformatics/btl660

W. Liao and S. Veeramachaneni, A simple semi-supervised algorithm for named entity recognition, Proceedings of the NAACL HLT 2009 Workshop on Semi-Supervised Learning for Natural Language Processing, SemiSupLearn '09, pp.58-65, 2009.
DOI : 10.3115/1621829.1621837

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.164.6209

X. Liu, S. Zhang, F. Wei, and M. Zhou, Recognizing named entities in tweets, Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp.359-367, 2011.

H. Lodhi, C. Saunders, J. Shawe-taylor, N. Cristianini, and C. Watkins, Text classification using string kernels, The Journal of Machine Learning Research, vol.2, pp.419-444, 2002.

S. L. Lytinen and A. Gershman, ATRANS automatic processing of money transfer messages, Proceedings of the 5th National Conference on Artificial Intelligence (AAAI-86), pp.1089-1095, 1986.

S. Lyu, Mercer kernels for object recognition with local features, IEEE Computer Vision and Pattern Recognition (CVPR 2005), pp.223-229, 2005.

C. Manning, P. Raghavan, and H. Schütze, Introduction to information retrieval, 2008.
DOI : 10.1017/CBO9780511809071

C. D. Manning and H. Schütze, Foundations of statistical natural language processing, 1999.

M. P. Marcus, M. A. Marcinkiewicz, and B. Santorini, Building a large annotated corpus of English: The penn treebank. Computational Linguistics - Special issue on using large corpora: II, pp.313-330, 1993.

E. Marsh and D. Perzanowski, MUC-7 evaluation of IE technology: Overview of results, Proceedings of the Seventh Message Understanding Conference (MUC-7), 1998.

A. Mccallum and W. Li, Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons, Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 -, pp.188-191, 2003.
DOI : 10.3115/1119176.1119206

A. Mccallum and W. Li, Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons, Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 -, 2003.
DOI : 10.3115/1119176.1119206

A. Mccallum, K. Nigam, J. Rennie, and K. Seymore, A machine learning approach to building domain-specific search engines, Proceedings of the 16th International Joint Conference on Artificial Intelligence, pp.662-667, 1999.

L. Medvés, L. Szilágyi, and S. Szilágyi, A Modified Markov Clustering Approach for Protein Sequence Clustering, In Pattern Recognition in Bioinformatics Lecture Notes in Computer Science, vol.5265, pp.110-120, 2008.
DOI : 10.1007/978-3-540-88436-1_10

Y. Miyao, K. Sagae, R. Saetre, T. Matsuzaki, and J. Tsujii, Evaluating contributions of natural language parsers to protein-protein interaction extraction, Bioinformatics, vol.25, issue.3, pp.25394-400, 2009.
DOI : 10.1093/bioinformatics/btn631

S. Momtazi, M. Lease, and D. Klakow, Effective Term Weighting for Sentence Retrieval, Proceedings of the 14th European conference on Research and advanced technology for digital libraries, pp.482-485, 2010.
DOI : 10.1007/978-3-642-15464-5_62

T. Mondary and H. Zargayouna, Quaero program, evaluation report: Results for task 3.3 on ontology acquisition, period 4, 2011.

D. Nadeau and S. Satoshi, A survey of named entity recognition and classification, Lingvisticae Investigationes, pp.3-26, 2007.
DOI : 10.1075/bct.19.03nad

D. Nadeau, P. Turney, and S. Matwin, Unsupervised Named-Entity Recognition: Generating Gazetteers and Resolving Ambiguity, In Advances in Artificial Intelligence Lecture Notes in Computer Science, vol.4013, pp.266-277, 2006.
DOI : 10.1007/11766247_23

H. Ney, U. Essen, and R. Kneser, On structuring probabilistic dependences in stochastic language modelling, Computer Speech & Language, vol.8, issue.1, pp.1-38, 1994.
DOI : 10.1006/csla.1994.1001

C. Nédellec, Learning language in logic ? Genic interaction extraction challenge, Proceedings of the 4th Learning Language in Logic Workshop (LLL05), pp.31-37, 2005.

A. Panchenko and O. Morozova, A study of hybrid similarity measures for semantic relation extraction, Proceedings of the Workshop on Innovative Bibliography Hybrid Approaches to the Processing of Textual Data, pp.10-18, 2012.

B. Pang and L. Lee, Opinion Mining and Sentiment Analysis, Foundations and Trends?? in Information Retrieval, vol.2, issue.1???2, pp.1-135, 2008.
DOI : 10.1561/1500000011

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.147.1344

A. Pauls and D. Klein, Faster and smaller n-gram language models, Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp.258-267, 2011.

T. M. Phuong, D. Lee, H. Lee, and K. , Learning Rules to Extract Protein Interactions from Biomedical Text, Advanced in Knowledge Discovery and Data Mining, pp.148-158, 2003.
DOI : 10.1007/3-540-36175-8_15

C. Pollard and I. Sag, Head-Driven Phrase Structure Grammar, 1994.

L. Popelínsk´popelínsk´y, ?. Bla, and J. Ták, Learning genic interactions without expert domain knowledge: Comparison of different ILP algorithms, Proceedings of the 4th Learning Language in Logic Workshop (LLL05), pp.59-61, 2005.

W. M. Rand, Objective Criteria for the Evaluation of Clustering Methods, Journal of the American Statistical Association, vol.15, issue.336, pp.846-850, 1971.
DOI : 10.1080/01621459.1963.10500845

L. Ratinov and D. Roth, Design challenges and misconceptions in named entity recognition, Proceedings of the Thirteenth Conference on Computational Natural Language Learning, CoNLL '09, pp.147-155, 2009.
DOI : 10.3115/1596374.1596399

A. Ratnaparkhi, A Maximum Entropy Model for Part-Of-Speech Tagging, Proceedings of the Empirical Methods in Natural Language Processing, pp.133-142, 1996.

S. Ravi, K. Knight, and R. Soricut, Automatic prediction of parser accuracy, Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP '08, pp.887-896, 2008.
DOI : 10.3115/1613715.1613829

J. W. Reed, Y. Jiao, T. E. Potok, B. A. Klump, M. T. Elmore et al., TF-ICF: A New Term Weighting Scheme for Clustering Dynamic Data Streams, 2006 5th International Conference on Machine Learning and Applications (ICMLA'06), pp.258-263, 2006.
DOI : 10.1109/ICMLA.2006.50

S. Riedel and E. Klein, Genic interaction extraction with semantic and syntactic chains, Proceedings of the 4th Learning Language in Logic Workshop (LLL05), pp.69-74, 2005.

B. Rink and S. Harabagiu, A generative model for unsupervised discovery of relations and argument classes from clinical texts, Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp.519-528, 2011.

B. Rosenfeld and R. Feldman, Clustering for unsupervised relation identification, Proceedings of the sixteenth ACM conference on Conference on information and knowledge management , CIKM '07, pp.411-418, 2007.
DOI : 10.1145/1321440.1321499

B. C. Russell, A. Torralba, K. P. Murphy, F. , and W. T. , LabelMe: A Database and Web-Based Tool for Image Annotation, International Journal of Computer Vision, vol.3, issue.1, pp.1-3157, 2008.
DOI : 10.1007/s11263-007-0090-8

R. Saetre, K. Sagae, and J. Tsujii, Syntactic features for proteinprotein interaction extraction, Short Paper Proceedings of the 2nd International Symposium on Languages in Biology and Medicine Singapore. CEUR Workshop Proceedings (CEUR-WS.org), pp.6-7, 2007.

G. Salton and C. Buckley, Term-weighting approaches in automatic text retrieval. Information Processing and Management, pp.513-523, 1988.

J. M. Santos and M. Embrechts, On the Use of the Adjusted Rand Index as a Metric for Evaluating Supervised Classification, Proceedings of the 19th International Conference on Artificial Neural Networks: Part II, ICANN '09, pp.175-184, 2009.
DOI : 10.1162/neco.1989.1.1.151

S. Sarawagi, Information Extraction, Foundations and Trends?? in Databases, vol.1, issue.3, pp.261-377, 2008.
DOI : 10.1561/1900000003

L. Sarmento, P. Carvalho, M. J. Silva, and E. De-oliveira, Automatic creation of a reference corpus for political opinion mining in user-generated content, Proceeding of the 1st international CIKM workshop on Topic-sentiment analysis for mass opinion, TSA '09, pp.29-36, 2009.
DOI : 10.1145/1651461.1651468

URL : https://hal.archives-ouvertes.fr/hal-01109751

H. Schmid, Probabilistic part-of-speech tagging using decision trees, proceedings of the International Conference on New Methods in Language Processing, pp.44-49, 1994.

Y. Shinyama and S. Sekine, Preemptive information extraction using unrestricted relation discovery, Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics -, pp.304-311, 2006.
DOI : 10.3115/1220835.1220874

URL : http://acl.ldc.upenn.edu/N/N06/N06-1039.pdf

J. Sivic and A. Zisserman, Video Google: a text retrieval approach to object matching in videos, Proceedings Ninth IEEE International Conference on Computer Vision, pp.1470-1477, 2003.
DOI : 10.1109/ICCV.2003.1238663

C. G. Snoek and M. Worring, Time interval maximum entropy based event indexing in soccer video, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698), pp.481-484, 2003.
DOI : 10.1109/ICME.2003.1221353

N. Sobhana, M. Pabitra, and S. Ghosh, Conditional Random Field Based Named Entity Recognition in Geological text, International Journal of Computer Applications, vol.1, issue.3, 2010.
DOI : 10.5120/72-166

P. Soucy, Beyond TFIDF weighting for text categorization in the vector space model, Proceedings of the 19th International Joint Conference on Artificial Intelligence (IJCAI 2005, pp.1130-1135, 2005.

G. Stamou, J. Van, J. Z. Pan, G. Schreiber, and R. Smith, Multimedia annotations on the semantic Web, IEEE Multimedia, vol.13, issue.1, pp.86-90, 2006.
DOI : 10.1109/MMUL.2006.15

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.81.8741

A. J. Stother, On the Complexity of Matrix Multiplication, 2010.

V. Strassen, The asymptotic spectrum of tensors and the exponent of matrix multiplication, 27th Annual Symposium on Foundations of Computer Science (sfcs 1986), pp.49-54, 1986.
DOI : 10.1109/SFCS.1986.52

J. Sturm, J. M. Kessens, M. Wester, F. De-wet, E. Sanders et al., Automatic transcription of football commentaries in the MUMIS project, Proceedings of the 8th European Conference on Speech Communication and Technology, 2003.

C. Sun, L. Lin, X. Wang, and Y. Guan, Using Maximum Entropy Model to Extract Protein-Protein Interaction Information from Biomedical Literature, Proceedings of the intelligent computing 3rd international conference on Advanced intelligent computing theories and applications, ICIC'07, pp.730-737, 2007.
DOI : 10.1007/978-3-540-74171-8_72

K. Takeuchi and N. Collier, named entity recognition, proceeding of the 6th conference on Natural language learning , COLING-02, pp.2-3, 2002.
DOI : 10.3115/1118853.1118882

P. Tan, M. Steinbach, and V. Kumar, Introduction to Data Mining, 2006.

A. Toral and R. Munoz, A proposal to automatically build and maintain gazetteers for named entity recognition by using Wikipedia, Proceedings of the EACL-2006 Workshop on NEW TEXT-Wikis and blogs and other dynamic text sources, 2006.

A. M. Turing, Computing machinery and intelligence, pp.433-460, 1950.
DOI : 10.1007/978-1-4020-6710-5_3

S. Van-dongen, Graph Clustering by Flow Simulation, 2000.

S. Van-dongen, Performance criteria for graph clustering and Markov cluster experiments, National Research Institute for Mathematics and Computer Science in the Netherlands, 2000.

C. J. Van-rijsbergen, FOUNDATION OF EVALUATION, Journal of Documentation, vol.30, issue.4, pp.365-373, 1974.
DOI : 10.1108/eb026584

S. Verma, S. Vieweg, W. Corvey, L. Palen, J. H. Martin et al., Natural language processing to the rescue? Extracting " situational awareness " Tweets during mass emergency, proceedings of the Fifth International AAAI Conference on Weblogs and Social Media (ICWSM), 2011.

N. Vinh, J. Epps, and J. Bailey, Information theoretic measures for clusterings comparison, Proceedings of the 26th Annual International Conference on Machine Learning, ICML '09, 2010.
DOI : 10.1145/1553374.1553511

C. Wang, F. Jing, L. Zhang, and H. Zhang, Image annotation refinement using random walk with restarts, Proceedings of the 14th annual ACM international conference on Multimedia , MULTIMEDIA '06, pp.647-650, 2006.
DOI : 10.1145/1180639.1180774

W. Wang, R. Besançon, O. Ferret, and B. Grau, Filtering and clustering relations for unsupervised information extraction in open domain, Proceedings of the 20th ACM international conference on Information and knowledge management, CIKM '11, pp.1405-1414, 2011.
DOI : 10.1145/2063576.2063780

C. Whitelaw, A. Kehlenbeck, N. Petrovic, and L. Ungar, Web-scale named entity recognition, Proceeding of the 17th ACM conference on Information and knowledge mining, CIKM '08, pp.123-132, 2008.
DOI : 10.1145/1458082.1458102

J. Xiao, J. Su, G. Zhou, and C. Tan, Protein-protein interaction extraction: A supervised learning approach, proceedings of the First International Symposium on Semantic Mining in Biomedicine (SMBM), pp.51-59, 2005.

L. Yao, A. Haghighi, S. Riedel, and A. Mccallum, Structured relation discovery using generative models, Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp.1456-1466, 2011.

D. Yarowsky, Unsupervised word sense disambiguation rivaling supervised methods, Proceedings of the 33rd annual meeting on Association for Computational Linguistics -, pp.189-196, 1995.
DOI : 10.3115/981658.981684

URL : http://acl.ldc.upenn.edu/P/P95/P95-1026.pdf

H. Zargayouna, Evaluation report: Results for task 3.3 on ontology acquisition period 3, LIPN. Quaero project, 2010.
URL : https://hal.archives-ouvertes.fr/hal-00838313

H. Zhang, A. C. Berg, M. Maire, M. , and J. , SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Volume 2 (CVPR'06), pp.2126-2136, 2006.
DOI : 10.1109/CVPR.2006.301

M. Zhang, J. Su, D. Wang, G. Zhou, and C. L. Tan, Discovering relations between named entities from a large raw corpus using tree similaritybased clustering, Proceedings of the Second international joint conference on Natural Language Processing, IJCNLP'05, pp.378-389, 2005.

Y. Zhao and G. Karypis, Criterion functions for document clustering: Experiments and analysis, 2001.

G. Zhou, L. Qian, F. , and J. , Tree kernel-based semantic relation extraction with rich syntactic and semantic information, Information Sciences, vol.180, issue.8, pp.1313-1325, 2010.
DOI : 10.1016/j.ins.2009.12.006

G. Zhou and J. Su, Named entity recognition using an HMM-based chunk tagger List of Figures 2.1 Sample sentence for protein-protein interaction, Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ACL '02, pp.473-480, 2002.

P. .. Un, Sac-de-mots (n-grammes) pour représenter, p.23

. Tan, Possible hyperplanes for classifying a linearly separable data set B 1 is a better hyperplane than B 2 because it maximizes the separation of the classes, p.41, 2006.

. Manning, Single link : similarity of the most similar members (b) Complete link : similarity of the most dissimilar members, p.43, 2008.

A. Constituency-tree, This is an example of constituency grammar " , where non-terminal symbols build intermediate nodes and terminal symbols are leaf nodes, p.56, 2001.