Des moteurs de recherche efficaces pour des systèmes hypertextes grâce aux contextes des noeuds, Colloque International : Technologies de l'Information et de la Communication dans les Enseignements d'ingénieurs et dans l'industrie, 2000. ,
MITRE, Proceedings of the 6th conference on Message understanding , MUC6 '95, 1995. ,
DOI : 10.3115/1072399.1072413
Language identification from text using n-gram based cumulative frequency addition, CSIS Research Day, 2004. ,
NoDoSE -a tool for semi-automatically extracting semi-structured data from text documents, SIGMOD Conference, pp.283-294, 1998. ,
Ambiguity problem in multilingual information retrieval, CLEF, pp.156-165, 2000. ,
SRI international FASTUS system : MUC-6 test results and analysis, Proceedings of the Sixth Message Understanding Conference, 1995. ,
Using a language independent domain model for multilingual information extraction, Special Issue on Multilinguality in the Software Industry : the AI Contribution (MULSAIC-97, 1999. ,
DOI : 10.1080/088395199117252
The diameter of the World Wide Web. CoRR, cond-mat, 1999. ,
WebOQL : Restructuring documents, databases, and Webs, TAPOS, vol.5, issue.3, pp.127-141, 1999. ,
Semistructured and structured data in the Web: going back and forth, ACM SIGMOD Record, vol.26, issue.4, pp.16-23, 1997. ,
DOI : 10.1145/271074.271080
An introduction to information extraction, Artificial Intelligence Communications, vol.12, issue.3, pp.161-172, 1999. ,
Emergence of scaling in random networks, Science, vol.286, pp.509-512, 1999. ,
The Situation in Logic, volume 17 of CSLI Lecture Notes. Center for the Study of Language and Information Publications, 1989. ,
A learning experience : Training an artificial neural network to discriminate languages, 1992. ,
Convergence properties of the Kmeans algorithm, Advances in Neural Information Processing Systems 7, 1995. ,
Language identifier : A computer program for automatic natural-language identification on on-line text, the 29th Annual Conference of the American Translators Association, pp.47-54, 1988. ,
ITC-irst at CLEF 2003: Monolingual, Bilingual, and Multilingual Information Retrieval, CLEF, pp.140-151, 2003. ,
DOI : 10.1007/978-3-540-30222-3_13
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.501.1222
Integrating New Languages in a Multilingual Search System Based on a Deep Linguistic Analysis, CLEF, pp.83-89, 2004. ,
DOI : 10.1007/11519645_8
Supervised wrapper generation with Lixto, VLDB, pp.715-716, 2001. ,
A machine learning architecture for optimizing Web search engine, Workshop on Internet-Based Information Systems (W- AAAI'96), 1996. ,
Improved algorithms for topic distillation in a hyperlinked environment, Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval , SIGIR '98, pp.104-111, 1998. ,
DOI : 10.1145/290941.290972
Graph structure in the Web, Computer Networks, vol.33, issue.1-6, 2000. ,
DOI : 10.1016/S1389-1286(00)00083-9
Sesame : A generic architecture for storing and querying RDF and RDF Schema, International Semantic Web Conference, pp.54-68, 2002. ,
Random projection in dimensionality reduction, Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining , KDD '01, 2001. ,
DOI : 10.1145/502512.502546
Measuring the Web, Computer Networks and ISDN Systems, vol.28, issue.7-11, pp.993-1005, 1996. ,
DOI : 10.1016/0169-7552(96)00061-X
Histoire de l'informatique. La découverte, 1993. ,
Extracting Patterns and Relations from the World Wide Web, WebDB Workshop at 6th International Conference on Extending Database Technology, EDBT'98, 1998. ,
DOI : 10.1007/10704656_11
Structural analysis of hypertexts: identifying hierarchies and useful metrics, ACM Transactions on Information Systems, vol.10, issue.2, pp.142-180, 1992. ,
DOI : 10.1145/146802.146826
Multilingual Information Retrieval Based on Document Alignment Techniques, ECDL, pp.183-197, 1998. ,
DOI : 10.1007/3-540-49653-X_12
As we may think. The Atlantic Monthly, pp.101-108, 1945. ,
Automatic Report Generation from Ontologies: The MIAKT Approach, Nineth International Conference on Applications of Natural Language to Information Systems, 2004. ,
DOI : 10.1007/978-3-540-27779-8_28
Empirical methods in information extraction. AI Magazine Mining the Web's link structure Bigraph and trigraph models for language identification and character recognition Towards the self-annotating Web WebQuery : Searching and visualizing the Web through connectivity IEPAD : information extraction based on pattern discovery GATE : A framework and graphical development environment for robust NLP tools and applications Automatic web information extraction in the ROADRUNNER system, Califf. Relational Learning Techniques for Natural Language Information Extraction AISB Workshop on Computational Linguistics for Speech and Handwriting Recognition WWW '04 : Proceedings of the 13th international conference on World Wide WebCL96] J. Cowie and W. Lehnert. Information extraction. Communications of the ACM WWWCLM00] Vincenza Carchiolo, Alessandro Longheu, and Michele Malgeri. Extracting logical schema from the web. In PRICAI Workshop on Text and Web MiningCM98] Valter Crescenzi and Giansalvatore Mecca. Grammars have exceptions Proceedings of the 40th Anniversary Meeting of the Association for Computational LinguisticsCMM01] Valter Crescenzi ER (Workshops), pp.60-67, 1994. ,
Querying structured documents with hypertext links using OODBMS, Proceedings of the 1994 ACM European conference on Hypermedia technology , ECHT '94, pp.186-197, 1994. ,
DOI : 10.1145/192757.192799
Introduction to the special issue on software architecture for language engineering, Natural Language Engineering, 2004. ,
Ngram -based text categorization, Proceedings of SDAIR-94, 3rd Annual Symposium on Document Analysis and Information Retrieval, pp.161-175, 1994. ,
An empirical study of smoothing techniques for language modeling, 1998. ,
Information extraction, automatic. Encyclopedia of Language and Linguistics, 2005. ,
Gauging similarity with n-grams : Languageindependent categorization of text, Science, vol.267, issue.10, pp.843-848, 1995. ,
Multilingual linguistic modules for IE systems, Proceedings of Workshop on Information Extraction for Slavonic and other Central and Eastern European Languages (IESL'03), 2003. ,
Automatic semantic annotation using unsupervised information extraction and integration, Workshop on Knowledge Markup and Semantic Annotation, 2003. ,
Magpie, Proceedings of the 9th international conference on Intelligent user interface , IUI '04, pp.191-197, 2004. ,
DOI : 10.1145/964442.964479
SemTag and seeker, Proceedings of the twelfth international conference on World Wide Web , WWW '03, pp.178-186, 2003. ,
DOI : 10.1145/775152.775178
An overview of the FRUMP system, Strategies for Natural Language Processing, pp.149-176, 1982. ,
Wim Peters, Erich Peters, and Wim Voermans. Cross-lingual legal information retrieval using a WordNet architecture, ICAIL, pp.163-167, 2005. ,
Structure of Growing Networks with Preferential Linking, Physical Review Letters, vol.85, issue.21, pp.4633-4636, 2000. ,
DOI : 10.1103/PhysRevLett.85.4633
Statistical identification of language, 1994. ,
Conceptual-model-based data extraction from multiplerecord Web pages, Data Knowl. Eng, issue.3, pp.31227-251, 1999. ,
Information extraction from World Wide Web -a survey, 1999. ,
La dialectique de l'´ ecrit et du document. un effort de synthèse. Schéma et schématisation, pp.82-91, 1981. ,
On random graphs, Publicationes Mathematicae, vol.6, pp.290-297, 1959. ,
The shape of the Web and its implications for searching the Web, 2000. ,
The World-Wide Web: quagmire or gold mine?, Communications of the ACM, vol.39, issue.11, pp.65-68, 1996. ,
DOI : 10.1145/240455.240473
WordNet -An Electronic Lexical Database, 1998. ,
Self-organization and identification of Web communities, Computer, vol.35, issue.3, pp.66-71, 2002. ,
DOI : 10.1109/2.989932
Systèmes multilingue recherche interlingue, Conférence Internationale sur le Document Electronique, 2005. ,
Structured answers for a large structured document collection [Für99] Johannes Fürnkranz Exploiting structural information for text classification on the WWW [Fre98] D. Freitag. Information extraction from HTML : Application of a general machine learning approach Multilingual sentence categorization according to language. CoRR, cmp-lg/9502039, 16ème ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'93) IDA Proceesings of Fifteenth National Conference on Artificial Intelligence (AAAI- 98), 1998. [Gig95] Emmanuel GiguetGig98] Emmanuel Giguet. Méthode pour l'analyse automatique de structures formelles sur documents multilinguesGKR98] David Gibson, Jon M. Kleinberg, and Prabhakar Raghavan. Inferring Web communities from link topology. In Hypertext, pp.204-213, 1993. ,
The Web graph : an overview, InQuatrì emes Rencontres francophones sur les aspects algorithmiques des télécommunications (ALGOTEL'02), 2002. ,
URL : https://hal.archives-ouvertes.fr/hal-00016817
Topologie d'Internet et du Web : mesure et modélisation, Premier colloque Mesures de l'Internet, 2003. ,
Bipartite graphs as models of complex networks, CAAN, pp.127-139, 2004. ,
Bipartite structure of all complex networks, Gér02] Mathias Géry. Indexation et interrogation de chemins de lecture en contexte pour la Recherche d'Information Structurée sur le Web, pp.215-221, 2002. ,
DOI : 10.1016/j.ipl.2004.03.007
URL : https://hal.archives-ouvertes.fr/hal-00016855
Adaptive information extraction and sublanguage analysis A translation approach to portable ontologies Knowledge Acquisition Description of the Proteus system as used for MUC-5 Message Understanding Conference -6 : A brief history Autowrapper : automatic wrapper generation for multiple online services Information extraction : Beyond document retrieval Description of the lasie system as used for muc-6 Hyweb : Un système d'interrogation orienté objet pour le web Description of the FASTUS system as used for MUC-4, TIPSTER architecture design document version 2.0 (tinman architecture Proceedings of Workshop on Adaptive Text Extraction and Mining at Seventeenth International Joint Conference on Artificial Intelligence Proceedings of the Fifth Message Understanding Conference (MUC-5) Proceedings of the 16th International Conference on Computational Linguistics The Asia Pacic Web Conference Proceedings of the Sixth Message Understanding Conference BDA Proceedings of the Fourth Message Understanding Conference MUC-4, pp.199-220, 1992. ,
Language recognition using two-and three-letter clusters, 1993. ,
Generating finite-state transducers for semistructured data extraction from the Web, Special Issue on Semistructured Data, 1998. ,
Template-based wrappers in the TSIMMIS system, SIG- MOD Conference, pp.532-535, 1997. ,
Multilingual document alignment -a study with chinese and japanese, NLPRS, pp.617-623, 2001. ,
Description of the TACITUS system as used for MUC-3, Proceedings of the Third Message Understanding Conference MUC-3, pp.200-206, 1991. ,
S-CREAM -semiautomatic creation of metadata, 13th International Conference on Knowledge Engineering and Knowledge Management (EKAW02), pp.358-372, 2002. ,
Learning information extraction patterns from examples. Workshop on new approaches to learning for natural language processing (IJCAI-95), pp.127-142, 1995. ,
Essais de linguistique générale, 1963. ,
Template-based information mining from HTML documents, AAAI/IAAI, pp.256-262, 1997. ,
SCISOR: extracting information from on-line news, Communications of the ACM, vol.33, issue.11, pp.88-97, 1990. ,
DOI : 10.1145/92755.92769
Dimensionality reduction by random mapping : Fast similarity computation for clustering, International Joint Conference on Neural Networks (IJCNN'98, 1998. ,
Web mining research, SIGKDD Explorations, pp.1-15, 2000. ,
DOI : 10.1145/360402.360406
Bibliographic coupling between scientific papers, American Documentation, vol.14, pp.10-25, 1963. ,
AeroDAML : Applying information extraction to Generate DAML Annotations from Web pages ,
The Web as a graph : Measurements, models, and methods, Bibliographie In First International Conference on Knowledge Capture The structure of the Web, pp.1-171849, 1999. ,
Authoritative sources in a hyperlinked environment, Journal of the ACM, vol.46, issue.5, pp.604-632, 1999. ,
DOI : 10.1145/324133.324140
Acquisition of linguistic patterns for knowledge-based information extraction Self-organizing maps Random graph models for the Web graph, FOCS, pp.31713-724, 1995. ,
The Web as a graph, Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems , PODS '00, pp.1-10, 2000. ,
DOI : 10.1145/335168.335170
Crosslingual information extraction system evaluation Referral Web : combining social networks and collaborative filtering Wrapper induction for information extraction, Proceedings of the Sixth Message Understanding Conference Proceedings of COLINGKus97] Nicholas Kushmerick, pp.221-23663, 1995. ,
Atrans : Automatic processing of money transfer messages, Proceedings of the Fifth National Conference on Artificial Intelligence (AAAI-86), pp.1089-1093, 1986. ,
Language identification using phonebased acoustic likelihoods, the IEEE International Conference on Accoustics, Speech, and Signal Processing (ICA94), 1994. ,
Description of the PIE system as used for MUC-6, Proceedings of the Sixth Message Understanding Conference (MUC-6), pp.113-126, 1995. ,
How " World Wide " is the Web ? trend in internationalization of Web sites, Annual Review of OCLC Research, 1999. ,
XWRAP: an XML-enabled wrapper construction system for Web information sources, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073), pp.611-621, 2000. ,
DOI : 10.1109/ICDE.2000.839475
DEByE -data extraction by example, Data Knowl. Eng, vol.40, issue.2, pp.121-154, 2002. ,
A brief survey of Web data extraction tools, SIG- MOD Record, issue.2, pp.3184-93, 2002. ,
Some methods for classification and analysis of multivariate observations, Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, pp.281-297, 1967. ,
Multilingual information extraction Master's thesis, Master's Thesis, 2004. ,
Multi-source and multilingual information extraction, 2003. ,
Topology of the conceptual network of language, Physical Review E, issue.065102, p.65, 2002. ,
Stalker : Learning extraction rules for semistructured, Proceedings of AAAI-98 Workshop on AI and Information Integration, 1998. ,
Active learning for hierarchical wrapper induction, AAAI/IAAI, p.975, 1999. ,
Querying the World Wide Web, Fourth International Conference on Parallel and Distributed Information Systems, pp.80-91, 1996. ,
DOI : 10.1109/PDIS.1996.568671
MUSE : a multi-source entity recognition system, 2003. ,
Multiple discriminant analysis in linguistic problems, Statistical Methods in Linguistics, vol.4, 1965. ,
How We Think., Proceedings of Online 72 ,
DOI : 10.2307/2179725
Foreign language identification : First step in the translation process, the 28th Annual Conference of the American Translators Accociation, pp.509-516, 1987. ,
Nouvelle méthode syntagmatique de vectorisation appliquée au Self-organizing map des textes vietnamiens, POSTER, RECITAL (Rencontre des Etudiants Chercheurs en Informatique pour le Traitement Automatique des Langues), 2004. ,
Multilingual hyperdocument recognition: a document mining approach, Proceedings. 2004 International Conference on Information and Communication Technologies: From Theory to Applications, 2004., 2004. ,
DOI : 10.1109/ICTTA.2004.1307822
Hyperling : Système de reconnaissance et de classification des hyperdocuments multilingues, International Conference in Computer Science « Research, Innovation and Vision of the Future» (RIVF'05), 2005. ,
Multilingual Web Documents: the system Hyperling, 2006 2nd International Conference on Information & Communication Technologies, 2003. ,
DOI : 10.1109/ICTTA.2006.1684435
Dimitar Manov , Damyan Ognyanoff, and Miroslav Goranov. KIM -semantic annotation platform, International Semantic Web Conference, pp.834-849, 2003. ,
KIM -semantic annotation platform Natural Language Engineering [Poi99] Thierry Poibeau. Mixing technologies for intelligent information extraction [PP04] Muntsa Padro and Lluis Padro. Comparing methods for language identification, Actes du workshop Intelligent Information Integration (III), 16th International Joint Conference on Artificial Intelligence (IJCAI'99) Procesamiento del Lenguaje NaturalPPR96] Peter Pirolli, James Pitkow, and Ramana Rao. Silk from a Sow's Ear : Extracting usable structures from the Web Proc, pp.155-162, 1999. ,
Survey of semantic annotation platforms, Proceedings of the 2005 ACM symposium on Applied computing , SAC '05, pp.1634-1638, 2005. ,
DOI : 10.1145/1066677.1067049
Elaboration automatique d'une base de données donnéesà partir d'informations semi-structurées issues du Web Automatically constructing a dictionary for information extraction tasks What is this page known for ? Computing web page reputations, INFORSID Proceedings of the Eleventh Annual Conference on Artificial Intelligence, pp.327-341, 1993. ,
Extracting semi-structured data through examples, CIKM, pp.94-101, 1999. ,
Inducing information extraction systems for new languages via crosslanguage projection, COLING, 2002. ,
Experimental results on the alignment of multilingual web sites, Eighth European Conference on Software Maintenance and Reengineering, 2004. CSMR 2004. Proceedings., pp.288-295, 2004. ,
DOI : 10.1109/CSMR.2004.1281431
Probabilistic reasoning for entity & relation recognition, Proceedings of the 19th international conference on Computational linguistics -, pp.1-7, 2002. ,
DOI : 10.3115/1072228.1072379
Web ecology : Recycling HTML pages as XML documents using W4F, WebDB (Informal Proceedings), pp.31-36, 1999. ,
Automatic text decomposition and structuring, Information Processing & Management, vol.32, issue.2, pp.127-138, 1996. ,
DOI : 10.1016/S0306-4573(96)85001-1
Breaking through the foreign language barrier : Resources on the web, Online Journal of Issues in Nursing, 2000. ,
Issues in inductive learning of domainspecific text extraction rules, Learning for Natural Language Processing, pp.290-301, 1995. ,
Hierarchical directed acyclic graph kernel, Proceedings of the 41st Annual Meeting on Association for Computational Linguistics , ACL '03, pp.32-39, 2003. ,
DOI : 10.3115/1075096.1075101
Co-citation in the scientific literature : A new measure of the relationship between two documents. Essays of an Information Scientist, pp.28-31, 1974. ,
Structured translation for cross-language information retrieval, Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval , SIGIR '00, pp.120-127, 2000. ,
DOI : 10.1145/345508.345562
Learning to extract text-based information from the World Wide Web, KDD, pp.251-254, 1997. ,
Learning information extraction rules for semistructures and free text, Feb, 1999. ,
Mining structural information on the Web Exploring complex networks Knowledge-based wrapper generation by using XML, IJCAI-2001 Workshop on Adaptive Text Extraction and Mining, pp.8-131205, 1997. ,
ESOM-Maps : tools for clustering, visualization, and classification with Emergent SOM Prediction for phoneme/syllable/word-category and identification of language using hmm, UN90] Yoshio Ueda and Seiichi Nakagawa the 1990 International Conference on Spoken Language ProcessingVA00] J. Vesanto and E. Alhoniemi. Clustering of the Self-Organizing Map. In Student Member, pp.521157-1168, 1990. ,
MnM: Ontology Driven Semi-automatic and Automatic Support for Semantic Markup, EKAW '02 : Proceedings of the 13th International Conference on Knowledge Engineering and Knowledge Management . Ontologies and the Semantic Web, pp.379-391, 2002. ,
DOI : 10.1007/3-540-45810-7_34
Description of the PLUM system as used for MUC-6 Social Network Analysis : Methods and Applications, Proceedings of the Sixth Message Understanding Conference (MUC-6), pp.55-70119, 1989. ,
Collective dynamics of 'small-world' networks, Nature, vol.393, pp.440-442, 1998. ,
MIETTA --- a framework for uniform and multilingual access to structured database and Web information, Proceedings of the fifth international workshop on on Information retrieval with Asian languages , IRAL '00, 2000. ,
DOI : 10.1145/355214.355220
NYU : Description of the Proteus/PET system as used for MUC-7, Proceedings of the Seventh Message Understanding Conference, 1998. ,
The automatic identification of languages using linguistic recognition signals, 1991. ,
Catégorisation des hyperdocuments multilingues : système Hyperling, Conférence Internationale sur le Document Electronique (CiDE.8), 2005. ,
Automatic language identification of telephone speech messages using phoneme recognition and n-gram modeling, the IEEE International Conference on Accoustics, Speech, and Signal Processing (ICA94), 1994. ,