, European Language Resources Association (ELRA), Portoro?, Slovénie, vol.56, p.103, 2016.

H. Frank-vanden-berghen and . Bersini, CONDOR, a new parallel, constrained extension of Powell's UOBYQA algorithm: Experimental results and comparison with the DFO algorithm, Journal of Computational and Applied Mathematics, vol.181, pp.157-175, 2005.

W. Michael, P. G. Berry, and . Young, Using latent semantic indexing for multilanguage information retrieval. C omputers and the Humanities, vol.29, pp.47-49, 1995.

C. Best, E. Van-der-goot, K. Blackler, T. Garcia, and D. Horby, Europe Media Monitor-System Description, p.70, 2005.

B. , Bureau International de l'Association littéraire et artistique, Actes de la Conférence réunie à Berlin, p.20, 1910.

W. Blacoe and M. Lapata, A Comparison of Vector-based Representations for Semantic Composition, P roceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Association for Computational Linguistics, vol.55, p.98, 2012.

J. Blitzer, M. Dredze, and F. Pereira, Biographies, bollywood, boomboxes and blenders: Domain adaptation for sentiment classification, P roceedings of the 45th Annual Meeting of the Association of Computational Linguistics. Association of Computational Linguistics, pp.440-447, 2007.

P. Bojanowski, E. Grave, A. Joulin, T. Et, and . Mikolov, Enriching Word Vectors with Subword Information, Hinrich Schutze, éditeur, T ransactions of the Association for Computational Linguistics, vol.5, pp.135-146, 2017.

O. Bojar, C. Buck, C. Callison-burch, C. Federmann, B. Haddow et al., Findings of the 2013 Workshop on Statistical Machine Translation. Dans P roceedings of the Eighth Workshop on Statistical Machine Translation (WMT13). Association for Computational Linguistics, pp.1-44, 2013.

F. Boudin, TALN Archives : une archive numérique francophone des articles de recherche en Traitement Automatique de la Langue, TALN Archives: a digital archive of French research articles in Natural Language Processing, vol.2, pp.507-514, 2013.

R. Samuel, G. Bowman, C. Angeli, C. D. Potts, and . Manning, A Large Annotated Corpus for Learning Natural Language Inference, P roceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, p.121, 2015.

L. Breiman, Bias, Variance, and Arcing Classifiers. Rapport technique, p.121, 1996.

L. Breiman, Random Forests. M achine Learning, vol.45, pp.5-32, 2001.

, , pp.114-116

F. Peter, V. J. Brown, S. A. Pietra, R. L. Della-pietra, and . Mercer, The Mathematics of Statistical Machine Translation: Parameter Estimation, Computational Linguistics, vol.19, issue.2, pp.46-87, 1993.

V. Samuel and . Bruton, Self-Plagiarism and Textual Recycling: Legitimate Forms of Research Misconduct, Accountability in Research, vol.21, issue.3, pp.176-197, 2014.

T. Brychcin and L. Svoboda, UWB at SemEval-2016 Task 1: Semantic textual similarity using lexical, syntactic, and semantic information, Proceedings of the 10th International Workshop on Semantic Evaluation, vol.53, p.112, 2016.

P. Businger, H. Gene, and . Golub, Linear least squares solutions by householder transformations, N umerische Mathematik, vol.7, issue.3, pp.269-276, 1965.

C. Callison-burch, P. Koehn, C. Monz, M. Post, R. Soricut et al., Findings of the 2012 Workshop on Statistical Machine Translation. Dans P roceeding of WMT 2012, 2012.

B. William, J. M. Cavnar, and . Trenkle, N-Gram-Based Text Categorization, Proceedings of 3rd Annual Symposium on Document Analysis and Information Retrieval (SDAIR'94), pp.161-175, 1994.

D. Cer, M. Diab, E. Agirre, I. Lopez-gazpio, and L. Specia, SemEval-2017 Task 1: Semantic Textual Similarity Multilingual and Crosslingual Focused Evaluation, 2017.
DOI : 10.18653/v1/s17-2001

URL : https://hal.archives-ouvertes.fr/hal-01560674

. Dans, Association for Computational Linguistics, P roceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pp.1-14, 2017.

M. Cettolo, C. Girardi, M. Et, and . Federico, Wit 3 : Web inventory of transcribed and translated talks, Proceedings of the 16 th Conference of the European Association for Machine Translation (EAMT), pp.261-268, 2012.

D. Chen, D. Christopher, and . Manning, A Fast and Accurate Dependency Parser using Neural Networks, P roceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp.740-750, 2014.
DOI : 10.3115/v1/d14-1082

URL : https://doi.org/10.3115/v1/d14-1082

M. Chen, Efficient Vector Representation for Documents through Corruption, P roceedings of the 5th International Conference on Learning Representations, 2017.

F. Toulon and . Avril, , 2017.

P. Cimiano, A. Schultz, S. Sizov, P. Sorg, and S. Staab, Explicit Versus Latent Concept Models for Cross-Language Information Retrieval, Proceedings of the 21st International Joint Conference on Artificial Intelligence (IJCAI-09), pp.1513-1518, 2009.

G. John, L. E. Cleary, and . Trigg, K*: An Instance-based Learner Using an Entropic Distance Measure, P roceedings of the 12th International Conference on Machine Learning, vol.2, pp.114-116, 1995.

P. Clough, Old and new challenges in automatic plagiarism detection, N ational Plagiarism Advisory Service, p.61, 2003.

P. Clough and M. Stevenson, Developing a Corpus of Plagiarised Short Answers, Language Resources and Evaluation, vol.45, issue.1, pp.5-24, 2011.

C. Collberg and S. Kobourov, Self-Plagiarism in Computer Science, 2005.

, C ommunications of the ACM, vol.48, issue.4, pp.88-94

. Collins, Cobuild English Dictionary, p.40, 1988.

A. Conneau, D. Kiela, H. Schwenk, L. Barrault, and A. Bordes, Supervised Learning of Universal Sentence Representations from Natural Language Inference Data, P roceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp.681-691, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01897968

N. Cristianini and J. Shawe-taylor, An introduction to Support Vector Machines and other kernel-based learning methods, p.49, 2000.

F. J. Damerau, A technique for computer detection and correction of spelling errors, C ommunications of the ACM, vol.7, issue.3, pp.86-101, 1964.

V. Danilova, Cross-Language Plagiarism Detection Methods. Dans Galia Angelova, Kalina Bontcheva, et Ruslan Mitkov, éditeurs, Proceedings of the Student Research Workshop associated with RANLP 2013. Hisarya, Bulgarie, Recent Advances in Natural Language Processing, vol.37, p.58, 2013.

S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer, and R. , Indexing by latent semantic analysis, J ournal of the American Society for Information Science, vol.41, issue.6, pp.47-49, 1990.

M. Dehouck and P. Denis, Delexicalized Word Embeddings for Cross-lingual Dependency Parsing, P roceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, vol.1, pp.240-249, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01590639

J. Denero and D. Klein, Tailoring Word Alignments to Syntactic Machine Translation, 2007.

. Dans, P roceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pp.17-24, 2007.

. Birk-diedenhofen-et-jochen and . Musch, cocor: A Comprehensive Solution for the Statistical Comparison of Correlations, P LoS ONE, vol.10, issue.6, 2015.

. Bibliographie,

F. Duclaye, Apprentissage automatique de relations d'équivalence sémantique à partir du Web, 2003.

S. T. Dumais, T. A. Letsche, M. L. Littman, and T. Landauer, Automatic Cross-language Retrieval Using Latent Semantic Indexing, AAAI-97 Spring Symposium Series: Cross-Language Text and Speech Retrieval, pp.18-24, 1997.

L. Duong, H. Kanayama, T. Ma, S. Bird, and T. Cohn, Multilingual Training of Crosslingual Word Embeddings, P roceedings of the 15th Conference of the European Chapter, vol.1, pp.893-903, 2017.
DOI : 10.18653/v1/e17-1084

URL : https://doi.org/10.18653/v1/e17-1084

P. Edmonds and G. Hirst, Near-synonymy and lexical choice, Computational Linguistics, vol.28, issue.2, pp.105-144, 2002.
DOI : 10.1162/089120102760173625

S. Meyer-zu-eissen and B. Stein, Intrinsic Plagiarism Detection, Advances in Information Retrieval, éditeur, 2 8th European Conference on IR Research, vol.31, p.34, 2006.

J. Enright and G. Kondrak, A Fast Method for Parallel Document Identification, The Conference of the North American Chapter of the Association for Computational Linguistics (NAACL'07), pp.29-32, 2007.
DOI : 10.3115/1614108.1614116

URL : http://dl.acm.org/ft_gateway.cfm?id=1614116&type=pdf

C. España-bonet, Á. C. Varga, A. Barrón-cedeño, and J. Van-genabith, An Empirical Analysis of NMT-Derived Interlingual Embeddings and their Use in Parallel Sentence Identification, 2017.

. Eurovoc, Subject-Oriented Version, vol.2, p.41, 1995.

C. Ferric, R. G. Fang, A. Steen, and . Casadevall, Misconduct accounts for the majority of retracted scientific publications, vol.109, pp.17028-17033, 2012.

J. Ferrero, F. Agnès, L. Besacier, D. Et, and . Schwab, A Multilingual, Multi-style and Multi-granularity Dataset for Cross-language Textual Similarity Detection, P roceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16). European Language Resources Association (ELRA), Portoro?, Slovénie, vol.76, pp.4162-4169, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01303135

J. Ferrero, L. Besacier, D. Schwab, F. Et, and . Agnès, CompiLIG at SemEval-2017 Task 1: Cross-Language Plagiarism Detection Methods for Semantic Textual Similarity, P roceedings of the 11th International Workshop on Semantic Evaluation, pp.109-114, 2017.
DOI : 10.18653/v1/s17-2012

URL : https://hal.archives-ouvertes.fr/hal-01531330

J. Ferrero, L. Besacier, D. Schwab, F. Et, and . Agnès, Deep Investigation of Cross-Language Plagiarism Detection Methods, Proceedings of the 10th Workshop on Building and Using Comparable Corpora (BUCC). Association for Computational Linguistics, pp.6-15, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01531346

J. Ferrero, L. Besacier, D. Schwab, F. Et, and . Agnès, Using Word Embedding for Cross-Language Plagiarism Detection, P roceedings of the 15th Conference of the European Chapter, vol.2, pp.415-421, 2017.
DOI : 10.18653/v1/e17-2066

URL : https://hal.archives-ouvertes.fr/hal-01502146

J. Ferrero and A. Simac-lejeune, conférence internationale sur l'extraction et la gestion des connaissances (EGC, Janvier, vol.1, pp.23-28, 2015.

M. Franco-salvador, I. Bensalem, E. Flores, P. Gupta, P. Et et al., PAN 2015 Shared Task on Plagiarism Detection: Evaluation of Corpora for Text Alignment, N otebook Papers for PAN at CLEF 2015 LABs and Workshops, 2015.

M. Franco-salvador, P. Gupta, P. Et, and . Rosso, Cross-Language Plagiarism Detection using a Multilingual Semantic Network, 5th European Conference on Information Retrieval (ECIR'13), vol.3, pp.710-713, 2013.
DOI : 10.1007/978-3-642-36973-5_66

URL : https://riunet.upv.es/bitstream/10251/38819/11/poster_ECIR_13.pdf

M. Franco-salvador, P. Gupta, P. Et, and . Rosso, Graph-based similarity analysis: a new approach to cross-language plagiarism detection. J ournal of the Spanish Society of Natural Language Processing (Sociedad Espaola de Procesamiento del Languaje Natural) 50, p.45, 2013.

M. Franco-salvador, P. Gupta, P. Et, and . Rosso, Knowledge Graphs as Context Models: Improving the Detection of Cross-Language Plagiarism with Paraphrasing. Dans Bridging Between Information Retrieval and Databases. PROMISE Winter School, vol.8173, p.45, 2013.
DOI : 10.1007/978-3-642-54798-0_12

URL : https://riunet.upv.es/bitstream/10251/49758/1/PROMISE_13.pdf

M. Franco-salvador, P. Rosso, and R. Navigli, A Knowledge-based Representation for Cross-Language Document Retrieval and Categorization, Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics. Göteborg, Suède, pp.414-423, 2014.
DOI : 10.3115/v1/e14-1044

URL : http://wwwusers.di.uniroma1.it/~navigli/pubs/EACL_2014_FrancoSalvadoretal.pdf

M. Franco-salvador, P. Rossoa, and M. Montes-y-gómez, A systematic study of knowledge graph analysis for cross-language plagiarism detection, Juillet, vol.52, pp.550-570, 2016.
DOI : 10.1016/j.ipm.2015.12.004

M. Franco-salvadora, P. Guptaa, P. Rossoa, and R. E. Banchsb, Cross-language plagiarism detection over continuous-space-and knowledge graph-based representations of language. K nowledge-Based Systems, vol.111, pp.87-99, 2016.

G. Francopoulo, J. Mariani, and P. Paroubek, A Study of Reuse and Plagiarism in LREC papers, P roceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16). European Language Resources Association (ELRA), pp.1890-1897, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01840811

E. Frank, M. Hall, and B. Pfahringer, Locally Weighted Naive Bayes, P roceedings of the 19th Conference in Uncertainty in Artificial Intelligence, vol.2, pp.114-116, 2003.

E. Gabrilovich and . Markovitch, Computing Semantic Relatedness using Wikipedia-based Explicit Semantic Analysis, Proceedings of the 20th International Joint Conference on Artifical Intelligence (IJCAI'07), pp.1606-1611, 2007.

A. William, K. W. Gale, and . Church, A Program for Aligning Sentences in Bilingual Corpora, C omputational Linguistics, vol.19, issue.1, pp.75-102, 1993.

F. Galton, Regression towards mediocrity in hereditary stature, J ournal of the Anthropological Institute of Great Britain and Ireland, vol.15, p.116, 1886.
DOI : 10.2307/2841583

S. Gella, R. Sennrich, F. Keller, M. Et, and . Lapata, Image Pivoting for Learning Multilingual Multimodal Representations, Natural Language Processing, pp.2829-2835, 2017.
DOI : 10.18653/v1/d17-1303

URL : https://doi.org/10.18653/v1/d17-1303

U. Germann, Aligned Hansards of the 36 th Parliament of Canada. Release 2001-1a, 2001.

S. Ghannay, Y. Benoit-favre, . Estève, N. Et, and . Camelin, Word Embedding Evaluation and Combination, P roceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16). European Language Resources Association (ELRA), pp.300-305, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01433185

E. Gibney, I'm No Plagiarist, I Moved a Comma, The Times Higher Education Supplement: THE No. 2104, 2006.

G. H. Golub and C. Reinsch, Singular value decomposition and least squares solutions, N umerische Mathematik, vol.14, issue.5, pp.403-420, 1970.
DOI : 10.1007/978-3-662-39778-7_10

S. Gouws, Y. Bengio, G. Et, and . Corrado, BilBOWA: Fast Bilingual Distributed Representations without Word Alignments, Proceedings of the 32nd International Conference on Machine Learning (ICML'15), pp.748-756, 2015.

S. Gouws and A. Søgaard, Simple task-specific bilingual word embeddings, 2015.
DOI : 10.3115/v1/n15-1157

URL : https://doi.org/10.3115/v1/n15-1157

. Dans-h-uman-language, Association for Computational Linguistics, The 2015 Annual Conference of the North American Chapter of the ACL (HLT-NAACL), pp.1386-1390, 2015.

S. Green, M. De-marneffe, J. Bauer, E. Christopher, and D. Manning, Multiword Expression Identification with Tree Substitution Grammars: A Parsing tour de force with French, P roceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pp.725-735, 2011.
URL : https://hal.archives-ouvertes.fr/hal-01111383

J. Grman and R. Ravas, Improved implementation for finding text similarities in large collections of data-Notebook for PAN at CLEF, N otebook Papers for PAN at CLEF 2011 LABs and Workshops. Amsterdam, Pays-Bas. Septembre, p.32, 2011.

J. Grover and P. Mitra, Bilingual Word Embeddings with Bucketed CNN for Parallel Sentence Extraction, P roceedings of the 55th Annual Meeting of the Association for Computational Linguistics-Student Research Workshop, pp.11-16, 2017.
DOI : 10.18653/v1/p17-3003

URL : https://doi.org/10.18653/v1/p17-3003

P. Guibert and C. Michaut, Le plagiat étudiant. Education et sociétés 28:214, 2011.
DOI : 10.3917/es.028.0149

P. Gupta, A. Barrón-cedeño, and P. Rosso, Cross-language High Similarity Search using a Conceptual Thesaurus, Information Access Evaluation. Multilinguality, Multimodality, and Visual Analytics, pp.67-75, 2012.
DOI : 10.1007/978-3-642-33247-0_8

URL : https://riunet.upv.es/bitstream/10251/36280/3/Cross-language%20High%20Similarity%20Search%20using%20a%20Conceptual%20Thesaurus.pdf

M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann et al., The WEKA Data Mining Software: An Update, SIGKDD Explorations, vol.11, issue.1, p.115, 2009.
DOI : 10.1145/1656274.1656278

A. Hardie, Part-of-speech ratios in English corpora, I nternational Journal of Corpus Linguistics, vol.12, issue.1, pp.55-81, 2007.
DOI : 10.1075/ijcl.12.1.05har

F. Hill, K. Cho, A. Et, and . Korhonen, Learning Distributed Representations of Sentences from Unlabelled Data, P roceedings of the 15th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2016), pp.1367-1377, 2016.
DOI : 10.18653/v1/n16-1162

URL : https://doi.org/10.18653/v1/n16-1162

S. Hochreiter and J. Schmidhuber, Long Short-Term Memory, Neural Computation, vol.9, issue.8, pp.1735-1780, 1997.

G. Holmes, M. Hall, E. Et, and . Frank, Generating Rule Sets from Model Trees, P roceedings of the 12th Australian Joint Conference on Artificial Intelligence. Sydney, Australie, pp.1-12, 1999.
DOI : 10.1007/3-540-46695-9_1

URL : http://www.cs.waikato.ac.nz/~ml/publications/1999/ajc.pdf

M. Iyyer, V. Manjunatha, J. Boyd-graber, H. Daumé, and I. , Deep Unordered Composition Rivals Syntactic Methods for Text Classification, P roceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, vol.1, 2015.
DOI : 10.3115/v1/p15-1162

URL : https://doi.org/10.3115/v1/p15-1162

, Juillet, pp.1681-1691, 2015.

P. Jaccard, The Distribution of the Flora in the Alpine Zone, N ew Phytologist, vol.11, issue.2, pp.37-50, 1912.

A. Goswami, Vector Space Model and Overlap Metric for Author Identification, N otebook Papers for PAN at CLEF 2013 LABs and Workshops, 2013.

Y. Ji and J. Eisenstein, Discriminative Improvements to Distributional Sentence Similarity, P roceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp.891-896, 2013.

M. Sheridan, Back translation: an emerging sophisticated cyber strategy to subvert advances in 'digital age' plagiarism detection and prevention, Assessment & Evaluation in Higher Education, vol.40, issue.5, pp.1-7, 2015.

J. Institute, WHAT WOULD HONEST ABE LINCOLN SAY? Dans Installment 2: Honesty and Integrity-The Ethics of American Youth: 2010, study by Josephson Institute of Ethics' Report Card on American Youth's Values and Actions, p.26, 2011.

C. T. Kelley, Iterative Methods for Linear and Nonlinear Equations, Frontiers in Applied Mathematics. Society for Industrial and Applied Mathematics, 1995.
DOI : 10.1137/1.9781611970944

C. T. Kelley, Iterative Methods for Optimization, Frontiers in Applied Mathematics. Society for Industrial and Applied Mathematics, 1999.

K. Chow-kok and N. Salim, Web Based Cross Language Plagiarism Detection, J ournal of Computing, vol.1, issue.1, p.52, 2009.

K. Chow-kok and N. Salim, Features Based Text Similarity Detection, J ournal of Computing, vol.2, issue.1, pp.53-57, 2010.

K. Chow-kok and N. Salim, Web Based Cross Language Plagiarism Detection, Second International Conference on Computational Intelligence, Modelling and Simulation (CIMSiM). IEEE, pp.199-204, 2010.

M. Kestemont, K. Luyckx, and W. Daelemans, Intrinsic plagiarism detection using character trigram distance scores-Notebook for PAN at CLEF, N otebook Papers for PAN at CLEF 2011 LABs and Workshops. Amsterdam, Pays-Bas, p.35, 2011.

K. Khoshnavataher, V. Zarrabi, S. Mohtaj, H. Et, and . Asghari, Developing Monolingual Persian Corpus for Extrinsic Plagiarism Detection Using Artificial Obfuscation, W orking Notes Papers of the CLEF 2015 Evaluation Labs, 2015.

R. Kiros, Y. Zhu, R. Salakhutdinov, R. S. Zemel, and A. Torralba, Raquel Urtasun, et Sanja Fidler, Skip-Thought Vectors. Dans P roceedings of the 28th International Conference on Neural Information Processing Systems, pp.3294-3302, 2015.

D. Klein, D. Christopher, and . Manning, Fast Exact Inference with a Factored Model for Natural Language Parsing, P roceedings of the 15th Annual Conference on Advances in Neural Information Processing Systems, vol.15, pp.3-10, 2002.

D. Klein, D. Christopher, and . Manning, , 2003.

. Dans, P roceedings of the 41st Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, pp.423-430, 2003.

G. Klyne and J. J. Carroll, Resource Description Framework (RDF): Concepts and Abstract Syntax, 2004.

P. Koehn, Europarl: A Parallel Corpus for Statistical Machine Translation, Conference Proceedings: the tenth Machine Translation Summit. Phuket, Thaïlande, vol.5, p.121, 2005.

R. Kohavi, The Power of Decision Tables, P roceedings of the 8th European Conference on Machine Learning. Héraklion, Grèce, pp.174-189, 1995.

T. Kudo and Y. Matsumoto, Chunking with Support Vector Machines, P roceedings of the Second Meeting of the North American Chapter of the Association for Computational Linguistics on Language Technologies (NAACL'01). Association for Computational Linguistics, pp.1-8, 2001.

T. Kudo and Y. Matsumoto, Fast Methods for Kernel-based Text Analysis, P roceedings of the 41st Annual Meeting on Association for Computational Linguistics, vol.1, pp.24-31, 2003.

M. Kuznetsov, A. Motrenko, R. Kuznetsova, V. Et, and . Strijov, Methods for intrinsic plagiarism detection and author diarization-Notebook for PAN at CLEF, 2016.

N. Dans, Papers for PAN at CLEF 2016 LABs and Workshops

R. Layton, P. A. Watters, and R. Dazeley, Local n-grams for Author Identification Notebook for PAN at CLEF, N otebook Papers for PAN at CLEF 2013 LABs and Workshops. Valence, 2013.

V. Quoc, T. Le, and . Mikolov, Distributed Representations of Sentences and Documents, Proceedings of the 31th International Conference on Machine Learning (ICML'14). JMLR Proceedings, Pékin, Chine, vol.32, p.103, 2014.

M. Lesk, Automatic Sense Disambiguation Using Machine Readable Dictionaries: How to Tell a Pine Cone from an Ice Cream Cone, Proceedings of the 5th Annual International Conference on Systems Documentation (SIGDOC'86), pp.24-26, 1986.

I. Vladimir and . Levenshtein, Binary codes capable of correcting deletions, insertions, and reversals, S oviet Physics Doklady, vol.10, issue.8, pp.86-101, 1966.

G. Levow, D. W. Oard, and P. Resnik, Dictionary-based Techniques for Cross-language Information Retrieval. I nformation Processing and Management, vol.41, pp.523-547, 2005.

P. Liang, B. Taskar, D. Et, and . Klein, Alignment by Agreement, P roceedings of HLT-NAACL, pp.104-111, 2006.
DOI : 10.3115/1220835.1220849

URL : http://dl.acm.org/ft_gateway.cfm?id=1220849&type=pdf

Z. Lin, M. Feng, C. Nogueira, M. Santos, B. Yu et al., A Structured Self-Attentive Sentence Embedding, P roceedings of the 5th International Conference on Learning Representations, 2017.

A. Linard, B. Daille, E. Et, and . Morin, Attempting to Bypass Alignment from Comparable Corpora via Pivot Language, Proceedings of the 8th workshop on Building and Using Comparable Corpora (BUCC), pp.32-37, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01188570

C. Lioma and R. Blanco, Part of Speech Based Term Weighting for Information Retrieval, P roceedings of the 31th European Conference on IR Research on Advances in Information Retrieval (ECIR 2009), pp.412-423, 2017.
DOI : 10.1007/978-3-642-00958-7_37

URL : http://arxiv.org/pdf/1704.01617

M. Littman, S. T. Dumais, K. Thomas, and . Landauer, Automatic Cross-language Information Retrieval Using Latent Semantic Indexing, pp.51-62, 1998.
DOI : 10.1007/978-1-4615-5661-9_5

Y. Liu and M. Lapata, , 2017.

M. Luong, H. Pham, E. Christopher, and D. Manning, Bilingual Word Representations with Monolingual Quality in Mind, Proceedings of the 1st NAACL Workshop on Vector Space Modeling for Natural Language Processing. Denver, Colorado, États-Unis, vol.56, p.103, 2015.

C. Lyon, J. Malcolm, B. Et, and . Dickerson, Detecting short passages of similar text in large document collections, P roceedings of the 2001 Conference on Empirical Methods in Natural Language Processing, pp.118-125, 2001.

X. Ma, Champollion: A Robust Parallel Text Sentence Aligner, Proceedings of the fifth International Conference on Language Resources and Evaluation (LREC'06), 2006.

I. Gênes and . Mai, , 2006.

J. Mallinson, R. Sennrich, M. Et, and . Lapata, Paraphrasing Revisited with Neural Machine Translation, Association for Computational Linguistics, éditeur, P roceedings of the 15th Conference of the European Chapter, vol.1, pp.880-892, 2017.
DOI : 10.18653/v1/e17-1083

URL : https://doi.org/10.18653/v1/e17-1083

M. Mancini, J. Camacho-collados, I. Iacobacci, and R. Navigli, Embedding Words and Senses Together via Joint Knowledge-Enhanced Training, P roceedings of the 21st Conference on Computational Natural Language Learning, pp.100-111, 2017.
DOI : 10.18653/v1/k17-1012

URL : https://doi.org/10.18653/v1/k17-1012

C. D. Manning, P. Raghavan, H. Et, and . Schütze, États-Unis, chapitre 6-"Scoring, term weighting, and the vector space model, p.9780511809071, 2008.
DOI : 10.1017/cbo9780511809071.007

D. Christopher, H. Manning, and . Schütze, Foundations of Statistical Natural Language Processing, 1999.

I. Masic, Plagiarism in Scientific Publishing, Acta Informatica Medica, pp.208-213, 2012.
DOI : 10.5455/aim.2012.20.208-213

URL : http://europepmc.org/articles/pmc3558294?pdf=render

D. Mccabe, Students' cheating takes a high-tech turn, vol.2, pp.26-29, 2010.

J. Philip-mccrae, P. Cimiano, and R. Klinger, Orthonormal Explicit Topic Analysis for Cross-lingual Document Matching, Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp.1732-1740, 2013.

J. Philip-mccrae, D. Spohr, P. Et, and . Cimiano, de I nformation Systems and Applications, incl. Internet/Web, and HCI, chapitre Linking Lexical Resources and Ontologies on the Semantic Web with Lemon, The Semantic Web: Research and Applications : 8th Extended Semantic Web Conference, vol.6643, pp.245-259, 2011.

P. Mcnamee and J. Mayfield, Character N-Gram Tokenization for European Language Text Retrieval, Information Retrieval Proceedings, vol.7, issue.1-2, pp.73-97, 2004.

,

I. Mel'?uk, Dependency Syntax: Theory and Practice, p.99, 1988.

T. C. Mendenhall, 1887. The Characteristic Curves of Composition, S cience, vol.9, pp.237-246

E. Charles and . Metz, Basic principles of ROC analysis, S eminars in Nuclear Medicine, vol.8, issue.4, pp.283-298, 1978.

T. Mikolov, K. Chen, G. Corrado, J. Et, and . Dean, Efficient Estimation of Word Representations in Vector Space, T he Workshop Proceedings of the International Conference on Learning Representations, vol.7, p.103, 2013.

T. Mikolov, M. Karafiát, L. Burget, J. Cernocký, and . Et-sanjeev-khudanpur, Recurrent neural network based language model, Conference of the International Speech Communication Association, INTERSPEECH 2010, vol.1, pp.1045-1048, 2010.

T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, Distributed Representations of Words and Phrases and their Compositionality, 2013.

. Dans, P roceedings of the 27th Annual Conference on Neural Information Processing Systems (NIPS'13). Lac Tahoe, États-Unis, vol.54, p.103, 2013.

T. Mikolov, G. Wen-tau-yih, and . Zweig, Linguistic Regularities in Continuous Space Word Representations, P roceedings of NAACL-HLT 2013, pp.746-751, 2013.

. Bibliographie,

A. George and . Miller, Wordnet: A Dictionary Browser, Proceedings of the First Conference of the UW Centre for the New Oxford Dictionary. Information in Data, p.40, 1985.

G. A. Miller, C. Fellbaum, J. Kegl, K. J. Et, and . Miller, WordNet: an electronic lexical reference system based on theories of lexical memory, vol.17, pp.181-213, 1988.

S. Mohtaj, H. Asghari, V. Et, and . Zarrabi, Developing Monolingual English Corpus for Plagiarism Detection using Human Annotated Paraphrase Corpus, W orking Notes Papers of the CLEF 2015 Evaluation Labs, 2015.

M. Montes-y-gómez, A. Gelbukh, A. Lopez-lopez, and R. Baeza-yates, Flexible Comparison of Conceptual Graphs, Lecture Notes in Computer Science, pp.102-111, 2001.

E. Morin, A. Hazem, E. Loginova-clouet, and F. Boudin, LINA: Identifying Comparable Documents from Wikipedia, Proceedings of the 8th workshop on Building and Using Comparable Corpora (BUCC). Association for Computational Linguistics, Pékin, Chine, pp.88-91, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01185670

N. Mrk?ic, I. Vulic, Ó. Diarmuid, I. Séaghdha, R. Leviant et al., Semantic Specialisation of Distributional Word Vector Spaces using Monolingual and Cross-Lingual Constraints, T ransactions of the Association for Computational Linguistics, 2017.

M. Muhr, R. Kern, M. Zechner, and M. Granitzer, External and Intrinsic Plagiarism Detection Using a Cross-Lingual Retrieval and Segmentation SystemLab Report for PAN at CLEF 2010. Dans Martin Braschler, Donna Harman, et Emanuele Pianta, éditeurs, N otebook Papers for PAN at CLEF 2010 LABs and Workshops, Padoue, Italie. Septembre, vol.1176, pp.52-87, 2010.

J. El-moatez-billah-nagoudi, . Ferrero, D. Et, and . Schwab, Amélioration de la similarité sémantique vectorielle par méthodes non-supervisées, ème conférence sur le Traitement Automatique des Langues Naturelles, vol.2, pp.110-117, 2017.

J. El-moatez-billah-nagoudi, . Ferrero, D. Et, and . Schwab, LIM-LIG at SemEval-2017 Task1: Enhancing the Semantic Similarity for Arabic Sentences with Vectors Weighting, 2017.

, Août, vol.111, p.123, 2017.

R. Navigli, Babelplagiarism: What can BabelNet do for Cross-language Plagiarism Detection, Online Working Notes. Rome, Italie. Septembre. 2012, 2012.

R. Navigli and M. Lapata, An experimental study of graph connectivity for unsupervised word sense disambiguation, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.32, pp.678-692, 2010.

R. Navigli, P. Simone, and . Ponzetto, BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network, Artificial Intelligence Proceedings, vol.193, p.59, 2012.

J. Nivre, J. Hall, and J. Nilsson, MaltParser: A Data-Driven ParserGenerator for Dependency Parsing, P roceedings of the fifth International Conference on Language Resources and Evaluation (LREC'06). European Language Resources Association (ELRA), Gênes, Italie, pp.2216-2219, 2006.

G. Oberreuter, D. Juan, and . Velásquez, Text Mining Applied to Plagiarism Detection: The Use of Words for Detecting Deviations in the Writing Style, E xpert Systems with Applications, vol.40, issue.9, pp.3756-3763, 2013.

M. Pagliardini, P. Gupta, and M. Jaggi, Unsupervised Learning of Sentence Embeddings using Compositional n-Gram Features, 2017.

M. Pataki, A New Approach for Searching Translated Plagiarism, Proceedings of the 5th International Plagiarism Conference, vol.4, p.101, 2012.

K. Pearson, Notes on regression and inheritance in the case of two parents, P roceedings of the Royal Society of London, vol.58, p.116, 1895.

J. Pennington, R. Socher, D. Christopher, and . Manning, GloVe: Global Vectors for Word Representation, P roceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp.1532-1543, 2014.

R. Pereira, Cross-Language Plagiarism Detection, p.67, 2010.

V. P. Rafael-corezola-pereira and . Moreira, et Renata Galante. 2010a. A New Approach for Cross-language Plagiarism Analysis. Dans P roceedings of the 2010 International Conference on Multilingual and Multimodal Information Access Evaluation: Cross-language Evaluation Forum. Padoue, Italie, CLEF'10, pp.15-26

V. P. Rafael-corezola-pereira, R. Moreira, and . Galante, UFRGS@PAN2010: Detecting External Plagiarism-Lab Report for PAN at CLEF 2010. Dans Braschler et Harman, éditeurs, N otebook Papers for PAN at CLEF 2010 LABs and Workshops. Padoue, Italie, 2010.

S. Petrov, L. Barrett, R. Thibaux, D. Et, and . Klein, Learning Accurate, Compact, and Interpretable Tree Annotation, P roceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, pp.433-440, 2006.

S. Petrov, D. Das, and R. Mcdonald, A Universal Part-of-Speech Tagset, P roceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12). European Language Resources Association (ELRA), vol.99, p.163, 2012.

S. Petrov and D. Klein, Improved Inference for Unlexicalized Parsing, The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference. Association for Computational Linguistics, pp.404-411, 2007.

H. Pham, M. Luong, E. Christopher, and D. Manning, Learning Distributed Representations for Multilingual Text Sequences, Proceedings of the 1st NAACL Workshop on Vector Space Modeling for Natural Language Processing, 2015.

D. Pinto, J. Civera, A. Juan, P. Rosso, A. Et et al., A Statistical Approach to Crosslingual Natural Language Tasks, Journal of Algorithms, vol.64, pp.51-60, 2009.

D. Pinto, A. Juan, and P. Rosso, Using Query-Relevant Documents Pairs for Cross-Lingual Information Retrieval, Lecture Notes in Computer Science, vol.4629, pp.630-637, 2007.

M. Potthast, A. Barrón-cedeño, A. Eiselt, B. Stein, ;. M. Et-paolo-rosso et al., Overview of the 2nd International Competition on Plagiarism Detection, éditeur, N otebook Papers for PAN at CLEF 2010 LABs and Workshops. volume 1176 de C EUR Workshop Proceedings, p.64, 2010.

M. Potthast, A. Barrón-cedeño, B. Stein, P. Et, and . Rosso, Cross-Language Plagiarism Detection, Language Resources and Evaluation, vol.45, issue.1, pp.45-62, 2011.

M. Potthast, A. Eiselt, A. Barrón-cedeño, B. Stein, P. Et et al., Overview of the 3rd international Competition on Plagiarism Detection, Petras, P. Forner, éditeur, N otebook Papers for PAN at CLEF 2011 LABs and Workshops. volume 1177 de C EUR Workshop Proceedings. 5 citations pages, vol.64, p.83, 2011.

M. Potthast, M. Hagen, A. Beyer, M. Busse, M. Tippmann et al., Overview of the 6th International Competition on Plagiarism Detection, N otebook Papers for PAN at CLEF 2014 LABs and Workshops. Sheffield, Angleterre, pp.845-876, 2014.

M. Potthast, B. Stein, M. Et, and . Anderka, A Wikipedia-Based Multilingual Retrieval Model, 30th European Conference on IR Research (ECIR'08, vol.4956, pp.522-530, 2008.

M. Potthast, B. Stein, A. Barrón-cedeño, and P. Rosso, An Evaluation Framework for Plagiarism Detection, Huang et Jurafsky, éditeurs, Proceedings of the 23rd International Conference on Computational Linguistics (COLING), vol.64, p.122, 2010.

M. Potthast, B. Stein, A. Eiselt, A. Barrón-cedeño, and P. Rosso, Overview of the 1st International Competition on Plagiarism Detection, éditeurs, 3 rd PAN workshop. Uncovering Plagiarism, Authorship and Social Software Misuse (PAN'09), vol.502, p.67, 2009.

B. Pouliquen, R. Steinberger, C. Et, and . Ignat, Ontologies and Information Extraction' at the Summer School 'The Semantic Web and Language Technology-Its Potential and Practicalities, Juillet, p.42, 2003.

B. Pouliquen, R. Steinberger, C. Et, and . Ignat, Automatic Identification of Document Translations in Large Multilingual Document Collections, Proceedings of the International Conference Recent Advances in Natural Language Processing (RANLP'03). Borovets, Bulgarie, vol.6, p.85, 2003.

P. Prettenhofer and B. Stein, Cross-language Text Classification Using Structural Correspondence Learning, Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Uppsala, Suède, ACL'10, pp.1118-1127, 2010.
DOI : 10.1145/2036264.2036277

URL : http://arxiv.org/pdf/1008.0716.pdf

J. R. Quinlan, Learning with continuous classes, Proceedings of the Fifth Australian Joint Conference on Artificial Intelligence, vol.3, p.116, 1992.

J. and R. Quinlan, États-Unis, chapitre Induction of decision trees, Machine Learning, p.103, 1986.

J. and R. Quinlan, C4.5: Programs for Machine Learning. The Morgan Kaufmann series in machine learning, 1993.

A. M. Rogerson and G. Mccarthy, Using Internet based paraphrasing tools: Original work, patchwriting or facilitated plagiarism?, I nternational Journal for Educational Integrity, vol.13, issue.2, 2017.
DOI : 10.1007/s40979-016-0013-y

URL : https://edintegrity.biomedcentral.com/track/pdf/10.1007/s40979-016-0013-y

S. Ruder, A survey of cross-lingual embedding models, p.103, 2017.

M. Saad, D. Langlois, K. Et, and . Smaili, Cross-Lingual Semantic Similarity Measure for Comparable Articles, Advances in Natural Language Processing-9th International Conference on NLP, pp.105-115, 2014.
DOI : 10.1007/978-3-319-10888-9_11

URL : https://hal.archives-ouvertes.fr/hal-01067687

G. Salton, Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer, États-Unis. 12 citations pages, vol.42, p.100, 1989.

G. Salton and C. Buckley, Term-weighting Approaches in Automatic Text Retrieval, I nformation Processing and Management, vol.24, issue.5, pp.513-523, 1988.
DOI : 10.1016/0306-4573(88)90021-0

URL : https://ecommons.cornell.edu/bitstream/1813/6721/1/87-881.pdf

, , pp.90021-90021

. Bibliographie,

A. Miguel, G. Sanchez-perez, A. Sidorov, and . Gelbukh, The Winning Approach to Text Alignment for Text Reuse Detection at PAN 2014-Notebook for PAN at CLEF, 2014.

D. Cappellato, éditeur, N otebook Papers for PAN at CLEF 2014 LABs and Workshops. Sheffield, Angleterre. Septembre, 2014.

Y. Sari and M. Stevenson, Exploring Word Embeddings and Character N-Grams for Author Clustering-Notebook for PAN at CLEF, N otebook Papers for PAN at CLEF 2016 LABs and Workshops, 2016.

Y. Sari, A. Vlachos, M. Et, and . Stevenson, Continuous N-gram Representations for Authorship Attribution, Association for Computational Linguistics, éditeur, P roceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, vol.2, pp.267-273, 2017.
DOI : 10.18653/v1/e17-2043

URL : https://doi.org/10.18653/v1/e17-2043

H. Schmid, Probabilistic Part-of-Speech Tagging Using Decision Trees, Proceedings of the International Conference on New Methods in Language Processing. Manchester, Angleterre, vol.4, p.163, 1994.

D. Schwab, Approche hybride-lexicale et thématique-pour la modélisation, la détection et l'exploitation des fonctions lexicales en vue de l'analyse semantique de texte, 2005.

H. Schwenk and M. Douze, Learning Joint Multilingual Sentence Representations with Neural Machine Translation, P roceedings of the 2nd Workshop on Representation Learning for NLP, pp.157-167, 2017.
DOI : 10.18653/v1/w17-2619

URL : https://doi.org/10.18653/v1/w17-2619

R. Sennrich, O. Firat, K. Cho, A. Birch, B. Haddow et al.,

M. Nadejde, Nematus: a Toolkit for Neural Machine Translation, P roceedings of the Software Demonstrations of the 15th Conference of the European Chapter of the Association for Computational Linguistics, pp.65-68, 2017.

G. Sérasset, DBnary: Wiktionary as a Lemon-Based Multilingual Lexical Resource in RDF, Semantic Web Journal (special issue on Multilingual Linked Open Data), vol.6, issue.4, p.101, 2015.

C. Servan, Z. Elloumi, H. Blanchon, and L. Besacier, Word2Vec vs DBnary ou comment (ré)concilier représentations distribuées et réseaux lexico-sémantiques ? Le cas de l'évaluation en traduction automatique, Actes de la conférence conjointe JEP-TALN-RECITAL 2016, vol.2, 2016.

S. K. Shevade, S. S. Keerthi, C. Bhattacharyya, and K. R. Murthy, Improvements to the SMO Algorithm for SVM Regression, I EEE Transactions on Neural Networks, vol.11, issue.5, pp.1188-1193, 2000.

T. Shi, Z. Liu, Y. Liu, M. Et, and . Sun, Learning cross-lingual word embeddings via matrix co-factorization, P roceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, pp.567-572, 2015.

P. Shrestha, S. Sierra, F. A. González, P. Rosso, M. Montes-y-gómez et al., Convolutional Neural Networks for Authorship Attribution of Short Texts, Association for Computational Linguistics, éditeur, P roceedings of the 15th Conference of the European Chapter, vol.2, pp.669-674, 2017.

A. , , 2017.

P. Shrestha and T. Solorio, Using a Variety of n-Grams for the Detection of Different Kinds of Plagiarism-Notebook for PAN at CLEF, N otebook Papers for PAN at CLEF 2013 LABs and Workshops, 2013.

M. Simard, G. F. Foster, and P. Isabelle, Using Cognates to Align Sentences in Bilingual Corpora, Proceedings of the 1993 conference of the Centre for Advanced Studies on Collaborative research: distributed computing (CASCON'93), vol.2, p.39, 1993.

A. J. Smola and B. Scölkopf, A tutorial on support vector regression, S tatistics and Computing, vol.14, pp.199-222, 2004.

R. Socher, J. Bauer, C. D. Manning, and A. Y. Ng, Parsing with Compositional Vector Grammars, P roceedings of the 51st Annual Meeting of the Association for Computational Linguistics, pp.455-465, 2013.

R. Socher, E. H. Huang, J. Pennington, A. Y. Ng, and C. D. Manning, Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection, 2011.

. Dans, P roceedings of the 24th International Conference on Neural Information Processing Systems (NIPS'11). Grenade, Espagne, pp.801-809, 2011.

R. Socher, A. Perelygin, J. Y. Wu, J. Chuang, C. D. Manning et al., Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank, P roceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp.1631-1642, 2013.

S. Stajner, M. Franco-salvador, S. P. Ponzetto, P. Rosso, H. Et et al., Sentence Alignment Methods for Improving Text Simplification Systems, P roceedings of the 55th Annual Meeting of the Association for Computational Linguistics, pp.97-102, 2017.

E. Stamatatos, P. Stein, and . Rosso, Intrinsic Plagiarism Detection Using Character n-gram Profiles, et Efstathios Stamatatos, éditeurs, Workshop on Uncovering Plagiarism, Authorship, and Social Software Misuse (PAN'09), p.34, 2009.

E. Stamatatos, Authorship Attribution Using Text Distortion, Association for Computational Linguistics, éditeur, P roceedings of the 15th Conference of the European Chapter, vol.1, pp.1137-1148, 2017.

S. Bibliographie-benno-stein and . Meyer-zu-eissen, Near Similarity Search and Plagiarism Analysis, F rom Data and Information Analysis to Knowledge Engineering, pp.430-437, 2006.

B. Stein and S. Meyer-zu-eissen, Fingerprint-based Similarity Search and its Applications, F orschung und wissenschaftliches Rechnen, p.33, 2007.

B. Stein and S. Meyer-zu-eissen, Intrinsic Plagiarism Analysis with Meta Learning, Moshe Koppel, et Efstathios Stamatatos, éditeurs, P AN. volume 276 de C EUR Workshop Proceedings, p.34, 2007.
DOI : 10.1007/s10579-010-9115-y

B. Stein, N. Lipka, and P. Prettenhofer, Intrinsic Plagiarism Analysis, Language Resources and Evaluation, vol.45, issue.1, pp.63-82, 2011.
DOI : 10.1007/s10579-010-9115-y

R. Steinberger, Cross-lingual Keyword Assignment, C onference of the Spanish Society for Natural Language Processing (SEPLN'2001). Jaén, Espagne, numéro 27 dans Procesamiento del Lenguaje Natural, p.42, 2001.

R. Steinberger, B. Pouliquen, J. Et, and . Hagman, Cross-lingual Document Similarity Calculation Using the Multilingual Thesaurus EUROVOC, LNCS, vol.2276, pp.415-424, 2002.
DOI : 10.1007/3-540-45715-1_44

URL : http://hosting.jrc.cec.eu.int/langtech/Documents/CICLing-02_Steinberger.pdf

R. Steinberger, B. Pouliquen, C. Et, and . Ignat, Exploiting Multilingual Nomenclatures and Language-Independent Text Features as an Interlingua for Cross-lingual Text Analysis Applications, Proceedings of the 4th Slovenian Language Technology Conference. Information Society, p.42, 2004.

R. Steinberger, B. Pouliquen, A. Widiger, C. Ignat, and T. Erjavec, The JRC-Acquis: A multilingual aligned parallel corpus with 20+ languages, P roceedings of the 5th International Conference on Language Resources and Evaluation (LREC'2006). Gênes, Italie, vol.61, p.121, 2006.

G. W. Stewart, On the Early History of the Singular Value Decomposition, S ociety for Industrial and Applied Mathematics (SIAM Review), vol.35, issue.4, pp.551-566, 1993.
DOI : 10.1137/1035134

URL : http://www.cs.umd.edu/Library/TRs/CS-TR-2855/CS-TR-2855.ps.Z

K. Sugathadasa, B. Ayesha, A. S. Nisansa-de-silva, V. Perera, and . Jayawardana, Synergistic Union of Word2Vec and Lexicon for Domain Specific Semantic Similarity, P roceedings of the Seventh international conference on Innovative Computing Technology, p.98, 2017.
DOI : 10.1109/iciinfs.2017.8300343

URL : http://arxiv.org/pdf/1706.01967

S. Md-arafat-sultan, . Bethard, T. Et, and . Sumner, DLS@CU: Sentence similarity from word alignment and semantic vector composition, Proceedings of the 9th International Workshop on Semantic Evaluation, vol.59, p.112, 2015.

J. Tian, Z. Zhou, M. Lan, . Et-yuanbin, and . Wu, ECNU at SemEval-2017 Task 1: Leverage Kernel-based Traditional NLP features and Neural Networks to Build a Universal Model for Multilingual and Cross-lingual Semantic Textual Similarity, 2017.
DOI : 10.18653/v1/s17-2028

URL : https://doi.org/10.18653/v1/s17-2028

, Août, pp.191-197, 2017.

J. Tiedemann, Parallel Data, Tools and Interfaces in OPUS, 2012.

. Dans, P roceedings of the 8th International Conference on Language Resources and Evaluation (LREC 2012). Istanbul, Turquie, pp.62-76, 2012.

D. Antonio-rodríguez-torrejón and J. Manuel-martín-ramos, Crosslingual CoReMo System-Notebook for PAN at CLEF, N otebook Papers for PAN at CLEF 2011 LABs and Workshops. Amsterdam, Pays-Bas. Septembre, p.43, 2011.

J. Turian, L. Ratinov, and Y. Bengio, Word Representations: A Simple and General Method for Semi-Supervised Learning, P roceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp.384-394, 2010.

A. Tversky, Features of similarity, Psychological Review, vol.84, issue.4, p.43, 1977.

S. Upadhyay, M. Faruqui, C. Dyer, D. Et, and . Roth, Cross-lingual Models of Word Embeddings: An Empirical Comparison, P roceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL'16, pp.1661-1670, 2016.

H. Van-halteren, Linguistic Profiling for Author Recognition and Verification, P roceedings of the 42 nd Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, Stroudsburg, Pennsylvanie, États-Unis, ACL'04, 2004.

D. Varga, P. Hálacsy, V. Nagy, L. Németh, and A. Kornai, et Viktor Trón. 2005. Parallel corpora for Medium Density Languages. Dans Recent Advances in Natural Language Processing (RANLP'05), pp.590-596, 2005.

Z. ?eska, M. Toman, K. Et, and . Jezek, Multilingual Plagiarism Detection, Artificial Intelligence: Methodology, Systems, and Applications, vol.5253, pp.83-92, 2008.

A. Vinokourov and M. Girolami, A Probabilistic Framework for the Hierarchic Organisation and Classification of Document Collections, J ournal of Intelligent Information Systems, vol.18, issue.2/3, pp.153-172, 2002.

A. Vinokourov, J. Shawe-taylor, and N. Cristianini, Inferring a Semantic Representation of Text via Cross-Language Correlation Analysis, P roceedings of the 15th Annual Conference on Advances in Neural Information Processing Systems 15 (NIPS 2002, vol.58, p.60, 2002.

I. Vuli?, , 2017.

. Dans, P roceedings of the 15th Conference of the European Chapter, vol.2, pp.408-414, 2017.

I. Vuli? and A. Korhonen, On the role of seed lexicons in learning bilingual word embeddings, P roceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pp.247-257, 2016.

M. Bibliographie-ivan-vulic and . Moens, Probabilistic Models of Cross-Lingual Semantic Similarity in Context Based on Latent Cross-Lingual Concepts Induced from Comparable Data, P roceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp.349-362, 2014.

Y. Wang, H. Ian, and . Witten, Induction of model trees for predicting continuous classes, Proceedings of the poster papers of the European Conference on Machine Learning. Prague, République tchèque, vol.4, p.121, 1997.

J. Wieting and K. Gimpel, Revisiting Recurrent Networks for Paraphrastic Sentence Embeddings, P roceedings of the 55th Annual Meeting of the Association for Computational Linguistics, pp.2078-2088, 2017.

J. Wieting, J. Mallinson, K. Et, and . Gimpel, Learning Paraphrastic Sentence Embeddings from Back-Translated Bitext, Natural Language Processing, pp.274-285, 2017.

A. Williams, N. Nangia, and S. R. Bowman, A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference, 2017.

E. J. Williams, The Comparison of Regression Variables, J ournal of the Royal Statistical Society. Series B (Methodological), vol.21, issue.2, pp.396-399, 1959.

V. Wiwanitkit, Plagiarism: word, idea, figure, etc. C roatian, Medical Journal, vol.52, issue.5, p.657, 2011.

Y. Yang, J. G. Carbonell, R. D. Brown, E. Robert, and . Frederking, Translingual Information Retrieval: Learning from Bilingual Corpora, Artificial Intelligence, vol.103, issue.1-2, pp.323-345, 1998.

H. Wenpeng-yin and . Schütze, Discriminative Phrase Embedding for Paraphrase Identification, P roceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics, pp.1368-1373, 2015.

H. Zamani, H. Nasr, P. Babaie, and S. Abnar, Authorship Identification Using Dynamic Selection of Features from Probabilistic Feature Set. Dans M ethods for intrinsic plagiarism detection and author diarization-Notebook for PAN at CLEF, pp.128-140, 2014.

W. Y. Zou, R. Socher, D. Cer, and C. D. Manning, Bilingual Word Embeddings for Phrase-Based Machine Translation, P roceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, p.56, 2013.

P. Zweigenbaum, S. Sharoff, R. Et, and . Rapp, Towards Preparation of the Second BUCC Shared Task: Detecting Parallel Sentences in Comparable Corpora, 2016.

, Dans Proceedings of the Ninth Workshop on Building and Using Comparable Corpora (BUCC). European Language Resources Association (ELRA), pp.38-43

. Mai, , vol.83, p.121, 2016.

P. Zweigenbaum, S. Sharoff, R. Et, and . Rapp, Overview of the Second BUCC Shared Task: Spotting Parallel Sentences in Comparable Corpora, Proceedings of the 10th Workshop on Building and Using Comparable Corpora (BUCC), pp.60-67, 2017.

. Bibliographie-personnelle,

J. Ferrero, F. Agnès, L. Besacier, D. Et, and . Schwab, A Multilingual, Multi-style and Multi-granularity Dataset for Cross-language Textual Similarity Detection, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01303135

, Dans Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC'16), pp.4162-4169

J. Ferrero, L. Besacier, D. Schwab, F. Et, and . Agnès, Using Word Embedding for Cross-Language Plagiarism Detection, Proceedings of the 15th Conference of the European Chapter, vol.2, pp.415-421, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01502146

J. Ferrero, L. Besacier, D. Schwab, F. Et, and . Agnès, Deep Investigation of Cross-Language Plagiarism Detection Methods, Proceedings of the 10th Workshop on Building and Using Comparable Corpora, pp.6-15, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01531346

J. Ferrero, L. Besacier, D. Schwab, F. Et, and . Agnès, CompiLIG at SemEval-2017 Task 1: Cross-Language Plagiarism Detection Methods for Semantic Textual Similarity, Proceedings of the 11th International Workshop on Semantic Evaluation, pp.109-114, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01531330

J. El-moatez-billah-nagoudi, . Ferrero, D. Et, and . Schwab, Amélioration de la similarité sémantique vectorielle par méthodes non-supervisées, 24e conférence sur le Traitement Automatique des Langues Naturelles (TALN 2017), pp.110-117, 2017.

J. El-moatez-billah-nagoudi and . Ferrero, LIM-LIG at SemEval

. Task1, Enhancing the Semantic Similarity for Arabic Sentences with Vectors Weighting

, Dans Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pp.134-138

J. El-moatez-billah-nagoudi, . Ferrero, D. Et, and . Schwab, Word Embedding-Based Approaches for Measuring Semantic Similarity of Arabic-English Sentences, À paraître dans Proceedings of the 6th International Conference on Arabic Language Processing, 2017.

M. Fès,