Y. Adi, E. Kermany, Y. Belinkov, O. Lavi, and Y. Goldberg, Fine-grained Analysis of Sentence Embeddings using Auxiliary Prediction Tasks, roceedings of the 5th International Conference on Learning Representations, 2017.

?. Agi? and N. Schluter, Baselines and test data for cross-lingual inference, pp.68-71, 2017.

E. Agirre, C. Banea, D. Cer, M. Diab, A. Gonzalez-agirre et al., SemEval-2016 Task 1: Semantic Textual Similarity, Monolingual and Cross-Lingual Evaluation, Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp.497-511, 2016.
DOI : 10.18653/v1/S16-1081

D. W. Aha, D. Kibler, and M. K. Albert, Instance-based learning algorithms, Machine Learning, vol.57, issue.1, pp.37-66, 1991.
DOI : 10.1145/1968.1972

A. Alcázar, Towards Linguistically Searchable Text. Dans P roceedings of the BIDE 2005, p.58, 2006.

W. Ammar, G. Mulcaire, Y. Tsvetkov, G. Lample, C. Dyer et al., Massively Multilingual Word Embeddings, p.53, 2016.

G. Aragione, La transmission du savoir entre « tradition » et « plagiat » dans l'Antiquité classique et chrétienne. Dans Études de lettres, pp.117-138, 2010.
DOI : 10.4000/edl.388

H. Asghari, K. Khoshnava, O. Fatemi, and H. Faili, Developing Bilingual Plagiarism Detection Corpus Using Sentence Aligned Parallel Corpus. Dans W orking Notes Papers of the CLEF 2015 Evaluation Labs, Workshop Proceedings, pp.8-11148, 2015.

R. Francis, M. I. Bach, and . Jordan, Kernel Independent Com- ponent Analysis, J ournal of Machine Learning Research, vol.3, pp.1-48, 2002.

R. Baeza-yates and G. Navarro, A Faster Algorithm for Approximate String Matching Dans Dan Hirchsberg et Gene Myers, éditeurs, C ombinatorial Pattern Matching (CPM'96), LNCS Juin, vol.1075, pp.1-23, 1996.

F. Vanden-berghen and H. Bersini, CONDOR, a new parallel, constrained extension of Powell's UOBYQA algorithm: Experimental results and comparison with the DFO algorithm, Journal of Computational and Applied Mathematics, vol.181, issue.1, pp.157-175, 2005.
DOI : 10.1016/j.cam.2004.11.029

W. Michael, . Berry, G. Paul, and . Young, Using latent semantic indexing for multilanguage information retrieval, C omputers and the Humanities, vol.29, issue.6 2, pp.413-429, 1995.

C. Best, E. Van-der-goot, K. Blackler, T. Garcia, and D. Horby, Europe Media Monitor -System Description, p.70, 2005.

W. Blacoe and M. Lapata, A Comparison of Vector-based Representations for Semantic Composition, P roceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp.546-556, 2012.

J. Blitzer, M. Dredze, and F. Pereira, Biographies, bollywood, boomboxes and blenders: Domain adaptation for sentiment classification. Dans P roceedings of the 45th Annual Meeting of the Association of Computational Linguistics. Association of Computational Linguistics, Juin, pp.440-447, 2007.

P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov, Enriching Word Vectors with Subword Information. Dans Hinrich Schutze, éditeur, T ransactions of the Association for Computational Linguistics, pp.135-146, 2017.

O. Bojar, C. Buck, C. Callison-burch, C. Federmann, B. Haddow et al., Radu Soricut, et Lucia Specia. 2013. Findings of the 2013 Workshop on Statistical Machine Translation. Dans P roceedings of the Eighth Workshop on Statistical Machine Translation (WMT13), pp.1-44

F. Boudin, TALN Archives: a digital archive of French research articles in Natural Language Processing (TALN Archives : une archive numérique francophone des articles de recherche en Traitement Automatique de la Langue) Short Papers) Les Sables d, Proceedings of TALN 2013, pp.507-514, 2013.

R. Samuel, G. Bowman, C. Angeli, C. D. Potts, and . Manning, A Large Annotated Corpus for Learning Natural Language Inference, roceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp.15-1075, 2015.

F. Peter, V. J. Brown, S. A. Pietra, R. L. Pietra, and . Mercer, The Mathematics of Statistical Machine Translation: Parameter Estimation, Computational Linguistics, vol.19, issue.2, pp.263-311, 1993.

V. Samuel and . Bruton, Self-Plagiarism and Textual Recycling: Legitimate Forms of Research Misconduct, Accountability in Research, vol.21, issue.3, pp.176-197, 2014.

T. Brychcin and L. Svoboda, UWB at SemEval-2016 Task 1: Semantic Textual Similarity using Lexical, Syntactic, and Semantic Information, Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp.588-594, 2016.
DOI : 10.18653/v1/S16-1089

P. Businger and G. H. Golub, Linear least squares solutions by householder transformations, Numerische Mathematik, vol.4, issue.3, pp.269-276, 1965.
DOI : 10.1007/BF01436084

C. Callison-burch, P. Koehn, C. Monz, M. Post, R. Soricut et al., Findings of the 2012 Workshop on Statistical Machine Translation. Dans P roceeding of WMT 2012, pp.12-3102, 2012.

B. William, J. M. Cavnar, and . Trenkle, N-Gram-Based Text Categorization, Proceedings of 3rd Annual Symposium on Document Analysis and Information Retrieval (SDAIR'94, pp.161-175, 1994.

M. Cettolo, C. Girardi, and M. Federico, Wit 3 : Web inventory of transcribed and translated talks, Proceedings of the 16 th Conference of the European Association for Machine Translation (EAMT), pp.261-268, 2012.

M. Chen, Efficient Vector Representation for Documents through Corruption, roceedings of the 5th International Conference on Learning Representations, 2017.

P. Cimiano, A. Schultz, S. Sizov, P. Sorg, and S. Staab, Explicit Versus Latent Concept Models for Cross-Language Information Retrieval, Proceedings of the 21st International Joint Conference on Artificial Intelligence (IJCAI-09www.ijcai.org/Proceedings, pp.1513-1518, 2009.

F. Duclaye, Apprentissage automatique de relations d'équivalence sémantique à partir du Web, Thèse de doctorat, p.40, 2003.

S. T. Dumais, T. A. Letsche, M. L. Littman, and T. Landauer, Automatic Cross-language Retrieval Using Latent Semantic Indexing. Dans AAAI-97 Spring Symposium Series: Cross-Language Text and Speech Retrieval, www.aaai.org/Papers/Symposia/Spring, pp.18-24, 1997.

L. Duong, H. Kanayama, T. Ma, S. Bird, and T. Cohn, Multilingual Training of Crosslingual Word Embeddings. Dans P roceedings of the 15th Conference of the European Chapter, pp.893-903, 2017.

P. Edmonds and G. Hirst, Near-Synonymy and Lexical Choice, Computational Linguistics, vol.24, issue.1, pp.105-144, 2002.
DOI : 10.2307/2102968

URL : http://doi.org/10.1162/089120102760173625

S. Meyer, Z. Eissen, and B. Stein, Intrinsic Plagiarism Detection Advances in Information Retrieval, éditeur, 8th European Conference on IR Research, pp.29-31, 2006.

J. Enright and G. Kondrak, A Fast Method for Parallel Document Identification . Dans Human Language Technologies 2007: The Conference of the North American Chapter, pp.29-32, 2007.

C. España-bonet, Á. C. Varga, A. Barrón-cedeño, and J. Van-genabith, An Empirical Analysis of NMT-Derived Interlingual Embeddings and Their Use in Parallel Sentence Identification, IEEE Journal of Selected Topics in Signal Processing, vol.11, issue.8, 2017.
DOI : 10.1109/JSTSP.2017.2764273

. Eurovoc, Thesaurus Eurovoc Subject-Oriented Version, p.41, 1995.

C. Ferric, R. G. Fang, A. Steen, and . Casadevall, Misconduct accounts for the majority of retracted scientific publications, N ational Academy of Sciences of the United States of America (PNAS), vol.109, issue.42, pp.17028-17033, 2012.

J. Ferrero, F. Agnès, L. Besacier, . Et-didier, and . Schwab, A Multilingual , Multi-style and Multi-granularity Dataset for Cross-language Textual Similarity Detection . Dans P roceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16) European Language Resources Association (ELRA), pp.4162-4169, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01303135

J. Ferrero, L. Besacier, D. Schwab, and F. Agnès, CompiLIG at SemEval-2017 Task 1: Cross-Language Plagiarism Detection Methods for Semantic Textual Similarity, Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), p.109, 2017.
DOI : 10.18653/v1/S17-2012

URL : https://hal.archives-ouvertes.fr/hal-01531330

J. Ferrero, L. Besacier, D. Schwab, and F. Agnès, Deep Investigation of Cross-Language Plagiarism Detection Methods, Proceedings of the 10th Workshop on Building and Using Comparable Corpora, pp.6-15, 2017.
DOI : 10.18653/v1/W17-2502

URL : https://hal.archives-ouvertes.fr/hal-01531346

J. Ferrero, L. Besacier, D. Schwab, and F. Agnès, Using Word Embedding for Cross-Language Plagiarism Detection. Dans P roceedings of the 15th Conference of the European Chapter, pp.415-421, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01502146

J. Ferrero and A. Simac-lejeune, Détection et regroupement automatique de style d'écriture dans un texte. Dans 1 5ème conférence internationale sur l'extraction et la gestion des connaissances, pp.23-28, 2015.

M. Franco-salvador, I. Bensalem, E. Flores, P. Gupta, and P. Rosso, PAN 2015 Shared Task on Plagiarism Detection: Evaluation of Corpora for Text Alignment . Dans N otebook Papers for PAN at CLEF 2015 LABs and Workshops, Workshop Proceedings. Septembre. 2015, 2015.

M. Franco-salvador, P. Gupta, and P. Rosso, Cross-Language Plagiarism Detection Using a Multilingual Semantic Network, LNCS, vol.3, issue.7814, pp.710-713, 2013.
DOI : 10.1007/978-3-642-36973-5_66

M. Franco-salvador, P. Gupta, and P. Rosso, Graph-based similarity analysis: a new approach to cross-language plagiarism detection, J ournal of the Spanish Society of Natural Language Processing, pp.50-95, 2013.

M. Franco-salvador, P. Gupta, and P. Rosso, Knowledge Graphs as Context Models: Improving the Detection of Cross-Language Plagiarism with Paraphrasing. Dans Bridging Between Information Retrieval and Databases. PROMISE Winter School 2013, pp.8173-227, 2013.

M. Franco-salvador, P. Rosso, and R. Navigli, A Knowledge-based Representation for Cross-Language Document Retrieval and Categorization, Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, pp.414-423, 2014.
DOI : 10.3115/v1/E14-1044

M. Franco-salvador, P. Rossoa, and M. Montes-y-gómez, A systematic study of knowledge graph analysis for cross-language plagiarism detection. Dans Information Processing and Management, pp.550-570, 2016.

M. Franco-salvadora, P. Guptaa, P. Rossoa, and R. E. Banchsb, Cross-language plagiarism detection over continuous-space-and knowledge graph-based representations of language . K nowledge-Based Systems, pp.87-99, 2016.

G. Francopoulo, J. Mariani, and P. Paroubek, A Study of Reuse and Plagiarism in LREC papers, Dans P roceedings of the Tenth International Conference Bibliographie on Language Resources and Evaluation (LREC'16). European Language Resources Association (ELRA), Portoro?, Slovénie, pp.1890-1897, 2016.

E. Frank, M. Hall, and B. Pfahringer, Locally Weighted Naive Bayes. Dans P roceedings of the 19th Conference in Uncertainty in Artificial Intelligence, pp.249-256, 2003.

E. Gabrilovich and S. Markovitch, Computing Semantic Relatedness using Wikipedia-based Explicit Semantic Analysis, Proceedings of the 20th International Joint Conference on Artifical Intelligence (IJCAI'07, pp.1606-1611, 2007.

A. William, K. W. Gale, and . Church, A Program for Aligning Sentences in Bilingual Corpora, C omputational Linguistics, vol.19, issue.1, pp.75-102, 1993.

F. Galton, Regression Towards Mediocrity in Hereditary Stature., The Journal of the Anthropological Institute of Great Britain and Ireland, vol.15, issue.88, pp.246-263, 1886.
DOI : 10.2307/2841583

U. Germann, Aligned Hansards of the 36 th Parliament of Canada. Release 2001-1a. https://www.isi, p.58, 2001.

S. Ghannay, Y. Benoit-favre, N. Estève, and . Camelin, Word Embedding Evaluation and Combination, Dans P roceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16). European Language Resources Association (ELRA), Portoro?, Slovénie, pp.300-305, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01433185

E. Gibney, I'm No Plagiarist, I Moved a Comma The Times Higher Education Supplement: THE No. 2104. http://www.questia.com/magazine/1P3-3034399841/i-m- no-plagiarist-i-moved-a-comma-news#, pp.26-28, 2006.

G. H. Golub and C. Reinsch, Singular value decomposition and least squares solutions, Numerische Mathematik, vol.11, issue.5, pp.403-420, 1970.
DOI : 10.1090/S0002-9947-1960-0109825-2

S. Gouws, Y. Bengio, and G. Corrado, BilBOWA: Fast Bilingual Distributed Representations without Word Alignments, Proceedings of the 32nd International Conference on Machine Learning (ICML'15), pp.748-756, 2015.

S. Gouws and A. Søgaard, Simple task-specific bilingual word embeddings. Dans H uman Language Technologies : The 2015 Annual Conference of the North American Chapter, pp.1386-1390, 2015.
DOI : 10.3115/v1/n15-1157

S. Green, M. De-marneffe, J. Bauer, and C. D. Manning, Multiword Expression Identification with Tree Substitution Grammars: A Parsing tour de force with French, roceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pp.725-735, 2011.
URL : https://hal.archives-ouvertes.fr/hal-01111383

J. Grman and R. Ravas, Improved implementation for finding text similarities in large collections of data -Notebook for PAN at CLEF 2011. Dans N otebook Papers for PAN at CLEF 2011 LABs and Workshops, p.32, 2011.

J. Grover and P. Mitra, Bilingual Word Embeddings with Bucketed CNN for Parallel Sentence Extraction, Proceedings of ACL 2017, Student Research Workshop, pp.11-16, 2017.
DOI : 10.18653/v1/P17-3003

P. Guibert and C. Michaut, Le plagiat étudiant. Education et sociétés 28, pp.214-240, 2011.
DOI : 10.3917/es.028.0149

P. Gupta, A. Barrón-cedeño, and P. Rosso, Cross-Language High Similarity Search Using a Conceptual Thesaurus, Information Access Evaluation. Multilinguality, Multimodality, and Visual Analytics, pp.67-75, 2012.
DOI : 10.1007/978-3-642-33247-0_8

M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann et al., The WEKA data mining software, ACM SIGKDD Explorations Newsletter, vol.11, issue.1, pp.10-18, 2009.
DOI : 10.1145/1656274.1656278

A. Hardie, Part-of-speech ratios in English corpora, International Journal of Corpus Linguistics, vol.12, issue.1, pp.55-81, 2007.
DOI : 10.1075/ijcl.12.1.05har

F. Hill, K. Cho, and A. Korhonen, Learning Distributed Representations of Sentences from Unlabelled Data, Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp.1367-1377, 2016.
DOI : 10.18653/v1/N16-1162

S. Hochreiter and J. Schmidhuber, Long Short-Term Memory, Neural Computation, vol.4, issue.8, pp.1735-1780, 1997.
DOI : 10.1016/0893-6080(88)90007-X

G. Holmes, M. Hall, . Et-eibe, and . Frank, Generating Rule Sets from Model Trees, Dans P roceedings of the 12th Australian Joint Conference on Artificial Intelligence, pp.1-12, 1999.
DOI : 10.1007/3-540-46695-9_1

M. Iyyer, V. Manjunatha, J. Boyd-graber, and I. Daumé, Deep Unordered Composition Rivals Syntactic Methods for Text Classification, Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 2015.
DOI : 10.3115/v1/P15-1162

P. Jaccard, THE DISTRIBUTION OF THE FLORA IN THE ALPINE ZONE.1, New Phytologist, vol.11, issue.2, pp.37-50, 1912.
DOI : 10.1111/j.1469-8137.1912.tb05611.x

A. Kumar, J. , and B. Goswami, Vector Space Model and Overlap Metric for Author Identification. Dans N otebook Papers for PAN at CLEF 2013 LABs and Workshops, 2013.

Y. Ji and J. Eisenstein, Discriminative Improvements to Distributional Sentence Similarity, P roceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp.891-896, 2013.

M. Jones and L. Sheridan, Back translation: an emerging sophisticated cyber strategy to subvert advances in ???digital age??? plagiarism detection and prevention, Assessment & Evaluation in Higher Education, vol.14, issue.1, pp.1-7, 2015.
DOI : 10.5406/radicalteacher.90.0047

J. 2. Institute, . What, . Would, L. Abe, and . Say, Honesty and Integrity -The Ethics of American Youth: 2010, study by Josephson Institute of Ethics' Report Card on American Youth's Values and Actions, p.26

C. T. Kelley, Iterative Methods for Linear and Nonlinear Equations, Frontiers in Applied Mathematics. Society for Industrial and Applied Mathematics, pp.104-107, 1995.
DOI : 10.1137/1.9781611970944

C. T. Kelley, Iterative Methods for Optimization, Frontiers in Applied Mathematics. Society for Industrial and Applied Mathematics, issue.2, pp.104-107, 1999.
DOI : 10.1137/1.9781611970920

C. Kok, K. , and N. Salim, Web Based Cross Language Plagiarism Detection, J ournal of Computing, vol.1, issue.1, pp.39-43, 2009.

C. Kok, K. , and N. Salim, Features Based Text Similarity Detection, J ournal of Computing, vol.2, issue.1, pp.53-57, 2010.

C. Kok, K. , and N. Salim, Web Based Cross Language Plagiarism Detection Modelling and Simulation (CIMSiM), Second International Conference on Computational Intelligence, pp.199-204, 2010.

M. Kestemont, K. Luyckx, and W. Daelemans, Intrinsic plagiarism detection using character trigram distance scores -Notebook for PAN at CLEF 2011. Dans N otebook Papers for PAN at CLEF 2011 LABs and Workshops, p.35, 2011.

K. Khoshnavataher, V. Zarrabi, S. Mohtaj, and H. Asghari, Developing Monolingual Persian Corpus for Extrinsic Plagiarism Detection Using Artificial Obfuscation . Dans W orking Notes Papers of the CLEF 2015 Evaluation Labs, Workshop Proceedings. Septembre. 2015, pp.146-83, 2015.

R. Kiros, Y. Zhu, R. Salakhutdinov, R. S. Zemel, A. Torralba et al., Skip-Thought Vectors, roceedings of the 28th International Conference on Neural Information Processing Systems (NIPS 2015, pp.3294-3302, 2015.

D. Klein and C. D. Manning, Fast Exact Inference with a Factored Model for Natural Language Parsing, P roceedings of the 15th Annual Conference on Advances in Neural Information Processing Systems, pp.3-10, 2002.

G. Klyne and J. J. Carroll, Resource Description Framework (RDF): Concepts and Abstract Syntax. https://www.w3.org/TR, p.41, 2004.

P. Koehn, Europarl: A Parallel Corpus for Statistical Machine Translation Dans Conference Proceedings: the tenth Machine Translation Summit, pp.79-86, 2005.

R. Kohavi, The power of decision tables, P roceedings of the 8th European Conference on Machine Learning, pp.174-189, 1995.
DOI : 10.1007/3-540-59286-5_57

T. Kudo and Y. Matsumoto, Chunking with Support Vector Machines. Dans P roceedings of the Second Meeting of the North American Chapter of the Association for Computational Linguistics on Language Technologies (NAACL'01) Association for Computational Linguistics, pp.1-8, 2001.

T. Kudo and Y. Matsumoto, Fast methods for kernel-based text analysis, Proceedings of the 41st Annual Meeting on Association for Computational Linguistics , ACL '03, pp.24-31, 2003.
DOI : 10.3115/1075096.1075100

M. Kuznetsov, A. Motrenko, R. Kuznetsova, and V. Strijov, Methods for intrinsic plagiarism detection and author diarization -Notebook for PAN at CLEF 2016. Dans N otebook Papers for PAN at CLEF 2016 LABs and Workshops, 2016.

R. Layton, P. A. Watters, and R. Dazeley, Local n-grams for Author Identification Notebook for PAN at CLEF 2013. Dans N otebook Papers for PAN at CLEF 2013 LABs and Workshops, CLEF2013wn-PAN-LaytonEt2013.pdf. 2 citations, pp.34-35, 2013.

V. Quoc, T. Le, and . Mikolov, Distributed Representations of Sentences and Documents, Proceedings of the 31th International Conference on Machine Learning (ICML'14). JMLR Proceedings, pp.1188-1196, 2014.

M. Lesk, Automatic sense disambiguation using machine readable dictionaries, Proceedings of the 5th annual international conference on Systems documentation , SIGDOC '86, pp.24-26, 1986.
DOI : 10.1145/318723.318728

I. Vladimir and . Levenshtein, Binary codes capable of correcting deletions, insertions, and reversals, S oviet Physics Doklady, vol.10, issue.8 2, pp.707-710, 1966.

G. Levow, D. W. Oard, and P. Resnik, Dictionary-based techniques for cross-language information retrieval, Information Processing & Management, vol.41, issue.3, pp.523-547, 2005.
DOI : 10.1016/j.ipm.2004.06.012

P. Liang, B. Taskar, and D. Klein, Alignment by Agreement. Dans P roceedings of HLT-NAACL, Juin, pp.104-111, 2006.
DOI : 10.3115/1220835.1220849

Z. Lin, M. Feng, C. Nogueira-dos-santos, M. Yu, B. Xiang et al., A Structured Self-Attentive Sentence Embedding, roceedings of the 5th International Conference on Learning Representations, 2017.

A. Linard, B. Daille, and E. Morin, Attempting to Bypass Alignment from Comparable Corpora via Pivot Language, Proceedings of the Eighth Workshop on Building and Using Comparable Corpora, pp.32-37, 2015.
DOI : 10.18653/v1/W15-3405

URL : https://hal.archives-ouvertes.fr/hal-01188570

C. Lioma and R. Blanco, Part of Speech Based Term Weighting for Information Retrieval, Dans P roceedings of the 31th European Conference on IR Research on Advances in Information Retrieval, pp.412-423, 2009.
DOI : 10.1108/eb026526

M. Littman, S. T. Dumais, and T. K. Landauer, Automatic Cross-Language Information Retrieval Using Latent Semantic Indexing, pp.51-62, 1998.
DOI : 10.1007/978-1-4615-5661-9_5

Y. Liu and M. Lapata, Learning Structured Text Representations, 2017.

M. Luong, H. Pham, and C. D. Manning, Bilingual Word Representations with Monolingual Quality in Mind, Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing, pp.151-159, 2015.
DOI : 10.3115/v1/W15-1521

C. Lyon, J. Malcolm, and B. Dickerson, Detecting short passages of similar text in large document collections, Conference on Empirical Methods in Natural Language Processing, pp.118-125, 2001.

X. Ma, Champollion: A Robust Parallel Text Sentence Aligner, Proceedings of the fifth International Conference on Language Resources and Evaluation (LREC'06), 2006.

J. Mallinson, R. Sennrich, and M. Lapata, Paraphrasing Revisited with Neural Machine Translation Association for Computational Linguistics, éditeur, P roceedings of the 15th Conference of the European Chapter, pp.880-892, 2017.

M. Mancini, J. Camacho-collados, I. Iacobacci, and R. Navigli, Embedding Words and Senses Together via Joint Knowledge-Enhanced Training, Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017), pp.100-111, 2017.
DOI : 10.18653/v1/K17-1012

C. D. Manning, P. Raghavan, and H. Schütze, Introduction to Information Retrieval États-Unis, chapitre 6 -"Scoring , term weighting, and the vector space model, pp.109-133, 2008.

D. Christopher, H. Manning, and . Schütze, Foundations of Statistical Natural Language Processing, pp.32-63, 1999.

I. Masic, Plagiarism in Scientific Publishing, Acta Informatica Medica, pp.208-213, 2012.
DOI : 10.5455/aim.2012.20.208-213

D. Mccabe, Students' cheating takes a high-tech turn, pp.26-29, 2010.

J. Philip-mccrae, P. Cimiano, and R. Klinger, Orthonormal Explicit Topic Analysis for Cross-lingual Document Matching, Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp.1732-1740, 2013.

J. Philip-mccrae, D. Spohr, and P. Cimiano, The Semantic Web: Research and Applications : 8th Extended Semantic Web Conference, de I nformation Systems and Applications, incl. Internet/Web, and HCI, chapitre Linking Lexical Resources and Ontologies on the Semantic Web with Lemon, pp.245-259, 2011.

P. Mcnamee and J. Mayfield, Character N-Gram Tokenization for European Language Text Retrieval, Information Retrieval, vol.7, issue.1/2, pp.73-97, 2004.
DOI : 10.1023/B:INRT.0000009441.78971.be

I. Mel-'?uk, Dependency Syntax: Theory and Practice, p.99, 1988.

T. C. Mendenhall, THE CHARACTERISTIC CURVES OF COMPOSITION, Science, vol.9, issue.214S, pp.237-246, 1887.
DOI : 10.1126/science.ns-9.214S.237

C. E. Metz, Basic principles of ROC analysis, S eminars in Nuclear Medicine, pp.283-29880014, 1978.
DOI : 10.1016/S0001-2998(78)80014-2

T. Mikolov, K. Chen, G. Corrado, and J. Dean, Efficient Estimation of Word Representations in Vector Space, Proceedings of the International Conference on Learning Representations, pp.33-42, 2013.

T. Mikolov, M. Karafiát, L. Burget-cernocký, and S. Khudanpur, Recurrent neural network based language model. Dans 1 1th Annual Conference of the International Speech Communication Association, INTER- SPEECH 2010, Septembre, pp.1045-1048, 2010.

T. Mikolov, W. Tau-yih, and G. Zweig, Linguistic Regularities in Continuous Space Word Representations. Dans P roceedings of NAACL-HLT 2013, pp.746-751, 2013.

G. A. Miller, Wordnet: A Dictionary Browser, Proceedings of the First Conference of the UW Centre for the New Oxford Dictionary. Information in Data. citée page, p.40, 1985.

G. A. Miller, C. Fellbaum, J. Kegl, and K. J. Miller, WordNet: An Electronic Lexical Reference System Based on Theories of Lexical Memory, Revue qu??b??coise de linguistique, vol.17, issue.2, pp.181-213, 1988.
DOI : 10.1016/S0022-5371(78)90110-X

S. Mohtaj, H. Asghari, and V. Zarrabi, Developing Monolingual English Corpus for Plagiarism Detection using Human Annotated Paraphrase Corpus. Dans W orking Notes Papers of the CLEF 2015 Evaluation Labs, Workshop Proceedings . Septembre. 2015, pp.144-83, 2015.

M. Montes-y-gómez, A. Gelbukh, A. Lopez-lopez, and R. Baeza-yates, Flexible Comparison of Conceptual Graphs. Dans Lecture Notes in Computer Science, Janvier, pp.102-111, 2001.

E. Morin, A. Hazem, E. Loginova-clouet, and F. Boudin, LINA: Identifying Comparable Documents from Wikipedia, Proceedings of the Eighth Workshop on Building and Using Comparable Corpora, pp.88-91, 2015.
DOI : 10.18653/v1/W15-3413

URL : https://hal.archives-ouvertes.fr/hal-01185670

N. Mrk?ic, I. Vulic, Ó. Diarmuid, I. Séaghdha, R. Leviant et al., Semantic Specialisation of Distributional Word Vector Spaces using Monolingual and Cross-Lingual Constraints, T ransactions of the Association for Computational Linguistics, 2017.

M. Muhr, R. Kern, M. Zechner, and M. Granitzer, External and Intrinsic Plagiarism Detection Using a Cross-Lingual Retrieval and Segmentation System - Lab Report for PAN at CLEF 2010 otebook Papers for PAN at CLEF 2010 LABs and Workshops, CLEF2010wn-PAN-MuhrEt2010.pdf. 2 citations, pp.52-87, 2010.

E. Moatez-billah-nagoudi, J. Ferrero, . Et-didier, and . Schwab, Amélioration de la similarité sémantique vectorielle par méthodes non-supervisées, pp.110-117, 2017.

R. Navigli, Babelplagiarism: What can BabelNet do for Cross-language Plagiarism Detection. Dans CLEF 2012, Evaluation Labs and Workshop, Online Working Notes, p.45, 2012.

R. Navigli and M. Lapata, An Experimental Study of Graph Connectivity for Unsupervised Word Sense Disambiguation, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.32, issue.4, pp.678-692, 2010.
DOI : 10.1109/TPAMI.2009.36

J. Nivre, J. Hall, and J. Nilsson, MaltParser: A Data-Driven Parser- Generator for Dependency Parsing, Dans P roceedings of the fifth International Conference on Language Resources and Evaluation (LREC'06). European Language Resources Association (ELRA), pp.2216-2219, 2006.

G. Oberreuter and J. D. Velásquez, Text mining applied to plagiarism detection: The use of words for detecting deviations in the writing style, Expert Systems with Applications, vol.40, issue.9, pp.3756-3763, 2013.
DOI : 10.1016/j.eswa.2012.12.082

M. Pagliardini, P. Gupta, and M. Jaggi, Unsupervised Learning of Sentence Embeddings using Compositional n-Gram Features, pp.56-98, 2017.

M. Pataki, A New Approach for Searching Translated Plagiarism, Proceedings of the 5th International Plagiarism Conference, pp.49-64, 2012.

K. Pearson, Note on Regression and Inheritance in the Case of Two Parents, Proceedings of the Royal Society of London (1854-1905), vol.58, issue.-1, pp.240-242, 1895.
DOI : 10.1098/rspl.1895.0041

J. Pennington, R. Socher, and C. D. Manning, Glove: Global Vectors for Word Representation, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp.1532-1543, 2014.
DOI : 10.3115/v1/D14-1162

R. Pereira, Cross-Language Plagiarism Detection, Thèse de master, p.67, 2010.

R. Pereira, V. P. Moreira, and R. Galante, A New Approach for Cross-language Plagiarism Analysis. Dans P roceedings of the 2010 International Conference on Multilingual and Multimodal Information Access Evaluation: Cross-language Evaluation Forum, pp.15-26, 2010.

R. Pereira, V. P. Moreira, and R. Galante, UFRGS@PAN2010: Detecting External Plagiarism -Lab Report for PAN at CLEF 2010. Dans Braschler et Harman, éditeurs, N otebook Papers for PAN at CLEF 2010 LABs and Workshops, 2010.

S. Petrov, L. Barrett, R. Thibaux, and D. Klein, Learning accurate, compact, and interpretable tree annotation, Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the ACL , ACL '06, pp.433-440, 2006.
DOI : 10.3115/1220175.1220230

S. Petrov, D. Das, and R. Mcdonald, A Universal Part-of-Speech Tagset, P roceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12). European Language Resources Association (ELRA), pp.2089-2096, 2012.

S. Petrov and D. Klein, Improved Inference for Unlexicalized Parsing. Dans H uman Language Technologies 2007: The Conference of the North American Chapter, Proceedings of the Main Conference. Association for Computational Linguistics, pp.404-411, 2007.

H. Pham, M. Luong, and C. D. Manning, Learning Distributed Representations for Multilingual Text Sequences, Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing, pp.15-1512, 2015.
DOI : 10.3115/v1/W15-1512

D. Pinto, J. Civera, A. Juan, P. Rosso, and A. Barrón-cedeño, A statistical approach to crosslingual natural language tasks, Journal of Algorithms, vol.64, issue.1, pp.51-60, 2009.
DOI : 10.1016/j.jalgor.2009.02.005

D. Pinto, A. Juan, and P. Rosso, Using Query-Relevant Documents Pairs for Cross-Lingual Information Retrieval, Lecture Notes in Computer Science, vol.4629, pp.630-637, 2007.
DOI : 10.1007/978-3-540-74628-7_81

M. Potthast, A. Barrón-cedeño, A. Eiselt, B. Stein, and P. Rosso, Overview of the 2nd International Competition on Plagiarism Detection, éditeur, N otebook Papers for PAN at CLEF 2010 LABs and Workshops, de C EUR Workshop Proceedings, pp.45-62, 2010.

M. Potthast, A. Eiselt, A. Barrón-cedeño, B. Stein, and P. Rosso, Overview of the 3rd international Competition on Plagiarism Detection, Petras, P. Forner, éditeur, N otebook Papers for PAN at CLEF 2011 LABs and Workshops. volume 1177 de C EUR Workshop Proceedings. 5 citations pages, pp.58-71, 2011.

M. Potthast, M. Hagen, A. Beyer, M. Busse, M. Tippmann et al., Overview of the 6th International Competition on Plagiarism Detection . Dans N otebook Papers for PAN at CLEF 2014 LABs and Workshops, pp.845-876, 2014.

M. Potthast, B. Stein, and M. Anderka, A Wikipedia-Based Multilingual Retrieval Model, 30th European Conference on IR Research, pp.522-530978, 2008.
DOI : 10.1007/978-3-540-78646-7_51

M. Potthast, B. Stein, A. Barrón-cedeño, P. R. Huang, and . Jurafsky, An Evaluation Framework for Plagiarism Detection, Proceedings of the 23rd International Conference on Computational Linguistics (COLING), pp.997-1005, 2010.

M. Potthast, B. Stein, A. Eiselt, A. Barrón-cedeño, and P. Rosso, Overview of the 1st International Competition on Plagiarism Detection, éditeurs, 3 rd PAN workshop. Uncovering Plagiarism, Authorship and Social Software Misuse (PAN'09, pp.1-9, 2009.

B. Pouliquen, R. Steinberger, and C. Ignat, Automatic Annotation of Multilingual Text Collections with a Conceptual Thesaurus. Dans W orkshop 'Ontologies and Information Extraction' at the Summer School 'The Semantic Web and Language Technology -Its Potential and Practicalities, Juillet, pp.9-28, 2003.

B. Pouliquen, R. Steinberger, and C. Ignat, Automatic Identification of Document Translations in Large Multilingual Document Collections, Proceedings of the International Conference Recent Advances in Natural Language Processing (RANLP'03). Borovets , Bulgarie, pp.401-408, 2003.

P. Prettenhofer and B. Stein, Cross-language Text Classification Using Structural Correspondence Learning, Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Uppsala, Suède, ACL'10, pp.1118-1127, 2010.

J. R. Quinlan, Learning with continuous classes, Proceedings of the Fifth Australian Joint Conference on Artificial Intelligence, pp.343-348, 1992.

R. Quinlan, États-Unis, chapitre Induction of decision trees, Machine Learning, pp.81-106, 1986.

R. Quinlan, C4.5: Programs for Machine Learning The Morgan Kaufmann series in machine learning, pp.103-121, 1993.

M. Ann, G. Rogerson, and . Mccarthy, Using Internet based paraphrasing tools: Original work, patchwriting or facilitated plagiarism? I nternational, Journal for Educational Integrity, vol.13, issue.2, 2017.

S. Ruder, A survey of cross-lingual embedding models, p.103, 2017.

M. Saad, D. Langlois, and K. Smaili, Cross-Lingual Semantic Similarity Measure for Comparable Articles, Advances in Natural Language Processing -9th International Conference on NLP, pp.105-115, 2014.
DOI : 10.1007/978-3-319-10888-9_11

URL : https://hal.archives-ouvertes.fr/hal-01067687

G. Salton, Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer, 1989.

G. Salton and C. Buckley, Term-weighting Approaches in Automatic Text Retrieval. I nformation Processing and Management, pp.513-5230306, 1988.

A. Miguel, G. Sanchez-perez, A. Sidorov, and . Gelbukh, The Winning Approach to Text Alignment for Text Reuse Detection at PAN 2014 -Notebook for PAN at CLEF, 2014.

D. Cappellato, éditeur, N otebook Papers for PAN at CLEF 2014 LABs and Workshops, 2014.

Y. Sari and M. Stevenson, Exploring Word Embeddings and Character N-Grams for Author Clustering -Notebook for PAN at CLEF 2016. Dans N otebook Papers for PAN at CLEF 2016 LABs and Workshops, 2016.

Y. Sari, A. Vlachos, and M. Stevenson, Continuous N-gram Representations for Authorship Attribution Association for Computational Linguistics, éditeur, P roceedings of the 15th Conference of the European Chapter, pp.267-273, 2017.

H. Schmid, Probabilistic Part-of-Speech Tagging Using Decision Trees, Proceedings of the International Conference on New Methods in Language Processing, pp.44-49, 1994.

D. Schwab, Approche hybride-lexicale et thématique-pour la modélisation, la détection et l'exploitation des fonctions lexicales en vue de l'analyse semantique de texte Thèse de doctorat, Université Montpellier II. Les citations concernent la section 1.1.4.2 en page 24 et la section 2.2.3.1 en page 71, 2005.

H. Schwenk and M. Douze, Learning Joint Multilingual Sentence Representations with Neural Machine Translation. Dans P roceedings of the 2nd Workshop on Representation Learning for NLP, pp.157-167, 2017.

R. Sennrich, O. Firat, K. Cho, A. Birch, B. Haddow et al., Nematus: a Toolkit for Neural Machine Translation, Proceedings of the Software Demonstrations of the 15th Conference of the European Chapter of the Association for Computational Linguistics, pp.65-68, 2017.
DOI : 10.18653/v1/E17-3017

G. Sérasset, DBnary: Wiktionary as a Lemon-based multilingual lexical resource in RDF, Semantic Web, vol.46, issue.4, pp.355-361, 2015.
DOI : 10.1007/s10579-012-9182-3

C. Servan, Z. Elloumi, H. Blanchon, and L. Besacier, Word2Vec vs DBnary ou comment (ré)concilier représentations distribuées et réseaux lexico-sémantiques ? Le cas de l'évaluation en traduction automatique, T10.pdf. 2 citations, pp.20-2016, 2016.

K. Shevade, S. S. Keerthi, C. Bhattacharyya, and K. R. Murthy, Improvements to the SMO algorithm for SVM regression, IEEE Transactions on Neural Networks, vol.11, issue.5, pp.1188-1193, 2000.
DOI : 10.1109/72.870050

T. Shi, Z. Liu, Y. Liu, M. Et, and . Sun, Learning Cross-lingual Word Embeddings via Matrix Co-factorization, Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pp.567-572, 2015.
DOI : 10.3115/v1/P15-2093

P. Shrestha, S. Sierra, F. A. González, P. Rosso, M. Montes-y-gómez et al., Convolutional Neural Networks for Authorship Attribution of Short Texts, Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, pp.669-674, 2017.
DOI : 10.18653/v1/E17-2106

P. Shrestha and T. Solorio, Using a Variety of n-Grams for the Detection of Different Kinds of Plagiarism -Notebook for PAN at CLEF 2013. Dans N otebook Papers for PAN at CLEF 2013 LABs and Workshops, 2013.

M. Simard, G. F. Foster, and P. Isabelle, Using Cognates to Align Sentences in Bilingual Corpora, Proceedings of the 1993 conference of the Centre for Advanced Studies on Collaborative research: distributed computing (CASCON'93, pp.1071-1082, 1993.

A. J. Smola and B. Scölkopf, A tutorial on support vector regression. Dans S tatistics and Computing Manufactured in The Netherlands, pp.199-222, 2004.

R. Socher, J. Bauer, C. D. Manning, and A. Y. Ng, Parsing with Compositional Vector Grammars, roceedings of the 51st Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, pp.455-465, 2013.

R. Socher, A. Perelygin, J. Y. Wu, J. Chuang, C. D. Manning et al., Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank, P roceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp.1631-1642, 2013.

S. Stajner, M. Franco-salvador, S. P. Ponzetto, P. Rosso, . Et-heiner et al., Sentence Alignment Methods for Improving Text Simplification Systems, Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp.97-102, 2017.
DOI : 10.18653/v1/P17-2016

E. Stamatatos, Intrinsic Plagiarism Detection Using Character n-gram Profiles, Workshop on Uncovering Plagiarism, Authorship, and Social Software Misuse (PAN'09, pp.38-46, 2009.

E. Stamatatos, Authorship Attribution Using Text Distortion, Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, pp.1137-1148, 2017.
DOI : 10.18653/v1/E17-1107

URL : https://doi.org/10.18653/v1/e17-1107

B. Stein and S. Meyer-zu-eissen, Near Similarity Search and Plagiarism Analysis . Dans F rom Data and Information Analysis to Knowledge Engineering, pp.430-437, 2006.
DOI : 10.1007/3-540-31314-1_52

B. Stein and S. Meyer-zu-eissen, Fingerprint-based Similarity Search and its Applications . Dans F orschung und wissenschaftliches Rechnen, pp.85-98, 2007.

B. Stein and S. Meyer-zu-eissen, Intrinsic Plagiarism Analysis with Meta Learning, de C EUR Workshop Proceedings. citée page, p.34, 2007.

B. Stein, N. Lipka, and P. Prettenhofer, Intrinsic plagiarism analysis, Language Resources and Evaluation, vol.32, issue.5, pp.63-82, 2011.
DOI : 10.1023/A:1001749303137

R. Steinberger, Cross-lingual Keyword Assignment. Dans C onference of the Spanish Society for Natural Language Processing (SEPLN'2001), numéro 27 dans Procesamiento del Lenguaje Natural, pp.273-280, 2001.

R. Steinberger, B. Pouliquen, and J. Hagman, Cross-Lingual Document Similarity Calculation Using the Multilingual Thesaurus EUROVOC, LNCS, vol.2276, pp.415-424, 2002.
DOI : 10.1007/3-540-45715-1_44

R. Steinberger, B. Pouliquen, and C. Ignat, Exploiting Multilingual Nomenclatures and Language-Independent Text Features as an Interlingua for Cross-lingual Text Analysis Applications, Proceedings of the 4th Slovenian Language Technology Conference . Information Society, p.42, 2004.

R. Steinberger, B. Pouliquen, A. Widiger, C. Ignat, T. Erjavec et al., The JRC-Acquis: A multilingual aligned parallel corpus with 20+ languages, roceedings of the 5th International Conference on Language Resources and Evaluation (LREC'2006). Gênes, Italie, pp.2142-2147, 2006.

G. W. Stewart, On the Early History of the Singular Value Decomposition, SIAM Review, vol.35, issue.4, pp.551-566, 1993.
DOI : 10.1137/1035134

K. Sugathadasa, B. Ayesha, A. S. Nisansa-de-silva, V. Perera, and . Jayawardana, Dimuthu Lakmal, et Madhavi Perera Synergistic Union of Word2Vec and Lexicon for Domain Specific Semantic Similarity, P roceedings of the Seventh international conference on Innovative Computing Technology, p.98, 2017.

S. Md-arafat-sultan, T. Bethard, and . Sumner, DLS@CU: Sentence similarity from word alignment and semantic vector composition, Proceedings of the 9th International Workshop on Semantic Evaluation, pp.148-153, 2015.

J. Tian, Z. Zhou, M. Lan, and Y. Wu, ECNU at SemEval-2017 Task 1: Leverage Kernel-based Traditional NLP features and Neural Networks to Build a Universal Model for Multilingual and Cross-lingual Semantic Textual Similarity, Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), 2017.
DOI : 10.18653/v1/S17-2028

J. Tiedemann, Parallel Data, Tools and Interfaces in OPUS, 2012.

D. A. , R. Torrejón, and J. Manuel-martín-ramos, Crosslingual CoReMo System -Notebook for PAN at CLEF 2011. Dans N otebook Papers for PAN at CLEF 2011 LABs and Workshops, p.43, 2011.

J. Turian, L. Ratinov, and Y. Bengio, Word Representations: A Simple and General Method for Semi-Supervised Learning, roceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, pp.384-394, 2010.

A. Tversky, Features of similarity., Psychological Review, vol.84, issue.4, pp.327-352, 1977.
DOI : 10.1037/0033-295X.84.4.327

S. Upadhyay, M. Faruqui, C. Dyer, and D. Roth, Cross-lingual Models of Word Embeddings: An Empirical Comparison, Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp.1661-1670, 2016.
DOI : 10.18653/v1/P16-1157

H. Van-halteren, Linguistic profiling for author recognition and verification, Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics , ACL '04, 2004.
DOI : 10.3115/1218955.1218981

D. Varga, P. Hálacsy, V. Nagy, L. Németh, A. Kornai et al., Parallel corpora for medium density languages, Recent Advances in Natural Language Processing (RANLP'05). Borovets, Bulgarie, pp.590-596, 2005.
DOI : 10.1075/cilt.292.32var

Z. ?eska and M. Toman, Multilingual Plagiarism Detection, Lecture Notes in Computer Science, vol.5253, pp.83-92, 2008.
DOI : 10.1007/978-3-540-85776-1_8

A. Vinokourov and M. Girolami, A Probabilistic Framework for the Hierarchic Organisation and Classification of Document Collections, Journal of Intelligent Information Systems, vol.18, issue.2/3, pp.153-172, 2002.
DOI : 10.1023/A:1013677411002

A. Vinokourov, J. Shawe-taylor, and N. Cristianini, Inferring a Semantic Representation of Text via Cross-Language Correlation Analysis https://papers.nips.cc/paper/2324-inferring-a-semantic-representation-of- text-via-cross-language-correlation-analysis.pdf, P roceedings of the 15th Annual Conference on Advances in Neural Information Processing Systems, pp.1473-1480, 2002.

I. Vuli?, Cross-Lingual Syntactically Informed Distributed Word Representations, Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, 2017.
DOI : 10.18653/v1/E17-2065

I. Vuli? and A. Korhonen, On the Role of Seed Lexicons in Learning Bilingual Word Embeddings, Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp.247-257, 2016.
DOI : 10.18653/v1/P16-1024

I. Vulic and M. Moens, Probabilistic Models of Cross-Lingual Semantic Similarity in Context Based on Latent Cross-Lingual Concepts Induced from Comparable Data, roceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp.349-362, 2014.

Y. Wang and I. H. Witten, Induction of model trees for predicting continuous classes, Proceedings of the poster papers of the European Conference on Machine Learning. Prague, République tchèque, pp.128-137, 1997.

J. Wieting and K. Gimpel, Revisiting Recurrent Networks for Paraphrastic Sentence Embeddings, Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp.2078-2088, 2017.
DOI : 10.18653/v1/P17-1190

J. Wieting, J. Mallinson, and K. Gimpel, Learning Paraphrastic Sentence Embeddings from Back-Translated Bitext. Dans P roceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Septembre, pp.274-285, 2017.

A. Williams, N. Nangia, and S. R. Bowman, A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference, 2017.

J. Williams, The Comparison of Regression Variables, J ournal of the Royal Statistical Society. Series B (Methodological), vol.21, issue.2, pp.396-399, 1959.

V. Wiwanitkit, Plagiarism: word, idea, figure, etc. C roatian, Medical Journal, vol.52, issue.5, p.657, 2011.

Y. Yang, J. G. Carbonell, R. D. Brown, and R. E. Frederking, Translingual information retrieval: learning from bilingual corpora, Artificial Intelligence, vol.103, issue.1-2, pp.323-345, 1998.
DOI : 10.1016/S0004-3702(98)00063-0

W. Yin and H. Schütze, Discriminative Phrase Embedding for Paraphrase Identification, Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp.1368-1373, 2015.
DOI : 10.3115/v1/N15-1154

H. Zamani, H. Nasr, P. Babaie, S. Abnar, M. Dehghani et al., Authorship Identification Using Dynamic Selection of Features from Probabilistic Feature Set. Dans M ethods for intrinsic plagiarism detection and author diarization -Notebook for PAN at CLEF 2014, pp.128-140, 2014.

W. Y. Zou, R. Socher, D. Cer, and C. D. Manning, Bilingual Word Embeddings for Phrase-Based Machine Translation, roceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp.1393-1398, 2013.

P. Zweigenbaum, S. Sharoff, and R. Rapp, Towards Preparation of the Second BUCC Shared Task: Detecting Parallel Sentences in Comparable Corpora, Proceedings of the Ninth Workshop on Building and Using Comparable Corpora (BUCC). European Language Resources Association (ELRA), pp.38-43, 2016.

P. Zweigenbaum, S. Sharoff, and R. Rapp, Overview of the Second BUCC Shared Task: Spotting Parallel Sentences in Comparable Corpora, Proceedings of the 10th Workshop on Building and Using Comparable Corpora, pp.60-67, 2017.
DOI : 10.18653/v1/W17-2512

A. Bibliographie-personnelle-jérémy, F. Ferrero, L. Agnès, . Besacier, . Et-didier et al., A Multilingual , Multi-style and Multi-granularity Dataset for Cross-language Textual Similarity Detection European Language Resources Association (ELRA) Portoro?, Slovénie . 23-28 mai 2016 ISLRN : 723-785-513-738-2 Using Word Embedding for Cross-Language Plagiarism Detection, Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC'16) Proceedings of the 15th Conference of the European Chapter et Frédéric Agnès. 2017. Deep Investigation of Cross-Language Plagiarism Detection Methods. Dans Proceedings of the 10th Workshop on Building and Using Comparable Corpora, pp.4162-4169, 2016.

J. Ferrero, L. Besacier, D. Schwab, and F. Agnès, CompiLIG at SemEval-2017 Task 1: Cross-Language Plagiarism Detection Methods for Semantic Textual Similarity, Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pp.109-114, 2017.
DOI : 10.18653/v1/S17-2012

URL : https://hal.archives-ouvertes.fr/hal-01531330

E. Moatez-billah-nagoudi, J. Ferrero, . Et-didier-schwab-jérémy, and . Ferrero, Amélioration de la similarité sémantique vectorielle par méthodes non-supervisées. Dans 24e conférence sur le Traitement Automatique des Langues Naturelles http://taln2017.cnrs.fr/wp-content/uploads Enhancing the Semantic Similarity for Arabic Sentences with Vectors Weighting, Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017) et Didier Schwab. 2017. Word Embedding-Based Approaches for Measuring Semantic Similarity of Arabic-English Sentences. À paraître dans Proceedings of the 6th International Conference on Arabic Language Processing, pp.110-117, 2017.