M. Abadi, P. Barham, J. Chen, and Z. Chen,

J. Davis, M. Dean, S. Devin, G. Ghemawat, and . Irving, Michael Isardet lF ensor)owX system for lrgeEsle mhine lerningF In OSDI, vol.16, p.37, 2016.

, Heike Adel et Hinrich Schütze. qlol xormliztion of gonvolutionl xeurl xetworks for toint intity nd eltion glssi(tion, Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, p.111, 2017.

K. Pankaj and . Agarwal, Sariel Har-Peled et Kasturi R Varadarajan. qeE ometri pproximtion vi oresets. Combinatorial and computational geometry, vol.52, p.130, 2005.

, Martial Agueh et Guillaume Carlier. fryenters in the sserstein spe, SIAM Journal on Mathematical Analysis, vol.43, issue.2, p.70, 2011.

A. Ali, Rich Caruana et Ashish Kapoor. etive verning with wodel eletion, Proceedings of the Twenty-Eighth AAAI Conference on Articial Intelligence, AAAI'14, p.24, 2014.

A. Andoni, A. Naor, and O. Neiman, smpossiility of kething of the Qh rnsporttion wetri with udrti gost, 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016), vol.55, p.92, 2016.

S. Antol, A. Agrawal, J. Lu, M. Mitchell, D. Batra et al., qX isul question nE swering, Proceedings of the IEEE International Conference on Computer Vision, p.24252433, 2015.

M. Arjovsky, S. Chintala, and L. Bottou, sserstein qenertive edversril xetworks, Proceedings of the 34th International Conference on Machine Learning, vol.70, p.611

, Martin Arjovsky et Léon Bottou. owrds prinipled methods for trining genertive dversril networks, 2017.

N. Asghar and P. Poupart, Xin Jiang et Hang Li. heep tive lerning for dilogue genertion, Proceedings of the 6th Joint Conference Bibliography on Lexical and Computational Semantics (* SEM 2017), p.10, 2017.

M. Balcan, Andrei Broder et Tong Zhang. wrgin sed tive lerning. Learning Theory, vol.33, p.20, 2007.

M. Balcan, Avrim Blum et Nathan Srebro. e theory of lerning with similrity funtions. Machine Learning, vol.72, p.86, 2008.

G. Balikas and C. Laclau, Ievgen Redko et Massih-Reza Amini. grossEvingul houment etrievl sing egulrized sserstein histne, European Conference on Information Retrieval, p.398410, 2018.

. Springer, , vol.87, 2018.

L. Peter, W. Bartlett, and . Maass, pnikEghervonenkis dimenE sion of neurl nets. The handbook of brain theory and neural networks, p.11881192, 2003.

L. Peter, N. Bartlett, and . Harvey, Chris Liaw et Abbas Mehrabian. xerlyEtight vdimension nd pseudodimension ounds for pieewise liner neurl networksF riv preprint. arXiv, vol.1703, 2017.

B. Eric, K. Baum, and . Lang, uery lerning n work poorly when humn orle is used, International joint conference on neural networks, vol.8, 1992.

J. Benamou, G. Carlier, and M. Cuturi, Luca Nenna et Gabriel Peyré. stertive regmn projetions for regulrized trnsE porttion prolems, SIAM Journal on Scientic Computing, vol.37, issue.2, pp.1111-1138, 2015.

Y. Bengio, P. Lamblin, D. Popovici, and H. Larochelle, qreedy lyerEwise trining of deep networks, Advances in neural information processing systems, p.131, 2007.

Y. Bengio and J. Louradour, Ronan Collobert et Jason Weston. gurriulum lerning, ICML, p.28, 2009.

, Yoshua Bengio et Olivier Delalleau. yn the expressive power of deep rhitetures, International Conference on Algorithmic Learning Theory, p.131, 2011.

J. Bigot, R. Gouet, and T. Klein, Alfredo Lópezet lF qeodesi ge in the sserstein spe y onvex ge, Annales de l'Institut Henri Poincaré, Probabilités et Statistiques, vol.53, pp.1-26, 2017.

A. Blumer and A. Ehrenfeucht, David Haussler et Manfred K Warmuth. vernility nd the pnikEghervonenkis dimension, Journal of the ACM (JACM), vol.36, issue.4, p.929965, 1989.

C. Blundell and J. Cornebise, Koray Kavukcuoglu et Daan Wierstra. eight unertinty in neurl networks, p.52, 2015.

N. Bonneel, M. Van-de-panne, S. Paris, and W. Heidrich, hispleE ment snterpoltion sing vgrngin wss rnsport, ACM Transaction on

, Graphics, vol.30, issue.6, p.12, 1158.

N. Bonneel, J. Rabin, G. Peyré, and H. Pster, lied nd don sserstein fryenters of wesures, Journal of Mathematical Imaging and Vision, vol.51, issue.1, p.70, 2015.

N. Bonneel, G. Peyré, and M. Cuturi, sserstein fryentri goordiE ntesX ristogrm egression sing yptiml rnsport, ACM Trans. Graph, vol.35, issue.4, p.10, 2016.

, Mach. Learn, vol.45, issue.1, p.27, 2001.

, Klaus Brinker. snorporting diversity in tive lerning with support vetor mhines, Proceedings of the 20th international conference on machine learning (ICML-03), p.16, 2003.

J. Bromley, I. Guyon, and Y. Lecun, Eduard Säckinger et Roopak Shah. ignture veri(tion using 4 simese4 time dely neurl network, Advances in Neural Information Processing Systems, p.79, 1994.

N. Carlini and D. Wagner, hefensive distilltion is not roE ust to dversril exmples, p.35, 2016.

N. Carlini and D. Wagner, edversril ixmples ere xot isily hetetedX fypssing en hetetion wethods, Proceedings of the 10th ACM Workshop on Articial Intelligence and Security, AISec '17, p.36, 2017.

N. Carlini and D. Wagner, owrds evluting the roustE ness of neurl networks, Security and Privacy (SP), 2017 IEEE Symposium on, p.36

M. Rui, . Castro, D. Robert, and . Nowak, winimx ounds for tive lernE ing, International Conference on Computational Learning Theory, p.519, 2007.

, Kathryn Chaloner et Isabella Verdinelli. fyesin experimentl designX e review. Statistical Science, p.273304, 1995.

]. Nn and C. , eEoptimlity for regression designs, Journal of Mathematical Analysis and Applications, vol.87, issue.1, p.4550, 1982.

M. ,

. Charikar, imilrity istimtion ehniques from ounding elE gorithms, Proceedings of the Thiry-fourth Annual ACM Symposium on Theory of Computing, STOC '02, p.92, 2002.

S. Chopra, R. Hadsell, and Y. Lecun, verning similrity metri disE rimintivelyD with pplition to fe veri(tion, Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, vol.1, p.539546, 2005.

A. Choromanska, Yann LeCun et Gerard Ben Arous. ypen prolemX he lndspe of the loss surfes of multilyer networks, Conference on Learning Theory, p.26, 2015.

S. Claici and J. Solomon, sserstein goresets for vipshitz gosts, 2018.

Z. David-a-cohn, . Ghahramani, I. Michael, and . Jordan, etive lernE ing with sttistil models, Journal of articial intelligence research, p.10, 1996.

, Collobert et Jason Weston. e ni(ed erhiteture for xtE url vnguge roessingX heep xeurl xetworks with wultitsk verning

, Proceedings of the 25th International Conference on Machine Learning, ICML '08, p.160167, 2008.

G. Contardo, Ludovic Denoyer et Thierry Artières. e wetEverning epproh to yneEtep etive verning, p.66, 2017.

N. Courty, R. Flamary, D. Tuia, and A. Rakotomamonjy, yptiml trnsport for domin dpttion, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017.

, Rémi Flamary et Mélanie Ducoe. verning sserstein imeddings, 2017.

N. Courty and R. Flamary, Devis Tuia et Alain Rakotomamonjy. yptiml trnsport for domin dpttion, p.169, 2017.

, analysis and machine intelligence, vol.39, issue.9, p.73, 2017.

, Rémi Flamary et Mélanie Ducoe. verning sserE stein imeddings, International Conference on Learning Representations, vol.3, 2018.

M. Cuturi, inkhorn histnesX vightspeed gomputtion of yptiE ml rnsporttion, Advances on Neural Information Processing Systems (NIPS), p.78, 2013.

. Cuturi-2013b]-marco-cuturi, inkhorn distnesX vightspeed omputtion of optiE ml trnsport, Advances in neural information processing systems, p.22922300, 2013.

M. Cuturi and A. Doucet, pst gomputtion of sserstein fryenE ters, ICML, vol.78, p.70, 2014.

M. Cuturi and G. Peyré, e smoothed dul pproh for vriE tionl sserstein prolems, SIAM Journal on Imaging Sciences, vol.9, issue.1, p.320343, 2016.

M. Chirac, . Giscard, D. Pompidou, and . Gaulle, , p.130, 2012.

, Sanjoy Dasgupta. enlysis of greedy tive lerning strtegy, Advances in neural information processing systems, vol.337344, p.10, 2005.

, Sanjoy Dasgupta. enlysis of greedy tive lerning strtegy, Advances in Neural Information Processing Systems 17, p.24, 2005.

A. Yann-n-dauphin and . Fan, Michael Auli et David Grangier. vnguge wodeling with qted gonvolutionl xetworks, International Conference on Machine Learning, p.111, 2017.

F. De-goes, K. Breeden, V. Ostromoukhov, and M. Desbrun, flue xoise hrough yptiml rnsport, ACM Trans. Graph, vol.31, issue.6, p.70, 1171.

, Melanie Ducoe et Frederic Precioso. fhgX uery y dropout ommittee for trining deep supervised rhiteture, 2015.

M. Ducoe, D. Mayare, and F. Precioso, Frédéric Lavigne, Laurent Vanni et A Tre-Hardy. whine verning under the light of hrseology expertiseX use se of presidentil speehesD he qulleErollnde @IWSVEPHITA, JADT 2016-Statistical Analysis of Textual Data, vol.1, p.157168, 2016.

, Georey Portelli et Frederic Precioso. lle th mode yptiml ixperimentl hesign for heep xetworks, 29th Conference on Neural Information Processing Systems, 2016.

, Melanie Ducoe et Frederic Precioso. sntroduing etive verning for gxx under the light of ritionl snferene, 2016.

, Melanie Ducoe et Frederic Precioso. edversril etive verning for heep xetworksX wrgin fsed epproh, 2018.

K. Dvijotham, R. Stanforth, and S. Gowal, Timothy Mann et Pushmeet Kohli. e dul pproh to slle veri(tion of deep networks, p.137, 2018.

J. Ebrahimi and A. Rao, Daniel Lowd et Dejing Dou. rotplipX hiteEfox edversril ixmples for xv, p.112, 2017.

, Pascal Frossard et Stefano Soatto. glssi(tion regions of deep neurl networks, 2017.

R. Feldman, J. Sanger-;-feldman, R. , and J. Sanger, The text mining handbook. advanced approaches in analyzing unstructured data, p.111, 2007.

. Richard-p-feynman, Lectures on statistical mechanics, p.52, 1972.

P. Flaherty, A. Arkin, I. Michael, and . Jordan, oust design of iologil experiments, Advances in neural information processing systems, vol.363370, 2005.

, Rémi Flamary et Nicolas Courty. y ython yptiml rnsport lirry, p.81, 2017.

P. T. Fletcher, C. Lu, S. M. Pizer, and S. Joshi, rinipl qeodesi enlysis for the tudy of xonliner ttistis of hpe, IEEE Trans. Medical Imaging, vol.23, issue.8, p.9951005, 2004.

Y. Freund and H. S. Seung, Eli Shamir et Naftali Tishby. eE letive mpling sing the uery y gommittee elgorithm, Mach. Learn, vol.28, issue.2-3, p.23, 1997.

, Gal 2016a] Yarin Gal et Zoubin Ghahramani. hropout s fyesin pproxiE mtionX epresenting model unertinty in deep lerning. In international conference on machine learning, p.54, 2016.

, Riashat Islam et Zoubin Ghahramani. heep fyesin etive verning with smge ht, Bayesian Deep Learning workshop, NIPS, vol.17, p.13, 2016.

W. Gao and Z. Zhou, hropout demher omplexity of deep neurl networks, Science China Information Sciences, vol.59, issue.7, p.19, 2016.

G. Gasso, Alain Rakotomamonjy et Stéphane Canu. eovering sprse signls with ertin fmily of nononvex penlties nd hg progrmE ming, IEEE Transactions on Signal Processing, vol.57, issue.12, p.80, 2009.

A. Genevay, M. Cuturi, G. Peyré, and F. Bach, tohsti optimizE tion for lrgeEsle optiml trnsport, Advances in Neural Information Processing Systems, p.78, 2016.

X. Glorot, Antoine Bordes et Yoshua Bengio. homin dpttion for lrgeEsle sentiment lssi(tionX e deep lerning pproh, Proceedings of the 28th international conference on machine learning (ICML-11), p.109, 2011.

, Anupriya Gogna et Angshul Majumdar. emi upervised eutoenE oder, International Conference on Neural Information Processing, p.12, 2016.

I. J. Goodfellow, Shlens et C. Szegedy. ixplining nd rrE nessing edversril ixmples. ICLR 2015, Dmbre 2015. (Cit pages, vol.35, p.32

B. Graham, J. Reizenstein, and L. Robinson, ifE (ient thwise dropout trining using sumtries, 2015.

, Alex Graves. rtil ritionl snferene for xeurl xetworks, Proceedings of the 24th International Conference on Neural Information Processing Systems, NIPS'11, p.23482356, 2011.

R. Grosse and J. Martens, , 2016.

, B Boser et Vladimir Vapnik. eutomti pity tunE ing of very lrge gEdimension lssi(ers, Advances in neural information processing systems, p.147155, 1993.

, Hanneke. tes of onvergene in tive lerning. ArXiv e-prints, p.20, 2011.

, Moritz Hardt, Benjamin Recht et Yoram Singer. rin fsterD genE erlize etterX tility of stohsti grdient desent, p.19, 2015.

W. He, J. Wei, and X. Chen, Nicholas Carlini et Dawn Song. edversril ixmple hefenseX insemles of ek hefenses re not trong, 11th USENIX Workshop on Oensive Technologies (WOOT 17), Vancouver, BC, 2017. USENIX Association, p.36

N. Georey-e-hinton, A. Srivastava, and . Krizhevsky, Ilya Sutskever et Ruslan R Salakhutdinov. smproving neurl networks y preventing oE dpttion of feture detetors, p.20, 2012.

C. H. Steven, R. Hoi, J. Jin, . Zhu, R. Michael et al., fth wode etive verning nd sts epplition to wedil smge glssi(tion. ICML '06, p.417424, 2006.

G. Huang, C. Guo, M. Kusner, Y. Sun, F. Sha et al., upervised ord wover9s histne, Advances in Neural Information Processing Systems, p.69, 2016.

. Huang-2016b]-gao, C. Huang, . Guo, J. Matt, Y. Kusner et al., Advances in Neural Information Processing Systems, vol.87, p.48624870, 2016.

W. Miriam, J. C. Huijser, and . Van-gemert, etive heision foundry ennottion with heep qenertive wodels, 2017.

, Ferenc Huszár et David Duvenaud. yptimllyEweighted herding is fyesin qudrture, 2012.

P. Indyk and N. Thaper, pst imge retrievl vi emeddings, 3rd International Workshop on Statistical and Computational Theories of Vision, p.92, 2003.

, Sergey Ioe et Christian Szegedy. fth normliztionX eelerting deep network trining y reduing internl ovrite shift, p.65, 2015.

, Abram Kagan. enother look t the grmerEo inequlity. The American Statistician, vol.55, p.211212, 2001.

N. Kalchbrenner, E. Grefenstette, and P. Blunsom, e gonvolutionl xeurl xetwork for wodelling entenes, Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, vol.1, p.110, 2014.

A. Kapoor and K. Grauman, Raquel Urtasun et Trevor Darrell. etive lerning with gussin proesses for ojet tegoriztion, IEEE 11th International Conference on, p.13, 2007.

, -Fei. isulizing nd understnding reurrent networks, 2015.

M. Steven and . Kay, Fundamentals of statistical signal processing: Practical algorithm development, Pearson Education, vol.3, 2013.

, Subhash Khot et Assaf Naor. xonemeddility theorems vi pourier nlysis, Mathematische Annalen, vol.334, issue.4, p.92, 2006.

Y. Kim, gonvolutionl xeurl xetworks for entene glssi(tion, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), p.110, 2014.

B. Kim, R. Khanna, O. Oluwasanmi, and . Koyejo, ixmples re not enoughD lern to ritiize3 ritiism for interpretility, Advances in Neural Information Processing Systems, p.22802288, 2016.

S. Diederik-p-kingma and . Mohamed, Danilo Jimenez Rezende et Max Welling. emiEsupervised lerning with deep genertive models, Advances in Neural Information Processing Systems, p.12, 2014.

G. Koch, R. Zemel, and . Salakhutdinov, imese neurl networks for oneEshot imge reognition, ICML Deep Learning Workshop, vol.2, p.79, 2015.

;. S. Bibliography, S. R. Kolouri, G. K. Park, and . Rohde, he don gumultive histriution rnsform nd sts epplition to smge glssi(tion, IEEE Transactions on Image Processing, vol.25, issue.2, p.920934, 2016.

S. Kolouri, A. Tosun, J. Ozolek, and G. Rohde, e ontinuous liner optiml trnsport pproh for pttern nlysis in imge dtsets, Pattern Recognition, vol.51, p.71, 2016.

S. Kolouri, Y. Zou, and G. Rohde, lied sserstein uernels for roility histriutions, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p.92, 2016.

S. Kolouri, S. R. Park, M. Thorpe, D. Slepcev, and G. K. Rohde, ypE timl wss rnsportX ignl proessing nd mhineElerning pplitions

, IEEE Signal Processing Magazine, vol.34, issue.4, p.69, 2017.

V. Koltchinskii, demher omplexities nd ounding the exess risk in tive lerning, Journal of Machine Learning Research, vol.11, p.20, 2010.

, Andreas Krause et Volkan Cevher. umodulr ditionry seletion for sprse representtion, Proceedings of the 27th International Conference on Machine Learning (ICML-10), p.98, 2010.

A. Krizhevsky, I. Sutskever, E. Georey, and . Hinton, smgexet glssi(tion with heep gonvolutionl xeurl xetworks, Proceedings of Neural Information Processing Systems (NIPS), p.11061114, 2012.

D. Krueger, N. Ballas, S. Jastrzebski, D. Arpit, S. Maxinder et al., Asja Fischer et Aaron Courville. heep xets hon9t vern vi wemoriztion, p.19, 2017.

V. Kuleshov and S. Thakoor, Tingfung Lau et Stefano Ermon. edversril ixmples for xturl vnguge glssi(tion rolems, p.113, 2018.

S. Kumar, Soumen Chakrabarti et Shourya Roy. irth wover9s histne ooling over imese vws for eutomti hort enswer qrdE ing, Proceedings of the International Joint Conference on Articial Intelligence, vol.87, p.20462052, 2017.

M. Kusner and Y. Sun, Nicholas Kolkin et Kilian Weinberger. prom word emeddings to doument distnes, International Conference on Machine Learning, vol.87, p.957966, 2015.

A. Lebart, L. Salem, ;. Berry, A. Lebart, L. Salem et al., Exploring textual data, p.111, 1998.

M. Langberg, J. Leonard, and . Schulman, niversl ?E pproximtors for integrls, Proceedings of the twenty-rst annual ACM-SIAM symposium on Discrete Algorithms, p.598607, 2010.

, Lee et Maxim Raginsky. winimx sttistil lerning nd domin dpttion with sserstein distnes, p.73, 2017.

D. David, . Lewis, A. William, and . Gale, e sequentil lgorithm for trining text lssi(ers, Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval, p.312

]. Li and X. Chen, Eduard Hovy et Dan Jurafsky. isulizing nd understnding neurl models in xv, 2015.

T. Liang and T. Poggio, Alexander Rakhlin et James Stokes. pisherEro metriD geometryD nd omplexity of neurl networks, p.20, 2017.

X. Lin and D. Parikh, etive verning for isul uestion ensweringX en impiril tudy, p.13, 2017.

D. Lin-2017b]-xiao-lin and . Parikh, etive verning for isul uestion enE sweringX en impiril tudy, vol.60, p.59, 2017.

Y. Liu, etive lerning with support vetor mhine pplied to gene expression dt for ner lssi(tion, Journal of chemical information and computer sciences, vol.44, issue.6, p.10, 2004.

W. Liu and Y. Wen, Zhiding Yu et Meng Yang. vrgeEwrgin oftmx voss for gonvolutionl xeurl xetworksF In ICML, p.507516, 2016.

, Laurens van der Maaten et Georey Hinton. isulizing dt using tExi, Journal of machine learning research, vol.9, pp.2579-2605, 2008.

J. C. David and . Mackay, Neural computation, vol.4, issue.3, p.415447, 1992.

, James Martens et Roger Grosse. yptimizing neurl networks with uronekerEftored pproximte urvture, p.58, 2015.

J. Martens, J. Ba, and M. Johnson, uronekerEftored gurvture epproximtions for eurrent xeurl xetworks, International Conference on Learning Representations, p.61, 2018.

. Arakaparampil-m-mathai, B. Serge, and . Provost, Quadratic forms in random variables: theory and applications. M. Dekker, p.55, 1992.

J. Matou²ek and A. Naor, ypen prolems on emeddings of (nite metri spes, p.92, 2011.

J. Matou²ek, veture notes on metri emeddings, p.73, 2013.

D. Mayare, C. Bouzereau, M. Ducoe, and M. Guaresi, Frédéric Precioso et Laurent Vanni. ves mots des ndidtsD de llons à vertu, 2017.

S. Mellet and D. Longrée, , 2009.

, In Belgian Journal of Linguistics, vol.23, p.111, 2009.

S. Mellet and J. Barthélemy, v topologie textuelleX légitimtion d9une notion émergente. Lexicometrica, vol.7, p.116, 2009.

T. Miyato and . Shin-ichi-maeda, Masanori Koyama et Shin Ishii. irtul dversril triningX regulriztion method for supervised nd semiE supervised lerning, p.12, 2017.

. Seyed-mohsen-moosavi-dezfooli, Alhussein Fawzi et Pascal Frossard. heepfoolX simple nd urte method to fool deep neurl networks, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p.36, 2016.

K. Muandet, K. Fukumizu, and B. Sriperumbudur, Bernhard Schölkopfet lF uernel men emedding of distriutionsX e reE view nd eyond. Foundations and Trends R in Machine Learning, vol.10, p.1141, 2017.

M. Radford, . Neal, E. Georey, and . Hinton, e view of the iw lgorithm tht justi(es inrementlD sprseD nd other vrints, p.52, 1998.

L. George, . Nemhauser, A. Laurence, . Wolsey, L. Marshall et al., , 1978.

, Mathematical Programming, vol.14, issue.1, 1978.

. Behnam-neyshabur, Ryota Tomioka et Nathan Srebro. xormE sed pity ontrol in neurl networks, Conference on Learning Theory, p.13761401, 2015.

K. Nigam and A. Mccallum, , p.24, 1998.

, Seunghoon Hong et Bohyung Han. verning deonE volution network for semnti segmenttion, Proceedings of the IEEE International Conference on Computer Vision, p.112, 2015.

A. Oliver, A. Odena, C. Rael, D. Ekin, I. J. Cubuk et al., elisti ivlution of emiEupervised verning elgorithms, p.12, 2018.

, Silvio Savarese Ozan Sener. etive verning for gonvolutionl xeurl xetworksX e goreEet epproh. International Conference on Learning Representations, 2018. accepted as poster. (Cit pages, vol.17, p.15

A. Rakotomamonjy, A. Traore, and M. Berar, Rémi Flamary et Nicolas Courty. sserstein histne wesure whines, p.86, 2018.

R. Marco-tulio, Sameer Singh et Carlos Guestrin. wodelEgnosti interpretility of mhine lerning, p.112, 2016.

R. Marco-tulio, Sameer Singh et Carlos Guestrin. enhorsX righEpreision modelEgnosti explntions, AAAI Conference on Articial Intelligence, p.112, 2018.

R. Marco-tulio, Sameer Singh et Carlos Guestrin. emntilly iquivlent edversril ules for heugging xv wodels, Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, vol.1, p.856865, 2018.

H. Ritter, A. Botev, and D. Barber, Cuturi et G. Peyré. pst ditionry lerning with smoothed sserstein loss, International Conference on Learning Representations, p.630638, 2016.

O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh et al., International Journal of Computer Vision (IJCV), vol.115, issue.3, p.211252, 2015.

J. Niek and . Sanders, ndersEtwitter sentiment orpus, Sanders Analytics LLC, p.149, 2011.

, Filippo Santambrogio. sntrodution to optiml trnsport theE ory. Notes, 2014.

Y. Sawada, T. Sato, and . Nakada, Kei Ujimoto et Nobuhiro Hayashi. ellEtrnsfer lerning for deep neurl networks nd its pplition to sepsis lssi(tion, p.12, 2017.

Y. Robert-e-schapire and . Freund, Peter Bartlett et Wee Sun Lee. foosting the mrginX e new explntion for the e'etiveness of voting methE ods, Annals of statistics, p.20, 1998.

A. Schrijver, e omintoril lgorithm minimizing suE modulr funtions in strongly polynomil time, Journal of Combinatorial Theory, Series B, vol.80, issue.2, p.59, 2000.

D. Sculley, ynline tive lerning methods for fst lelEe0ient spm (lteringF In CEAS, vol.7, p.10, 2007.

V. Seguy and M. Cuturi, rinipl geodesi nlysis for proE ility mesures under the optiml trnsport metri, Advances in Neural Information Processing Systems, vol.73, p.70, 2015.

B. Settles, M. Craven, and L. Friedland, etive verning with el ennottion gosts, Proceedings of the NIPS Workshop on Cost-Sensitive Learning, p.24, 2008.

, Burr Settles. prom theories to queriesX etive lerning in prtie, Active Learning and Experimental Design workshop In conjunction with AISTATS 2010, vol.118, 2011.

H. S. Seung, M. Opper, and H. Sompolinsky, uery y gommittee. COLT '92, vol.23, p.13, 1992.

J. Shen and Y. Qu, Weinan Zhang et Yong Yu. sserstein histne quided epresenttion verning for homin edpttionF In AAAI, p.73, 2018.

S. Shirdhonkar and D. W. Jacobs, epproximte erth mover9s distne in liner time, CVPR, p.70, 2008.

, Karen Simonyan et Andrew Zisserman. ery heep gonvolutionl xetworks for vrgeEle smge eognition. CoRR, 2014.

C. Brian, B. Smith, . Settles, C. William, . Hallows et al., sQ sustrte spei(ity determined y peptide rrys nd mhine lerning, ACS chemical biology, vol.6, issue.2, p.10, 2010.

J. S. Smith, B. Nebgen, N. Lubbers, O. Isayev, and A. E. Roitberg, vess is moreX smpling hemil spe with tive lerning

, Raja Giryes, Guillermo Sapiro et Miguel RD Rodrigues. qenerliztion error of deep neurl networksX ole of lssi(tion mrgin nd dt struture, Sampling Theory and Applications, p.2017, 2017.

J. Solomon, F. De-goes, G. Peyré, M. Cuturi, A. Butscher et al., gonvolutionl sserstein histnesX i0E ient yptiml rnsporttion on qeometri homins, ACM Trans. Graph, vol.34, issue.4, p.70, 2015.

J. Solomon, F. De-goes, G. Peyré, M. Cuturi, A. Butscher et al., Du et L. Guibas. gonvolutionl sserstein distnesX ifE (ient optiml trnsporttion on geometri domins, ACM Transactions on, 2015.
URL : https://hal.archives-ouvertes.fr/halshs-00955539

, Graphics (TOG), vol.34, p.66, 2015.

M. Staib, S. Claici, J. Solomon, and S. Jegelka, Masashi Sugiyama et Neil Rubens. e th ensemle pproh to tive lerning with model seletion, rllel treming sserstein fryenters. CoRR, vol.21, issue.9, p.41, 2008.

C. Szegedy, W. Zaremba, I. Sutskever, and J. Bruna, Dumitru Erhan, Ian Goodfellow et Rob Fergus. sntriguing properties of neurl networks, vol.35, p.32, 2013.

, Thomas Tanay et Lewis Grin. e oundry tilting persepetive on the phenomenon of dversril exmples, vol.47, p.44, 2016.

A. Taylor, M. Marcus, and B. Santorini, Treebanks, p.522, 2003.

, Alex Sivak Jiashi Feng Huan Xu Shie Mannor Tom Zahavy Bingyi Kang. insemle oustness nd qenerliztion of tohsti heep verning elgorithms, p.22, 2018.

S. Tong and D. Koller, upport vetor mhine tive lerning with pplitions to text lssi(tion, Journal of machine learning research, vol.2, p.16, 2001.

L. Vanni, M. Ducoe, and C. Aguilar, Frederic Precioso et Damon Mayare. extul heonvolution lieny @hAX deep tool ox for linguisti nlysis, Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, vol.1, p.548557, 2018.

T. Vayer, L. Chapel, and R. Flamary, Romain Tavenard et Nicolas Courty. yptiml rnsport for strutured dt, p.88, 2018.

C. Villani, Optimal transport: old and new. Grundlehren der mathematischen Wissenschaften, p.70, 2009.

W. Wang, D. Slep£ev, S. Basu, J. Ozolek, and G. Rohde, e viner ypE timl rnsporttion prmework for untifying nd isulizing ritions in ets of smges, International Journal of Computer Vision, vol.101, issue.2, p.71, 2013.

, Zheng Wang et Jieping Ye. uerying disrimintive nd representtive smples for th mode tive lerning, vol.9, p.17, 2015.

K. Wang, D. Zhang, and Y. Li, Ruimao Zhang et Liang Lin. gostE e'etive tive lerning for deep imge lssi(tion. IEEE Transactions on Circuits and Systems for Video Technology, p.13, 2016.

W. ]-kai, Rishabh Iyer et Je Bilmes. umodulrity in ht uset eE letion nd etive verning, International Conference on Machine Learning, vol.38, p.17, 2015.

D. Tsung-hsien-wen, N. Vandyke, M. Mrk²i¢, L. Gasic, P. Barahona et al., Stefan Ultes et Steve Young. e xetworkE sed indEtoEind rinle skEoriented hilogue ystem, Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, vol.1, p.111, 2017.

J. Weston, F. Ratle, H. Mobahi, and R. Collobert, heep lerning vi semiEsupervised emedding, Neural Networks: Tricks of the Trade, p.79, 2012.

R. Willett, R. Nowak, M. Rui, and . Castro, pster rtes in regression vi tive lerning, Advances in Neural Information Processing Systems, p.10, 2006.

C. Wu and E. Tabak, , p.86, 2017.

Y. Xie and X. Wang, Ruijia Wang et Hongyuan Zha. e pst roximl oint wethod for sserstein histne, 2018.

, Huan Xu et Shie Mannor. oustness nd generliztion. Machine learning, vol.86, p.22, 2012.

, heep etive verning for xmed intity eogniE tion. International Conference on Learning Representations, 2018.

W. Yin and K. Kann, Mo Yu et Hinrich Schütze. gomprE tive study of nn nd rnn for nturl lnguge proessing, p.111, 2017.

K. Yu, J. Bi, and . Volker-tresp, etive lerning vi trnsdutive exE perimentl design, Proceedings of the 23rd international conference on Machine learning, p.11, 2006.

W. Yu, G. Zeng, P. Luo, F. Zhuang, Q. He et al., imedding with utoenoder regulriztion, ECML/PKDD, p.79, 2013.

A. Yu and K. Grauman, pineEgrined visul omprisons with lol lerning, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p.149, 2014.

A. W. Yu, D. Dohan, M. Luong, R. Zhao, and K. Chen, Mohammad Norouzi et Quoc V Le. exetX gomining vol gonvoluE tion with qlol elfEettention for eding gomprehension, vol.109, 2018.

A. Amir-r-zamir, W. Sax, L. Shen, and . Guibas, Jitendra Malik et Silvio Savarese. skonomyX hisentngling sk rnsfer verning, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p.12, 2018.

R. Matthew-d-zeiler and . Fergus, isulizing nd understnding onE volutionl networks, European conference on computer vision, vol.818, p.111, 2014.

, Ke Zhai et Huan Wang. edptive hropout with demher gomplexity egulriztion, International Conference on Learning Representations, 2018.

T. Zhang and . Oles, he vlue of unleled dt for lssi(tion prolems, Proceedings of the Seventeenth International Conference on Machine Learning, p.56, 2000.

C. Zhang, S. Bengio, and M. Hardt, Benjamin Recht et Oriol Vinyals. nderstnding deep lerning requires rethinking generlizE tion, p.19, 2016.

M. Zhang, B. Lease, and . Wallace, etive qemintive ext epresenttion verning, 2017.

, Qingcai Chen et Xiaolong Wang. etive deep networks for semiEsupervised sentiment lssi(tion, ACL ICCL, p.13, 2010.

J. Zhu and P. Krähenbühl, Eli Shechtman et Alexei A Efros. qenertive visul mnipultion on the nturl imge mnifold, European Conference on Computer Vision, p.149, 2016.

, Jia-Jie Zhu et Jose Bento. qenertive edversril etive verning, 2017.