R. Adams, Sobolev spaces / Robert A. Adams

J. Adebayo and L. , Iterative orthogonal feature projection for diagnosing bias in blackbox models, Conference on Fairness, Accountability, and Transparency in Machine Learning, 2016.

A. Agarwal, A. Beygelzimer, M. Dudík, J. Langford, and H. Wallach, A reductions approach to fair classification, vol.56, p.57, 2018.

R. Agrawal, A. Gupta, Y. Prabhu, and M. Varma, Multi-label learning with millions of labels: Recommending advertiser bid phrases for web pages, Proceedings of the International World Wide Web Conference, p.27, 2013.

D. Anbar, A modified robbins-monro procedure approximating the zero of a regression function from below, Ann. Statist, vol.5, issue.1, pp.229-234, 1977.

S. Arlot and R. Genuer, Analysis of purely random forests bias, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01023596

J. Audibert, Aggregated estimators and empirical complexity for least square regression, Ann. Inst. H. Poincaré Probab. Statist, vol.40, issue.6, p.53, 2004.

J. Audibert, Fast learning rates in statistical inference through aggregation, Ann. Statist, vol.37, issue.4, pp.1591-1646, 2009.
URL : https://hal.archives-ouvertes.fr/hal-00139030

J. Audibert and A. Tsybakov, Fast learning rates for plug-in classifiers, Ann. Statist, vol.35, issue.2, pp.608-633, 2007.
URL : https://hal.archives-ouvertes.fr/hal-00160849

R. Babbar and B. Schölkopf, Dismec: Distributed sparse machines for extreme multi-label classification, Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, pp.721-729, 2017.

S. Barocas and A. Selbst, Big data's disparate impact, Calif. L. Rev, vol.104, p.671, 2016.

S. Barocas, M. Hardt, and A. Narayanan, Fairness and Machine Learning. fairmlbook.org, p.55, 2018.

P. Bartlett and S. Mendelson, Rademacher and Gaussian complexities: risk bounds and structural results, Issue Comput. Learn. Theory, vol.3, pp.463-482, 2002.

P. Bartlett and M. Wegkamp, Classification with a reject option using a hinge loss, J. Mach. Learn. Res, vol.9, pp.1823-1840, 2008.

P. Bartlett, O. Bousquet, and S. Mendelson, Local rademacher complexities, The Annals of Statistics, vol.33, issue.4, pp.1497-1537, 2005.

P. Bartlett, M. Jordan, and J. Mcauliffe, Convexity, classification, and risk bounds, Journal of the American Statistical Association, vol.101, issue.473, pp.138-156, 2006.

Z. Barutcuoglu, R. E. Schapire, and O. G. Troyanskaya, Hierarchical multi-label prediction of gene function, Bioinformatics, vol.22, issue.7, p.27, 2006.

P. Bellec, A. Dalalyan, E. Grappin, and Q. Paris, On the prediction loss of the lasso in the partially labeled setting, Electron. J. Statist, vol.12, issue.2, pp.3443-3472, 2018.

O. Besov, V. Ilin, and S. Nikolskii, Integralnye predstavleniya funktsii i teoremy vlozheniya. Fizmatlit "Nauka, 1996.

A. Beutel, J. Chen, Z. Zhao, and E. H. Chi, Data decisions and theoretical implications when adversarially learning fair representations, Conference on Fairness, Accountability, and Transparency in Machine Learning, 2017.

A. Beygelzimer, J. Langford, Y. Lifshits, G. Sorkin, and A. Strehl, Conditional probability tree estimation analysis and algorithms, Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, pp.51-58, 2009.

K. Bhatia, H. Jain, P. Kar, M. Varma, and P. Jain, Sparse local embeddings for extreme multi-label classification, NIPS, p.136, 2015.

L. Birgé, A new lower bound for multiple hypothesis testing, IEEE Transactions on Information Theory, vol.51, issue.4, p.157, 2005.

A. Blumer, A. Ehrenfeucht, D. Haussler, and M. Warmuth, Learnability and the vapnikchervonenkis dimension, Journal of the ACM (JACM), vol.36, issue.4, p.11, 1989.

S. Bobkov and M. Ledoux, One-dimensional empirical measures, order statistics and Kantorovich transport distances, vol.48, p.113, 2016.

O. Bousquet and A. Elisseeff, Stability and generalization, Journal of machine learning research, vol.2, pp.499-526, 2002.

L. Breiman, Consistency for a simple model of random forests, 2004.

L. Brown and . Low, A constrained risk inequality with applications to nonparametric functional estimation, The annals of Statistics, vol.24, issue.6, pp.2524-2535, 1996.

T. Calders, F. Kamiran, and M. Pechenizkiy, Building classifiers with independency constraints, IEEE international conference on Data mining, vol.25, p.55, 2009.

F. Calmon, D. Wei, B. Vinzamuri, K. N. Ramamurthy, and K. R. Varshney, Optimized pre-processing for discrimination prevention, Neural Information Processing Systems, vol.55, p.56, 2017.

F. Chierichetti, R. Kumar, S. Lattanzi, and S. Vassilvitskii, Fair clustering through fairlets, Neural Information Processing Systems, 2017.

C. Chow, An optimum character recognition system using decision functions, IRE Transactions on Electronic Computers, vol.8, issue.4, p.94, 1957.

C. Chow, On optimum error and reject trade-off, IEEE Transactions on Information Theory, vol.16, issue.5, p.94, 1970.

E. Chzhen, Optimal rates for F-score binary classification, 2019.
URL : https://hal.archives-ouvertes.fr/hal-02123314

E. Chzhen, Classification of sparse binary vectors, J. Stat. Plan. Inference, 2019.

E. Chzhen, C. Denis, J. Hebiri, and . Salmon, On the benefits of output sparsity for multi-label classification, vol.29, p.143, 2017.

E. Chzhen, C. Denis, and M. Hebiri, Minimax semi-supervised confidence sets for multi-class classification, Ann. Stat, vol.27, p.58, 2019.
URL : https://hal.archives-ouvertes.fr/hal-02112918

E. Chzhen, C. Denis, M. Hebiri, L. Oneto, and M. Pontil, Leveraging Labeled and Unlabeled Data for Consistent Fair Binary Classification
URL : https://hal.archives-ouvertes.fr/hal-02150662

E. Chzhen, M. Hebiri, and J. Salmon, On lasso refitting strategies, Bernouilli, pp.2019-2048
URL : https://hal.archives-ouvertes.fr/hal-01593888

S. D. Conte and C. W. Boor, Elementary Numerical Analysis: An Algorithmic Approach. McGraw-Hill Higher Education, p.39, 1980.

A. Cotter, M. Gupta, H. Jiang, N. Srebro, K. Sridharan et al., Training well-generalizing classifiers for fairness metrics and other data-dependent constraints, 2018.

F. Cribari-neto, N. Garcia, and K. Vasconcellos, A note on inverse moments of binomial variates, Brazilian Review of Econometrics, vol.20, issue.2, pp.269-277, 2000.

B. De-finetti, Probability, induction and statistics. The art of guessing, Wiley Series in Probability and Mathematical Statistics, 1972.

B. De-finetti, Theory of probability: a critical introductory treatment, vol.1, 1974.

O. Dekel and O. Shamir, Multiclass-multilabel classification with more classes than examples, Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp.137-144, 2010.

K. Dembczy?ski, W. Waegeman, W. Cheng, and E. Hüllermeier, On label dependence and loss minimization in multi-label classification, Machine Learning, vol.88, pp.5-45, 2012.

K. Dembczynski, A. Jachnik, W. Kotlowski, W. Waegeman, and E. Hüllermeier, Optimizing the f-measure in multi-label classification: Plug-in rule approach versus structured loss minimization, ICML, vol.136, p.139, 2013.

K. Dembczynski, W. Kot-lowski, O. Koyejo, and N. Natarajan, Consistency analysis for binary classification revisited, ICML, vol.35, p.38, 2017.

C. Denis and M. Hebiri, Confidence sets for classification, Statistical Learning and Data Sciences, vol.57, p.146, 2015.
URL : https://hal.archives-ouvertes.fr/hal-02112918

C. Denis and M. Hebiri, Confidence sets for classification, International Symposium on Statistical Learning and Data Sciences, pp.301-312, 2015.
URL : https://hal.archives-ouvertes.fr/hal-02112918

C. Denis and M. Hebiri, Confidence sets with expected sizes for multiclass classification, JMLR, vol.18, issue.1, pp.3571-3598, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01357850

L. Devroye, The uniform convergence of nearest neighbor regression function estimators and their application in optimization, IEEE Transactions on Information Theory, vol.24, issue.2, p.62, 1978.

L. Devroye, Any discrimination rule can have an arbitrarily bad probability of error for finite sample size, IEEE Transactions on Pattern Analysis and Machine Intelligence, issue.2, pp.154-157, 1982.

L. Devroye, L. Györfi, and G. Lugosi, A probabilistic theory of pattern recognition, Applications of Mathematics, vol.31, issue.4, p.17, 1996.

V. Dinh, L. Si-tung, N. Ho, D. Viet-cuong, B. T. Nguyen et al., Learning from non-iid data: Fast rates for the one-vs-all multiclass plug-in classifiers, Theory and Applications of Models of Computation, pp.375-387, 2015.

M. Donini, L. Oneto, S. Ben-david, J. S. Shawe-taylor, and M. Pontil, Empirical risk minimization under fairness constraints, Neural Information Processing Systems, vol.55, p.67, 2018.

R. Dudley, The sizes of compact subsets of hilbert space and continuity of gaussian processes, Journal of Functional Analysis, vol.1, issue.3, pp.290-330, 1967.

A. Dvoretzky, J. Kiefer, and J. Wolfowitz, Asymptotic minimax character of the sample distribution function and of the classical multinomial estimator, Ann. Math. Statist, vol.27, issue.3, p.86

C. Dwork, N. Immorlica, A. T. Kalai, and M. D. Leiserson, Decoupled classifiers for group-fair and efficient machine learning, Conference on Fairness, Accountability and Transparency, vol.55, p.65, 2018.

D. Eisenstat and D. Angluin, The vc dimension of k-fold union, Information Processing Letters, vol.101, issue.5, pp.181-184, 2007.

M. Feldman, S. A. Friedler, J. Moeller, C. Scheidegger, and S. Venkatasubramanian, Certifying and removing disparate impact, International Conference on Knowledge Discovery and Data Mining, 2015.

S. Gao, W. Wu, C. H. Lee, and T. S. Chua, A MFoM learning approach to robust multiclass multi-label text categorization, ICML, p.27, 2004.

W. Gao and Z. Zhou, On the consistency of multi-label learning, COLT, p.137, 2011.

R. Genuer, Variance reduction in purely random forests, Journal of Nonparametric Statistics, vol.24, issue.3, pp.543-562, 2012.
URL : https://hal.archives-ouvertes.fr/hal-01590513

D. Gil, J. Girela, J. Juan, J. Gomez-torres, and M. Johnsson, Predicting seminal quality with artificial intelligence methods, Expert Systems with Applications, vol.39, issue.16, pp.12564-12573, 2012.

E. Gilbert, A comparison of signalling alphabets. The Bell system technical journal, vol.31, pp.504-522, 1952.

E. Giné and R. Nickl, Mathematical Foundations of Infinite-Dimensional Statistical Models. Cambridge Series in Statistical and Probabilistic Mathematics, vol.14, p.15, 2015.

H. A. Güvenir, G. Demiröz, and N. Ilter, Learning differential diagnosis of erythematosquamous diseases using voting feature intervals, Artificial Intelligence in Medicine, vol.13, p.147, 1998.

M. Hardt, E. Price, and N. Srebro, Equality of opportunity in supervised learning, Neural Information Processing Systems, vol.23, p.67, 2016.

J. A. Hartigan, Estimation of a convex density contour in two dimensions, J. Amer. Statist. Assoc, vol.82, issue.397, p.95, 1987.

D. Haussler, Sphere packing numbers for subsets of the boolean n-cube with bounded vapnikchervonenkis dimension, Journal of Combinatorial Theory, Series A, vol.69, issue.2, pp.217-232, 1995.

R. Herbei and M. Wegkamp, Classification with reject option, Canad. J. Statist, vol.34, issue.4, p.96, 2006.

W. Hoeffding, Probability inequalities for sums of bounded random variables, J. Amer. Statist. Assoc, vol.58, issue.301, pp.13-30, 1963.

M. Hristache, A. Juditsky, J. Polzehl, and V. Spokoiny, Structure adaptive approach for dimension reduction, The Annals of Statistics, vol.29, issue.6, pp.1537-1566, 2001.

M. Hristache, A. Juditsky, and V. Spokoiny, Direct estimation of the index coefficient in a single-index model, Annals of Statistics, pp.595-623, 2001.

I. Ibragimov and R. Khasminskii, Statistical Estimation: Asymtotic Theory. Applications of Mathematics Series, 1981.

S. Jabbari, M. Joseph, M. Kearns, J. Morgenstern, and A. Roth, Fair learning in markovian environments, Conference on Fairness, Accountability, and Transparency in Machine Learning, 2016.

H. Jain, Y. Prabhu, and M. Varma, Extreme multi-label loss functions for recommendation, tagging, ranking & other missing label applications, KDD, vol.136, p.137, 2016.

H. Jiang, Uniform convergence rates for kernel density estimation, Proceedings of the 34th International Conference on Machine Learning, vol.70, pp.1694-1703, 2017.

M. Joseph, M. Kearns, J. H. Morgenstern, and A. Roth, Fairness in learning: Classic and contextual bandits, Neural Information Processing Systems, 2016.

A. Juditsky, O. Lepski, and A. Tsybakov, Nonparametric estimation of composite functions, The Annals of Statistics, vol.37, issue.3, p.16, 2009.
URL : https://hal.archives-ouvertes.fr/hal-00148063

F. Kamiran and T. Calders, Classifying without discriminating, International Conference on Computer, 2009.

F. Kamiran and T. Calders, Classification with no discrimination by preferential sampling, Machine Learning Conference, 2010.

F. Kamiran and T. Calders, Data preprocessing techniques for classification without discrimination, Knowledge and Information Systems, vol.33, issue.1, p.56, 2012.

S. S. Keerthi, V. Sindhwani, and O. Chapelle, An efficient method for gradient-based adaptation of hyperparameters in svm models, NIPS, p.38, 2007.

N. Kilbertus, M. Rojas-carulla, G. Parascandolo, M. Hardt, D. Janzing et al., Avoiding discrimination through causal reasoning, Neural Information Processing Systems, 2017.

A. N. Kolmogorov and V. M. Tikhomirov, eps-entropy and eps-capacity of sets in functional spaces, vol.01, p.1961

V. Koltchinskii, Oracle inequalities in empirical risk minimization and sparse recovery problems, Lecture Notes in Mathematics, vol.2033, p.157, 2011.

A. Korostelëv and A. Tsybakov, Minimax theory of image reconstruction, Lecture Notes in Statistics, vol.82, p.16, 1993.

O. Koyejo, N. Natarajan, P. Ravikumar, and I. Dhillon, Consistent binary classification with generalized performance metrics, NIPS, p.38, 2014.

O. Koyejo, N. Natarajan, P. Ravikumar, and I. Dhillon, Consistent multilabel classification, NIPS, vol.57, p.137, 2015.

M. J. Kusner, J. Loftus, C. Russell, and R. Silva, Counterfactual fairness, Neural Information Processing Systems, 2017.

M. Lapin, M. Hein, and B. Schiele, Top-k multiclass svm, NIPS, p.137, 2015.

Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, Gradient-based learning applied to document recognition, Proceedings of the IEEE, vol.86, issue.11, pp.2278-2324, 1998.

M. Ledoux and M. Talagrand, Probability in Banach spaces, Ergebnisse der Mathematik und ihrer Grenzgebiete, vol.23, issue.3

. Springer-verlag, , 1991.

J. Lei, Classification with confidence, Biometrika, vol.101, issue.4, pp.755-769, 2014.

O. Lepskii, Asymptotic minimax estimation with prescribed properties, Theory of Probability & Its Applications, vol.34, p.137, 1990.

D. Lewis, Evaluating and optimizing autonomous text classification systems, ACM, vol.19, p.35, 1995.

T. Li, A. Prasad, and P. Ravikumar, Fast classification rates for high-dimensional gaussian generative models, NIPS, pp.1054-1062, 2015.

X. Li, F. Zhao, and Y. Guo, Multi-label image classification with a probabilistic label enhancement model, p.27, 2014.

Y. Li, Y. Song, and J. Luo, Improving pairwise ranking for multi-label image classification, p.137, 2017.

F. Louzada, A. Ara, and G. Fernandes, Classification methods applied to credit scoring: Systematic review and overall comparison, vol.21, pp.117-134, 2016.

K. Lum and J. Johndrow, A statistical framework for fair predictive algorithms, 2016.

E. Mammen and A. Tsybakov, Smooth discrimination analysis, Ann. Statist, vol.27, issue.6, p.17, 1999.

P. Massart, The tight constant in the dvoretzky-kiefer-wolfowitz inequality, Ann. Probab, vol.18, issue.3, p.86, 1990.

P. Massart and . Nédélec, Risk bounds for statistical learning, Ann. Statist, vol.34, issue.5, p.37, 2006.

J. Matou?ek, Lectures on discrete geometry, vol.108, 2002.

C. Mcdiarmid, On the method of bounded differences, pp.148-188, 1989.

A. Menon and R. C. Williamson, The cost of fairness in binary classification, Conference on Fairness, Accountability and Transparency, vol.56, p.58, 2018.

A. Menon, H. Narasimhan, S. Agarwal, and S. Chawla, On the statistical consistency of algorithms for binary classification under class imbalance, ICML, vol.28, pp.17-19, 2013.

H. Narasimhan, R. Vaish, and S. Agarwal, On the statistical consistency of plug-in classifiers for non-decomposable performance measures, NIPS, p.38, 2014.

L. Oneto, M. Donini, A. Elders, and M. Pontil, Taking advantage of multitask learning for fair classification, AAAI/ACM Conference on AI, Ethics, and Society, 2019.

L. Oneto, M. Donini, and M. Pontil, General fair empirical risk minimization, p.56, 2019.

I. Partalas, A. Kosmopoulos, M. Baskiotis, T. Artières, G. Paliouras et al., LSHTC: A benchmark for large-scale text classification, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01691460

G. Pleiss, M. Raghavan, F. Wu, J. Kleinberg, and K. Weinberger, On fairness and calibration, Neural Information Processing Systems, 2017.

W. Polonik, Measuring mass concentrations and estimating density contour clusters-an excess mass approach, Ann. Statist, vol.23, issue.3, p.95, 1995.

Y. Prabhu and M. Varma, Fastxml: A fast, accurate and stable tree-classifier for extreme multi-label learning, Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, vol.28, p.32, 2014.

Y. Prabhu, A. Kag, S. Gopinath, K. Dahiya, S. Harsola et al., Extreme multi-label learning with label features for warm-start tagging, ranking and recommendation, Proceedings of the ACM International Conference on Web Search and Data Mining, p.27, 2018.

H. Ramaswamy, A. Tewari, and S. Agarwal, Consistent algorithms for multiclass classification with an abstain option, Electronic Journal of Statistics, vol.12, issue.1, pp.530-554, 2018.

P. Rigollet, Generalization error bounds in semi-supervised classification under the cluster assumption, Journal of Machine Learning Research, vol.8, issue.14, p.96, 2007.
URL : https://hal.archives-ouvertes.fr/hal-00022528

P. Rigollet and R. Vert, Optimal rates for plug-in estimators of density level sets, Bernoulli, vol.16, issue.10, 2009.

J. E. Roemer and A. Trannoy, Equality of opportunity, Handbook of income distribution, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01446191

W. Rudin, Real and complex analysis, 1987.

M. Sadinle, J. Lei, and L. Wasserman, Least ambiguous set-valued classifiers with bounded error levels, Journal of the American Statistical Association, vol.57, pp.1-12, 2018.

M. Sahami, S. Dumais, D. Heckerman, and E. Horvitz, A bayesian approach to filtering junk E-mail, Learning for Text Categorization: Papers from the 1998 Workshop, 1998.

J. Schmidt-hieber, Nonparametric regression using deep neural networks with relu activation function, p.16, 2017.

E. Scornet, G. Biau, and J. Vert, Consistency of random forests, Ann. Statist, vol.43, issue.4, pp.1716-1741, 2015.
URL : https://hal.archives-ouvertes.fr/hal-00990008

S. Shalev-shwartz, O. Shamir, N. Srebro, and K. Sridharan, Learnability, stability and uniform convergence, Journal of Machine Learning Research, vol.11, pp.2635-2670, 2010.

J. Simon, Sobolev, besov and nikolskii fractional spaces: imbeddings and comparisons for vector valued spaces on an interval, vol.157, pp.117-148, 1990.

A. Singh, R. Nowak, and J. Zhu, Unlabeled data: Now it helps, now it doesn't, NIPS, vol.38, p.96, 2009.

S. Sobolev, Some Applications of Functional Analysis in Mathematical Physics. Translations of mathematical monographs, 1991.

C. Stone, Consistent nonparametric regression, Ann. Statist, vol.104, p.95, 1977.

C. Stone, Optimal global rates of convergence for nonparametric regression. The annals of statistics, vol.16, p.105, 1982.

V. Sudakov, Geometric problems of the theory of infinite-dimensional probability distributions, Trudy Mat. Inst. Steklov, vol.141, p.191, 1976.

G. Tsoumakas, I. Katakis, and I. Vlahavas, Mining multi-label data, Data mining and knowledge discovery handbook, p.27, 2009.

A. Tsybakov, Robust reconstruction of functions by the local-approximation method, vol.22, pp.69-84, 1986.

A. Tsybakov, On nonparametric estimation of density level sets, Ann. Statist, vol.25, issue.3, p.95, 1997.

A. Tsybakov, Optimal aggregation of classifiers in statistical learning, Ann. Statist, vol.32, issue.1, p.142, 2004.
URL : https://hal.archives-ouvertes.fr/hal-00102142

A. Tsybakov, Introduction to nonparametric estimation, Springer Series in Statistics, vol.15, issue.14, p.105, 2009.

A. Tsybakov and S. Van-de-geer, Square root penalty: adaptation to the margin in classification and in edge estimation, The Annals of Statistics, vol.33, issue.3, pp.1203-1224, 2005.
URL : https://hal.archives-ouvertes.fr/hal-00101837

S. Vallender, Calculation of the wasserstein distance between probability distributions on the line, Theory of Probability & Its Applications, vol.18, pp.784-786, 1974.

S. Van-de-geer, High-dimensional generalized linear models and the lasso, The Annals of Statistics, vol.36, issue.2, pp.614-645, 2008.

A. Van-der and . Vaart, Asymptotic statistics, vol.3, 1998.

C. J. Van-rijsbergen, Foundation of evaluation, Journal of documentation, vol.30, issue.4, p.35, 1974.

V. Vapnik, Statistical learning theory, vol.6, p.95, 1998.

V. Vapnik and A. Chervonenkis, On the uniform convergence of relative frequencies of events to their probabilities, Measures of complexity, vol.11, p.86, 1971.

R. Varshamov, Estimate of the number of signals in error correcting codes, Dokl. Akad. Nauk SSSR, vol.117, pp.739-741, 1957.

V. Vovk, On-line confidence machines are well-calibrated, Proceedings of the Forty-Third Annual Symposium on Foundations of Computer Science, pp.187-196, 2002.

V. Vovk, Asymptotic optimality of transductive confidence machine, Algorithmic learning theory, vol.2533, pp.336-350, 2002.

V. Vovk, A. Gammerman, and G. Shafer, Algorithmic learning in a random world, vol.93, p.94, 2005.

M. Wegkamp and M. Yuan, Support vector machines with a reject option, Bernoulli, vol.17, issue.4, pp.1368-1385, 2011.

J. Wellner, Empirical processes: Theory and applications, 2005.

B. Yan, S. Koyejo, K. Zhong, and P. Ravikumar, Binary classification with karmic, thresholdquasi-concave metrics, ICML, vol.80, p.62, 2018.

Y. Yang, Minimax nonparametric classification: Rates of convergence, IEEE Transactions on Information Theory, vol.45, issue.7, p.95, 1999.

S. Yao and B. Huang, Beyond parity: Fairness objectives for collaborative filtering, Neural Information Processing Systems, 2017.

N. Ye, K. Chai, W. Lee, and H. Chieu, Optimizing f-measures: A tale of two approaches, ICML, vol.35, p.38, 2012.

H. F. Yu, P. Jain, P. Kar, and I. S. Dhillon, Large-scale multi-label learning with missing labels, ICML, pp.593-601, 2014.

M. B. Zafar, I. Valera, M. G. Rodriguez, and K. P. Gummadi, Fairness beyond disparate treatment & disparate impact: Learning classification without disparate mistreatment, International Conference on World Wide Web, vol.25, p.63, 2017.

M. B. Zafar, I. Valera, M. Gomez-rodriguez, and K. P. Gummadi, Fairness constraints: A flexible approach for fair classification, Journal of Machine Learning Research, vol.20, issue.75, pp.1-42, 2019.

R. Zemel, Y. Wu, K. Swersky, T. Pitassi, and C. Dwork, Learning fair representations, International Conference on Machine Learning, vol.55, p.56, 2013.

T. Zhang, Statistical behavior and consistency of classification methods based on convex risk minimization, The Annals of Statistics, vol.32, issue.1, pp.56-85, 2004.

M. Zhao, N. Edakunni, A. Pocock, and G. Brown, Beyond fano's inequality: bounds on the optimal f-score, ber, and cost-sensitive risk and their implications, JMLR, vol.14, issue.13, pp.1033-1090, 2013.

I. Zliobaite, On the relation between accuracy and fairness in binary classification, 2015.