, Technical Report: 3GPP TR 38.913 v15.0.0: Study on Scenarios and Requirements for Next Generation Access Technologies, 2018.

P. Schulz, M. Matthe, H. Klessig, M. Simsek, G. Fettweis et al., Latency critical iot applications in 5g: Perspective on the design of radio interface and network architecture, IEEE Communications Magazine, vol.55, issue.2, pp.70-78, 2017.

N. A. Johansson, Y. E. Wang, E. Eriksson, and M. Hessler, Radio access for ultrareliable and low-latency 5g communications, 2015 IEEE International Conference on Communication Workshop (ICCW), pp.1184-1189, 2015.

J. Choi, V. Va, N. Gonzalez-prelcic, R. Daniels, C. R. Bhat et al., Millimeterwave vehicular communication to support massive automotive sensing, IEEE Communications Magazine, vol.54, issue.12, pp.160-167, 2016.

M. A. Lema, K. Antonakoglou, F. Sardis, N. Sornkarn, M. Condoluci et al., 5g case study of internet of skills: Slicing the human senses, 2017 European Conference on Networks and Communications (EuCNC), pp.1-6, 2017.

G. Amitabha, 5G mmWave Revolution and New Radio, 2017.

, The tactile Internet, Report ITU-T Technol, 2014.

M. Simsek, A. Aijaz, M. Dohler, J. Sachs, and G. Fettweis, The 5g-enabled tactile internet: Applications, requirements, and architecture, pp.1-6, 2016.

, Communications requirements of smart grid technologies, U.S. Dept. Energy, 2010.

I. Parvez, A. Rahmati, I. Guvenc, A. I. Sarwat, and H. Dai, A survey on low latency towards 5g: Ran, core network and caching solutions, IEEE Communications Surveys & Tutorials, vol.20, issue.4, pp.3098-3130, 2018.

G. P. Fettweis, The tactile internet: Applications and challenges, IEEE Vehicular Technology Magazine, vol.9, issue.1, pp.64-70, 2014.

M. Maier, M. Chowdhury, B. P. Rimal, and D. P. Van, The tactile internet: Vision, recent progress, and open challenges, IEEE Communications Magazine, vol.54, issue.5, pp.138-145, 2016.

P. Petar, N. J. Jimmy, S. Cedomir, D. C. Elisabeth, S. Erik et al., Wireless access for ultra-reliable low-latency communication: Principles and building blocks, IEEE Network, vol.32, issue.2, pp.16-23, 2018.

, Minimum requirements related to technical performance for IMT-2020 radio interface(s), 2017.

Y. Polyanskiy, Channel coding: Non-asymptotic fundamental limits, 2010.

M. Hayashi, Information spectrum approach to second-order coding rate in channel coding, IEEE Trans. on Inf. Theory, vol.55, issue.11, pp.4947-4966, 2009.

B. Makki, T. Svensson, and M. Zorzi, Finite block-length analysis of the incremental redundancy HARQ, IEEE Wireless Commun. Lett, vol.3, issue.5, pp.529-532, 2014.

P. Wu and N. Jindal, Coding versus ARQ in fading channels: How reliable should the PHY be, IEEE Trans. on Commun, vol.59, issue.12, pp.3363-3374, 2011.

S. H. Kim, D. K. Sung, and T. Le-ngoc, Performance analysis of incremental redundancy type hybrid ARQ for finite-length packets in AWGN channel, Proc. IEEE Global Commun. Conf. (Globecom), 2013.

A. R. Williamson, T. Chen, and R. D. Wesel, A rate-compatible sphere-packing analysis of feedback coding with limited retransmissions, Proc. IEEE ISIT, 2012.

S. Xu, T. Chang, S. Lin, C. Shen, and G. Zhu, Energy-efficient packet scheduling with finite blocklength codes: Convexity analysis and efficient algorithms, IEEE Trans. on Wireless Commun, vol.15, issue.8, pp.5527-5540, 2016.

O. L. López, H. Alves, and M. Latva-aho, Joint power control and rate allocation enabling ultra-reliability and energy efficiency in simo wireless networks, IEEE Trans. on Commun, vol.67, issue.8, pp.5768-5782, 2019.

L. Szczecinski, S. R. Khosravirad, P. Duhamel, and M. Rahman, Rate allocation and adaptation for incremental redundancy truncated HARQ, IEEE Trans. on Commun, vol.61, issue.6, pp.2580-2590, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00827950

R. Devassy, G. Durisi, G. C. Ferrante, O. Simeone, and E. Uysal, Reliable transmission of short packets through queues and noisy channels under latency and peak-age violation guarantees, 2019.

Y. Hu, M. Ozmen, M. C. Gursoy, and A. Schmeink, Optimal power allocation for qosconstrained downlink multi-user networks in the finite blocklength regime, IEEE Trans. on Wireless Commun, vol.17, issue.9, pp.5827-5840, 2018.

H. Wang, N. Wong, A. M. Baldauf, C. K. Bachelor, S. V. Ranganathan et al., An information density approach to analyzing and optimizing incremental redundancy with feedback, Proc. IEEE Int. Symp. Inf. Theory, 2017.

K. F. Trillingsgaard and P. Popovski, Generalized HARQ protocols with delayed channel state information and average latency constraints, IEEE Trans. on Inf. Theory, vol.64, issue.2, pp.1262-1280, 2017.

D. Djonin, A. Karmokar, and V. Bhargava, Joint rate and power adaptation for type-I hybrid ARQ systems over correlated fading channels under different buffer-cost constraints, IEEE Trans. Veh. Technol, vol.57, issue.1, pp.421-435, 2008.

E. Visotsky, V. Tripathi, and M. Honig, Optimum ARQ design: A dynamic programming approach, Proc. IEEE Int. Symp. Inf. Theory, 2003.

M. Jabi, M. Benji, L. Szczecinski, and F. Labeau, Energy efficiency of adaptive HARQ, IEEE Trans. on Commun, vol.64, issue.2, pp.818-831, 2016.

J. Gibson, The Communications Handbook, 2002.

G. Caire and D. Tuninetti, The throughput of hybrid ARQ protocols for the Gaussian collision channel, IEEE Trans. on Inf. Theory, vol.47, issue.5, pp.1971-1988, 2001.

C. L. Martret, A. Leduc, S. Marcille, and P. Ciblat, Analytical performance derivation of hybrid ARQ schemes at IP layer, IEEE Trans. on Commun, vol.60, issue.5, pp.1305-1314, 2012.
URL : https://hal.archives-ouvertes.fr/hal-02286368

J. Park and D. Park, A new power allocation method for parallel AWGN channels in the finite block length regime, IEEE Wireless Commun. Lett, vol.16, issue.9, pp.1392-1395, 2012.

Y. Polyanskiy, H. V. Poor, and S. Verdú, Feedback in the non-asymptotic regime, IEEE Trans. on Inf. Theory, vol.57, issue.8, pp.4903-4925, 2011.

K. Vakilinia, S. V. Ranganathan, D. Divsalar, and R. D. Wesel, Optimizing transmission lengths for limited feedback with nonbinary LDPC examples, IEEE Trans. on Commun, vol.564, issue.6, pp.2245-2257, 2016.

A. Martinez and A. G. Fàbregas, Saddlepoint approximation of random-coding bounds, Proc. Inf. Theory Applicat. Workshop (ITA), 2011.

S. Khosravirad and H. Viswanathan, Analysis of feedback error in automatic repeat request, 2017.

R. Wolff, Stochastic modeling and the theory of queues, 1989.

C. C. Tan and N. C. Beaulieu, On first-order markov modeling for the rayleigh fading channel, IEEE Transactions on Communications, vol.48, issue.12, pp.2032-2040, 2000.

M. Frank and P. Wolfe, An algorithm for quadratic programming, Naval Research Logistics Quarterly, vol.3, issue.1-2, pp.95-110, 1956.

A. H. Nuttall, Some integrals involving the q m function, IEEE Transactions on Information Theory, vol.21, issue.1, pp.95-96, 1975.

J. E. Mitchell, Branch-and-cut algorithms for combinatorial optimization problems, vol.1, pp.65-77, 2002.

R. Bellman, A markovian decision process, Journal of mathematics and mechanics, pp.679-684, 1957.

K. J. Åström, Optimal control of markov processes with incomplete state information i, Journal of Mathematical Analysis and Applications, vol.10, pp.174-205, 1965.

L. P. Kaelbling, M. L. Littman, and A. R. Cassandra, Planning and acting in partially observable stochastic domains, Artificial intelligence, vol.101, issue.1-2, pp.99-134, 1998.

T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez et al., Continuous control with deep reinforcement learning, International Conference on Learning Representations, ICLR, 2016.

D. Silver, G. Lever, N. Heess, T. Degris, D. Wierstra et al., Deterministic policy gradient algorithms, Proceedings of the 31st International Conference on Machine Learning, PMLR, vol.32, pp.387-395, 2014.
URL : https://hal.archives-ouvertes.fr/hal-00938992

V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness et al., Human-level control through deep reinforcement learning, Nature, vol.518, issue.7540, pp.529-533, 2015.

S. C. Jaquette, Markov decision processes with a new optimality criterion: Discrete time, The Annals of Statistics, vol.1, issue.3, pp.496-505, 1973.

M. G. Bellemare, W. Dabney, and R. Munos, A distributional perspective on reinforcement learning, International Conference on Machine Learning, ICML, 2017.

W. Dabney, M. Rowland, M. G. Bellemare, and R. Munos, Distributional reinforcement learning with quantile regression, Thirty-Second AAAI Conference on Artificial Intelligence, 2018.

R. Koenker and K. F. Hallock, Quantile regression, Journal of economic perspectives, vol.15, issue.4, pp.143-156, 2001.

R. S. Sutton, Learning to predict by the methods of temporal differences, Machine learning, vol.3, issue.1, pp.9-44, 1988.

V. François-lavet, P. Henderson, R. Islam, M. G. Bellemare, and J. Pineau, An introduction to deep reinforcement learning, Foundations and Trends® in Machine Learning, vol.11, pp.219-354, 2018.

G. Barth-maron, M. W. Hoffman, D. Budden, W. Dabney, D. Horgan et al., Distributed distributional deterministic policy gradients, International Conference on Learning Representations, ICLR, 2018.

Z. Wang, T. Schaul, M. Hessel, H. Van-hasselt, M. Lanctot et al., Dueling network architectures for deep reinforcement learning, International Conference on Machine Learning, ICML, 2016.

M. Zaheer, S. Kottur, S. Ravanbakhsh, B. Poczos, R. R. Salakhutdinov et al., Deep sets, Advances in Neural Information Processing Systems, vol.30, pp.3391-3401, 2017.

N. Keriven and G. Peyré, Universal invariant and equivariant graph neural networks, Advances in Neural Information Processing Systems, pp.7090-7099, 2019.
URL : https://hal.archives-ouvertes.fr/hal-02484980

I. Liu, R. A. Yeh, and A. G. Schwing, Pic: Permutation invariant critic for multi-agent deep reinforcement learning, 2019.

R. Lowe, Y. Wu, A. Tamar, J. Harb, O. P. Abbeel et al., Multi-agent actor-critic for mixed cooperative-competitive environments, Advances in neural information processing systems, 2017.

J. K. Gupta, M. Egorov, and M. Kochenderfer, Cooperative multi-agent control using deep reinforcement learning, International Conference on Autonomous Agents and Multiagent Systems, 2017.

J. N. Foerster, G. Farquhar, T. Afouras, N. Nardelli, and S. Whiteson, Counterfactual multiagent policy gradients, Thirty-second AAAI conference on artificial intelligence, 2018.

, 3GPP TR 36.913 v15.0.0: Requirements for further advancements for E-UTRA (LTE-Advanced), 2018.

S. Ioffe and C. Szegedy, Batch normalization: Accelerating deep network training by reducing internal covariate shift, 2015.

H. Geoffrey, Neural networks for machine learning lecture 6a overview of minibatch gradient descent, 2012.

S. Omidshafiei, J. Pazis, C. Amato, J. P. How, and J. Vian, Deep decentralized multi-task multi-agent reinforcement learning under partial observability, International Conference on Machine Learning, ICML, Syndey, 2017.