How Bell invented the telephone, American Institute of Electrical Engineers, vol.34, issue.8, pp.1503-1513, 1915. ,
Alexander Graham Bell and the invention of the telephone, Electronics & Power, vol.22, issue.3, pp.159-162, 1976. ,
The evolution of untethered communications, 1998. ,
Transmission features of the new telephone sets, Bell System Technical Journal, vol.17, issue.3, pp.358-380, 1938. ,
Telephony by pulse code modulation, The Bell System Technical Journal, vol.26, issue.3, pp.395-409, 1947. ,
History of pulse code modulation, Proc. Institution of Electrical Engineers, pp.889-892, 1979. ,
, ITU-T Recommendation P.341: Transmission characteristics of national networks, 1998.
The essential guide to telecommunications, 2002. ,
Discrete-Time Speech Signal Processing: Principles and Practice, 2002. ,
,
, , 1987.
Acoustic correlates of some phonetic categories, The Journal of the Acoustical Society of America, vol.68, issue.3, pp.836-842, 1980. ,
Artificial bandwidth extension of narrowband speechenhanced speech quality and intelligibility in mobile devices, 2013. ,
Digital speech transmission: Enhancement, coding and error concealment, 2006. ,
Quantifying and exploiting speech memory for the improvement of narrowband speech bandwidth extension, Canada, 2013. ,
On artificial bandwidth extension of telephone speech, Signal Processing, vol.83, issue.8, p.133, 2003. ,
Listener ratings of speech passbands, Proc. IEEE Workshop on Speech Coding For Telecommunications, pp.81-82, 1997. ,
Speech coding: A tutorial review, Proc. of the IEEE, vol.82, pp.1541-1582, 1994. ,
Development and evaluation of artificial bandwidth extension methods for narrowband telephone speech, 2013. ,
The philosophy of PCM, Proc. of the IRE, vol.36, issue.11, pp.1324-1331, 1948. ,
, ITU-T Recommendation G.711: Pulse code modulation (PCM) of voice frequencies, 2001.
,
, ITU-T Recommendation G.712: Transmission performance characteristics of pulse code modulation channels, 1988.
, ITU-T Recommendation G.726: 40, 32, 24, 16 kbit/s Adaptive Differential Pulse Code Modulation (ADPCM), 1990.
Foundation and evolution of standardized coders, 2003. ,
, ETSI Recommendation GSM 06.10 : Gsm full rate speech transcoding, 1992.
ITU-T G. 729 Annex A: reduced complexity 8 kb/s CS-ACELP codec for digital simultaneous voice and data, IEEE Communications Magazine, vol.35, issue.9, pp.56-63, 1997. ,
Description of ITU-T rec. G. 729 Annex A: Reduced complexity 8 kbit / s CS-ACELP coding, Proc. IEEE Int.l Conf. on Acoustics, Speech and Signal Processing, 1997. ,
, ETSI Recommendation GSM 06, Digital Cellular Telecommunications System, vol.60
, Enhanced Full Rate (EFR) Speech Transcoding, 1996.
Coding of speech at 8 kbit/s using Conjugate Structure Algebraic Code-Excited Linear-Prediction (CS-ACELP), vol.729, 1996. ,
, ETSI Recommendation GSM 06, Digital Cellular Telecommunications System, vol.90
, Adaptive Multi-Rate (AMR) Speech Transcoding, 1998.
, 3GPP ts 26.090: Mandatory Speech Codec speech processing functions
, Adaptive Multi-Rate (AMR) Speech Codec; Transcoding Functions, 2000.
, Mobile HD voice: Global Update report, 2016.
, ITU-T Recommendation G.722: 7 khz audio-coding within 64 kbit/s, 1988.
, ITU-T Recommendation G.722.1: Low-complexity coding at 24 and 32 kbit/s for hands-free operation in systems with low frame loss, 2005.
, Speech Codec Speech Processing Functions; Adaptive Multi-Rate -Wideband (AMR-WB) speech codec; Transcoding functions, 3GPP TS, vol.26, 2002.
, ITU-T Recommendation G.722.2: Wideband Coding of Speech at Around 16 kbits/s using Adaptive Multi-Rate Wideband (AMR-WB), 2002.
The adaptive multirate wideband speech codec: system characteristics, quality advances, and deployment strategies, IEEE Communications Magazine, vol.44, issue.5, pp.59-65, 2006. ,
, ITU-T Recommendation G.729.1: G.729 Based embedded variable bit-rate coder: An 8-32 kb/s scalable wideband coder bitstream interoperable with g.729, 2006.
Recent speech coding technologies and standards, Speech and Audio Processing for Coding, Enhancement and Recognition, pp.75-109, 2015. ,
, ITU-T Recommendation G.729.1: Annex E: Superwideband scalable extension for G.729.1, 2010.
,
Superwideband bandwidth extension using normalized MDCT coefficients for scalable speech and audio coding, Advances in Multimedia, vol.2013, 2013. ,
, 3GPP ts 26.290: Audio Codec Processing Functions; Extended Adaptive Multi-Rate -Wideband AMR-WB+ Codec; Transcoding functions, 2005.
AMR-WB+: a new audio coding standard for 3rd generation mobile audio services, Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, vol.2, p.1109, 2005. ,
, Low-complexity full-band audio coding for high-quality conversational applications, vol.719, 2008.
Bandwidth extension of audio signals by spectral band replication, Proc. IEEE Benelux Workshop on Model Based Processing and Coding of Audio (MPCA), pp.53-58, 2002. ,
Spectral Band Replication, a novel approach in audio coding, 2002. ,
MPEG-4 high-efficiency AAC coding, IEEE Signal Processing Magazine, vol.25, issue.3, pp.137-142, 2008. ,
Huawei white paper, 2014. ,
, Study of use cases and requirements for enhanced voice codecs for the Evolved Packet System (EPS), 2010.
Subjective quality evaluation of the 3GPP EVS codec, Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp.5157-5161, 2015. ,
, Codec for Enhanced Voice Services; Detailed algorithmic description, 3GPP TS 26, vol.445, 2014.
, 3GPP TS 26.441: Codec for Enhanced Voice Services; General overview, 2014.
, , vol.13, 2016.
Standardization of the new 3gpp evs codec, Proc. IEEE Int.l Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp.5703-5707, 2015. ,
Overview of the EVS codec architecture, Proc. Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp.5698-5702, 2015. ,
, Enhanced Voice Services (EVS): Market Update, Global mobile Suppliers Association (GSA), 2018.
Blind bandwidth extension of audio signals based on non-linear prediction and hidden Markov model, APSIPA Transactions on Signal and Information Processing, vol.3, p.8, 2014. ,
eAMR: Wideband speech over legacy narrowband networks, Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp.5110-5114, 2017. ,
Challenges of 16 khz in acoustic pre-and post-processing for terminals, IEEE Communications Magazine, vol.44, issue.5, pp.98-104, 2006. ,
,
An objective evaluation methodology for blind bandwidth extension, Proc. INTERSPEECH, pp.2548-2552, 2016. ,
Development, evaluation and implementation of an artificial bandwidth extension method of telephone speech in mobile terminal, IEEE Transactions on Consumer Electronics, vol.55, issue.2, pp.780-787, 2009. ,
High-definition telephony over heterogeneous networks, 2012. ,
Speech quality while roaming in next generation networks, Proc. IEEE Int.l Conf. on Communications, pp.1-5, 2009. ,
Subjective ratings of instantaneous and gradual transitions from narrowband to wideband active speech, Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp.4674-4677, 2010. ,
, 976: Performance characterization of the Adaptive Multi-Rate Wideband (AMR-WB) speech codec, 2003.
Investigation on neural bandwidth extension of telephone speech for improved speaker recognition, Proc. IEEE Int.l Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp.6111-6115, 2019. ,
DNN-based speech bandwidth expansion and its application to adding high-frequency missing features for automatic speech recognition of narrowband speech, Annual Conf. of the Int. Speech Comm. Association, 2015. ,
Cyclegan bandwidth extension acoustic modeling for automatic speech recognition, Proc. IEEE Int. Conf. on Acoustics, Speech Bibliography and Signal Processing (ICASSP), pp.6780-6784, 2019. ,
Large context end-to-end automatic speech recognition via extension of hierarchical recurrent encoder-decoder models, Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp.5661-5665, 2019. ,
On the relevance of bandwidth extension for speaker identification, Proc. IEEE European Signal Processing Conference (EUSIPCO), pp.1-4, 2002. ,
Investigation on blind bandwidth extension with a non-linear function and its evaluation of x-vector-based speaker verification, pp.4055-4059, 2019. ,
Effect of bandwidth extension to telephone speech recognition in cochlear implant users, The Journal of the Acoustical Society of America, vol.125, issue.2, pp.77-83, 2009. ,
Artificial speech bandwidth extension improves telephone speech intelligibility and quality in cochlear implant users, The Journal of the Acoustical Society of America, vol.145, issue.3, pp.1640-1649, 2019. ,
Sound-quality improvement of broadcast telephone calls, vol.38, p.31 ,
Enhancement of band-limited speech signals, 1983. ,
Signal restoration of broad band speech using nonlinear processing, Proc. IEEE European Signal Processing Conference, pp.1-4, 1996. ,
Performance and implementation of a robust ADPCM algorithm for wideband speech coding with 64 kbit/s, Proc. Int. Zürich Seminar on Digital Communications, 1984. ,
,
Speech bandwidth extension, 2002. ,
Enhancement of bandlimited speech signals: Algorithms and theoretical bounds, vol.32, p.69, 2002. ,
Bandwidth extension of telephony speech, Speech and Audio Processing in Adverse Environments, pp.135-184, 2008. ,
Speech analysis synthesis and perception, 1972. ,
Narrowband to wideband conversion of speech using GMM based transformation, Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), pp.1843-1846, 2000. ,
Statistical recovery of wideband speech from narrowband speech, IEEE Transactions on Speech and Audio Processing, vol.2, issue.4, pp.544-548, 1994. ,
Dual-mode wideband speech recovery from narrowband speech, Proc. INTERSPEECH, 2003. ,
HMM-based frequency bandwidth extension for speech enhancement using line spectral frequencies, Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, vol.1, p.709, 2004. ,
Mel-frequency cepstral coefficient-based bandwidth extension of narrowband speech, Proc. INTERSPEECH, pp.53-56, 2008. ,
Artificial bandwidth extension using deep neural networks for spectral envelope estimation, Proc. of Int. Workshop on Acoustic Signal Enhancement (IWAENC), pp.1-5, 2016. ,
,
Artificial bandwidth extension of telephone speech signals using phonetic: A priori knowledge, 2017. ,
Bandwidth enhancement of narrow-band speech signals, Proc. IEEE European signal processing conference (EUSIPCO), pp.1178-1181, 1994. ,
Speech bandwidth extension method and apparatus, vol.455, 1995. ,
An algorithm to reconstruct wideband speech from narrowband speech based on codebook mapping, Proc. Int. Conf. on Spoken Language Processing, 1994. ,
Spectrum broadening of telephone band signals using multirate processing for speech quality enhancement, IEICE Transations on Fundamentals of Electronics, vol.78, issue.8, pp.996-998, 1995. ,
Wideband extension of narrowband speech for enhancement and coding, 2000. ,
An algorithm for vector quantizer design, IEEE Transactions on communications, vol.28, issue.1, pp.84-95, 1980. ,
A new technique for wideband enhancement of coded narrowband speech, IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No. 99EX351), vol.34, p.33, 1999. ,
Audio bandwidth extension: application of psychoacoustics, signal processing and loudspeaker design, 2005. ,
Low-band extension of telephoneband speech, Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, pp.1851-1854, 2000. ,
,
Beyond Nyquist: Towards the recovery of broad-bandwidth speech from narrow-bandwidth speech, Proc. Fourth European Conference on Speech Communication and Technology, 1995. ,
Generation of broadband speech from narrowband speech using piecewise linear mapping, 1997. ,
Robust text-independent speaker identification using Gaussian mixture speaker models, IEEE Transactions on speech and audio processing, vol.3, issue.1, pp.72-83, 1995. ,
Spectral voice conversion for text-to-speech synthesis, Proc. Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp.285-288, 1998. ,
Statistical methods for voice quality transformation, Fourth European Conference on Speech Communication and Technology, 1995. ,
Combining equalization and estimation for bandwidth extension of narrowband speech, Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, vol.1, p.713, 2004. ,
The effect of memory inclusion on mutual information between speech frequency bands, Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, vol.3, 2006. ,
Objective analysis of the effect of memory inclusion on bandwidth extension of narrowband speech, Proc. INTER-SPEECH, pp.2489-2492, 2007. ,
Combining frontend-based memory with MFCC features for bandwidth extension of narrowband speech, Proc. Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp.4001-4004, 2009. ,
,
Speech bandwidth extension using Gaussian mixture model-based estimation of the highband mel spectrum, Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp.5100-5103, 2011. ,
Wideband extension of telephone speech using a hidden Markov model, IEEE Workshop on Speech Coding Proceedings. Meeting the Challenges of the New Millennium, pp.133-135, 2000. ,
Artificial bandwidth extension of speech signals using MMSE estimation based on a hidden Markov model, Proc. IEEE Int ,
, Conf. on Acoustics, Speech, and Signal Processing (ICASSP), pp. I-I, 2003.
A statistical framework for artificial bandwidth extension exploiting speech waveform and phonetic transcription, IEEE European Signal Processing Conference (EUSIPCO), pp.1839-1843, 2009. ,
On improving speech intelligibility in automotive hands-free systems, Proc. IEEE Int. Symposium on Consumer Electronics (ISCE 2010), pp.1-5, 2010. ,
On improving telephone speech intelligibility for hearing impaired persons, pp.1-4, 2012. ,
Impact of hearing impairment on fricative intelligibility for artificially bandwidth-extended telephone speech in noise, Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, pp.7039-7043, 2013. ,
HMM-based artificial bandwidth extension supported by neural networks, Proc. IEEE Int. Workshop on Acoustic Signal Enhancement (IWAENC), pp.1-5, 2014. ,
Evaluation of a speech bandwidth extension algorithm based on vocal tract shape estimation, Proc. IEEE Bibliography Int. Workshop on Acoustic Signal Enhancement (IWAENC), pp.1-4, 2012. ,
Speech signal band width extension and noise removal using subband HMN, Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), pp. I-245, 2002. ,
A codebook design method for fricative enhancement in artificial bandwidth extension, Proc. Int. Mobile Multimedia Communications Conf., p. 39, ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications, 2009. ,
Vocal tract area based artificial bandwidth extension, IEEE Workshop on Machine Learning for Signal Processing, pp.480-485, 2008. ,
Sparse probabilistic state mapping and its application to speech bandwidth expansion, Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp.4005-4008, 2009. ,
Artificial bandwidth extension of spectral envelope along a viterbi path, Speech Communication, vol.55, issue.1, pp.111-118, 2013. ,
Artificial bandwidth extension of spectral envelope with temporal clustering, Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), pp.5096-5099, 2011. ,
Language informed bandwidth expansion, Proc. IEEE Int. Workshop on Machine Learning for Signal Processing, pp.1-6, 2012. ,
Neural networks versus codebooks in an application for bandwidth extension of speech signals, Eighth European Conference on Speech Communication and Technology, 2003. ,
Deep neural networks for acoustic modeling in speech recognition, IEEE Signal processing magazine, vol.29, 2012. ,
,
Artificial speech bandwidth extension using deep neural networks for wideband spectral envelope estimation, Speech, and Language Processing, vol.26, pp.71-83, 2018. ,
Artificial bandwidth extension using deep neural networkbased spectral envelope estimation and enhanced excitation estimation, IET Signal Processing, vol.10, issue.4, pp.422-427, 2016. ,
Long short-term memory recurrent-neural-networkbased bandwidth extension for automatic speech recognition, Acoustical Science and Technology, vol.37, issue.6, pp.319-321, 2016. ,
Recurrent neural network for spectral mapping in speech bandwidth extension, Proc. IEEE Global Conf. on Signal and Information Processing (GlobalSIP), pp.242-246, 2016. ,
Blind bandwidth extension based on convolutional and recurrent deep neural networks, International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.5444-5448, 2018. ,
Using conditional restricted boltzmann machines for spectral envelope modeling in speech bandwidth extension, Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, pp.5930-5934, 2016. ,
Speech bandwidth extension using recurrent temporal restricted boltzmann machines, IEEE Signal Processing Letters, vol.23, issue.12, pp.1877-1881, 2016. ,
High-frequency regeneration in speech coding systems, Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, vol.4, pp.428-431, 1979. ,
Techniques for the regeneration of wideband speech from narrowband speech, EURASIP Journal on Applied Signal Processing, vol.2001, issue.1, pp.266-274, 2001. ,
,
Enhancement of band-limited speech signals, Proc. Aachen Symposium on Signal Theory, pp.331-336, 2001. ,
Spectral widening of the excitation signal for telephone-band speech enhancement, Proc. Int. Workshop on Acoustic Echo and Noise Control, pp.215-218, 2001. ,
Synchronous overlap and add of spectra for enhancement of excitation in artificial bandwidth extension of speech, Proc. INTERSPEECH, pp.2588-2592, 2015. ,
Bandwidth extension of speech signals, 2008. ,
The effect of highband harmonic structure in the artificial bandwidth expansion of telephone speech, Proc. Annual Conf. of the Int. Speech Communication Association, 2007. ,
A novel implementation of the spectral shaping approach for artificial bandwidth extension, Int. Conf. on Communications and Electronics, pp.262-267, 2010. ,
Neural network-based artificial bandwidth expansion of speech, IEEE Transactions on Audio, Speech, and language processing, vol.15, issue.3, pp.873-881, 2007. ,
Bandwidth extension of telephone speech using a neural network and a filter bank implementation for highband mel spectrum, IEEE Transactions on Audio, Speech, and Language Processing, vol.19, issue.7, pp.2170-2183, 2011. ,
Frequency recovery of narrow-band speech using adaptive spline neural networks, Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, pp.997-1000, 1999. ,
Artificial bandwidth expansion method to improve intelligibility and quality of AMR-coded narrowband speech, Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, vol.1, p.809, 2005. ,
,
Bandwidth extension of telephone speech using a filter bank implementation for highband mel spectrum, 18th European Signal Processing Conference, pp.979-983, 2010. ,
A deep neural network approach to speech bandwidth expansion, Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp.4395-4399, 2015. ,
A novel method of artificial bandwidth extension using deep architecture, Proc Annual Conf. of Int. Speech Communication Association, 2015. ,
A novel research to artificial bandwidth extension based on deep BLSTM recurrent neural networks and exemplar-based sparse representation, 2016. ,
Speech bandwidth extension using bottleneck features and deep recurrent neural networks, Proc. INTERSPEECH, pp.297-301, 2016. ,
Improved bottleneck features using pretrained deep neural networks, Proc. Annual Conf. of the Int. Speech Communication Association, 2011. ,
Modeling speech with sum-product networks: Application to bandwidth extension, Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, pp.3699-3703, 2014. ,
On representation learning for artificial bandwidth extension, Proc. Annual Conf. of the Int. Speech Communication Association, 2015. ,
Joint dictionary training for bandwidth extension of speech signals, Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, pp.5925-5929, 2016. ,
,
Signal estimation from modified short-time Fourier transform, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol.32, issue.2, pp.236-243, 1984. ,
WaveNet: A generative model for raw audio, vol.125, 2016. ,
Multi-scale context aggregation by dilated convolutions, 2015. ,
Semantic image segmentation with deep convolutional nets and fully connected crfs, 2014. ,
Waveform modeling using stacked dilated convolutional neural networks for speech bandwidth extension, Proc. INTER-SPEECH, pp.1123-1127, 2017. ,
Samplernn: An unconditional end-to-end neural audio generation model, 2016. ,
Waveform modeling and generation using hierarchical recurrent neural networks for speech bandwidth extension, IEEE Transactions on Audio, Speech, and Language Processing, vol.26, issue.5, pp.883-894, 2018. ,
Audio super resolution using neural networks, 2017. ,
Time-frequency networks for audio super-resolution, Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp.646-650, 2018. ,
FFTNet: A real-time speakerdependent neural vocoder, Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp.2251-2255, 2018. ,
,
Learning bandwidth expansion using perceptually-motivated loss, Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp.606-610, 2019. ,
Generative adversarial nets, Proc. Advanced Neural Information Processing Systems (NIPS), pp.2672-2680, 2014. ,
Artificial bandwidth extension using a conditional generative adversarial network with discriminative training, Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp.7005-7009, 2019. ,
Image-to-image translation with conditional adversarial networks, Proc. IEEE Conf. on computer vision and pattern recognition, pp.1125-1134, 2017. ,
Speech super resolution generative adversarial network, Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp.3717-3721, 2019. ,
Stabilizing training of generative adversarial networks through regularization, Proc. Advances in Neural Information Processing Systems (NIPS), pp, 2017. ,
Unpaired image-to-image translation using cycle-consistent adversarial networks, Proc. IEEE Int. Conf. on computer vision, pp.2223-2232, 2017. ,
Avoiding over-estimation in bandwidth extension of telephony speech, Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), pp.869-872, 2001. ,
Discriminative training of deep regression networks for artificial bandwidth extension, IEEE Int. Workshop on Acoustic Signal Enhancement (IWAENC), pp.540-544, 2018. ,
,
Gaussian mixture model based mutual information estimation between frequency bands in speech, Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, vol.1, p.525, 2002. ,
Feature selection for improved bandwidth extension of speech signals, Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), p.697, 2004. ,
Wrappers for feature subset selection, Artificial intelligence, vol.97, issue.1-2, pp.273-324, 1997. ,
Feature selection for dnn-based bandwidth extension, Proc. Jahrestagung für Akustik (DAGA), 2018. ,
Memory-based approximation of the gaussian mixture model framework for bandwidth extension of narrowband speech, Annual Conf. of the Int. Speech Communication Association, 2011. ,
Assessment and prediction of speech quality in telecommunications, vol.46, p.44, 2000. ,
Speech quality assessment, pp.623-654, 2011. ,
Speech Coding: with Code-Excited Linear Prediction, 2017. ,
, ITU-T Recommendation P. 800: Methods for subjective determination of transmission quality, 1996.
, Subjective evaluation of conversational quality, 3GPP Recommendation P, vol.805, 2007.
Conversational quality evaluation of artificial bandwidth extension of telephone speech, The Journal of the Acoustical Society of America, vol.132, issue.2, p.45, 2012. ,
,
, ITU-T Recommendation P. 800.1: Mean opinion score (MOS) terminology, 2016.
Integral and diagnostic intrusive prediction of speech quality, 2011. ,
On the evaluation of the conversational speech quality in telecommunications, EURASIP Journal on Advances in Signal Processing, p.93, 2008. ,
Speech quality of VoIP: assessment and prediction, 2007. ,
Speech and audio processing in adverse environments, 2008. ,
Correlation analysis of subjective and objective measures for speech quality, Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, vol.5, pp.706-709, 1980. ,
, ITU-T Recommendation P.862: Perceptual evaluation of speech quality (PESQ): An objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs, 2001.
, ITU-T Recommendation P.862.2: Wideband extension to Recommendation P.862 for the assessment of wideband telephone networks and speech codecs, 2005.
, ITU-T Recommendation P.863: Perceptual objective listening quality assessment, 2011.
An instrumental quality measure for artificially bandwidth-extended speech signals, IEEE Transactions on Audio, Speech and Language Processing, vol.25, issue.2, pp.384-396, 2017. ,
Quality improvement of telephone speech by artificial bandwidth expansion-listening tests in three languages, Ninth Int. Conf. on Spoken Language Processing, 2006. ,
,
Evaluation of an artificial speech bandwidth extension method in three languages, IEEE Transactions on Audio, Speech, and Language Processing, vol.16, issue.6, pp.1124-1137, 2008. ,
Speech quality prediction for artificial bandwidth extension algorithms, Proc. INTERSPEECH, pp.3439-3443, 2013. ,
On speech quality assessment of artificial bandwidth extension, Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp.6082-6086, 2014. ,
Speech quality evaluation of artificial bandwidth extension: Comparing subjective judgments and instrumental predictions, Proc. Annual Conf. of the Int. Speech Communication Association, 2015. ,
A subjective listening test of six different artificial bandwidth extension approaches in English, Chinese, German, and Korean, Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp.5915-5919, 2016. ,
High frequency reconstruction of audio signal based on chaotic prediction theory, Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp.381-384, 2010. ,
Audio bandwidth extension based on RBF neural network, IEEE Int. Symposium on Signal Processing and Information Technology (ISSPIT), pp.150-154, 2011. ,
Spectral envelope estimation used for audio bandwidth extension based on RBF neural network, Proc. IEEE Int ,
, Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp.543-547, 2013.
,
A blind bandwidth extension method of audio signals based on Volterra series, Proc. Asia Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference, pp.1-4, 2012. ,
Nonlinear time series analysis, 2004. ,
A blind bandwidth extension method for audio signals based on phase space reconstruction, Speech, and Music Processing, vol.2014, pp.1-9, 2014. ,
Nonlinear bandwidth extension based on nearest-neighbor matching, Proc. Asia Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference, pp.169-172, 2010. ,
Nonlinear bandwidth extension of audio signals based on hidden Markov model, IEEE Int. Symposium on Signal Processing and Information Technology (ISSPIT), pp.144-149, 2011. ,
Audio bandwidth extension based on ensemble echo state networks with temporal evolution, IEEE Transactions on Audio, Speech and Language Processing, vol.24, issue.3, pp.594-607, 2016. ,
Audio bandwidth extension based on temporal smoothing cepstral coefficients, EURASIP Journal on Audio, Speech, and Music Processing, vol.2014, issue.1, pp.1-16, 2014. ,
Efficient high-frequency bandwidth extension of music and speech, Audio Engineering Society Convention, 2002. ,
Enhancing the EVS Codec in Wideband Mode by Blind Artificial Bandwidth Extension to Superwideband, Proc. IEEE Int. Workshop on Acoustic Signal Enhancement (IWAENC), pp.281-285, 2018. ,
,
Artificial bandwidth extension of wideband speech by pitch-scaling of higher frequencies, INFORMATIK 2013-Informatik angepasst an Mensch, 2013. ,
Superwideband extension for AMR-WB using conditional codebooks, Proc. IEEE Int ,
, Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp.3695-3698, 2014.
Spectral linear prediction: Properties and applications, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol.23, issue.3, pp.283-296, 1975. ,
Linear prediction of speech, 2013. ,
Pattern recognition and machine learning, 2006. ,
Fundamentals of statistical signal processing, 1993. ,
The STFT, Sinusoidal Models, and Speech Modification, Springer handbook of speech processing, pp.229-258, 2007. ,
Applied Signal Processing: A MATLAB-Based Proof of Concept, 2010. ,
DARPA TIMIT acoustic-phonetic continous speech corpus CD-ROM. NIST speech disc 1-1.1, NASA STI/Recon technical report N, vol.93, 1993. ,
Database Version : 1.0, pp.2-10, 2002. ,
IEEE recommended practice for speech quality measurements, IEEE Transactions on Audio and Electroacoustics, vol.17, pp.225-246, 1969. ,
,
CMU ARCTIC databases for speech synthesis, 2003. ,
The Blizzard challenge 2005: Evaluating corpusbased speech synthesis on common databases, Proc. INTERSPEECH, 2005. ,
, ITU-T Recommendation P. 501, Test signals for use in telephonometry, 2012.
,
The AFsp package, 2002. ,
, ITU-T Recommendation G. 191: Software tools for speech and audio coding standardization, 2005.
, ITU-T Recommendation G. 191, Software Tool Library, 2009.
, ITU-T Recommendation P. 56, Objective measurement of active speech level, 2011.
Fundamentals of speech recognition, 1993. ,
Distance measures for speech processing, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol.24, issue.5, pp.380-391, 1976. ,
, ITU-T Recommendation P.862: Mapping function for transforming P.862 raw result scores to MOS-LQO, 2003.
Elements of information theory, 2012. ,
An overview of text-independent speaker recognition: From features to supervectors, Speech communication, vol.52, issue.1, pp.12-40, 2010. ,
URL : https://hal.archives-ouvertes.fr/hal-00587602
,
Principal component analysis, 2002. ,
Extracting deep bottleneck features using stacked auto-encoders, Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp.3377-3381, 2013. ,
Auto-encoder bottleneck features using deep belief networks, Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp.4153-4156, 2012. ,
A deep auto-encoder based low-dimensional feature extraction from fft spectral envelopes for statistical parametric speech synthesis, Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp.5535-5539, 2016. ,
Novel subband autoencoder features for detection of spoofed speech, pp.1820-1824, 2016. ,
Binary coding of speech spectrograms using a deep auto-encoder, Proc. Annual Conf. of Int. Speech Communication Association, 2010. ,
Semi-supervised training of a voice conversion mapping function using a joint-autoencoder, Proc. Annual Conf. of Int. Speech Communication Association, 2015. ,
A voice conversion mapping function based on a stacked joint-autoencoder, Proc. INTERSPEECH, pp.1647-1651, 2016. ,
Deep Learning, 2016. ,
Neural networks and principal component analysis: Learning from examples without local minima, Neural networks, vol.2, issue.1, pp.53-58, 1989. ,
,
Auto-association by multilayer perceptrons and Singular Value Decomposition, Biological cybernetics, vol.59, issue.4-5, pp.291-294, 1988. ,
Nonlinear autoassociation is not equivalent to PCA, Neural computation, vol.12, issue.3, pp.531-545, 2000. ,
Scaling learning algorithms towards ai, vol.34, pp.1-41, 2007. ,
Learning deep architectures for ai, Foundations and trends R in Machine Learning, vol.2, pp.1-127, 2009. ,
Reducing the dimensionality of data with neural networks, Science, vol.313, issue.5786, pp.504-507, 2006. ,
Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion, Journal of Machine Learning Research, vol.11, pp.3371-3408, 2010. ,
Greedy layerwise training of deep networks, Proc. Advances in Neural Information Processing Systems (NIPS), pp.153-160, 2007. ,
The difficulty of training deep architectures and the effect of unsupervised pretraining, Artificial Intelligence and Statistics, pp.153-160, 2009. ,
Why does unsupervised pre-training help deep learning?, Journal of Machine Learning Research, vol.11, pp.625-660, 2010. ,
Understanding the difficulty of training deep feedforward neural networks, Proc. Int. Conf. on Artificial Intelligence and Statistics, pp.249-256, 2010. ,
,
Quadratic polynomials learn better image features, 1337. ,
Delving deep into rectifiers: Surpassing human-level performance on imagenet classification, Proc. IEEE Int. Conf. on computer vision, pp.1026-1034, 2015. ,
Deep sparse rectifier neural networks, Proc. Int. Conf. on artificial intelligence and statistics, pp.315-323, 2011. ,
URL : https://hal.archives-ouvertes.fr/hal-00752497
Rectify nonlinearities improve neural network acoustic model, ICML Workshop on Deep Learning for Audio, Speech, and Language Processing, 2013. ,
Fast and accurate deep network learning by exponential linear units (elus), 2015. ,
Imagenet classification with deep convolutional neural networks, Proc. Advances in Neural Information Processing Systems (NIPS), pp.1097-1105, 2012. ,
On rectified linear units for speech processing, Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp.3517-3521, 2013. ,
Improving neural networks by preventing co-adaptation of feature detectors, 2012. ,
Dropout: A simple way to prevent neural networks from overfitting, The Journal of Machine Learning Research, vol.15, issue.1, pp.1929-1958, 2014. ,
Batch normalization: Accelerating deep network training by reducing internal covariate shift, Proc. Int. conf. on machine learning (ICML), pp.448-456, 2015. ,
,
CNN-based joint mapping of short and long utterance i-vectors for speaker verification using short utterances, Proc. INTERSPEECH, pp.3712-3716, 2017. ,
Keras, 2015. ,
Adam: A method for stochastic optimization, 2014. ,
Auto-encoding variational bayes, 2013. ,
, Tutorial on variational autoencoders, 2016.
Attribute2image: Conditional image generation from visual attributes, Proc. European Conference on Computer Vision, pp.776-791, 2016. ,
Learning structured output representation using deep conditional generative models, Proc. Advances in Neural Information Processing Systems (NIPS), pp.3483-3491, 2015. ,
Semi-supervised learning with deep generative models, Proc. Advances in Neural Information Processing Systems (NIPS), pp.3581-3589, 2014. ,
Learning latent representations for speech generation and transformation, pp.1273-1277, 2017. ,
Modeling and transforming speech using variational autoencoders, Proc. INTERSPEECH, pp.1770-1774, 2016. ,
Voice conversion from non-parallel corpora using variational auto-encoder, Signal and Information Processing Association Annual Summit and Conference (APSIPA) Annual Summit and Conference, pp.1-6, 2016. ,
,
Expressive speech synthesis via modeling expressions with variational autoencoder, pp.3067-3071, 2018. ,
Joint learning using denoising variational autoencoders for voice activity detection, pp.1210-1214, 2018. ,
Variational autoencoders for learning latent representations of speech emotion, pp.3107-3111, 2018. ,
Monoaural audio source separation using variational autoencoders, pp.3489-3493, 2018. ,
Discrete-time signal processing, 1989. ,
Digital signal processing: principles algorithms and applications. Pearson Education India, 2001. ,
, 3GPP TS 26.442: Codec for Enhanced Voice Services
, , 2015.
, ITU-T Recommendation P, Transmission Characteristics for Wideband Digital Loudspeaking and Hands-free Telephony Terminals, vol.341, 2011.
, AMR-WB) speech codec, 3GPP TS 26.173: ANSI-C Code for the Adaptive Multi-Rate -Wideband, 2002.
Voice quality evaluation of various codecs, Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp.4662-4665, 2010. ,
,
Level discrimination as a function of level for tones from 0.25 to 16 khz, The Journal of the Acoustical Society of America, vol.81, issue.5, pp.1528-1541, 1987. ,