Étude de transformées temps-fréquence pour le codage audio faible retard en haute qualité

par David Virette (Virette, David)

Thèse de doctorat en Traitement du signal et télécommunications

Sous la direction de Pascal Scalart.

Soutenue en 2012

à Rennes 1 .

  • Résumé

    In recent years there has been a phenomenal increase in the number of products and applications which make use of audio coding formats. Among the most successful audio coding schemes, we can list the MPEG-1 Layer III (mp3), the MPEG-2 Advanced Audio Coding (AAC) or its evolution MPEG-4 High Efficiency-Advanced Audio Coding (HE-AAC). More recently, perceptual audio coding has been adapted to achieve low delay audio coding and to become suitable for conversational applications. Traditionally, the use of filter bank such as the Modified Discrete Cosine Transform (MDCT) is a central component of perceptual audio coding and its adaptation to low delay audio coding has become a very popular re­search topic. Low delay transforms have been developed Fin order to main­tain the performances of this main component while reducing dramatically the associated algorithmic delay. This work presents a low delay block switching tool which aliows the di­rect transition between long transform and short transform without the in­sertion of transition window. The same principle has been extended to de­fine new perfect reconstruction conditions for the MDCT with relaxed con­straints compared to the original definition. A seamless reconstruction method has been derived allowing to increase the flexibility of transform coding schemes with the possibility to select a transform window inde­pendently from the previous and the following frames. Additionally, based on this new approach, a new low delay window design procedure has been derived allowing to obtain an analytic defmition. Those new approaches have been successfully applied to the newly devel­oped MPEG low delay audio coding (LD-AAC and ELD-AAC) allowing to significantly improve the quality for transient signais. Moreover, the low delay window design has been adopted in G. 718, a scalable speech and au­dio codec standardized in ITU-T and has demonstrated its benefit in terms of delay reduction while maintaining the audio quality of a traditional MDCT.

Consulter en bibliothèque

La version de soutenance existe sous forme papier


  • Détails : 1 vol. (194 p.)
  • Notes : Publication autorisée par le jury
  • Annexes : Bibliogr. (91 réf.). Annexes

Où se trouve cette thèse ?

  • Bibliothèque : Ecole nationale supérieure des sciences appliquées et de technologie. Bibliothèque.
  • Disponible pour le PEB
  • Cote : M 3. VIR.
Voir dans le Sudoc, catalogue collectif des bibliothèques de l'enseignement supérieur et de la recherche.