This repository provides the official implementation of our paper: “Generative Urdu Speech Synthesis” Published in the IEEE Conference Proceedings, 2024.
IEEE Xplore: https://ieeexplore.ieee.org/document/10795832
DOI: 10.1109/ICCS62594.2024.10795832
For any suggestions feel free to email me at: ahanzala[dot]cs[at]gmail[dot]com
In recent years, Natural Language Processing (NLP) and speech synthesis have witnessed significant progress, resulting in the development of advanced Text-to-Speech (TTS) systems for various applications. While many TTS models excel in synthesizing English speech, their adaptability to new the languages, and diverse accents remains a challenging area of exploration. Urdu is a language spoken by millions of people around the globe especially in South Asia. Existing TTS models focus mainly on English and Chinese languages, having a minimal focus on Urdu and other low-resource languages. In this paper, we propose a generative Urdu TTS system. This research also undertakes a comprehensive investigation into the challenges associated with Urdu speech synthesis and evaluates the capabilities of Tortoise-TTS, a TTS model inspired by the DALL-E architecture, when applied to non-English languages, with a primary focus on Urdu.