Bengali Text to IPA conversion
Bengali is among the most widely spoken native languages in the world, yet Bengali text-to-IPA (International Phonetic Alphabet) transcription remains underdeveloped compared to other languages. In this project, we designed a model to convert Bengali text into its corresponding IPA representation.
We fine-tuned the ByT5 model, leveraging its byte-level tokenization, which is language-independent and well-suited for Bengali. Our model achieved Word Error Rate (WER) of 0.01420, securing first place on the competition leaderboard.
