Discrete Audio Tokens: More Than a Survey! in Transactions on Machine Learning Research
Project Description:
I was part of the core team on a large academic and industry collaboration led by Pooneh Mousavi, where we conducted a systematic review and benchmark of discrete audio tokenizers across speech, music, and general audio. We proposed a taxonomy of tokenization approaches, evaluated models on reconstruction, downstream performance, and acoustic language modeling, and released a database of tokenizers to support future research. This work was a fun, massive collaboration, and I'm excited to share that it has been accepted to Transactions on Machine Learning Research (TMLR).