CV
Education
-
Ph.D. Candidate in Intelligent Systems Engineering and Computer Science
Luddy School of Informatics, Computing, and Engineering, Indiana University (GPA: 4.0 / 4.0)
2021 — 2024 (Expected)-
Selected Coursework: Machine Learning for Signal Processing, Deep Learning, Computer Vision, Applied Machine Learning
-
Research Group: Signals and AI Group in Engineering (SAIGE)
-
Advisor: Prof. Minje Kim
-
-
M.Sc. in Information and Communication Engineering
Dept. of Information and Communication Technologies, Universitat Pompeu-Fabra (GPA: 9.5 / 10.0)
2019 — 2020-
Selected Coursework: Music Information Retrieval, System Design, Audio Signal Processing, ML for Audio, Research Methods, Reinforcement Learning
-
Thesis: “SATB Voice Segregation for Monaural Recordings”
-
Advisor: Dr. Pritish Chandna
-
-
B.M. in Electronic Production & Design
Electronic Production & Design Dept, Berklee College of Music (GPA: 3.8 / 4.0)
2013 — 2016-
Selected Coursework: Digital Signal Processing, Physical Computing, Audio Programming in C, Principles of Audio Electronics, Music Acoustics
-
Thesis: “A Deep Look at Spectral Synthesis Techniques Through csConvolve”
-
Advisor: Dr. Richard Boulanger
-
Experience
-
Netflix
Machine Learning Research Intern
Jun 2024 — Nov 2024 Los Gatos CA-
Supervised by Dr. Mahdi Kalayeh
-
Working on multimodal (audio-visual) generative AI problems
-
-
Google Research
Student Researcher
May 2023 — Oct 2023 Cambridge MA-
Supervised by Dr. Hakan Erdogan, Dr. John Herhsey, and Dr. Scott Wisdom
-
Working on next-level unsupervised audio source separation problems
-
-
Mitsubishi Electric Research Labs (MERL)
Research Intern
Summers in 2021, 2022 Cambridge MA-
Received $15'000 of gift-money from MERL to work on the “Cocktail Fork Problem”
-
Hosted by Dr. Gordon Wichern and Dr. Jonathan Le Roux
-
Derived and implemented new models and optimization methods for audio analysis with applications to source separation in challenging multi-source environments and using advanced machine learning techniques
-
-
Signals and AI Group in Engineering (SAIGE), Indiana University
Research Assistant
Jan 2021 — Present Bloomington IN-
Conducting research pertaining to neural audio coding and audio source separation problems
-
Supervised by Prof. Minje Kim
-
-
Senseable Intelligence Group, Massachusetts Institute of Technology (MIT)
Technical Lab Assistant
Nov 2020 - Apr 2021 Cambridge MA-
Contractor for Senseable Intelligence Group, McGovern Institute for Brain Research, led by Dr. Satrajit Gosh
-
-
Apple Inc.
Content Engineer
Jun 2016 — Jul 2019 Cupertino CA-
Software engineer for Apple's pro Audio & Music Apps (LogicPro, GarageBand)
-
Designed real-time MIDI processing systems in C++ for Apple’s virtual musical instruments
-
-
Electronic Production & Design Dept, Berklee College of Music
Programming Tutor
Sep 2015 — May 2016 Boston MA-
Tutored and mentored EPD students for technical classes: “Audio Programming in C”, “Digital Signal Processing”, “Csound”, “Max/MSP”
-
Publications
-
D. Petermann and M. Kim, “Hyperbolic distance-based speech separation,” in Proc. of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024, (to appear).
-
D. Petermann, I. Jang, and M. Kim, “Native multi-band audio coding within hyper-autoencoded reconstruction propagation networks,” in Proc. of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023, pp. 1-5.
-
D. Petermann, G. Wichern, A. Subramanian, and J. L. Roux, “Hyperbolic audio source separation,” in Proc. of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023, pp. 1-5.[Best Student Paper Award & Top 3% Papers]
-
D. Petermann and M. Kim, “SpaIn-Net: Spatially-informed stereophonic music source separation,” in Proc. of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022, pp. 106-110.
-
D. Petermann, G. Wichern, Z.-Q. Wang, and J. L. Roux, “The cocktail fork problem: Three-stem audio separation for real-world soundtracks,” in Proc. of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022, pp. 526-530.
-
D. Petermann, S. Beack, and M. Kim, “Harp-net: Hyper-autoencoded reconstruction propagation for scalable neural audio coding,” in Proc. of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2021, pp. 316-320.
-
D. Petermann, P. Chandna, H. Cuesta, J. Bonada, and E. Gomez, “Deep learning based source separation applied to choir ensembles,” in Proc. of the International Society for Music Information Retrieval Conference (ISMIR), 2020, pp. 733-739.
Journal Articles
-
D. PetermannG. Wichern, A. S. Subramanian, Z.-Q. Wang, and J. L. Roux, “Tackling the cocktail fork problem for separation and transcription of real-world soundtracks,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 31, pp. 2592-2605, 2023.
-
P. Chandna, H. Cuesta, D. Petermann, and E. G ́omez, “A deep-learning based framework for source separation, analysis, and synthesis of choral ensembles,” Frontiers in Signal Processing, vol. 2, 2022.
Patents
-
S. K. Beack, W. Lim, I. Jang, et al., Audio signal encoding/decoding methods and apparatus for performing the same, US Patent App. 63/420 405, 2023.
-
D. Petermann, G. Wichern, A. Subramanian, and J. L. Roux, Audio source separation using hyperbolic embeddings, US Patent App. 18/191 417, 2023.