Uncertainty as a Predictor: Leveraging Self-Supervised Learning for Zero-Shot MOS Prediction. Aditya Ravuri, Erica Cooper, Junichi Yamagishi. IEEE ICASSP 2024 workshop on Self-supervision in Audio, Speech and Beyond, Apr, 2024
Joint speaker encoder and neural back-end model for fully end-to-end automatic speaker verification with multiple enrollment utterances. Chang Zeng, Xiaoxiao Miao, Xin Wang, Erica Cooper, Junichi Yamagishi. Computer Speech & Language, 86 101619-101619, Jun, 2024
The VoiceMOS Challenge 2023: Zero-shot Subjective Speech Quality Prediction for Multiple Domains. Erica Cooper, Wen-Chin Huang, Yu Tsao, Hsin-Min Wang, Tomoki Toda, Junichi Yamagishi. ASRU 2023, Dec, 2023
Partial Rank Similarity Minimization Method for Quality MOS Prediction of Unseen Speech Synthesis Systems in Zero-Shot and Semi-supervised setting. Hemant Yadav, Erica Cooper, Junichi Yamagishi, Sunayana Sitaram, Rajiv Ratn Shah. ASRU 2023, Dec, 2023
Exploring Isolated Musical Notes as Pre-training Data for Predominant Instrument Recognition in Polyphonic Music. Lifan Zhong, Erica Cooper, Junichi Yamagishi, Nobuaki Minematsu. APSIPA ASC 2023, Oct, 2023
Investigating Range-Equalizing Bias in Mean Opinion Score Ratings of Synthesized Speech. Erica Cooper, Junichi Yamagishi. Interspeech 2023, Aug, 2023
SASPEECH: A Hebrew Single Speaker Dataset for Text to Speech and Voice Conversion. Orian Sharoni, Roee Shenberg, Erica Cooper. Interspeech 2023, Aug, 2023
Range-Based Equal Error Rate for Spoof Localization. Lin Zhang, Xin Wang, Erica Cooper, Nicholas Evans, Junichi Yamagishi. Interspeech 2023, Aug, 2023
Improving Generalization Ability of Countermeasures for New Mismatch Scenario by Combining Multiple Advanced Regularization Terms Chang Zeng, Xin Wang, Xiaoxiao Miao, Erica Cooper, Junichi Yamagishi. Interspeech 2023, Aug, 2023
Can Knowledge of End-to-End Text-to-Speech Models Improve Neural MIDI-to-Audio Synthesis Systems? Xuan Shi, Erica Cooper, Xin Wang, Junichi Yamagishi, Shrikanth Narayanan. Submitted to ICASSP 2023, Jun, 2023
Speaker Anonymization using Orthogonal Householder Neural Network. Xiaoxiao Miao, Xin Wang, Erica Cooper, Junichi Yamagishi, Natalia Tomashenko. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 1-15, 2023
The PartialSpoof Database and Countermeasures for the Detection of Short Fake Speech Segments Embedded in an Utterance. Lin Zhang, Xin Wang, Erica Cooper, Nicholas Evans, Junichi Yamagishi. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 31 813-825, 2023
Analyzing Language-Independent Speaker Anonymization Framework under Unseen Conditions. Xiaoxiao Miao, Xin Wang, Erica Cooper, Junichi Yamagishi, Natalia Tomashenko. Interspeech 2022, Sep, 2022
The VoiceMOS Challenge 2022. Wen-Chin Huang, Erica Cooper, Yu Tsao, Hsin-Min Wang, Tomoki Toda, Junichi Yamagishi. Interspeech 2022, Sep, 2022
Language-Independent Speaker Anonymization Approach using Self-Supervised Pre-Trained Models. Xiaoxiao Miao, Xin Wang, Erica Cooper, Junichi Yamagishi, Natalia Tomashenko. Odyssey 2022: The Speaker and Language Recognition Workshop, Jun, 2022
Attention Back-End for Automatic Speaker Verification with Multiple Enrollment Utterances. Chang Zeng, Xin Wang, Erica Cooper, Xiaoxiao Miao, Junichi Yamagishi. ICASSP 2022, May, 2022
On the Interplay Between Sparsity, Naturalness, Intelligibility, and Prosody in Speech Synthesis. Cheng-I Lai, Erica Cooper, Yang Zhang, Shiyu Chang, Kaizhi Qian, Yi-Lun Liao, Yung-Sung Chuang, Alexander Liu, Junichi Yamagishi, David Cox … ICASSP 2022, May, 2022
LDNet: Unified Listener Dependent Modeling in MOS Prediction for Synthetic Speech. Wen-Chin Huang, Erica Cooper, Junichi Yamagishi, Tomoki Toda. ICASSP 2022, May, 2022
Generalization Ability of MOS Prediction Networks. Erica Cooper, Wen-Chin Huang, Tomoki Toda, Junichi Yamagishi. ICASSP 2022, May, 2022
Use of Speaker Recognition Approaches for Learning and Evaluating Embedding Representations of Musical Instrument Sounds. Xuan Shi, Erica Cooper, Junichi Yamagishi. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 30 367-377, Jan, 2022
Multi-task learning in utterance-level and segmental-level spoof detection. Lin Zhang, Xin Wang, Erica Cooper, Junichi Yamagishi. ASVspoof 2021, Sep, 2021
An Initial Investigation for Detecting Partially Spoofed Audio. Lin Zhang, Xin Wang, Erica Cooper, Junichi Yamagishi, Jose Patino, Nicholas Evans. Interspeech 2021, Sep, 2021
How do Voices from Past Speech Synthesis Challenges Compare Today? Erica Cooper, Junichi Yamagishi. 11th ISCA Speech Synthesis Workshop, Aug, 2021
Text-to-Speech Synthesis Techniques for MIDI-to-Audio Synthesis. Erica Cooper, Xin Wang, Junichi Yamagishi. 11th ISCA Speech Synthesis Workshop, Aug, 2021
Exploring Disentanglement with Multilingual and Monolingual VQ-VAE. Jennifer Williams, Jason Fong, Erica Cooper, Junichi Yamagishi. 11th ISCA Speech Synthesis Workshop, Aug, 2021
Learning Disentangled Phone and Speaker Representations in a Semi-Supervised VQ-VAE Paradigm. Jennifer Williams, Yi Zhao, Erica Cooper, Junichi Yamagishi. ICASSP 2021, Jun, 2021
How Similar or Different Is Rakugo Speech Synthesizer to Professional Performers? Shuhei Kato, Yusuke Yasuda, Xin Wang, Erica Cooper, Junichi Yamagishi. ICASSP 2021, Jun, 2021
Improved Prosody from Learned F0 Codebook Representations for VQ-VAE Speech Waveform Reconstruction. Yi Zhao, Haoyu Li, Cheng-I Lai, Jennifer Williams, Erica Cooper, Junichi Yamagishi. Interspeech 2020, Oct, 2020
Can Speaker Augmentation Improve Multi-Speaker End-to-End TTS? Erica Cooper, Cheng-I Lai, Yusuke Yasuda, Junichi Yamagishi. Interspeech 2020, Oct, 2020
Zero-Shot Multi-Speaker Text-To-Speech with State-of-the-art Neural Speaker Embeddings. Erica Cooper, Cheng-I Lai, Yusuke Yasuda, Fuming Fang, Xin Wang, Nanxin Chen, Junichi Yamagishi. ICASSP 2020, May, 2020
Modeling of Rakugo Speech and Its Limitations: Toward Speech Synthesis That Entertains Audiences. Shuhei Kato, Yusuke Yasuda, Xin Wang, Erica Cooper, Shinji Takaki, Junichi Yamagishi. IEEE Access, 8 138149-138161, 2020
Subset Selection, Adaptation and Gemination for Amharic Text-to-Speech Synthesis. Elshadai Tesfaye Biru, Yishak Tofik Mohammed, David Tofu, Erica Cooper, Julia Hirschberg. 10th ISCA Speech Synthesis Workshop (SSW10), Sep, 2019
Rakugo speech synthesis using segment-to-segment neural transduction and style tokens — toward speech synthesis for entertaining audiences. Shuhei Kato, Yusuke Yasuda, Xin Wang, Erica Cooper, Shinji Takaki, Junichi Yamagishi. 10th ISCA Speech Synthesis Workshop (SSW10), Sep, 2019
A Comparison of Speaker-based and Utterance-based
Data Selection for Text-to-Speech Synthesis.
Kai-Zhan Lee, Erica Cooper, Julia Hirschberg.
Interspeech, September 2018, Hyderabad, India.
Adaptation and Frontend Features to Improve
Naturalness in Found-Data Synthesis.
Erica Cooper, Julia Hirschberg.
Speech Prosody, June 2018, Poznań, Poland.
Utterance Selection for Optimizing Intelligibility
of TTS Voices Trained on ASR Data.
Erica Cooper, Xinyue Wang, Alison Chang, Yocheved
Levitan, Julia Hirschberg.
Interspeech, August 2017, Stockholm, Sweden.
Data Selection and Adaptation for Naturalness in
HMM-based Speech Synthesis.
Erica Cooper, Alison Chang, Yocheved Levitan, Julia Hirschberg.
Interspeech, September 2016, San
Francisco, California.