Tacotron: Towards end-to-end speech synthesis Y Wang, RJ Skerry-Ryan, D Stanton, Y Wu, RJ Weiss, N Jaitly, Z Yang, ... arXiv preprint arXiv:1703.10135, 2017 | 2450* | 2017 |
Style tokens: Unsupervised style modeling, control and transfer in end-to-end speech synthesis Y Wang, D Stanton, Y Zhang, RJS Ryan, E Battenberg, J Shor, Y Xiao, ... International conference on machine learning, 5180-5189, 2018 | 962 | 2018 |
Towards end-to-end prosody transfer for expressive speech synthesis with tacotron RJ Skerry-Ryan, E Battenberg, Y Xiao, Y Wang, D Stanton, J Shor, ... international conference on machine learning, 4693-4702, 2018 | 683 | 2018 |
Predicting expressive speaking style from text in end-to-end speech synthesis D Stanton, Y Wang, RJ Skerry-Ryan 2018 IEEE Spoken Language Technology Workshop (SLT), 595-602, 2018 | 144 | 2018 |
Location-relative attention mechanisms for robust long-form speech synthesis E Battenberg, RJ Skerry-Ryan, S Mariooryad, D Stanton, D Kao, ... ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 130 | 2020 |
Uncovering latent style factors for expressive speech synthesis Y Wang, RJ Skerry-Ryan, Y Xiao, D Stanton, J Shor, E Battenberg, ... arXiv preprint arXiv:1711.00520, 2017 | 88 | 2017 |
Semi-supervised generative modeling for controllable speech synthesis R Habib, S Mariooryad, M Shannon, E Battenberg, RJ Skerry-Ryan, ... arXiv preprint arXiv:1910.01709, 2019 | 59 | 2019 |
Effective use of variational embedding capacity in expressive end-to-end speech synthesis E Battenberg, S Mariooryad, D Stanton, RJ Skerry-Ryan, M Shannon, ... arXiv preprint arXiv:1906.03402, 2019 | 58 | 2019 |
Temporal ranking scheme for desktop searching S Raub, A Dingle, D Stanton US Patent 7,529,739, 2009 | 57 | 2009 |
A systematic comparison of phrase table pruning techniques R Zens, D Stanton, P Xu Proceedings of the 2012 Joint Conference on Empirical Methods in Natural …, 2012 | 49 | 2012 |
Speaker generation D Stanton, M Shannon, S Mariooryad, RJ Skerry-Ryan, E Battenberg, ... ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 34 | 2022 |
Combined title prefix and full-word content searching D Stanton, S Raub, A Dingle US Patent 7,617,197, 2009 | 26 | 2009 |
Non-saturating GAN training as divergence minimization M Shannon, B Poole, S Mariooryad, T Bagby, E Battenberg, D Kao, ... arXiv preprint arXiv:2010.08029, 2020 | 21 | 2020 |
Variational embedding capacity in expressive end-to-end speech synthesis ED Battenberg, D Stanton, RJW Skerry-Ryan, S Mariooryad, DT Kao, ... US Patent 11,222,621, 2022 | 17 | 2022 |
Fix it where it fails: Pronunciation learning by mining error corrections from speech logs Z Kou, D Stanton, F Peng, F Beaufays, T Strohman 2015 IEEE International Conference on Acoustics, Speech and Signal …, 2015 | 14 | 2015 |
Controlling expressivity in end-to-end speech synthesis systems D Stanton, ED Battenberg, RJW Skerry-Ryan, S Mariooryad, DT Kao, ... US Patent 11,676,573, 2023 | 6 | 2023 |
Learning the joint distribution of two sequences using little or no paired data S Mariooryad, M Shannon, S Ma, T Bagby, D Kao, D Stanton, ... arXiv preprint arXiv:2212.03232, 2022 | 3 | 2022 |
Document translation including pre-defined term translator and translation model JJ Chin, D Stanton, VS Thadkal, J Yin US Patent 9,116,886, 2015 | 3 | 2015 |
Neural-network-based text-to-speech model for novel speaker generation DA Stanton, SM Shannon, S Mariooryad, RJW Skerry-Ryan, ... US Patent 12,087,275, 2024 | | 2024 |
Variational embedding capacity in expressive end-to-end speech synthesis ED Battenberg, D Stanton, RJW Skerry-Ryan, S Mariooryad, DT Kao, ... US Patent 12,067,969, 2024 | | 2024 |