Details of Research Outputs

TitleUFANS: U-Shaped Fully-Parallel Acoustic Neural Structure for Statistical Parametric Speech Synthesis
Author (Name in English or Pinyin)
Ma, D.1; Su, Z.1; Wang, W.2; Lu, Y.1; Li, Z.2
Date Issued2019
Source PublicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
ISSN03029743
DOI10.1007/978-3-030-29894-4_22
Indexed BySCOPUS
Firstlevel Discipline信息科学与系统科学
Education discipline科技类
Published range国外学术期刊
Volume Issue Pages卷: 11672 LNAI 页: 273-278
References
[1] Bi, M., Lu, H., Zhang, S., Lei, M., Yan, Z.: Deep feed-forward sequential memory networks for speech synthesis. In: ICASSP (2018). http://arxiv.org/abs/1802. 09194
[2] Blaauw, M., Bonada, J.: A neural parametric singing synthesizer. arXiv preprint arXiv:1704.03809 (2017)
[3] Ciresan, D.C., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classification. In: CVPR, pp. 3642–3649 (2012). http://arxiv.org/abs/1202. 2745
[4] Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: 3rd International Conference for Learning Representations (2014). http://arxiv.org/abs/1412. 6980
[5] Liu, B., Nie, S., Zhang, Y., Ke, D., Liang, S., Liu, W.: Boosting noise robustness of acoustic model via deep adversarial training. arXiv preprint arXiv:1805.01357 (2018)
[6] Morise, M., Yokomori, F., Ozawa, K.: World: a vocoder-based high-quality speech synthesis system for real-time applications. IEICE Trans. Inf. Syst. 99(7), 1877– 1884 (2016)
[7] van den Oord, A., Kalchbrenner, N., Vinyals, O., Espeholt, L., Graves, A., Kavukcuoglu, K.: Conditional image generation with PixelCNN decoders. In: Neural Information Processing Systems (2016). http://arxiv.org/abs/1606.05328
[8] Ping, W., et al.: Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning (2018)
[9] Protasio Ribeiro, F., Florencio, D., Zhang, C., Seltzer, M.: CROWDMOS: an approach for crowdsourcing mean opinion score studies. In: ICASSP. IEEE, May 2011. https://www.microsoft.com/en-us/research/publication/crowdmos-an-approach-for-crowdsourcing-mean-opinion-score-studies/
[10] Qian, Y., Fan, Y., Hu, W., Soong, F.K.: On the training aspects of deep neural network (DNN) for parametric TTS synthesis. In: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 3829–3833. IEEE (2014)
[11] Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer Assisted Intervention, pp. 234–241 (2015). http://arxiv.org/abs/1505.04597
[12] Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014). http://www.cs.toronto.edu/rsalakhu/papers/srivastava14a.pdf
[13] Stoller, D., Ewert, S., Dixon, S.: Wave-u-net: a multi-scale neural network for end-to-end audio source separation. arXiv preprint arXiv:1806.03185 (2018)
[14] Tokuda, K., Yoshimura, T., Masuko, T., Kobayashi, T., Kitamura, T.: Speech parameter generation algorithms for HMM-based speech synthesis. In: ICASSP (2000)
[15] Wu, Z., Watts, O., King, S.: Merlin: an open source neural network speech synthesis system. In: 9th ISCA Speech Synthesis Workshop 2016, pp. 202–207, September 2016. https://doi.org/10.21437/SSW.2016-33
Citation statistics
Cited Times [WOS]:0   [WOS Record]     [Related Records in WOS]
Document TypeJournal article
Identifierhttps://irepository.cuhk.edu.cn/handle/3EPUXD0A/1201
CollectionSchool of Science and Engineering
Corresponding AuthorLi, Z.
Affiliation
1.Turing Robot Ltd. Multi-modal Group, Beijing, China
2.School of Science and Engineering, Chinese University of Hong Kong, Shenzhen, Shenzhen, Guangdong, China
Corresponding Author AffilicationSchool of Science and Engineering
Recommended Citation
GB/T 7714
Ma, D.,Su, Z.,Wang, W.et al. UFANS: U-Shaped Fully-Parallel Acoustic Neural Structure for Statistical Parametric Speech Synthesis[J]. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics),2019.
APA Ma, D., Su, Z., Wang, W., Lu, Y., & Li, Z. (2019). UFANS: U-Shaped Fully-Parallel Acoustic Neural Structure for Statistical Parametric Speech Synthesis. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics).
MLA Ma, D.,et al."UFANS: U-Shaped Fully-Parallel Acoustic Neural Structure for Statistical Parametric Speech Synthesis".Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2019).
Files in This Item:
There are no files associated with this item.
Related Services
Usage statistics
Google Scholar
Similar articles in Google Scholar
[Ma, D.]'s Articles
[Su, Z.]'s Articles
[Wang, W.]'s Articles
Baidu academic
Similar articles in Baidu academic
[Ma, D.]'s Articles
[Su, Z.]'s Articles
[Wang, W.]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Ma, D.]'s Articles
[Su, Z.]'s Articles
[Wang, W.]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.