数据集下载 > MAGICDATA Kid Voice TTS Corpus in Mandarin Chinese 爱数智慧中文童声语音合成数据集

MAGICDATA Kid Voice TTS Corpus in Mandarin Chinese 爱数智慧中文童声语音合成数据集

MAGICDATA Kid Voice TTS Corpus in Mandarin Chinese was recorded by a four-year-old Chinese girl originally born in Beijing China. This time we published 15-minute speech data from the corpus for non-commercial use.


The contents and the corresponding descriptions of the corpus:
  (1) The corpus contains 15 minutes of speech data, which is recorded in NC-20 acoustic studio.
  (2) The speaker is 4 years old originally born in Beijing
  (3) Detail information such as speech data coding and speaker information is preserved in the metadata file.
  (4) This corpus is natural kid style.
  (5) Annotation includes four parts: pronunciation proofreading, prosody labeling, phone boundary labeling and POS Tagging.
  (6) The annotation accuracy is higher than 99%.
  (7) For phone labeling, the database contains the annotation not only on the boundary of phonemes, but also on the boundary of the silence parts.

This is the first time to publish this voice!


The corpus aims to help researchers in the TTS fields. And it is part of a much bigger dataset (2.3 hours MAGICDATA Kid Voice TTS Corpus in Mandarin Chinese) which was recorded in the same environment.


Speaker intro:
The speaker, NiuNiu, is lively and cheerful. When she first came to the studio, she couldn't wait to introduce herself. "My name is NiuNiu, I am 4 years old." An outgoing child can always get along with others quickly. NiuNiu ‘s favorite cartoons are “Frozen” and “My Little Pony”.


Please note that this corpus has got the speaker and her parents’ authorization.

For more details or for commercial use, please contact us:
Fax: +86-10-82527250
E-mail: business@magicdatatech.com

爱数智慧中文童声语音合成数据集发音人为4岁女童(北京人),包含2235句话,时长超2小时。该数据集文本为日常用语。
数据集详情:
 (1)本开源数据集时长为15分钟。
 (2)录制环境为底噪符合NC-20标准的录音室。
 (3)数据集音频语速较慢,符合儿童发音习惯。
 (4)标注包含以下四个环节:发音校对、韵律层级、音素边界切分和分词词性标注
 (5)标注精确度为:音字校验99.9%,韵律层级99%,分词词性99%,音素边界切分99%。
 (6)音素边界除对声韵母边界进行切分外,还包含对句首尾和句中的静音段的精准切分。


发音人妞妞简介:
妞妞是个活泼开朗的小朋友。第一次来录音间时,她就迫不及待地和工作人员自我介绍“我叫妞妞,我今年4岁啦”。爱笑的小朋友总是能很快地融入环境。录音互动中,工作人员都喜欢上了这个“爱黏人”的小宝贝。妞妞最喜欢看的动画片是冰雪奇缘和小马宝莉。


该发音人声音第一次用于TTS录制,音频已获得发音人与其监护人授权。


爱数智慧中文童声语音合成数据集由北京爱数智慧科技有限公司开发,免费发布供非商业使用。

如需更多声音,欢迎致电+86-10-82527250或邮件发送至 business@magicdatatech.com。



本作品采用知识共享署名-非商业性使用-禁止演绎 4.0 国际许可协议进行许可
知识共享许可协议

imagicdatatech.com Beijing MAGIC DATA Co., Ltd.