| ID |
| King-TTS-017 |
|
| Name |
| European Portuguese Speech Corpus for TTS (Female) |
|
| Author |
| SpeechOcean |
|
| Language |
| Portuguese(Portugal) |
|
| Type |
| TTS |
|
| Sub-type |
| Desktop |
|
| Environment |
| Studio |
|
| Parameters |
| Sampling rate: 44.1K,16bit;Channels: Two Channel (Speech + EGG);Pure Recording Hours: 11 Hours;Phoneme set: SAMPA |
|
| labeling |
| 1. Revised reading prompt and corresponding phonetically sequence according to real speech;
2. TOBI based prosody annotations based on real speech.
3. Manually revised phone-level speech segmentation;
4. [Optional] Pronunciation lexicon;
5. [Optional] Pitchmarks/pitch extract from EEG;
|
|
| Resources purpose |
| R&D of European Portuguese TTS applications |
|
The European Portuguese (pt-PT) Speech Corpus consists one native Portuguese Female professional broadcaster (Female, 28 years old) recorded in a studio with high SNR (>35dB) over two channels (AKG C4000B microphone and Electroglottography (EGG) sensor).
The Corpus includes the following sub-corpora:
1. Sentence sub-corpus: including 3000 short sentences (7~12 words) and 2000 sentences with normal length (13~20 words). Considering all kinds of linguistic phenomena, all sentences are extracted from the daily articles in Portugal, such as national and international news, papers in life, travel, and so on. The sentences with political/religious/obscene/pornographic words which might lead to negative emotions are carefully excluded.
2. Emotional sub-corpus: including 100 exclamatory sentences and 100 interrogative sentences which can be used for emotional TTS study;
3. Digit sub-corpus: including many kinds of digits data, such as isolated digit, connected digits with blocks, natural and ordinal number readings;
4. Expression sub-corpus: consists of general expressions, such as date, time, money and measure expression;
5. Spell sub-corpus: including characters in alphabet, Greek characters and general abbreviations;
All reading prompts are manually revised and prosody annotations were made according to real speech. All speech data are segmented and labeled on phone level. Pronunciation lexicon and pitch extract from EEG can also be provided based on demands.