english english chinese chinese
Bookmark and Share
Home > News

Turkish speech recognition Database (In-car) ---316 speakers was released

This Turkish in-car speech recognition database was collected by Speechocean’s project team in Turkey. This database is one of our databases of Speech Data---Car (SDC) Project which contains the database collections in more than 30 languages presently.
It contains the voices of 316 different native speakers who were balanced distributed by age (mainly 16-30,31-45,46-60), gender (156males, 160 females) and regional accents (for the details, please see the technical document).
The script was specially designed to provide material for both training and testing of many classes of speech recognizers which contains 320 utterances covering 15 categories and 35 sub-categories for each speaker (for the detail script structure design, please see the technical document).
Each speaker was recorded under two environments from three environments (Parked, City Driving and Highway driving) with kinds of recording conditions such as motor running, fan on/off, window up/down and etc. and totally 320 utterances were recorded for each speaker under two environments (160 utterances and spontaneous sentences per environment).
Four high quality audio channels (C1: SHURE SM10A, C2: SENNHEISER ME104, F1: AKG Q400, F2: AKG Q400) were used in a car, at least three popular cars were adopted and the speech data are stored as sequences of 16 kHz, 16 bit and uncompressed.
Each utterance is stored in a separate file and each signal file is accompanied by an ASCII SAM label file which contains the relevant descriptive information.
A pronunciation lexicon with a phonemic transcription in SAMPA is also included.
All the data was transcribed and labeled. The serial No of this database in Catalogue is King-ASR-134, for the detail information and sample, please click here.