- Abstract:
-
The goal of this project was to build a unit selection voice that could portray emotions with varying intensities. A suitable definition of an emotion was developed along with a descriptive framework that supported the work carried out. A single speaker was recorded portraying happy and angry speaking styles. Additionally a neutral database was also recorded. A target cost function was implemented that chose units according to emotion mark-up in the database. The Dictionary of Affect supported the emotional target cost function by providing an emotion rating for words in the target utterance. If a word was particularly 'emotional', units from that emotion were favoured. In addition intensity could be varied which resulted in a bias to select a greater number emotional units. A perceptual evaluation was carried out and subjects were able to recognise reliably emotions with varying amounts of emotional units present in the target utterance.
- Links To Paper
- 1st link
- Bibtex format
- @InProceedings{EDI-INF-RR-0294,
- author = {
Gregor Hofer
and Korin Richmond
and Rob Clark
},
- title = {Informed blending of databases for emotional speech synthesis.},
- book title = {Proceedings of Interspeech'05},
- year = 2005,
- month = {Sep},
- pages = {501-504},
- url = {http://www.cstr.ed.ac.uk/downloads/publications/2005/hofer_emosyn.pdf},
- }
|