- Abstract:
-
We propose asynchronous-transition HMM (AT-HMM) that is based on asynchronous transition structures among individual features of acoustic feature vector sequences. Conventional HMM represents vector sequences by using a chain of states, each state has vector distributions of multi-dimensions. Therefore, the conventional HMM assumes that individual features change synchronously. However, this assumption seems over-simplified for modeling the temporal behavior of acoustic features, since cepstrum and its time-derivative can not synchronize with each other. In speaker-dependent continuous phoneme recognition task, the AT-HMMs reduced errors by 10% to 40%. In speaker-independent task, the performance of the AT-HMMs was comparable to conventional HMMs.
- Links To Paper
- this url is for the conference paper containing a subset of the full paper in Japanese
- this url is for the conference paper containing a subset of the full paper in Japanese
- Bibtex format
- @Article{EDI-INF-RR-0673,
- author = {
Shigeki Matsuda
and Mitsuru Nakai
and Hiroshi Shimodaira
and Shigeki Sagayama
},
- title = {Speech Recognition Using Asynchronous Transition HMM},
- journal = {IEICE Trans. (D-II)},
- year = 2003,
- month = {Jun},
- volume = {J86-D-II(6},
- pages = {741-754},
- url = {http://intl.ieeexplore.ieee.org/xpl/abs_free.jsp?arNumber=859132},
- }
|