Informatics Report Series


Report   

EDI-INF-RR-1202


Related Pages

Report (by Number) Index
Report (by Date) Index
Author Index
Institute Index

Home
Title:Contextual Dependencies in Unsupervised Word Segmentation
Authors: Sharon Goldwater ; T.L. Griffiths ; Mark Johnson
Date: 2006
Publication Title:Proceedings of Coling/ACL, Sydney, 2006
Publisher:ACL
Publication Type:Conference Paper Publication Status:Published
Page Nos:673-680
Abstract:
Developing better methods for segmenting continuous text into words is important for improving the processing of Asian languages, and may shed light on how humans learn to segment speech. We propose two new Bayesian word segmentation methods that assume unigram and bigram models of word dependencies respectively. The bigram model greatly out-performs the unigram model (and previous probabilistic models), demonstrating the importance of such dependencies for word segmentation. We also show that previous probabilistic models rely crucially on sub-optimal search procedures.
Links To Paper
1st link
Bibtex format
@InProceedings{EDI-INF-RR-1202,
author = { Sharon Goldwater and T.L. Griffiths and Mark Johnson },
title = {Contextual Dependencies in Unsupervised Word Segmentation},
book title = {Proceedings of Coling/ACL, Sydney, 2006},
publisher = {ACL},
year = 2006,
pages = {673-680},
url = {http://acl.ldc.upenn.edu/P/P06/P06-1085.pdf},
}


Home : Publications : Report 

Please mail <reports@inf.ed.ac.uk> with any changes or corrections.
Unless explicitly stated otherwise, all material is copyright The University of Edinburgh