Informatics Report Series



Related Pages

Report (by Number) Index
Report (by Date) Index
Author Index
Institute Index

Title:Interpolating between Types and Tokens by Estimating Power-Law Generators
Authors: Sharon Goldwater ; T.L. Griffiths ; Mark Johnson
Date: 2006
Publication Title:Advances in Neural Information Processing Systems 18
Publication Type:Conference Paper Publication Status:Published
Standard statistical models of language fail to capture one of the most striking properties of natural languages: the power-laws distribution in the frequencies of word tokens. We present a framework for developing statistical models that generically produce power-laws, augmenting standard generative models with an adaptor that produces the appropriate pattern of token frequencies. We show that taking a particular stochastic process - the Pitman-Yor process - as an adaptor justifies the appearance of type frequencies in formal analyses of natural language, and improves the performance of a model for unsupervised learning of morphology.
Links To Paper
1st link
Bibtex format
author = { Sharon Goldwater and T.L. Griffiths and Mark Johnson },
title = {Interpolating between Types and Tokens by Estimating Power-Law Generators},
book title = {Advances in Neural Information Processing Systems 18},
year = 2006,
url = {},

Home : Publications : Report 

Please mail <> with any changes or corrections.
Unless explicitly stated otherwise, all material is copyright The University of Edinburgh