- Abstract:
-
Toponym resolution is the task of linking place name instances in a text with spatial footprints, given the context in which they occur. Whereas a lot of work on the evaluation of temporal resolution is ongoing (eg Setzer & Gaizauskas, 2000), to date no reference resource is available to evaluate competing algorithms for toponym resolution. It is thus argued that a shareable, reusable evaluation resource is necessary. To this end, a new proposal for the markup of toponyms in text corpora with their referents and an associated tool data methodology are presented: the Toponym Resolution Markup Language (TRML) is an XML-based markup language, and TAME, the toponym annotation markup editor, is a tool that implements it. A novel evaluation resource is described which comprises a large-scale reference gazetteer server and a human-annotated news corpus in which toponyms are associated with latitude/longitude coordinates of the location they refer to. The reliability of the annotation task is established by determining inter-annotator agreement of the human annotators.
[Elsevier CEUS 30(4) 400-417. Special Issue on Geographic Information Retrieval.]
- Links To Paper
- 1st Link
- Bibtex format
- @Article{EDI-INF-RR-0838,
- author = {
Jochen Leidner
},
- title = {An Evaluation Dataset for the Toponym Resolution Task},
- journal = {Computers, Environment and Urban Systems},
- publisher = {Elsevier Science},
- year = 2006,
- month = {Jul},
- volume = {30},
- pages = {400-417},
- doi = {http://dx.doi.org/10.1016/j.compenvurbsys.2005.07.},
- url = {http://dx.doi.org/10.1016/j.compenvurbsys.2005.07.003},
- }
|