Informatics Report Series


Report   

EDI-INF-RR-1194


Related Pages

Report (by Number) Index
Report (by Date) Index
Author Index
Institute Index

Home
Title:Cross-Lingual Relevance Models
Authors: Victor Lavrenko ; M. Choquette ; W.B. Croft
Date:Aug 2002
Publication Title:Proceedings of the 35th ACM Conference in Information Retrieval (SIGIR) 2002
Publication Type:Conference Paper Publication Status:Published
Page Nos:175-182
DOI:10.1145/564376.564408 ISBN/ISSN:1-58113-561-0
Abstract:
We propose a formal model of Cross-Language Information Retrieval that does not rely on either query translation or document translation. Our approach leverages recent advances in language modeling to directly estimate an accurate topic model in the target language, starting with a query in the source language. The model integrates popular techniques of disambiguation and query expansion in a unified formal framework. We describe how the topic model can be estimated with either a parallel corpus or a dictionary. We test the framework by constructing Chinese topic models from English queries and using them in the CLIR task of TREC9. The model achieves performance around 95% of the strong mono-lingual baseline in terms of average precision. In initial precision, our model outperforms the mono-lingual baseline by 20%. The main cantribution of this work is the unified formal model which integrates techniques that are essential for effective Cross-Language Retrieval.
Links To Paper
No links available
Bibtex format
@InProceedings{EDI-INF-RR-1194,
author = { Victor Lavrenko and M. Choquette and W.B. Croft },
title = {Cross-Lingual Relevance Models},
book title = {Proceedings of the 35th ACM Conference in Information Retrieval (SIGIR) 2002},
year = 2002,
month = {Aug},
pages = {175-182},
doi = {10.1145/564376.564408},
}


Home : Publications : Report 

Please mail <reports@inf.ed.ac.uk> with any changes or corrections.
Unless explicitly stated otherwise, all material is copyright The University of Edinburgh