Informatics Report Series


Report   

EDI-INF-RR-0318


Related Pages

Report (by Number) Index
Report (by Date) Index
Author Index
Institute Index

Home
Title:An Expectation Maximisation Algorithm for One-to-Many Record Linkage, Illustrated on the Problem of Matching Far Infra-Red Astronomical Sources to Optical Counterparts
Authors: Amos Storkey ; Chris Williams ; Emma Taylor ; Robert G Mann
Date:Aug 2005
Publication Title:Informatics Research Report
Abstract:
The problem of record linkage is often seen simply in terms of making links between data points that might be generated from the same source. However, in many cases the grounds for linking items is itself not certain. In fact it is often desirable to learn, in an unsupervised manner, what form linked objects take in different databases. One simple case of this is the ``one to many'' linkage problem, where each object in one dataset is potentially linked to one of many objects in another dataset, and where the candidate matches are mutually exclusive. We show how the Expectation Maximisation algorithm can be used for this matching problem, both to calculate the probability of a match, and to learn something about the characteristics that matched objects have. The approach is derived for the specific astronomical problem of linking far infra-red observations to optical counterparts, but is generally applicable. This report outlines the theory of this record linkage procedure, but does not discuss its application or any implementational details.
Links To Paper
1st Link
Bibtex format
@Misc{EDI-INF-RR-0318,
author = { Amos Storkey and Chris Williams and Emma Taylor and Robert G Mann },
title = {An Expectation Maximisation Algorithm for One-to-Many Record Linkage, Illustrated on the Problem of Matching Far Infra-Red Astronomical Sources to Optical Counterparts},
year = 2005,
month = {Aug},
url = {http://www.anc.ed.ac.uk/~amos/publications/StorkeyEtAl2005AnEMAlgorithmForRecordLinkage.pdf},
}


Home : Publications : Report 

Please mail <reports@inf.ed.ac.uk> with any changes or corrections.
Unless explicitly stated otherwise, all material is copyright The University of Edinburgh