- Abstract:
-
We propose a class of constraints, referred to as conditional functional dependencies (CFDs), and study their applications in data cleaning. In contrast to traditional functional dependencies (FDs) that were developed mainly for schema design, CFDs aim at capturing the consistency of data by incorporating bindings of semantically related values. For CFDs we provide an inference system analogous to Armstrong's axioms for FDs, as well as consistency analysis. Since CFDs allow data bindings, a large number of individual constraints may hold on a table, complicating detection of constraint violations. We develop techniques for detecting CFD violations in SQL as well as novel techniques for checking multiple constraints in a single query. We experimentally evaluate the performance of our CFD-based methods for inconsistency detection. This not only yields a constraint theory for CFDs but is also a step toward a practical constraint-based method for improving data quality.
- Copyright:
- 2007 by The University of Edinburgh. All Rights Reserved
- Links To Paper
- 1st Link
- Bibtex format
- @InProceedings{EDI-INF-RR-0949,
- author = {
Wenfei Fan
and Philip Bohannon
and Floris Geerts
and Xibei Jia
and Anastasios Kementsiets
},
- title = {Conditional Functional Dependencies for Data Cleaning},
- book title = {Data Engineering, 2007, IEEE 23rd International Conference on},
- publisher = {IEEE},
- year = 2007,
- month = {Apr},
- volume = {2007},
- pages = {746-755},
- doi = {10.1109/ICDE.2007.367920},
- url = {http://ieeexplore.ieee.org/search/wrapper.jsp?arnumber=4221723},
- }
|