- Abstract:
-
Normalization as a way of producing good database designs is a well-understood topic. However, the same problem of distinguishing well-designed databases from poorly designed ones arises in other data models, in particular, XML. While in the relational world the criteria for being well-designed are usually very intuitive and clear to state, they become more obscure when one moves to more complex data models.
Our goal is to provide a set of tools for testing when a condition on a database design, specified by a {\em normal form}, corresponds to a good design. We use techniques of information theory, and define a measure of information content of elements in a database with respect to a set of constraints. We first test this measure in the relational context, providing information-theoretic justification for familiar normal forms such as BCNF, 4NF, PJ/NF, 5NFR, DK/NF. We then show that the same measure applies in the XML context, which gives us a characterization of a recently introduced XML normal form called XNF. Finally, we look at information-theoretic criteria for justifying normalization algorithms.
- Links To Paper
- 1st Link
- Bibtex format
- @Article{EDI-INF-RR-0833,
- author = {
Leonid Libkin
and Marcelo Arenas
},
- title = {An information-theoretic approach to normal forms for relational and XML data.},
- journal = {Journal of the ACM},
- publisher = {ACM},
- year = 2005,
- volume = {52},
- pages = {246-283},
- doi = {10.1145/1059513.1059519},
- url = {http://homepages.inf.ed.ac.uk/libkin/papers/jacm-pods03.ps.gz},
- }
|