| 
      
    
    Abstract:
 Curated databases in bioinformatics and other disciplines  are the result of a great deal of manual annotation, correction  and transfer of data from other sources. Provenance  information concerning the creation, attribution, or version  history of such data is crucial for assessing its integrity and  scientific value. General purpose database systems provide  little support for tracking provenance, especially when data  moves among databases. This paper investigates generalpurpose  techniques for recording provenance for data that  is copied among databases. We describe an approach in  which we track the user's actions while browsing source  databases and copying data into a curated database, in order  to record the user's actions in a convenient, queryable form.  We present an implementation of this technique and use it to  evaluate the feasibility of database support for provenance  management. Our experiments show that although the overhead  of a naive approach is fairly high, it can be decreased  to an acceptable level using simple optimizations. 
    Links To PaperSubmitted, anonymized version. To be replaced with conference final version.2nd Link 
    Bibtex format@InProceedings{EDI-INF-RR-0769,author	= {
  Peter Buneman
   and Adriane Chapman
   and James Cheney
},title   = {Provenance Management in Curated Databases},book title = {Proceedings of SIGMOD 2006 (International Conference on Management of Data)},publisher = {ACM},year = 2006,month = {Jun},pages = {539-550},doi = {10.1145/1142473.1142534},url = {http://homepages.inf.ed.ac.uk/jcheney/publications/copypaste.pdf},} |