Title:Catalyzer: a novel tool for integrating, managing and publishing heterogeneous bioscience data
Authors: Fred Howell ; Robert Cannon ; Nigel Goddard
Date: 2005
Publication Title:Concurrency
Publisher:Wiley & Sons
Publication Type:Journal Article Publication Status:Pre-print
The integrative ambitions of systems biology and neuroinformatics -- to construct working models of the machinery of living cells and brains -- will founder unless researchers have access to the huge amounts of diverse experimental data being collected. But the vast majority of bioscience research data which is gathered is never made available to other researchers, partly for want of adequate software for annotating experimental data, and partly for social reasons (researchers are rarely rewarded for publishing the actual data sets -- just for journal articles summarising findings). We have developed a novel software solution aimed at making it simpler for researchers to annotate and publish their research data. The first part of this solution is a desktop application ("Catalyzer") which lets researchers structure their data at source, and complements existing ad hoc solutions in use in labs (including cryptic filenames, Word, Excel, paper lab books) while being simpler and more flexible than relational databases, which are too complex for most bioscience researchers to set up. The catalogs produced by Catalyzer are stored in XML with a user defined schema, which will simplify future data mining efforts across large numbers of distributed data sets. Thus we term the approach "Structure At Source, Integrate As Required", with the initial focus on enabling the researchers to structure their own research data; only then will other researchers be able to integrate across data sets.
