Bertrand Henri Rihm, Solveig Vidal, Claude Nemurat, Sébastien Vachenc, Steve Mohr, Florian Mazur, Philippe Houdry, Francoise Grandjean, Sophie Visvikis, Jacques Ducloy
Med Sci Monit 2003; 9(8): MT89-95
Background:Current biological investigations tend to operate with genomes, instead of genes as during the last century. It is possible to compare entire genomes, transcriptomes or proteomes, using alphanumeric data corresponding to the differential expression levels of thousands of genes. What remains difficult is to link array results to factual or bibliographical data and retrieve information that is highly structured and - in Shannon’s sense - rare.Material/Methods:We have developed a tool, Documentation and Information LIBrary (DILIB), that enables us to retrieve, organize and analyze huge amounts of data available on the Internet and related to microarray experiments. DILIB can link hundreds of differentially expressed genes – through their Single Identifier or GenBank accession number – to hundreds of Medline records, and can retrieve, analyze, and compare automatically thousands of non-trivial descriptors related to gene clusters.Results:As exemplified with frequency comparison of MEdical Subject Headings and Registry Number descriptors, we reanalyzed the involvement of ‘integrin’, ‘interleukin’ and ‘CD Antigens’ in mesotheliomas. Thus, DILIB allowed us to: (i) associate literature to expressed genes, (ii) link functional transcriptomes in various experiments, (iii) associate specific descriptors to experiments, (iv) define new research areas, and eventually (v) find new functions for co-expressed genes.Conclusions:We propose a new concept, ‘bibliomics’, representing a subset of high quality and rare information, retrieved and organized by systematic literature-searching tools from existing databases, and related to a subset of genes functioning together in ‘-omic’ sciences.
Keywords: Databases, Genetic, Databases, Nucleic Acid, Gene Expression Profiling, Genomics, Humans, Information Storage and Retrieval - methods, MEDLINE, Oligonucleotide Array Sequence Analysis, Software