In This Issue
·  Scholarly Publishing Update
· Notes and News
·  UAMS Contributes to PubMed Central
·  Teaching with Technology Symposium
·  Tips on MyNCBI
· Introducing PubChem
·  JEI Welcomes new Librarian
·  IOA Welcomes new Librarian
·  New Acquisitions
Previous Issues
Feedback/Subscribe
News FAQ
Library Info
Library Home
UAMS Home

May/June 2006: Issue 130

Introducing PubChem:
An Entrez Database of Small Molecules

The NCBI has added three new databases linking small organic molecules to bioactivity assays, PubMed abstracts, and protein sequences and structures through the Entrez search and retrieval system. They are PubChem Substance, PubChem Compound, and PubChem Bioassay.[Entrez is the integrated, text-based search and retrieval system used at NCBI for the major databases, including PubMed, Nucleotide and Protein Sequences, Protein Structures, Complete Genomes, Taxonomy, and others.] PubChem is organized linking these three databases within the NCBI's Entrez information retrieval system. PubChem also provides a fast chemical structure similarity search tool.

PubChem Substance currently contains over 800,000 chemical samples imported from 14 public sources including ChemIDplus, the Developmental Therapeutics Program at NCI, KEGG, NCBI MMDB, and the NIST Chemistry WebBook. Chemical entities in PubChem Substance records that have known structures are validated, converted to a standardized form, and imported into PubChem Compound. This standardizing allows NCBI to compute chemical parameters and similarity relationships between compounds. The compounds are grouped into levels of chemical similarity from most general to most specific: same bonding connectivity and any tautomer; same bonding connectivity; same stereochemistry; same isotopes; and same stereochemistry and isotopes. PubChem Compound also indexes these chemicals using 34 fields, many of which represent computed chemical properties such as the number of chiral centers, the number of hydrogen bond donors/acceptors, molecular formula and weight, total formal charge, and octanol-water partition coefficients (XlogP). These groups are provided as Entrez links that allow similar compounds to be retrieved quickly. The third database, PubChem Bioassay, currently includes 173 bioactivity studies from the Develop-mental Therapeutics Program at NCI, and each of these studies is linked to records in PubChem Substance. The PubChem Bioassay interface allows users to view substances that meet certain activity and/or chemical criteria, and the matching records can either be viewed in PubChem Substance or downloaded in several formats.

As part of the Entrez system, the three PubChem databases are linked to several related Entrez databases, including PubMed, Protein, and Structure. The Protein and Structure links reveal proteins known to interact with a compound and protein structures that contain the compound as a bound ligand. The reverse links also provide new functionalities. Now ligands within structures can be identified instantly by the link to PubChem Compound, as can chemicals described in PubMed abstracts.

PubChem Bioassay allows one to search for bioactivity. The database contains bioactivity screens of chemical substances described in PubChem Substance.  It provides searchable descriptions of each bioassay, including descriptions of the conditions and readouts specific to that screening procedure. Links are then provided to PubChem Substance and PubChem Compound for these chemicals so that they may be further explored.

PubChem Database screen capture and link
 
PubChem Database descriptions

 

This information was compiled from NCBI News and the PubChem Database http://www.ncbi.nlm.nih.gov/Web/Newsltr/SummerFall04/pubchem.html