PubChem Glossary
A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
A #
AID
PubChem's BioAssay (protocol) identifier, a non-zero integer.
Accession
NCBI's Protein database identifier
B #
BioActivity Types
- IC50: the concentration of a compound where 50% of its inhibitory activity is observed (See https://en.wikipedia.org/wiki/IC50)
- EC50: the concentration of a compound where 50% of its maximal effect is observed (See https://en.wikipedia.org/wiki/EC50)
- Kd: the equilibrium dissociation constant for the ligand, determined directly in a binding assay using a labelled ligand (See http://www.guidetopharmacology.org/helpPage.jsp and https://en.wikipedia.org/wiki/Dissociation_constant)
- Ki: the equilibrium dissociation constant for the ligand, determined in inhibition studies (See http://www.guidetopharmacology.org/helpPage.jsp and https://en.wikipedia.org/wiki/Competitive_inhibition)
- AC50/Potency: the concentration of a compound where 50% of the activity is observed. AC50 and Potency are often used in an exchangeable way among PubChem BioAssay submissions, and may represent IC50, EC50, CC50 etc. Please refer to a specific BioAssay record for details.
C #
CID
PubChem's compound identifier, a non-zero integer for a unique chemical structure.
Complexity
The complexity rating of a compound is a rough estimate of how complicated the structure is, seen from the point of view of both the elements contained and the displayed structural features including symmetry. However, neither stereochemistry nor isotope labelling are used as auxiliary criteria. The value is computed using the Bertz/Hendrickson/Ihlenfeldt formula, described in these papers:
- The first general index of molecular complexity
S.H. Bertz, J. Am. Chem. Soc., 1981, 103 (12), pp 3599–3601 - Molecular complexity: a simplified formula adapted to individual atoms
Hendrickson et al., J. Chem. Inf. Comput. Sci., 1987, 27 (2), pp 63–67.
A scaling factor for aromaticity is used so that the complexity of benzene is the same as of cyclohexane. It is a floating point value, ranging from 0 (simple ions) to several thousand (complex natural products). Generally larger compounds are more complex than smaller ones, but highly symmetrical compounds, or compounds with few distinct atom types or elements are downgraded. Complexity is only loosely correlated with synthetic accessibility.
Comments
List all depositor's comments and additional information for this substance.
Component
For mixture substance/compound, component is one of the single molecule.
Compound
Chemical representatives in substances. Chemical structure presented in a compound is standardized through PubChem's data pipeline. A mixture substance may have several standardized compounds. A compound record is structurally unique in the PubChem compound database.
Computed Descriptors
Information to describe the compound in different formats, including SMILES, InChI, IUPAC names.
Computed Properties
Properties that can be calculated for each compound, including molecular weight, molecular formula, XLogP, etc.
Covalently-Bonded Unit
A group of atoms connected by covalent bonds, ignoring other bond types (or a single atom without covalent bonds). The "covalently-bonded unit count" property is the number of such units in a compound.
D #
Deprecated Compound
A Compound CID which has no links to any substance. This may occur as PubChem modifies processing. A deprecated compound will not be available within Entrez.
E #
Emergency Response Guidebook (ERG)
The Emergency Response Guidebook (ERG) is designed for use by first responders (fire fighters, police, and other emergency services personnel) who may be the first to arrive at the scene of a transportation incident involving dangerous goods. ERG is a guide to aid first responders in quickly identifying the specific or generic hazards of the material(s) involved in the incident and protecting themselves and the general public during the initial response phase of the incident. It was developed jointly by Transport Canada (TC), the U.S. Department of Transportation (DOT), the Secretariat of Communications and Transport of Mexico (SCT) and with the collaboration of CIQUIME (Centro de Información Química para Emergencias) of Argentina.
Entrez
Entrez is NCBI’s primary text search and retrieval system that integrates the PubMed database of biomedical literature with 38 other literature and molecular databases including PubChem's BioAssay, Compound and Substance databases as well as DNA and protein sequence, structure, gene, genome, genetic variation and gene expression. Learn more..
F #
G #
GeneID
NCBI's Gene database identifier
H #
HBA
Number of hydrogen acceptors in the structure. Classification of hydrogens follows [J. Chem. Inf. Comput. Sci. 1997,37, 615-621].
HBD
Number of hydrogen donors in the structure. Classification of hydrogens follows [J. Chem. Inf. Comput. Sci. 1997,37, 615-621].
I #
InChI
IUPAC International Chemical Identifier. Learn more... InChI string can be searched through the Entrez PubChem databases.
J #
K #
L #
M #
Molecular Formula
A chemical formula that indicates the kinds of atoms and the number of each kind in a molecule. It is a way of expressing information about the atoms that constitute a particular chemical molecule.
Molecular Weight
The molecular weight is the sum of all atomic weights of the constituent atoms in a compound, measured in gr/mol. In the absence of explicit isotope labeling, averaged natural abundance (which may, for example in case of Li and U compounds, not be identical to purchasable material) is assumed. If an atom bears an explicit isotope label, 100% isotopic purity is assumed at this location, even for short-lived radioactive isotopes where this is often physically unrealistic. At this moment, it is not possible to deposit more detailed isotope composition information into the PubChem database. Pseudo-atoms which are not an element have an atomic weight of 0 g/mol.
N #
O #
Old Version Substance
Substance versions are considered to be "old" when a more recent update is provided by the depositor.
P #
Parent Compound
A parent compound is conceptually the "important" part of the molecule when the molecule has more than one covalent component. Specifically, a parent component must have at least one carbon and contain at least 70% of the heavy (non-hydrogen) atoms of all the unique covalent units (ignoring stoichiometry). Note that this is a very empirical definition and is subject to change. For example, the "parent" compound in tetracycline hydrochloride (CID 54704426) and tetracycline metaphosphate (CID 54729668) is tetracycline (CID 54675776).
PMID
NCBI's PubMed database identifier
Q #
R #
Revoked BioAssay
When a depositor removes an assay that the depositor previously deposited into PubChem, the assay is considered revoked. A revoked assay will not be available within Entrez.
Revoked Substance
When a depositor removes a substance from their substance collection, the substance is considered revoked. A revoked substance will not be available within Entrez.
S #
SID
PubChem's substance identifier, a non-zero integer for a deposited substance.
SMILES
Simplified Molecular Input Line Entry System, a line notation (a typographical method using printable characters) for entering and representing molecules. PubChem computes two kinds of SMILES strings:
- Canonical SMILES : a unique SMILES string of a compound, generated by a “canonicalization” algorithm.
- Isomeric SMILES : a SMILES string with stereochemical and isotopic specifications.
- In nearly all situations, one should use the Isomeric SMILES, unless stereo and isotopic information is not desired.
Read this document to learn more.
SMARTS
A language that allows you to specify substructures using rules that are straightforward extensions of SMILES. Learn more..
Source Category
Source category, such as chemical vendors or governmental organizations, is a general purpose grouping that describes contributing organization.
Stereochemistry
Relative spatial arrangement of atoms within molecules, such as chirality.
Substance
Individual record object collected from depositors, representing a sample used at BioAssay.
Suppressed Compound
A Compound CID that links only to an old version substance. A suppressed compound will not be available within Entrez.
Synonyms
All names, trivial names, synonyms, frequently used IDs, and other names collected from depositors. In the compound summary page, synonyms are distinct synonyms from all corresponding substances.
T #
TPSA
Topological Polar Surface Area. This is an estimate of the area (in Å2) which is polar. The implementation follows the paper by Ertl et al. [J. Med. Chem. 2000, 43, 3714-3717]. It is a simple method - only N and O are considered, 3D coordinates are not used, and there are various precomputed factors for different hybridizations, charges and participation in aromatic systems.
U #
V #
Version
PubChem substance version number is incremented when an update is provided by the depositor.
W #
X #
Xref
The external references/links to PubChem database records.
XLogP
A computationally predicted octanol-water partition coefficient (or distribution coefficient). It is used as a measure of hydrophilcity or hydrophobicity of a molecule. From 2009, the PubChem uses version 3 of the algorithm to generate the XlogP value, which is described in the paper by Cheng et al. Learn more..
Y #
Z #
National Library of Medicine 8600 Rockville Pike Bethesda, MD 20894 |