Upload Chemicals
Once you have opened an account, login to PubChem Upload and click New Submission > Substances.
Format a File Or Fill Out a Web Form
If you have only a few records to upload, use our web form to fill in your information.
To input many chemical substances, like a product catalog, it is most convenient to prepare a file for upload. You can do this with either a standard spreadsheet file (like CSV) or a chemistry-specific SDF file. In either case the main objectives are the same as outlined below.
You Can Upload a Spreadsheet
- First row contains PubChem standard tags defining each column
- Each subsequent row defines one substance record
Or You Can Upload a Chemical SDF File
Most chemical software packages output chemical structures in SDF
An SDF file can be used to define your structures in PubChem.
While SDF is more complicated than a spreadsheet, it offers the following advantages:
- Annotation tags more easily support multi-line values
- Chemical structures can be specified with detailed MOL format
- Chemical software packages typically output data in SDF
Map Your Tags
The primary preparation required is to map your existing tags, e.g. Catalog-Number, to our PubChem standard tags, e.g. PUBCHEM_EXT_DATASOURCE_REGID, so that our parser can understand your content. In some cases such as synonyms, multiple fields can be defined with the same tag. Their values will be listed line by line.
Give Each Record a Unique Identifier
Provide a unique identifier to track your record for future updates and revokes
Use the following tag in your file:
- PUBCHEM_EXT_DATASOURCE_REGID
Add a Structure If You Have One
Expose your information on our highly-visible Compound page
If you have a chemical structure that we can standardize, we will link your information to a highly-visible Compound summary page for that structure.
If you provide something like 'tea extract', then you won't have a structure, which is fine. We do not accept molecular formulas to define your structure. We accept three structure tags:
- PUBCHEM_EXT_DATASOURCE_SMILES
CC(=O)C
- PUBCHEM_EXT_DATASOURCE_INCHI:
InChI=1S/C3H6O/c1-3(2)4/h1-2H3
- PUBCHEM_EXT_DATASOURCE_CID
180
In addition, with an SDF file, you can use the MOL section to define your structure instead of using a tag.
List Common Synonyms
Synonyms, listed one per line, associate your structure with common names for text search
Use the following tag in your file:
- PUBCHEM_SUBSTANCE_SYNONYM
Synonyms including common names, registry-ids (like CAS), chemical names and trade names. This tag can have multiple synonyms by putting each on a separate line or by repeating this tag with a different synonym for each tag occurrence.
Example entry (one synonym per line):
50-78-2
2-acetoxybenzoic acid
Synonyms are the primary keywords by which a substance is known and found via text search. They are collected to name our aggregated PubChem Compound records. Your records are most frequently discovered through your high-quality synonyms and chemical structure. Synonyms must be ASCII text; as with other input, all special characters and HTML tags will be converted to text or stripped out as appropriate.
Provide a URL To a Specific Page On Your Site
Drive web traffic back to your site
If your website has a specific page for each molecule, use the following tag to provide the URLs:
- PUBCHEM_EXT_SUBSTANCE_URL
Here are some guidelines for this type of URL:
- Make it specific to the record, not a general location like a home page.
- Keep the page outside of a login or paywall mechanism.
- Point to a page on your website, not a third party.