BioAssay Tag Names


Mandatory Tags


A required identifier you create that is unique amongst your bioassay records in PubChem. This identifier never changes; if we see it again in a second submission, we treat it as a replacement update for the record and create a new version. You can also use it to revoke records with the PUBCHEM_REVOKE_SUBSTANCE tag.


Please use a fairly short string of ASCII text letters and numbers that have meaning to you so it is easy to reference for updates in the future.



A short, informative name of the assay for display purposes.


Optional Tags


A definition of the assay purpose and parameters.


The protocol used to generate the assay. This might include an explanation of how the Activity Outcome and Score values in the Assay Data were determined.


Additional information that might not fit in the Description or Protocol sections.


These are Tag-Value pairs which provide a convenient place-holder for the definition of submitter-defined ontologies or other definitions outside the scope of the PubChem specification. All such comments will be searchable in PubChem.


For any assay designed to identify chemicals interacting with a target such as an enzyme inhibitor, please specify the sequence identifier here. For a chemical assay, the target is typically a protein, but it can also be a gene, nucleotide or pathway id. Cell-based assays can skip this field.

Note that only 1 or a couple of targets should be identified here. If you have something like an RNAi assay, target definitions which change for each tested result should be specified in a column within the Assay Data.


Cross-references (XRefs) can be made to many NCBI database records related to your assay. This includes other PubChem BioAssays either by AID number or RegId, but it also includes PubMed Ids, Taxonomy Ids, and many other databases.

Note, please do not duplicate the protein identifier if used in the Target section.


For most assay submissions, the assay data contains the actual data reported for each tested substance. The submitter may define as many columns as desired reporting numerical values, such as IC50s and Percent Inhibition, but also labels and database identifiers.

One column of activity outcome values must be reported to give a submitter-defined judgment call on whether each row should be considered inactive (1) or active (2).



The PubChem bioassay data model supports the presentation and annotation of profiling screening results.

Panel assays are very complex in nature and we have tried to make the interface as user-friendly as possible. Please remember, however, that extra attention should be paid to panel assay definitions and data to ensure their accuracy.

To see a panel assay example, please take a look at this kinase profiling assay.


Please indicate the type of substances tested in your assay to help categorize assays into chemical and RNAi types, for example.


Classify your assay by how the activity outcome was defined. Choices include:

  • Screening assay - Single concentration activity observed.
  • Confirmatory assay - Concentration-Response Relationship Observed (EC50,IC50,etc.)
  • Summary assay - Overview of and links to multiple, related assays.
  • Other - Assay does not fall into the above categories.


Classify your assay if a specialized project was used for its creation.

  • Other - Select if no other category is applicable.
  • Literature, Extracted - Select if assay data extracted from literature by 3rd party (not by author or publisher).
  • Literature, Author / Publisher - Select if assay data extracted from article by author or publisher.
  • RNAi Global Initiative - Select if work is from a member of the RNAi Global Initiative.
  • Assay Vendor - Select if contributed by an assay service provider.
  • NIH Molecular Libraries - Select if funded by an NIH Molecular Libraries program.


A grant number can be specified. Note that this string is not validated.


A label to be added to multiple assays for the purpose of logically grouping them.


Optional hold-until date to delay public access of assay data in PubChem. This may be useful, for example, to coordinate release of data with a journal publication.

Until the day of your hold-until date, public access is restricted but you can share access by creating a private access URL to your assay.



If you have previously deposited your Substance description into PubChem, you may use your Substance identifier (SID) assigned by PubChem. This must be an unsigned integer value and, in nearly all cases, your organization must have deposited the Substance associated with this SID.


You may use your own identifier for Substance descriptions previously loaded into PubChem.


This field allows the submitter to make an expert judgment call about the activity of each test result. Using a number, the value is set to 1 (inactive) or 2 (active) based on whatever means appropriate. An explanation of that determination should be provided in the Protocol or Comments section of the Assay Description.

In addition to active/inactive, this field can also be set to 3 (inconclusive), 4 (unspecified) or 5 (probe). The 'probe' designation indicates that the activity of the test result has been tested and confirmed though multiple rounds of experimental inquiry.


The activity of a test result may be assigned a normalized score between 0 and 100 where the most active result rows have scores closer to 100 and inactive closer to 0, so that one can rank the result based on this data and prioritize hits.


An URL may optionally be provided for Assay Data reported for this Substance in this column. This URL will be provided within PubChem displays to allow a PubChem user to link to your website, where you may choose to provide additional information or interfaces to your Assay Data, for example, dose-response curves, replicate data, etc.


Your textual annotation and comments may optionally be provided for Assay Data reported for this Substance in this column.


When you submit the data you must leave this blank or put a value '0' in this column. You may optionally suppress Assay Data for this Substance by putting a value of "1" in this column. In this case, leave all other columns blank except for Column 1: PUBCHEM_SID. Suppressing Assay Data does not delete data from PubChem, rather it eliminates all references and links to this information; however, all pre-existing links to this information will still function and a disclaimer will be displayed specifying this data is revoked.

You may un-revoke Assay Data for a Substance by depositing either the same or new data for this Substance. Do not revoke and submit the same substance in the same file.


Define your own result definition here, one per column. You must give it a name and you can also specify parameters like the data type and unit. For example if you want to report an EC50, you can name it "EC50", set the data type to "FLOAT" and the unit to "MICROMOLAR".

General Description Items


A table column header for general description tags.


A table column header for general description values.

Result Definitions Items


This header goes in the first row, first column of the spreadsheet. Immediately under it are optional tags to define properties of result definitions, such as RESULT_UNIT. In all data rows below that, this column contains an increasing number starting from one.


  • String
  • Float
  • Integer
  • Boolean
  • PubMed Id (PMID)
  • Protein Structure Id (MMDB)
  • URL
  • Taxonomy Organism Id
  • OMIM Id
  • Gene Id
  • Probe Id
  • PubChem BioAssay Id
  • PubChem Substance Id (SID)
  • PubChem Compound Id (CID)
  • Target Name
  • Target Description
  • Target Tax-Id
  • Target Gene Id
  • Protein Target Accession
  • Nucleotide Target Accession

The result type typically is either a Float, Integer, Boolean or String.

Optionally, the type can be used to specify an identifer, such as one coming from another NCBI Entrez database. For example, if PubMed Id is chosen as the type, then all data values in this column will be checked to ensure that they are valid PubMed identifiers.


  • None
  • Other
  • Percent
  • Molar
  • milliM
  • microM
  • nanoM
  • picoM
  • femtoM
  • Parts per Thousand
  • Parts per Million
  • Parts per Billion
  • milligrams per mL
  • micrograms per mL
  • nanograms per mL
  • picograms per mL
  • femtograms per mL
  • Ratio
  • Seconds
  • Reciprocal Seconds
  • Minutes
  • Reciprocal Minutes
  • Days
  • Reciprocal Days
  • milliliter / minute / kilogram
  • liter / kilogram
  • hour * nanogram / milliliter
  • centimeter / second
  • milligram / kilogram
  • Unspecified

Various units are available to better define the measurement of a given result column.


An optional description to explain what is being measured for a given result column.


An optional micromolar concentration at which this result was tested. This attribute implies that the result is biological concentration-response data.


For confirmatory assays, an optional id starting from 1 to group columns into series for defining dose-response curves. If one series is defined, all columns in that series will have a '1' in this field. A second series would use a '2' and so forth.


  • Is Active Concentration : 1

For confirmatory assays, this column allows an optional "1" for the one result column that summarized the active concentration. This is typically reported as an IC50, EC50, AC50, GI50 etc. or by reporting constant parameters such as Ki

Categorized Comments Items


A submitter-defined tag to define a categorized comment. This tag column must appear as the first column of the spreadsheet.


The value of a submitter-defined categorized comment.

Target Items


  • Protein Accession
  • Gene Id
  • Nucleotide Accession
  • Taxonomy Id

The database type of target identifier supplied.


The required, first table column header for target data. The values in this column are the actual primary identifiers from one of the accepted NCBI databases.


The optional name of the target. If left blank, a standard name from the sequence database will be used where possible.


Any additional description of the target beyond its name.


Any additional comments or annotations for the target.

XREF Items


  • PubChem BioAssay Id (AID)
  • External Assay RegId
  • PubMed Id (PMID)
  • Primary PubMed Id (about this assay)
  • MESH Index Term
  • Protein Accession
  • Nucleotide Accession
  • Gene Id
  • Protein Structure Id (MMDB)
  • PubChem Substance Id (SID)
  • PubChem Compound Id (CID)
  • Data Object Id (DOI)
  • Citation
  • Assay Homepage
  • Taxonomy Organism Id
  • OMIM Id
  • PubChem Pathway Id (BioSystems)
  • Patent Id (USPTO, EPO, WPO, etc.)
  • Substance Registry # (e.g. CAS,EC)

The database type of XRef identifier supplied. This type column must appear as the first column of the spreadsheet.


The actual identifer value from the cross-referenced NCBI database.


An explanatory text describing the relevance of this cross-referenced item to the assay.

Was this information helpful?


National Library of Medicine
8600 Rockville Pike
Bethesda, MD 20894


PubChem Help
HHS Vulnerability Disclosure


The page cannot be found

The page you are looking for might have been removed, had its name changed, or is temporarily unavailable. Please make sure you spelled the page name correctly or use the search box.