Publisher Springer Nature contributes millions of chemical-article links
Posted on October 26, 2017
PubChem added more than 26 million links to scientific articles, thanks to contributions from the publisher Springer Nature. Of these, 1.6 million links point to open access or free-to-read documents! (Read Springer Nature’s press release and presentation about it.)
Springer Nature includes the SpringerLink, SpringerOpen, and BioMed Central research platforms as well as the nature.com website. Combined, they include more than 10 million scientific documents spanning the primary literature, book chapters, and reference works. InfoChem, a subsidiary of Springer Nature, identified the chemicals mentioned in these scientific articles using a proprietary approach.
The Springer Nature data collection in PubChem covers over 600 thousand chemical substance records, and contains nearly 4 million scientific article descriptions (of which almost 300 thousand are open- or free-access) and 26.8 million links between chemicals and articles. The document descriptions include information such as a document object identifier (DOI), publication title, name of the journal or book, document type, subject matter classification, language, open/free access availability, and publication year.
This contribution, which doubles the number of chemical structures in PubChem with links to the scientific literature, improves the accessibility and discoverability of information about chemicals. Nearly all link content provided by Springer Nature is novel to PubChem, with only 10% of the provided chemical structures having a previous link to the scientific literature.
Integration of the Springer Nature links and data within PubChem has opened new possibilities for organizations and researchers. As a result of the contribution, PubChem added the capability to handle DOI-based annotation content. Additional appropriate DOI-based linked content (articles, data sets, and more) can now be added to PubChem.
Springer Nature is a scientific publishing company and a leading global research, educational and professional publisher formed through the merger of Nature Publishing Group, Palgrave Macmillan, Macmillan Education, and Springer Science+Business media.
The SpringerLink research platform provides access to more than 6 million journal articles, 3.7 million book chapters, and more than 480,000 reference works primarily in the areas of science, technology, and medicine.
Each chemical record with a Literature “Springer Nature References” section includes a table containing document links from Springer Nature. As an example, below are the links to the Springer Nature References section for aspirin (Compound ID 2244 and Substance ID 341138876). (Read this blog if you are not familiar with how Compounds and Substances in PubChem are different from each other.)
Click an article title to access the document on a Springer Nature website. To download all the contributed document data for a chemical record in CSV format, click the “Download” button at the top right of the table (see image). There is a full table data view accessible (by clicking the icon), where you can see additional data columns such as the DOI. By default, the articles are ordered by degree of “relevance” to the chemical as provided by Springer Nature, but the sorting field is easily changed through a pulldown menu, and sort direction also may be changed.
There are multiple ways to get a complete list of PubChem Substance or Compound records with “Springer Nature References”. One can:
- use the PubChem Classification Browser
- access the PubChem Data Sources page
- directly query the Substance or Compound database
The PubChem Classification Browser provides the means to navigate PubChem contents using various hierarchical classification trees. The PubChem Compound TOC (Table of Contents) classification tree allows you to find all chemicals with a given annotation section. In this case, one can click ‘Literature’ to view the subset fields under literature and find the ‘Springer Nature References’ section. Clicking on the number will then show compound records with that section.
The entire list of chemical substances provided by Springer Nature is available using the PubChem Data Sources page. (Read this blog to learn more about the PubChem Data Sources page.) Searching for “Springer Nature” from the list of data sources shown on the page will help lead you to the Springer Nature data source page that has a link to the PubChem records provided by Springer Nature. Alternatively, you can search the PubChem Compound or PubChem Substance database using the query: “Springer Nature”[sourcename].
National Library of Medicine
8600 Rockville Pike
Bethesda, MD 20894