The database snapshot, Simple Query Tool, REST API, and Data Feed products all return JSON-formatted data. For simplicity, that data is organized under the same schema in all cases; that schema is informally described on this page.
Regardless of the source, each record returned consists of one DOI Object, containing resource metadata. Each DOI Object in turn contains a list of zero or more OA Location Objects.
New fields may be added at any time. This won't be a problem for existing code in most cases since they will simply go unused, but you shouldn't rely on the number of fields being fixed.
Fields marked (beta) may have their behavior changed without warning. Changes to other fields will be announced on the Unpaywall mailing list.
The DOI object is more or less a row in our main database...it's everything we know about a given DOI-assigned resource, including metadata about the resource itself, and information about its OA status. It includes a list of zero or more OA Location Objects, as well as a best_oa_location
property that's probably the OA Location you'll want to use.
best_oa_locationObject|null | The best OA Location Object we could find for this DOI. |
The "best" location is determined using an algorithm that prioritizes publisher-hosted content first (eg Hybrid or Gold), then prioritizes versions closer to the version of record (
Returns |
data_standardInteger | Indicates the data collection approaches used for this resource. | Possible values
|
doiString | The DOI of this resource. | This is always lowercase. |
doi_urlString | The DOI in hyperlink form. |
This field simply contains "https://doi.org/" prepended to the doi field. It expresses the DOI in its correct format according to the Crossref DOI display guidelines. |
genreString|null | The type of resource. |
Currently the genre is identical to the Crossref-reported type of a given resource. The "journal-article" type is most common, but there are many others.
|
is_paratextBoolean | Is the item an ancillary part of a journal, like a table of contents? | See here for more information on how we determine whether an article is paratext. |
is_oaBoolean | Is there an OA copy of this resource. |
Convenience attribute; returns true when best_oa_location is not null .
|
journal_is_in_doajBoolean | Is this resource published in a DOAJ-indexed journal. |
Useful for defining whether a resource is Gold OA (depending on your definition, see also journal_is_oa ).
|
journal_is_oaBoolean | Is this resource published in a completely OA journal. | Useful for defining whether a resource is Gold OA. Includes any fully-OA journal, regardless of inclusion in DOAJ. This includes journals by all-OA publishers and journals that would otherwise be all Hybrid or Bronze OA. See here for more information on OA journals. |
journal_issnsString|null | Any ISSNs assigned to the journal publishing this resource. |
Separate ISSNs are sometimes assigned to print and electronic versions of the same journal. If there are multiple ISSNs, they are separated by commas. Example: 1232-1203,1532-6203 |
journal_issn_lString|null | A single ISSN for the journal publishing this resource. |
An ISSN-L can be used as a primary key for a journal when more than one ISSN is assigned to it.
Resources' journal_issns are mapped to ISSN-Ls using the
issn.org table, with some manual corrections.
|
journal_nameString|null | The name of the journal publishing this resource. | The same journal may have multiple name strings (eg, "J. Foo", "Journal of Foo", "JOURNAL OF FOO", etc). These have not been fully normalized within our database, so use with care. |
oa_locationsList | List of all the OA Location objects associated with this resource. |
This list is unnecessary for the vast majority of use-cases, since you probably just want the best_oa_location . It's included primarily for research purposes.
|
oa_locations_embargoed (beta)List | List of OA Location objects associated with this resource that are not yet available. | This list includes locations that we expect to be available in the future based on information like license metadata and journals' delayed OA policies. They do not affect the resource's oa_status and cannot be the best_oa_location or first_oa_location. |
first_oa_locationObject|null | The OA Location Object with the earliest oa_date. |
Returns |
oa_statusString | The OA status, or color, of this resource. |
Classifies OA resources by location and license terms as one of:
gold ,
hybrid ,
bronze ,
green or
closed .
See here for more information on how we assign an oa_status.
|
has_repository_copyBoolean | Whether there is a copy of this resource in a repository. | True if this resource has at least one OA Location with host_type = "repository".
False otherwise.
|
published_dateString|null | The date this resource was published. | As reported by the publishers, who unfortunately have inconsistent definitions of what counts as officially "published." Returned as an ISO8601-formatted timestamp, generally with only year-month-day. |
publisherString|null | The name of this resource's publisher. | Keep in mind that publisher name strings change over time, particularly as publishers are acquired or split up. |
titleString|null | The title of this resource. | It's the title. Pretty straightforward. |
updatedString | Time when the data for this resource was last updated. |
Returned as an ISO8601-formatted timestamp. Example: 2017-08-17T23:43:27.753663 |
yearInteger|null | The year this resource was published. |
Just the year part of the published_date |
z_authorsList of Crossref Contributor objects, or null | The authors of this resource. |
These are formatted as a list of Crossref Contributor objects, which are described in the Crossref API docs here.Contributor objects may also contain sequence elements,
which at the time of writing are not included in the Crossref API docs.
|
The OA Location object describes particular place where we found a given OA article. The same article is often available from multiple locations, and there may be differences in format, version, and license depending on the location; the OA Location object describes these key attributes. An OA Location Object is always a Child of a DOI Object.
evidenceString | How we found this OA location. | Used for debugging. Don’t depend on the exact contents of this for anything, because values are subject to change without warning. Example values:
|
host_typeString | The type of host that serves this OA location. | There are two possible values:
|
is_bestBoolean |
Is this location the best_oa_location for its resource.
|
See the DOI object's best_oa_location description for more on how we select which location is "best."
|
licenseString|null | The license under which this copy is published. | We return several types of licenses:
|
oa_dateString|null | When this document first became available at this location. | oa_date is calculated differently for different host types and is not available for all oa_locations. See https://support.unpaywall.org/a/solutions/articles/44002063719 for details. |
pmh_idString|null | OAI-PMH endpoint where we found this location. |
This is primarily for internal debugging. It's null for locations that weren't found using OAI-PMH.
|
updatedString | Time when the data for this location was last updated. |
Returned as an ISO8601-formatted timestamp. Example: 2017-08-17T23:43:27.753663 |
urlString |
The url_for_pdf if there is one; otherwise landing page URL.
|
When we can't find a |
url_for_landing_pageString | The URL for a landing page describing this OA copy. |
When the |
url_for_pdfString|null | The URL with a PDF version of this OA copy. | Pretty much what it says. |
versionString | The content version accessible at this location. | We use the DRIVER Guidelines v2.0 VERSION standard to define versions of a given article; see those docs for complete definitions of terms. Here's the basic idea, though, for the three version types we support:
|