@datafire/core_ac_uk

Client library for CORE API v2

Installation and Usage

npm install --save @datafire/core_ac_uk

let core_ac_uk = require('@datafire/core_ac_uk').create({
  apiKey: ""
});

.then(data => {
  console.log(data);
});

Description

You can use the CORE API to access the resources harvested and enriched by CORE. If you encounter any problems with the API, please report them to us.

Overview

The API is organised by resource type. The resources are articles, journals and repositories and are represented using JSON data format. Furthermore, each resource has a list of methods. The API also provides two global methods for accessing all resources at once.

Response format

Response for each query contains two fields: status and data. In case of an error status, the data field is empty. The data field contains a single object in case the request is for a specific identifier (e.g. CORE ID, CORE repository ID, etc.), or contains a list of objects, for example for search queries. In case of batch requests, the response is an array of objects, each of which contains its own status and data fields. For search queries the response contains an additional field totalHits, which is the total number of items which match the search criteria.

Search query syntax

Complex search queries can be used in all of the API search methods. The query can be a simple string or it can be built using terms and operators described in Elasticsearch documentation. The usable field names are title, description, fullText, authors, publisher, repositories.id, repositories.name, doi, oai, identifiers (which is a list of article identifiers including OAI, URL, etc.), language.name and year. Some example queries:

title:psychology and language.name:English
repositories.id:86 AND year:2014
identifiers:"oai:aura.abdn.ac.uk:2164/3837" OR identifiers:"oai:aura.abdn.ac.uk:2164/3843"
doi:"10.1186/1471-2458-6-309"

Retrieving the latest Articles

You can retrieve the harvested items since specific dates using the following queries:

repositoryDocument.metadataUpdated:>2017-02-10
repositoryDocument.metadataUpdated:>2017-03-01 AND repositoryDocument.metadataUpdated:<2017-03-31

Sort order

For search queries, the results are ordered by relevance score. For batch requests, the results are retrieved in the order of the requests.

Parameters

The API methods allow different parameters to be passed. Additionally, there is an API key parameter which is common to all API methods. For all API methods the API key can be provided either as a query parameter or in the request header. If the API key is not provided, the API will return HTTP 401 error. You can register for an API key here.

API methods

Actions

nearDuplicateArticles

Method accepts values for several parameters and retrieves a JSON array of articles which have near duplicate content matching the input parameters' values. The response array contains ids of the near duplicate articles along with their relevance scores.

core_ac_uk.nearDuplicateArticles({}, context)

Input

input object
- doi string: The DOI for which the duplicates will be identified
- title string: Title to match when looking for duplicate articles. Only useful when either value for @year or @description is supplied.
- year string: Year the article was published. Only useful when value for @title is supplied.
- description string: Abstract for an article based on which its duplicates will be found. Only useful when value for @title is supplied.
- fulltext string: Full text for an article based on which its duplicates will be found.
- identifier string: Article identifier for which the duplicates will be identified. Only useful when either values for @doi or (@title and @year) or (@title and @abstract) or @fulltext are supplied.
- repositoryId string: Limit the duplicates search to particular repository id.

Output

output ArticleDedupResponse

getArticleByCoreIdBatch

Method accepts a JSON array of CORE IDs and retrieves a list of articles. The response array is ordered based on the order of the IDs in the request array.

core_ac_uk.getArticleByCoreIdBatch({
  "body": []
}, context)

Input

input object
- body required array
  - items integer
- metadata boolean: Whether to retrieve the full article metadata or only the IDs. The default value is true
- fulltext boolean: Whether to retrieve fulltexts of the articles. The default value is false
- citations boolean: Whether to retrieve citations found in the articles. The default value is false
- similar boolean: Whether to retrieve lists of similar articles. The default value is false. Because the similar articles are calculated on demand, setting this parameter to true might slightly slow down the response time
- duplicate boolean: Whether to retrieve CORE IDs of different versions of the articles. The default value is false
- urls boolean: Whether to retrieve lists of URLs of the article fulltexts. The default value is false
- faithfulMetadata boolean: Returns the records raw XML metadata from the original repository. The default value is false

Output

output array
- items ArticleResponse

getArticleByCoreId

Method will retrieve an article based on given CORE ID.

core_ac_uk.getArticleByCoreId({
  "coreId": 0
}, context)

Input

input object
- coreId required integer: CORE ID of the article that needs to be fetched.
- metadata boolean: Whether to retrieve the full article metadata or only the ID. The default value is true.
- fulltext boolean: Whether to retrieve full text of the article. The default value is false
- citations boolean: Whether to retrieve citations found in the article. The default value is false
- similar boolean: Whether to retrieve a list of similar articles. The default value is false. Because the similar articles are calculated on demand, setting this parameter to true might slightly slow down the response time
- duplicate boolean: Whether to retrieve a list of CORE IDs of different versions of the article. The default value is false
- urls boolean: Whether to retrieve a list of URLs from which the article can be downloaded. This can include links to PDFs as well as HTML pages. The default value is false
- faithfulMetadata boolean: Returns the records raw XML metadata from the original repository. The default value is false

Output

output ArticleResponse

getArticlePdfByCoreId

Method will retrieve an article based on given CORE ID.

core_ac_uk.getArticlePdfByCoreId({
  "coreId": ""
}, context)

Input

input object
- coreId required string: ID of article history that needs to be fetched

Output

Output schema unknown

getArticleHistoryByCoreId

Method accepts a single CORE ID and returns a list of all historical versions of the article, which are stored in CORE database. The results are ordered from the newest one to the oldest one.

core_ac_uk.getArticleHistoryByCoreId({
  "coreId": ""
}, context)

Input

input object
- coreId required string: CORE ID of the article which history should be fetched
- page integer: Which page of the history results should be retrieved. Can be any number betwen 1 and 100, default is 1 (first page).
- pageSize integer: The number of results to return per page. Can be any number between 10 and 100, default is 10.

Output

output ArticleHistoryResponse

searchArticlesBatch

Method accepts a JSON array of search queries and parameters. It then searches through all articles and returns a JSON array of search results for each of the queries. Method searches through all article fields (title, authors, subjects, identifiers, etc.).

core_ac_uk.searchArticlesBatch({
  "body": []
}, context)

Input

input object
- body required array
  - items SearchRequest
- metadata boolean: Whether to retrieve the full article metadata or only the ID. The default value is true.
- fulltext boolean: Whether to retrieve full text of the article. The default value is false
- citations boolean: Whether to retrieve citations found in the article. The default value is false
- similar boolean: Whether to retrieve a list of similar articles. The default value is false. Because the similar articles are calculated on demand, setting this parameter to true might slightly slow down the response time
- duplicate boolean: Whether to retrieve a list of CORE IDs of different versions of the article. The default value is false
- urls boolean: Whether to retrieve a list of URLs from which the article can be downloaded. This can include links to PDFs as well as HTML pages. The default value is false
- faithfulMetadata boolean: Whether to retrieve the raw XML metadata of the article. The default value is false

Output

output array
- items ArticleSearchResponse

searchArticles

Searches through all articles and returns a JSON array with search results. Method searches through all article fields.

core_ac_uk.searchArticles({
  "query": 0
}, context)

Input

input object
- query required integer: The search query
- page integer: Which page of the search results should be retrieved. Can be any number betwen 1 and 100, default is 1 (first page).
- pageSize integer: The number of results to return per page. Can be any number between 10 and 100, default is 10.
- metadata boolean: Whether to retrieve the full article metadata or only the ID. The default value is true.
- fulltext boolean: Whether to retrieve full text of the article. The default value is false
- citations boolean: Whether to retrieve citations found in the article. The default value is false
- similar boolean: Whether to retrieve a list of similar articles. The default value is false. Because the similar articles are calculated on demand, setting this parameter to true might slightly slow down the response time
- duplicate boolean: Whether to retrieve a list of CORE IDs of different versions of the article. The default value is false
- urls boolean: Whether to retrieve a list of URLs from which the article can be downloaded. This can include links to PDFs as well as HTML pages. The default value is false
- faithfulMetadata boolean: Returns the records raw XML metadata from the original repository. The default value is false

Output

output ArticleSearchResponse

similarArticles

Method accepts a text and retrieves a JSON array of articles which are similar to the given text. The response array is ordered based on similarity score, starting from the most similar.

core_ac_uk.similarArticles({
  "body": null
}, context)

Input

input object
- body required SimilarRequest
- limit integer: How many similar articles to retrieve at most. Can be any number betwen 1 and 100, default is 10
- metadata boolean: Whether to retrieve the full article metadata or only the IDs of the similar articles. The default value is true
- fulltext boolean: Whether to retrieve fulltexts of the similar articles. The default value is false
- citations boolean: Whether to retrieve citations found in the articles. The default value is false
- similar boolean: Whether to retrieve lists of similar articles. The default value is false. Because the similar articles are calculated on demand, setting this parameter to true might slightly slow down the response time
- duplicate boolean: Whether to retrieve CORE IDs of different versions of the articles. The default value is false
- urls boolean: Whether to retrieve lists of URLs of the article fulltexts. The default value is false
- faithfulMetadata boolean: Whether to retrieve the raw XML metadata of the articles. The default value is false

Output

output ArticleSimilarResponse

getJournalByIssnBatch

Method accepts a JSON array of ISSNs and retrieves a list of journals.

core_ac_uk.getJournalByIssnBatch({
  "body": []
}, context)

Input

input object
- body required array
  - items string

Output

output array
- items JournalResponse

getJournalByIssn

Returns a journal with given ISSN identifier.

core_ac_uk.getJournalByIssn({
  "issn": ""
}, context)

Input

input object
- issn required string: ISSN identifier of journal that needs to be fetched.

Output

output JournalResponse

journals.search.post

Method accepts a JSON array of search queries and parameters. It then searches through all journals and returns a JSON array of search results for each of the queries. Method searches through all journal fields (title, identifiers, subjects, language, rights and publisher).

core_ac_uk.journals.search.post({
  "body": []
}, context)

Input

input object
- body required array
  - items SearchRequest

Output

output array
- items JournalResponse

journals.search.query.get

Searches through all journals and returns a JSON array of search results. Method searches through all journal fields (title, identifiers, subjects, language, rights and publisher).

core_ac_uk.journals.search.query.get({
  "query": ""
}, context)

Input

input object
- query required string: Search query
- page integer: Which page of the search results should be retrieved. Can be any number betwen 1 and 100, default is 1 (first page).
- pageSize integer: The number of results to return per page. Can be any number between 10 and 100, default is 10.

Output

output JournalSearchResponse

getRepositoryByIdBatch

Method accepts a JSON array of CORE repository IDs and retrieves a list of repositories. The response array is ordered based on the order of the IDs in the request array. The maximum number of IDs in request is 100.

core_ac_uk.getRepositoryByIdBatch({
  "body": []
}, context)

Input

input object
- body required array
  - items integer
- stats boolean: Whether to retrieve statistics about the repository. The default value is false
- depositHistory boolean: Returns deposit history over time
- depositHistoryCumulative boolean: Returns deposit history over time

Output

output array
- items RepositoryResponse

getRepositoryById

Method will retrieve a repository based on given CORE repository ID.

core_ac_uk.getRepositoryById({
  "repositoryId": 0
}, context)

Input

input object
- repositoryId required integer: CORE repository ID of the article that needs to be fetched.
- stats boolean: Whether to retrieve statistics about the repository. The default value is false
- depositHistory boolean: Returns deposit history over time
- depositHistoryCumulative boolean: Returns deposit history over time

Output

output RepositoryResponse

repositories.search.post

Method accepts a JSON array of search queries and parameters. It then searches through all repositories and returns a JSON array of search results for each of the queries. Method searches through all repository fields.

core_ac_uk.repositories.search.post({
  "body": []
}, context)

Input

input object
- body required array
  - items SearchRequest
- stats boolean: Whether to retrieve statistics about the repository. The default value is false
- depositHistory boolean: Returns deposit history over time
- depositHistoryCumulative boolean: Returns deposit history over time

Output

output RepositorySearchResponse

repositories.search.query.get

Searches through all repositories and returns a JSON array with search results. Method searches through all repository fields.

core_ac_uk.repositories.search.query.get({
  "query": ""
}, context)

Input

input object
- query required string: The search query
- page integer: Which page of the search results should be retrieved. Can be any number betwen 1 and 100, default is 1 (first page).
- pageSize integer: The number of results to return per page. Can be any number between 10 and 100, default is 10.
- stats boolean: Whether to retrieve statistics about the repository. The default value is false
- depositHistory boolean: Returns deposit history over time
- depositHistoryCumulative boolean: Returns deposit history over time

Output

output RepositorySearchResponse

search.post

Method accepts a JSON array of search queries. It searches through all resources and returns a JSON array with search results for each of the queries. Method searches through all resources and all fields. The results are ordered by relevance score and contain type of the relevant resource and its ID. Furthermore, the responses are oredered based on the order of the request items. The metadata of each resource need to be obtained through an appropriate method.

core_ac_uk.search.post({
  "body": []
}, context)

Input

input object
- body required array
  - items SearchRequest

Output

output array
- items SearchAllResponse

search.query.get

Searches through all resources and returns a JSON array with search results. Method searches through all resources and all fields. The results are ordered by relevance score and contain type of the relevant resource and its ID. The metadata of each resource need to be obtained through an appropriate method.

core_ac_uk.search.query.get({
  "query": ""
}, context)

Input

input object
- query required string: The search query
- page integer: Which page of the search results should be retrieved. Can be any number betwen 1 and 100, default is 1 (first page).
- pageSize integer: The number of results to return per page. Can be any number between 10 and 100, default is 10.

Output

output SearchAllResponse

Definitions

Article

Article object
- authors array: List of article authors
  - items string
- citations array: Citations found in the article
  - items Citation
- contributors array: List of article contributors
  - items string
- datePublished string: Date article published
- description string: The abstract
- doi string: The DOI of the article
- downloadUrl string: The download PDF URL which is displayed on our /display/[ArticleID] page
- fulltext string: Article full text
- fulltextIdentifier string: The URL to the fulltext
- fulltextUrls array: URLs of the fulltext version of this article
  - items string
- id required integer: Article ID
- identifiers array: List of document identifiers
  - items string
- journals array: List of journals this article belongs to
  - items ArticleJournal
- language Language
- oai string: The OAI of the article
- publisher string: Publisher of the article
- rawRecordXml RawRecordXml
- relations array: URLs of relating articles, etc.
  - items string
- repositories array: List of repositories this article belongs to
  - items Repository
- repositoryDocument RepositoryDocument
- similarities array: Similar articles
  - items Similar
- subjects array: Article subjects
  - items string
- title string: Article title
- topics array: Article topics
  - items string
- types array: Types, e.g. conference paper, journal paper, etc.
  - items string
- year integer: Year the article was published

ArticleDedupResponse

ArticleHistoryResponse

ArticleHistoryResponse object
- data array: List of article versions
  - items RawRecordXml
- status required string (values: OK, Not found, Too many queries, Missing parameter, Invalid parameter, Parameter out of bounds): Operation status

ArticleJournal

ArticleJournal object
- identifiers array: List of journal identifiers
  - items string
- title string: Title of the journal

ArticleResponse

ArticleResponse object
- data required Article
- status required string (values: OK, Not found, Too many queries, Missing parameter, Invalid parameter, Parameter out of bounds): Operation status

ArticleSearchResponse

ArticleSearchResponse object
- data required array: Search results
  - items Article
- status required string (values: OK, Not found, Too many queries, Missing parameter, Invalid parameter, Parameter out of bounds): Operation status
- totalHits required integer: Total number of articles matching the search criteria

ArticleSimilarResponse

ArticleSimilarResponse object
- data required array: Similar articles
  - items Article
- status required string (values: OK, Not found, Too many queries, Missing parameter, Invalid parameter, Parameter out of bounds): Operation status

Citation

Citation object
- authors string: Authors of the article
- date string: Date the cited article was published
- doi string: Digital Object Identifier
- raw string: Citation as raw string
- title string: Title of the cited article

Journal

Journal object
- identifiers required array: List of journal identifiers (e.g. URL, OAI or ISSN). The type is prepended to the identifier string (e.g. 'issn:2296-0597')
  - items string
- language string: Language of the journal
- publisher string: Publisher of the journal
- rights string: Copyright license of the journal
- subjects array: List of journal subjects
  - items string
- title string: Journal title

JournalResponse

JournalResponse object
- data Journal
- status required string (values: OK, NOT_FOUND, TOO_MANY_QUERIES, MISSING_PARAMETER, INVALID_PARAMETER): Operation status

JournalSearchResponse

JournalSearchResponse object
- data array: Search results
  - items Journal
- status required string (values: OK, NOT_FOUND, TOO_MANY_QUERIES, MISSING_PARAMETER, INVALID_PARAMETER): Operation status
- totalHits required integer: Total number of journals matching the search criteria

Language

Language object
- deletedStatus integer: The deleted status of the document: 0 for allowed, 1 for deleted, 2 for disabled
- depositedDate string: The date the item was deposited in the Data Provider (repository/Journal)
- indexed integer: The index status of the document: 0 for not indexed, 1 for indexed
- metadataUpdated string: The last date metadata of this article were updated
- pdfOrigin string: The remote URL where we aquired the PDF
- pdfSize integer: The size of pdf in bytes
- pdfStatus integer: The pdf status flag of article: 0 no pdf, 1 pdf exists
- tdmOnly boolean: The tdmOnly flag of the article: 0 normal, 1 tdm only
- textStatus integer: The text status flag of article: 0 does not exist, 1 exists
- timestamp string: The date of article as given by the repository

RawRecordXml

RawRecordXml object
- datetime string: Timestamp when CORE harvested the metadata
- metadata string: The raw XML metadata

Repository

Repository object
- dataProviderSourceStats array: Statistics based on the Data Provider/repository rather than data processed and filtered by CORE. This array is in beta and may change in the future
- history array: The number of deposits in the repository per date. This field is in beta and may change in the future
- historyCumulative array: The number of deposits in the repository per date over time (cumulative). This field is in beta and may change in the future
- id integer: CORE repository ID
- lastSeen string: The time the repository was last harvested by CORE. This field is in beta and may change in the future
- name string: Repository name
- openDoarId integer: ID of the repository in Open DOAR
- repositoryLocation RepositoryLocation
- repositoryStats RepositoryStats
- uri string: Repository URI
- urlHomepage string: Repository homepage
- urlOaipmh string: Repository OAI-PMH endpoint

RepositoryDocument

RepositoryLocation

RepositoryLocation object
- country string: Country name
- countryCode string: Two letter country code
- id integer: CORE repository ID
- latitude integer
- longitude integer
- repositoryName string: Repository name

RepositoryResponse

RepositoryResponse object
- data Repository
- status required string (values: OK, Not found, Too many queries, Missing parameter, Invalid parameter, Parameter out of bounds): Operation status

RepositorySearchResponse

RepositorySearchResponse object
- data array: Search results
  - items Repository
- status required string (values: OK, Not found, Too many queries, Missing parameter, Invalid parameter, Parameter out of bounds): Operation status
- totalHits required integer: Total number of repositories matching the search criteria

RepositoryStats

RepositoryStats object
- countFulltext integer: Repository fulltext count
- countMetadata integer: Repository metadata count
- dateLastProcessed string: Last repository processing date

Resource

Resource object
- id required string: Identifier of the resource
- type required string (values: journal, article, repository): Type of the resource

SearchAllResponse

SearchAllResponse object
- data required array: List of relevant resources
  - items Resource
- status required string (values: OK, Not found, Too many queries, Missing parameter, Invalid parameter, Parameter out of bounds): Operation status
- totalHits required integer: Total number of items matching the search criteria

SearchRequest

SearchRequest object
- page integer: Which page of the search results should be retrieved. Can be any number from 1 to 100, default is 1 (first page)
- pageSize integer: The number of results to return per page. Can be any number from 10 to 100, default is 10
- query required string: Search query

Similar

Similar object
- id required integer: CORE ID of the similar article
- score required number: Similarity score
- title string: Title of the similar article

SimilarRequest

SimilarRequest object
- text required string: Find Similar articles based on this string