The GraphQL schema, data sources, and business logic consumed by the cp-content-pipeline-api and cp-content-pipeline-client. It is not intended for use outside this project.
This package's exports are shown below.
import {
typeDefs,
articleDocumentQuery
QueryContext,
resolvers,
scalars,
initDataSources,
DataSources,
} from '@financial-times/cp-content-pipeline-schema'
They are consumed by a GraphQL server framework, for example:
import { ApolloServer } from 'apollo-server'
import {
resolvers,
typeDefs,
dataSources,
} from '@financial-times/cp-content-pipeline-schema'
const server = new ApolloServer({
typeDefs,
resolvers,
dataSources,
})
See the API package entry point for a more complex example.
See GraphQL and Apollo Server Basics for more context on the below.
A collection of type definitions that is exported as typeDefs
. It defines the shape of queries that are executed against our data. These are defined as .graphql
files in the typedefs
folder.
A GraphQL query for a complete FT article, that is consumed by the cp-content-pipeline-api
.
An interface that defines the context object sent to each resolver function.
Data sources are used to fetch data from external sources, such as the content-api (CAPI)
. These are defined in the data sources directory. Its factory function is initDataSources
and its type is exported as DataSources
.
We use Zod for our CAPI
schemas to do runtime checks on the incoming data from data sources. We want to ensure that it matches the types defined in our data source schema.
There is no agreed schema provided by CAPI
, so there is a chance it will occasionally be out of date or incorrect.
Manual updates will be required to keep it in sync with the responses we receive.
We monitor such errors in Grafana and log such errors to Splunk in a non-blocking format, for example:
{
"event": "RECOVERABLE_ERROR",
"error": {
"code": "CAPI_SCHEMA_VALIDATION_FAILURE",
"data": {
"contentId": "http://www.ft.com/thing/31884c4d-2da7-4b43-917e-3180e9eafa3d",
"contentType": "Article",
"schemaError": [
"mainImage.members.[].format": "Invalid literal value, received 'promo', expected one of 'standardInline','mobile','desktop'..."
]
},
"isOperational": true,
"message": "The data received from the CAPI data source does not match our data source schema. It is likely that our schema will require updating to handle all possible responses from CAPI."
}
}
The GraphQL resolvers which define the technique for fetching types defined in the schema.
We use GraphQL Code Generator with the TypeScript Resolvers plugin to output TypeScript types for the expected function signatures for the resolvers. This prevents runtime GraphQL errors caused by unexpected data being returned.
The resolvers accept and return data objects in the form of models instead of plain GraphQL objects. The models classes encapsulate business logic and interact with the data sources.
For example:
- The
Teaser.metaLink
field is specified in the schema as theConcept
GraphQL type - We use the
Concept
model for it internally - The
Teaser.metaLink
resolver accepts an instance of theConcept
model as itsparent
argument. It returns an instance of theConcept
model too - The
Concept
model is able to access thedataSources
via itscontext
Themappers
option in the config file is how we let the GraphQL Code Generator know how to map the models to our resolvers. The mappers
option is an object whose keys are names of GraphQL types and values the paths of the models.
For the Concept
example above, this looks like:
{
Concept: '../model/Concept#Concept as ConceptModel'
}
The as ConceptModel
aliasing is to make sure the name doesn't collide with the Concept
GraphQL type in the generated code.
GraphQL Code Generator is also used to generate the client library.
The most complex resolver is the one for Financial Times article content - the body()
resolver. It resolves content in the format of a content-tree, a specification for representing content as an abstract syntax tree. It implements the unist spec.
As of January 2023, the content-tree
spec defines everything that can appear in the body of an article. In the future, this may extend to other fields (e.g. Topper).
It is shared across Spark, Content & Metadata and Customer Products.
Each node in the content-tree
has a type
property, which will correspond to a what data is available to that nodes, e.g. paragraph
, link
, image-set
.
As of January 2023, the bodyTree
property is not yet being published to the Content API, so we convert the bodyXML
field to a valid content tree within cp-content-pipeline
.
This is done by the bodyXMLtoTree function. This function uses cheerio to parse the XML
, and then traverses through the nodes, converting each one to a content-tree
node.
We provide the bodyXMLtoTree
function some tagMappings to map XML
nodes to a content-tree
nodes.
For some content-tree
nodes, the CMS will not be able to provide all of the information required to render. For example, to render a tweet, we need to fetch the embed code from the Twitter API.
In the content-tree
spec, this additional information will be marked as optional, as it may not be there when the tree is produced.
We provide this additional data to the tree by using an array of references
- objects containing the extra information needed. These references can be queried using GraphQL, so we can make use of:
-
dataSources
to fetch data - other resolvers (e.g.
Picture
) to share transformation logic
For nodes that require additional information, a file should be created in the content-tree/references folder, containing a GraphQL typeDef
and resolver object.
An example of a very simplified resolution for the body()
resolver (including the references and the tree) is:
{
"structured": {
"references": {
...
},
"tree": {
"type": "body",
"version": 1,
"children": [
{
"type": "paragraph",
"children": [
{
"type": "text",
"value": "Some text"
},
]
},
{
"type": "paragraph",
"children": [
{
"type": "text",
"value": "Some text"
},
]
}
...
}
The collection of GraphQL types for resolvers which return scalars.
flowchart TB
subgraph typeDefs[typeDefs]
end
subgraph articleDocumentQuery[articleDocumentQuery]
end
subgraph articleJsonData[article content JSON data]
end
subgraph resolvers[resolvers]
subgraph contentResolvers[content.ts]
subgraph idResolver["id()"]
end
subgraph bodyResolver["body()"]
end
subgraph etcResolvers["..."]
end
end
end
subgraph bodyXMLToTree["bodyXMLToTree()"]
end
subgraph tagMappings["tagMappings()"]
end
subgraph models[models]
end
subgraph dataSources[dataSources]
end
typeDefs -- structures query --> articleDocumentQuery
typeDefs -- structures resolvers --> resolvers
articleDocumentQuery -- once executed (via client or api), the query returns article data--> articleJsonData
dataSources --> models
models --> resolvers
resolvers <-- query initiates resolvers --> articleDocumentQuery
bodyXMLToTree -- traverses XML and returns content-tree structure --> bodyResolver
tagMappings -- mappings provided to differentiate nodes --> bodyXMLToTree