CWRC-Writer-Base
================
The Canadian Writing Research Collaboratory (CWRC) has developed an in-browser text markup editor (CWRC-Writer) for use by individual scholars and collaborative scholarly editing projects that require a light-weight online editing environment. This package is the base code that builds on the TinyMCE editor, and is meant to be bundled together with two other packages that provide document storage and entity lookup. A default version of the CWRC-Writer that uses GitHub for storage is available for anyone's use at https://cwrc-writer.cwrc.ca/.
Table of Contents
Overview
CWRC-Writer is a WYSIWYG text editor for in-browser XML editing and stand-off RDF annotation.
It is built around a heavily customized version of the TinyMCE editor, and includes a CWRC-hosted XML validation service.
A CWRC-Writer installation is a bundling of the main CWRC-WriterBase (the code in this repository) with
a few other NPM packages that handle interaction with server-side services for document storage and named entity lookup.
The default implementation of the CWRC-Writer is the CWRC-GitWriter. It uses GitHub to store documents via the cwrc-git-dialogs package. Entity lookups for VIAF, WikiData, DBpedia, Getty and GeoNames are provided via CWRC-PublicEntityDialogs and related lookup packages.
Storage and Entity Lookup
If you choose not to use either the default GitHub storage or named entity lookups, then most of the work in setting up CWRC-Writer for your project will be in implementing the dialogs to interact with your backend storage and/or named entity lookups. We have split these pieces off into their own packages in large part to make it easier to substitute your own dialogs and supporting services.
A good example to follow when creating a new CWRC-Writer project is our public implementation CWRC-GitWriter. You might also choose to use either the GitHub storage dialogs or the named entity lookups, both of which are used by the CWRC-GitWriter, and replace just one of the two. To help understand how we've developed the CWRC-Writer, you could also look at our development docs.
To replace either of the storage and entity dialogs, you'll need to create modules with the following APIs:
Storage Object API
To see the methods that need to be provided by your own storage implementation, you can view the cwrc-git-dialogs API.
Note that because the load(writer)
and save(writer)
methods are passed an instance of the CWRC-WriterBase, all of the methods defined below in the API are available, in order to allow getting and setting of XML in the editor.
Entity Lookup API
You have at least two choices here:
-
You can entirely implement your own dialog for entity lookup, following the model in CWRC-PublicEntityDialogs
-
You can use CWRC-PublicEntityDialogs and configure it with different sources. We provide five sources: VIAF, Wikidata, Getty, DBpedia, and GeoNames.
You can use any of these sources, and supplement them with your own sources. CWRC-PublicEntityDialogs fully explains how to add your own sources.
API
Constructor
The CWRC-WriterBase exports a single constructor function that takes one argument, a configuration object.
See CWRC-GitWriter/src/js/config.js for an example of a base configuration file, and
CWRC-GitWriter/src/js/app.js to see the configuration file loaded, extended, and passed into the constructor.
Configuration Object
Options that can be set on the configuration object:
Required Options
-
config.container
: String. The ID of the element that should contain the CWRC-Writer. -
config.storageDialogs
: Object. Storage dialogs, see cwrc-git-dialogs for example and API definition. -
config.entityLookupDialogs
: Object. Entity lookup, see cwrc-public-entity-dialogs for example and API definition.
Other Options
-
config.cwrcRootUrl
: String. An absolute URL that should point to the root of the CWRC-Writer directory. If blank, the browser URL will be used. -
config.modules
: Object. The internal CWRC-WriterBase modules to load, grouped by their locations relative to the CWRC-Writer. A module ID must be provided. An optional display title can be specified. Certain modules require additional configuration.For example:
config.modules = { west: [ {id: 'structure', title: 'Markup'}, {id: 'entities'} ], east: [ {id: 'selection'} ], south: [ {id: 'validation', config: {'validationUrl': 'https://validator.services.cwrc.ca/validator/validate.html'}} ] }
-
config.annotator
: Boolean. If true, the user may only add annotations to the document. -
config.readonly
: Boolean. If true, the user may not edit nor annotate the document. -
config.mode
: String. The mode in which to start the CWRC-Writer:xml
orxmlrdf
. -
config.allowOverlap
: Boolean. Should overlapping entities be allowed initially? -
config.schemas
: Object. A map of schema objects that can be used in the CWRC-Writer. Each entry should contain the following:-
id
: String. The schema id. -
name
: String. The schema title. -
schemaMappingsId
: String. The directory name in the schema directory from which to load mapping and dialogs files for the schema. -
xmlUrl
: Array. A list of URLs that links to the schema (RELAX NG) file. CWRC-Griter will load the first in the list. Alternative URL can be used in the rare case that you want to match against a particular schema URL, but load the schema from another location (e.g. to avoid CORS errors). -
cssUrl
: Array. A list of URLs that links to the CSS associated with this schema.
-
-
config.buttons1
,config.buttons2
,config.buttons3
: String. A comma separated list of buttons to display in the CWRC-Writer toolbars. Possible values:schematags, addperson, addplace, adddate, addorg, addcitation, addnote, addtitle, addcorrection, addkeyword, addlink, editTag, removeTag, addtriple, toggletags, viewmarkup, editsource, validate, savebutton, loadbutton, logoutbutton, fullscreen
.
Writer object
The object returned by the constructor is defined here: writer.js. The typical properties and methods you'd want to use when implementing your own storage and/or entity dialogs are:
Properties
isInitialized
boolean
Has the editor been initialized.
isDocLoaded
boolean
Is there a document loaded in the editor.
isReadOnly
boolean
Is the editor in readonly
mode.
isAnnotator
boolean
Is the editor in annotate (entities) only mode.
Methods
loadDocumentURL(docUrl)
Loads an XML document from a URL into the editor.
loadDocumentXML(docXml)
Loads an XML document (either a XML Document or a stringified version of such) into the editor.
setDocument(docUrl|docXml)
A convenience method which calls either loadDocumentURL
or loadDocumentXML
based on the parameter provided.
getDocument(asString)
Returns the parsed XML document from the editor. If asString
is true, then a stringified version of the document is returned.
getDocRawContent()
Returns the raw, un-parsed HTML content from the editor.
showLoadDialog()
Convenience method to call the load
method of the object set in the storageDialogs
property of the config object passed to the writer.
showSaveDialog()
Convenience method to call the save
method of the object set in the storageDialogs
property of the config object passed to the writer.
validate(callback)
Validates the current document
callback(w, valid): function where w is the writer and valid is true/false.
Fires a documentValidated
event if validation is successful.
Managers
Tasks within CWRC-Writer are handled by specific managers.
AnnotationsManager
Handles conversion of entities to annotations and vice-versa.
SchemaManager
Handles schema loading and schema CSS processing. Stores the list of available schemas, as well as the current schema. Handles the creation of schema-appropriate entities, via the Mapper.
EntitiesManager
Handles the creation and modification of entities. Stores the list of entities in the current document.
EventManager
Handles the dissemination of events through the CWRC-Writer using a publication-subscribe pattern. See the code for the full list of events.
LayoutManager
Handles the initialization and display of the modules specified in the modules
property of the config object. Also handles browser resizing and fullscreen functionality.
DialogManager
Handles the initialization and display of dialogs.
Modules
Modules are self-contained components that add extra functionality to CWRC-Writer. These can be specified in the configuration object using the proper module ID.
StructureTree
Module ID: structure
Displays the markup of the current document in a tree format. Useful for navigating and modifying the document.
EntitiesList
Module ID: entities
Displays the list of entities in the current document. Allows for modifying, copying, scraping, and deleting of entities.
Selection
Module ID: selection
Displays the markup of the text that's selected in the current document.
Validation
Module ID: validation
Configuration:
-
validationUrl
: The URL for the validation service endpoint. The CWRC-hosted service is at https://validator.services.cwrc.ca/validator/validate.html).
Requests and displays the results of document validation. See validate.
NERVE
Module ID: nerve
Configuration:
-
nerveUrl
: The URL for the NERVE service endpoint. The CWRC-hosted service is at https://nerve.services.cwrc.ca/ner.
Sends the document for named entity recognition and adds the results as entities to the document.
ImageViewer
Module ID: imageViewer
Displays images linked from within the current document. Useful for OCR'd documents.
Relations
Module ID: relations
Displays the list of entity relationships (i.e. RDF triples) in the current document. Uses triple to add new relationships.
Development
CWRC-Writer-Dev-Docs explains how to work with CWRC-Writer GitHub repositories, including this one.