@giancosta86/cervantes
TypeScript icon, indicating that this package has built-in type declarations

2.0.0 • Public • Published

CervantesJS

Extract and classify Spanish terms from wiki pages, with TypeScript

GitHub CI npm version MIT License

Overview

CervantesJS is a TypeScript library for extracting Spanish terms from wiki pages; even more, it is a plugin for JardineroJS, creating a SQLite dictionary of Spanish terms by parsing Wikcionario.

Installation

To install the package as a plugin, please refer to the documentation of JardineroJS.

The current version of the plugin requires Jardinero 2.x

Otherwise, to install it as a library reference within a project:

npm install @giancosta86/cervantes

or

yarn add @giancosta86/cervantes

The public API entirely resides in the root package index, so you shouldn't reference specific modules.

Usage

CervantesJS is firstly and foremostly a plugin for JardineroJS: please, consult its documentation for details.

However, you can also reference the package as a standalone library for extracting Spanish terms from wiki pages!

In this case, you can just import names directly from its root:

import {...} from "@giancosta86/cervantes"

In particular, you may want to consider:

  • the SpanishTerm union type - and the related types like Noun, Article, ...

  • extractTerms() - to extract Spanish terms from a given wiki page

  • SpanishTransform - a transform stream applying extractTerms() to a flow of wiki pages

  • SPANISH_SQLITE_SCHEMA: a string containing the DDL code for SQLite

  • createSpanishWritableBuilder() - creating a WritableBuilder (from the sqlite-writable library) with the required type registrations and with a suitable transaction capacity

Further reference

Please, feel free to explore:

  • JardineroJS - the web stack itself, designed for extensible linguistic analysis

  • JardineroJS - SDK - the development kit for creating your own plugins

Package Sidebar

Install

npm i @giancosta86/cervantes

Weekly Downloads

1

Version

2.0.0

License

MIT

Unpacked Size

49.2 kB

Total Files

81

Last publish

Collaborators

  • giancosta86