Search results
30 packages found
Sort by: Default
- Default
- Most downloaded this week
- Most downloaded this month
- Most dependents
- Recently published
Spam Assassin public mail corpus.
List of ~636,000 Spanish words
The text of Moby Dick by Herman Melville.
List of ~336,000 French words
A Standard Corpus of Present-Day Edited American English, for use with Digital Computers.
State of the Union addresses by U.S. Presidents.
- stdlib
- datasets
- dataset
- data
- speeches
- politics
- usa
- us
- president
- sotu
- state of the union
- addresses
- text
- corpus
- View more
A wrapper for CETEMPúblico, an European Portuguese corpus of news extracts from the newspaper Público, with 180 million words tagged automatically using PALAVRAS.
A CJK text tokenizer
日本語で書かれた技術書のコーパス
Text mining library
translate languages using a statistical model
A core type to handle CoNLL-U format
Corpus CRUD API wrapper
State of the Union addresses by U.S. Presidents.
- stdlib
- datasets
- dataset
- data
- speeches
- politics
- usa
- us
- president
- sotu
- state of the union
- addresses
- text
- corpus
- View more
Corpus representaion stored in JSON and wrapped into Corpus CRUD API
Merge multiple sentiment libraries for better sentiment analysis
Text corpus calculation in Javascript.
A node.js module for generating usernames based on a specified corpus.
A Node.js library for concordancing a corpus formatted according to the Data Format for Digital Linguistis (DaFoDiL)
Transform a directory of conll files (treebank) into a directory of svg files.