compromise-penn-tags
TypeScript icon, indicating that this package has built-in type declarations

0.0.2 • Public • Published
a plugin for compromise
v
npm install compromise-penn-tags
nlp("pour through a book").pennTags()
/*
[{
  text: 'pour through a book',
  terms: [
    { text: 'pour', penn: 'VBP', tags: [Array] },
    { text: 'through', penn: 'IN', tags: [Array] },
    { text: 'a', penn: 'WDT', tags: [Array] },
    { text: 'book', penn: 'NN', tags: [Array] }
  ]
}]
*/

This plugin is meant to supply a mapping between the standard Penn Tagset and the custom tagset in compromise.

This lets users evaluate the compromise POS-tagger by comparing it to other libraries or testing data.

Please note that tokenization choices vary considerably between pos-tagger libraries, making this comparison more difficult.

Compromise makes some unique decisions tokenizing punctuation and contractions.

Unlike most pos-taggers, compromise terms have many tags, including descendent, or assumed tags.

Compromise is also less-confident than most libraries about declaring whether a Noun is a Singular or Plural - if the penn-tag is NNPS compromise may return NNP instead.

the .pennTags() method accepts the same options as the .json() method does.

nlp('in the town where I was born').pennTags({offset:true})
/*
[{
  text: 'in the town where I was born',
  terms: [
    { text: 'in', penn: 'IN', tags: [Array] },
    { text: 'the', penn: 'WDT', tags: [Array] },
    { text: 'town', penn: 'NN', tags: [Array] },
    { text: 'where', penn: 'CC', tags: [Array] },
    { text: 'I', penn: 'PRP', tags: [Array] },
    { text: 'was', penn: 'VB', tags: [Array] },
    { text: 'born', penn: 'VB', tags: [Array] }
  ],
  offset: { index: 0, start: 0, length: 28 }
}]
*/

work-in-progress

MIT

Readme

Keywords

none

Package Sidebar

Install

npm i compromise-penn-tags

Weekly Downloads

2

Version

0.0.2

License

MIT

Unpacked Size

14.3 kB

Total Files

6

Last publish

Collaborators

  • spencermountain