`@sugarcube/plugin-tika`

Use the Apache Tika toolkit to detect and extract metadata and text from over a thousand different file types.

Installation

npm install --save @sugarcube/plugin-tika

To use this plugin you need as well Java installed.

Plugins

`tika_parse`

Parse a list of file specified by the query type glob_pattern.

sugarcube -Q glob_pattern:files/**/*.pdf -p tika_parse

`tika_links`

This plugin iterates over all links in _sc_media and fetches the text and meta data for this link. This plugin ignores any errors that the fetch might throw.

`tika_location`

This plugin parses any location specified using the tika_location_field query type. This fetches the text and meta data of e.g. a url inside the unit.

sugarcube -Q google_search:Keith\ Johnstone \
          -Q tika_location_field:href \
          -p google_search,tika_location

The text and meta data are added into the _sc_media collection and placed directly on the unit as well, e.g. if the location field is href, the href_text and href_meta fields are added to the unit.

`tika_export`

Export the text and meta data that tika_location parses to a file.

sugarcube -Q google_search:Keith\ Johnstone \
          -p google_search,tika_location,tika_export \
          --tika.location_field href

Configuration Options:

tika.data_dir: Specify the target directory where to store all files. Defaults to ./data/tika_location.

License

GPL3 @ Christo

@sugarcube/plugin-tika

`@sugarcube/plugin-tika`

Installation

Plugins

`tika_parse`

`tika_links`

`tika_location`

`tika_export`

License

Dependents (0)

Package Sidebar

Install

Repository

Homepage

Weekly Downloads

Version

License

Unpacked Size

Total Files

Last publish

Collaborators

@sugarcube/plugin-tika

@sugarcube/plugin-tika

Installation

Plugins

tika_parse

tika_links

tika_location

tika_export

License

Dependents (0)

Package Sidebar

Install

Repository

Homepage

DownloadsWeekly Downloads

Version

License

Unpacked Size

Total Files

Last publish

Collaborators

`@sugarcube/plugin-tika`

`tika_parse`

`tika_links`

`tika_location`

`tika_export`

Weekly Downloads