Search results
148 packages found
Extract html from pdfs using Poppler's pdftohtml
Get all of a DOM element's cleaned and standardized text content.
Tokenizing strings of text. Extracting arrays of words and optionally number, emojis, tags, usernames and email addresses from strings. For Node.js and the browser. When you need more than just [a-z] regular expressions.
A tool to extract key-value pairs by comparing a template with an input string.
Module for creating a keyword array from a string and excluding stop words.
Extracts data from text and generates text with data included.
Seize is light Node or Browser web-page content extractor inspired by arc90 readability and Safari Reader
simple module that recognize iban from text
Extract the text from pdf files and more utils
Extract inline citations from stream
A set of functions for working with regular expressions, such as finding and replacing text patterns and validating user input.
Get all URLs in a string
Extract the text from pdf files
json-miner extracts json(s) out of text dump
extensible text from document extractor
A nodejs module to extract keywords from text.
extracts links from markdown texts
JavaScript text parser to extract code blocks delimited by character pairs (brackets, braces, quotes, etc).
Extract text from pdfs that contain searchable pdf text