seologs-parser-kit
Utility classes used to categorize, exclude, etc...
Usage
Initialize
const { Category, Exclusion, Pagination } = require("seologs-parser-kit");
Categorize
Find the category of your url based to the rules defined.
Example
const rules = {
homepage: "http://website/$",
assets: "/assets.*\\.(css|js)$"
};
const categorizer = new Category(rules);
let result = categorizer.findCategory("http://website/")
console.log(result); // homepage
result = categorizer.findCategory("http://website/assets/style.css")
console.log(result); // assets
result = categorizer.findCategory("http://website/contact/")
console.log(result); // undefined
Rules
The categorization rules are defined in an object as below:
- the key of each property is your label
- The value of each property is a the pattern to search for, as a regular expression string
Example:
{
homepage: "http://website/$",
assets: "/assets.*\\.(css|js)$"
}
Exclude
Check if your object should be excluded based to the rules defined.
Example
const rules = {
url: "\.(css|js)"
};
const excluder = new Exclusion(rules);
let result = excluder.check({
url: "http://website/assets/style.css"
});
console.log(result); // true
result = excluder.check({
url: "http://website/contact/"
});
console.log(result); // false
Rules
The exclusion rules are defined in an object that accepts the following properties:
- domain
- ip
- url
- userAgent
The value of each property is a the pattern to search for, as a regular expression string or an array of regex.
Example:
{
url: "\.(css|js)",
userAgent: "^Mozilla/5.0 \(compatible; bingbot/2.0; \+http://www.bing.com/bingbot.htm\)$"
}
Or :
{
url: [
"\.css",
"\.js"
],
userAgent: "^Mozilla/5.0 \(compatible; bingbot/2.0; \+http://www.bing.com/bingbot.htm\)$"
}
Pagination
Determine if your url is paginated or not based to the rules defined.
Example
const rules = {
page_with_number: ".*\/(page-([0-9]+)).html$"
};
const pagination = new Pagination(rules);
let result = pagination.isPaginated("http://website/cars/page-2.html");
console.log(result); // true
result = pagination.isPaginated("http://website/cars/page.html");
console.log(result); // false
Rules
The pagination rules are defined in an object as below:
- the key of each property is your label
- The value of each property is a the pattern to search for, as a regular expression string
Example:
{
page_with_number: ".*\/(page-([0-9]+)).html$"
}
Test
Run the following command:
npm test