nex-scraper

1.0.6 • Public • Published

nex-scraper

Simple proxy scraper by NeutronX (yaxeldragon)

Terms

When downloading this module/script you accept that

The user/company should use it under his/her/their own responsability.

How it Works

Once you input the URLs you want to look in, It will send a request, check if each proxy is an actual proxy using regex. Then It will give you access to all proxies in an array from an async funcion.

Features

  • [x] Custom URL Scrape.
  • [x] Filter from an HTML document.
  • [x] Check proxies with Regex.
  • [x] Remove all Duplicates.
  • [x] Easily access to all Proxies.
Area Quick Description
Get Started How to start using nex-scraper
How to Scrape How to get the Proxies.
Configuration Customize nex-scraper.
How to Check Check your proxies nex-scraper.
Versions See what's new.

Get Started

To install nex-scraper in your project you need to input the following in your console:

npm install nex-scraper

Once nex-scraper is installed you can import it in your project by using the following methods:

const proxy = require('nex-scraper');
const proxies = new proxy([
    '(input URL here)',
    '(another URL here...)'
]);
// add more later on using
proxies.addURL([
    '(input URL here)',
    '(another URL here...)'
])

You can do as many objects as you want with the URLs you want.

How to Scrape

your project by using the following methods:

const proxy = require('nex-scraper');
const proxies = new proxy([
    'http://www.live-socks.net/2020/05/28-05-20-socks-5-servers.html',
    // ^^^^^^^ This Is HTML (It will get filtered)
    'https://api.proxyscrape.com/?request=getproxies&proxytype=socks5'
    // ^^^^^^^ This Is a normal API (RAW text)
]);
proxies.scrape().then(proxies => {
    // Here you can access the Proxies
});

Configuration

Here you can decide weather you remove duplicates or not, if you want to it to filter something in specific, etc...

// All configurations:
{
    removeDuplicates: true,     // bool
    filter: 'XXX.XXX.XXX:XXXX', // string
    forEach: () => {}           // function
}

How to enable/disable removing Duplicates.

Customise what to Filter.

forEach proxy.

Remove Duplicates

const proxy = require('nex-scraper');
const proxies = new proxy([
    'URLs...'
]);
proxies.setConfig({
    removeDuplicates: true, // Default: true
});

This will NOT remove ANY duplicates, so you may get a same proxy more than one time.

Custom Filter

const proxy = require('nex-scraper');
const proxies = new proxy([
    'URLs...'
]);
proxies.setConfig({
    filter: "XXX.XXX.XXX.XXX:XXXX", // Default: true
});

PD: The default filter has a longer range than anything customizable.

This will ONLY return proxies that looks like XXX.XXX.XXX.XXX:XXXX. For example: XXX.XXX.XXX.XXX:XXXX to look like => 123.456.789.123:4567.

More examples:

// -------------------------------------
proxies.setConfig({
    filter:  "XXX.XXX.XXX.XXX:XXXX",
    // Look: "123.456.789.123:4567"
});
// -------------------------------------
proxies.setConfig({
    filter:  "XX.XXX.XXX.XX:XXXX",
    // Look: "12.345.678.91:2345"
});
// -------------------------------------

forEach Callback

const proxy = require('nex-scraper');
const proxies = new proxy([
    'URLs...'
]);
proxies.setConfig({
    foreach: (proxy) => {
        console.log(`Gotten ${proxy}`);
    }
});

It will run the gotten callback for each proxy it gets.

Checking

This area of this module will allow you to test the proxies you have just scraped, and supports SOCKS, and HTTPs.

Preview:

const nex = require('nex-scraper');
const proxies = new nex([
    'URLs...'
]);
const checker = new nex.checker();

#constructor: new <checker_instance>(optional_URL)

Inside here goes a WebSocket URL to test on. And this is optional becuase it all ready includes an URL.

#setTimeout: <checker_object>.setTimeout(time)

Sets a limit on how much time each proxy spends checking.

#setHeaders: <checker_object>.setHeaders(headers)

Sets headers to the WebSockets connections.

#checkOne: <checker_object>.checkOne(type, proxy)

Checks one proxy with a type agent (http, or socks)

#checkMany: <checker_object>.checkMany(type, proxies_array)

Checks multiple proxy with a type agent (http, or socks)

Versions

1.0.6 - Log:

  • Documentation Fix

1.0.5 - Log:

  • Checker Prototype

1.0.4 - Log:

  • Crash Fix

1.0.3 - Log:

1.0.2 - Log:

  • 1.0.1 - Patched crashes when Scraping.

Happy Coding!

Package Sidebar

Install

npm i nex-scraper

Weekly Downloads

5

Version

1.0.6

License

ISC

Unpacked Size

29.3 kB

Total Files

7

Last publish

Collaborators

  • yaxeldragon