nuǝW pǝuoᴉʇᴉsoԀ ʎlǝʌᴉʇɐƃǝN

    @s524797336/urllib

    1.0.0 • Public • Published

    urllib

    NPM version build status appveyor build status Test coverage David deps Known Vulnerabilities npm download

    Request HTTP URLs in a complex world — basic and digest authentication, redirections, cookies, timeout and more.

    Install

    $ npm install urllib --save

    Usage

    callback

    var urllib = require('urllib');
    
    urllib.request('http://cnodejs.org/', function (err, data, res) {
      if (err) {
        throw err; // you need to handle error
      }
      console.log(res.statusCode);
      console.log(res.headers);
      // data is Buffer instance
      console.log(data.toString());
    });

    Promise

    If you've installed bluebird, bluebird will be used. urllib does not install bluebird for you.

    Otherwise, if you're using a node that has native v8 Promises (v0.11.13+), then that will be used.

    Otherwise, this library will crash the process and exit, so you might as well install bluebird as a dependency!

    var urllib = require('urllib');
    
    urllib.request('http://nodejs.org').then(function (result) {
      // result: {data: buffer, res: response object}
      console.log('status: %s, body size: %d, headers: %j', result.res.statusCode, result.data.length, result.res.headers);
    }).catch(function (err) {
      console.error(err);
    });

    co & generator

    If you are using co or koa:

    var co = require('co');
    var urllib = require('urllib');
    
    co(function* () {
      var result = yield urllib.requestThunk('http://nodejs.org');
      console.log('status: %s, body size: %d, headers: %j',
        result.status, result.data.length, result.headers);
    })();

    Global response event

    You should create a urllib instance first.

    var httpclient = require('urllib').create();
    
    httpclient.on('response', function (info) {
      error: err,
      ctx: args.ctx,
      req: {
        url: url,
        options: options,
        size: requestSize,
      },
      res: res
    });
    
    httpclient.request('http://nodejs.org', function (err, body) {
      console.log('body size: %d', body.length);
    });

    API Doc

    Method: http.request(url[, options][, callback])

    Arguments

    • url String | Object - The URL to request, either a String or a Object that return by url.parse.
    • options Object - Optional
      • method String - Request method, defaults to GET. Could be GET, POST, DELETE or PUT. Alias 'type'.
      • data Object - Data to be sent. Will be stringify automatically.
      • dataAsQueryString Boolean - Force convert data to query string.
      • content String | Buffer - Manually set the content of payload. If set, data will be ignored.
      • stream stream.Readable - Stream to be pipe to the remote. If set, data and content will be ignored.
      • writeStream stream.Writable - A writable stream to be piped by the response stream. Responding data will be write to this stream and callback will be called with data set null after finished writing.
      • consumeWriteStream [true] - consume the writeStream, invoke the callback after writeStream close.
      • contentType String - Type of request data. Could be json. If it's json, will auto set Content-Type: application/json header.
      • nestedQuerystring Boolean - urllib default use querystring to stringify form data which don't support nested object, will use qs instead of querystring to support nested object by set this option to true.
      • dataType String - Type of response data. Could be text or json. If it's text, the callbacked data would be a String. If it's json, the data of callback would be a parsed JSON Object and will auto set Accept: application/json header. Default callbacked data would be a Buffer.
      • fixJSONCtlChars Boolean - Fix the control characters (U+0000 through U+001F) before JSON parse response. Default is false.
      • headers Object - Request headers.
      • timeout Number | Array - Request timeout in milliseconds for connecting phase and response receiving phase. Defaults to exports.TIMEOUT, both are 5s. You can use timeout: 5000 to tell urllib use same timeout on two phase or set them seperately such as timeout: [3000, 5000], which will set connecting timeout to 3s and response 5s.
      • auth String - username:password used in HTTP Basic Authorization.
      • digestAuth String - username:password used in HTTP Digest Authorization.
      • agent http.Agent - HTTP Agent object. Set false if you does not use agent.
      • httpsAgent https.Agent - HTTPS Agent object. Set false if you does not use agent.
      • ca String | Buffer | Array - An array of strings or Buffers of trusted certificates. If this is omitted several well known "root" CAs will be used, like VeriSign. These are used to authorize connections. Notes: This is necessary only if the server uses the self-signed certificate
      • rejectUnauthorized Boolean - If true, the server certificate is verified against the list of supplied CAs. An 'error' event is emitted if verification fails. Default: true.
      • pfx String | Buffer - A string or Buffer containing the private key, certificate and CA certs of the server in PFX or PKCS12 format.
      • key String | Buffer - A string or Buffer containing the private key of the client in PEM format. Notes: This is necessary only if using the client certificate authentication
      • cert String | Buffer - A string or Buffer containing the certificate key of the client in PEM format. Notes: This is necessary only if using the client certificate authentication
      • passphrase String - A string of passphrase for the private key or pfx.
      • ciphers String - A string describing the ciphers to use or exclude.
      • secureProtocol String - The SSL method to use, e.g. SSLv3_method to force SSL version 3.
      • followRedirect Boolean - follow HTTP 3xx responses as redirects. defaults to false.
      • maxRedirects Number - The maximum number of redirects to follow, defaults to 10.
      • formatRedirectUrl Function - Format the redirect url by your self. Default is url.resolve(from, to).
      • beforeRequest Function - Before request hook, you can change every thing here.
      • streaming Boolean - let you get the res object when request connected, default false. alias customResponse
      • gzip Boolean - Accept gzip response content and auto decode it, default is false.
      • timing Boolean - Enable timing or not, default is false.
      • enableProxy Boolean - Enable proxy request, default is false.
      • proxy String | Object - proxy agent uri or options, default is null.
    • callback(err, data, res) Function - Optional callback.
      • err Error - Would be null if no error accured.
      • data Buffer | Object - The data responsed. Would be a Buffer if dataType is set to text or an JSON parsed into Object if it's set to json.
      • res http.IncomingMessage - The response.

    Returns

    http.ClientRequest - The request.

    Calling .abort() method of the request stream can cancel the request.

    Options: options.data

    When making a request:

    urllib.request('http://example.com', {
      method: 'GET',
      data: {
        'a': 'hello',
        'b': 'world'
      }
    });

    For GET request, data will be stringify to query string, e.g. http://example.com/?a=hello&b=world.

    For others like POST, PATCH or PUT request, in defaults, the data will be stringify into application/x-www-form-urlencoded format if Content-Type header is not set.

    If Content-type is application/json, the data will be JSON.stringify to JSON data format.

    Options: options.content

    options.content is useful when you wish to construct the request body by yourself, for example making a Content-Type: application/json request.

    Notes that if you want to send a JSON body, you should stringify it yourself:

    urllib.request('http://example.com', {
      method: 'POST',
      headers: {
        'Content-Type': 'application/json'
      },
      content: JSON.stringify({
        a: 'hello',
        b: 'world'
      })
    });

    It would make a HTTP request like:

    POST / HTTP/1.1
    Host: example.com
    Content-Type: application/json
    
    {
      "a": "hello",
      "b": "world"
    }

    This exmaple can use options.data with application/json content type:

    urllib.request('http://example.com', {
      method: 'POST',
      headers: {
        'Content-Type': 'application/json'
      },
      data: {
        a: 'hello',
        b: 'world'
      }
    });

    Options: options.stream

    Uploads a file with formstream:

    var urllib = require('urllib');
    var formstream = require('formstream');
    
    var form = formstream();
    form.file('file', __filename);
    form.field('hello', '你好urllib');
    
    var req = urllib.request('http://my.server.com/upload', {
      method: 'POST',
      headers: form.headers(),
      stream: form
    }, function (err, data, res) {
      // upload finished
    });

    Response Object

    Response is normal object, it contains:

    • status or statusCode: response status code.
      • -1 meaning some network error like ENOTFOUND
      • -2 meaning ConnectionTimeoutError
    • headers: response http headers, default is {}
    • size: response size
    • aborted: response was aborted or not
    • rt: total request and response time in ms.
    • timing: timing object if timing enable.
    • remoteAddress: http server ip address
    • remotePort: http server ip port

    Response: res.aborted

    If the underlaying connection was terminated before response.end() was called, res.aborted should be true.

    require('http').createServer(function (req, res) {
      req.resume();
      req.on('end', function () {
        res.write('foo haha\n');
        setTimeout(function () {
          res.write('foo haha 2');
          setTimeout(function () {
            res.socket.end();
          }, 300);
        }, 200);
        return;
      });
    }).listen(1984);
    
    urllib.request('http://127.0.0.1:1984/socket.end', function (err, data, res) {
      data.toString().should.equal('foo haha\nfoo haha 2');
      should.ok(res.aborted);
      done();
    });

    HttpClient2

    HttpClient2 is a new instance for future. request method only return a promise, compatible with async/await and generator in co.

    Options

    options extends from urllib, besides below

    • retry Number - a retry count, when get an error, it will request again until reach the retry count.
    • retryDelay Number - wait a delay(ms) between retries.
    • isRetry Function - determine whether retry, a response object as the first argument. it will retry when status >= 500 by default. Request error is not included.

    Proxy

    Support both http and https protocol.

    Notice: Only support on Node.js >= 4.0.0

    Programming

    urllib.request('https://twitter.com/', {
      enableProxy: true,
      proxy: 'http://localhost:8008',
    }, (err, data, res) => {
      console.log(res.status, res.headers);
    });

    System environment variable

    • http
    HTTP_PROXY=http://localhost:8008
    http_proxy=http://localhost:8008
    • https
    HTTP_PROXY=http://localhost:8008
    http_proxy=http://localhost:8008
    HTTPS_PROXY=https://localhost:8008
    https_proxy=https://localhost:8008
    $ http_proxy=http://localhost:8008 node index.js

    TODO

    • [ ] Support component
    • [ ] Browser env use Ajax
    • [√] Support Proxy
    • [√] Upload file like form upload
    • [√] Auto redirect handle
    • [√] https & self-signed certificate
    • [√] Connection timeout & Response timeout
    • [√] Support Accept-Encoding=gzip by options.gzip = true
    • [√] Support Digest access authentication

    License

    MIT

    Install

    npm i @s524797336/urllib

    DownloadsWeekly Downloads

    3

    Version

    1.0.0

    License

    MIT

    Last publish

    Collaborators

    • s524797336