A proxy library that provides easy hooks to manipulate http and https traffic consistently.
filternet implements a feature-packed proxy server that allows the coder to intercept and manipulate requests and responses in a consistent manner. filternet is designed to behave consistently with:
Simplest example, everything is blinking!
var filternet = ;var myProxy = filternet;myProxy;
Run this and it will automatically listen at port 8128.
Simple https example:
var filternet = ;var sslCerts ='*.github.com': 'stargithub.key' 'stargithub.crt';var myProxy = filternet;myProxy;
This example will work as both a regular HTTPS proxy (via CONNECT) as well as a transparent HTTPS proxy (via SNI). The proxy will log bodies for all HTTP responses, and only HTTPS responses that fit '*.github.com' (note that the asterisk only works one level deep, see the SSL Certificates section).
See https://github.com/axiak/filternet/blob/master/example/skeletontest.js for a simple showing of available hooks, and https://github.com/axiak/filternet/blob/master/example/skeletontest_ssl.js for the same file with SSL intercept support.
The main function available is createProxyServer(opts) where the options available are:
This gets called first on every http request or intercepted https request. If you call callback(true), the proxy server will return a 407 response and complete.
This is used to disable intercepting. If you run callback(false), the proxy server will run as a normal proxy server would. callback(true) will enable your other listeners.
The default behavior is callback(true)
request_options is a map of data to be sent to http.request. callback expects request_options to continue the request.
The default behavior is callback(request_options);
callback expects (response_status_code, response_headers). You can use this method if you want to manipulate the response headers before they get sent.
The default behavior is callback(response_status_code, response_headers);
Given the response from the remote server, this listener enabled you to decide if the interception should happen or not. Run callback(true) if you intend on intercepting the response content.
The default behavior is callback(isHtml), where isHtml is true if the content-type is something like text/html.
(The use of the method prevents the proxy server from having to buffer images, etc.)
callback expects the content buffer or string to send to the client.
is_ssl is true if this interception was performed on an https request.
charset is a convenience string which is either the charset from the Content-Type header, or null if none was defined.
Generally, if charset is not null it's safer to run buffer.toString('utf8') to get a string. Otherwise you're probably dealing with binary data.
The default behavior is callback(buffer);
Called on any error that is not a clientError
If not defined errors actually break the proxy server.
Called on any clientError
If not defined errors actually break the proxy server.
HTTP defines compression with the Accept-Encoding header. Intercepting and analysing responses are somewhat incompatible with compression, so this library tries to get around this limitation in two ways:
If enableCompression is false, then the client's Accept-Encoding header will be rewritten to 'identity' if interception is enabled (see the enabledCheck event). Note that you can't turn off compression for just HTML, as we don't know until the response from the remote server whether or not the document is HTML.
If enabledCompression is true, then the Accept-Encoding is mangled to ensure it contains nothing more than 'gzip','deflate', or 'identity'. Then the remote response headers will indicate if the response is compressed. The response is then decompressed before they are sent to any listeners.
If recompress is enabled, then the potentially manipulated response content is then compressed again, using whatever method (gzip or deflate) was used to decompress from the server. Note that the recompress flag determines if the Content-Encoding header is mangled before it's sent to the client.
This module doesn't break SSL, and as such can't be used maliciously in conjunction with https. By default, the proxy server will serve https documents transparently without eavesdropping or calling the provided hooks to manipulate requests/responses. To alter this behavior, you need to specify the sslCerts mapping:
The keyFileName and the certificateFileName are the paths to the SSL key file and certificate file. The hostDescription is one of three things:
It's important to note that '*.example.com' will match neither 'example.com' nor 'b.a.example.com'.
To create your own certificate authority and sign your own certificates for anything, I found Zach Miller's HOW-TO easy to follow: http://pages.cs.wisc.edu/~zmiller/ca-howto/
For each distinct key file provided, the proxy server will launch a separate node HTTPS server bound to a socket file named based on the key file. (The directory is determined by the sslSockDir option.)
For a standard, non-transparent HTTPS proxy the server intercepts a CONNECT statement (with node's upgrade support) and get's the desired host name. If the host name matches one of the sslCerts, the proxy server will tunnel the client with the local HTTPS server using that certificate. Otherwise, the proxy server will tunnel with the desired remote server.
Transparent HTTPS proxy gets a little bit tricky. When the client sends a TLS handshake (the first packet), the client can specify extensions which are allows to contain the server name field. This is referred to as Server Name Indication or SNI. There's a very simple TLS handshake parser (https://github.com/axiak/filternet/blob/master/lib/sniparse.js) that tries to get the server name from the TLS handshake packet. If successful, the proxy server uses the same rules as it does for the CONNECT promotion: look for a certificate matching the host name pattern and intercept if it matches.
Note that not every HTTPS client supports SNI: notably absent from the list of support is any IE on Windows XP (see http://en.wikipedia.org/wiki/Server_Name_Indication for more information).
I'm sure there are some at the moment. If you encounter any, or if you have any improvements, feel free to file an issue or send a merge request my way.
The module is licensed under BSD 3-clause and is authored by email@example.com