Helper package that simplifies development of Apify acts. See https://www.apifier.com for details. This is still work in progress, things might change and break.
npm install apify --save
This package requires Node.js 6 or higher. It might work with lower versions too, but they are neither tested nor supported.
Import the package to your act.
const Apify = ;
To simplify development of acts, the runtime provides the
Apify.main(func) function which does the following:
Invokes the user function
If the function returned a promise, waits for it to resolve
Exits the process
If the user function throws an exception or some other error is encountered,
Apify.main() prints the details to console so that they are stored to the log file.
Apify.main() accepts a single argument - the user function that performs the operation of the act.
In the simplest case, the user function is synchronous:
If the user function returns a promise, it is considered as asynchronous:
const request = ;Apify;
To simplify your code, you can take advantage of the
const request = ;Apify;
Note that the
Apify.main() function does not need to be used at all,
it is provided merely for user convenience.
When running on the Apify platform, the act process is executed with several environment variables.
To simplify access to these variables, you can use the
which returns an object with the following properties:
// ID of the act.// Environment variable: APIFY_ACT_IDactId: String// ID of the act run// Environment variable: APIFY_ACT_RUN_IDactRunId: String// ID of the user who started the act (might be different than the owner of the act)// Environment variable: APIFY_USER_IDuserId: String// Authentication token representing privileges given to the act run,// it can be passed to various Apify APIs.// Environment variable: APIFY_TOKENtoken: String// Date when the act was started// Environment variable: APIFY_STARTED_ATstartedAt: Date// Date when the act will time out// Environment variable: APIFY_TIMEOUT_ATtimeoutAt: Date// ID of the key-value store where input and output data of this act is stored// Environment variable: APIFY_DEFAULT_KEY_VALUE_STORE_IDdefaultKeyValueStoreId: String// Port on which the act's internal web server is listening.// This is still work in progress, stay tuned.// Environment variable: APIFY_INTERNAL_PORTinternalPort: Number
Each act can have an input and output data record, which is raw data
with a specific MIME content type.
Both input and output is stored in the Apify key-value store created specifically for the act run,
under keys named
The ID of the key-value store is provided by the Actor runtime as the
Apify.getValue(key, [, callback]) function to obtain the input of your act:
const input = await Apify;console;consoledirinput;
If the input data has the
text/plain content type the result is a string.
For other content types, the result is raw Buffer.
Similarly, the output can be stored using the
Apify.setValue(key, value [, options] [, callback]) function as follows:
const output =someValue: 123;await Apify;
By default, the value is converted to JSON and stored with the
application/json content type.
If you want to store your data with another content type, pass it in the options as follows:
In this case, the value must be a string or Buffer.
IMPORTANT: Do not forget to use the
await keyword when calling
otherwise the act process might finish before the output is stored and/or storage errors will not be reported!
Besides the key
OUTPUT, you can use arbitrary keys
to store any data from your act, such as its state or larger results.
Apify runtime optionally depends on
the selenium-webdriver package that enables
automation of a web browser.
The simplest way to launch a new web browser is using the
Apify.browse([url,] [options,] [callback])
function. For example:
const browser = await Apify;
const browser = await Apify;
options parameter controls settings of the web browser and it has the following properties:
// Initial URL to open. Note that the url argument in Apify.browse() overrides this value.// The default value is 'about:blank'url: String// The type of the web browser to use.// See for possible options.// The default value is 'chrome', which is currently the only fully-supported browser.browserName: String// Indicates whether the browser should be opened in headless mode (i.e. without windows).// By default, this value is based on the APIFY_HEADLESS environment variable.headless: Boolean// URL of the proxy server, e.g. ''.// Currently only the 'http' proxy type is supported.// By default it is null, which means no proxy server is used.proxyUrl: String// Overrides the User-Agent HTTP header of the web browser.// By default it is null, which means the browser uses its default User-Agent.userAgent: String
The result of the
Apify.browse() is a new instance of the
which represents a web browser instance (possibly with multiple windows or tabs).
If you pass a Node.js-style callback the
Browser instance is passed to it,
Apify.browse() function returns a promise that resolves to the
Browser class has the following properties:
// An instance of the Selenium's WebDriver class.webDriver: Object// A method that closes the web browser and releases associated resources.// The method has no arguments and returns a promise that resolves when the browser was closed.close: Function
webDriver property can be used to manipulate the web browser:
const url = await browserwebDriver;
For more information, see WebDriver documentation.
When the web browser is no longer needed, it should be closed:
By default, the
browse functions return a promise.
However, they also accept a Node.js-style callback parameter.
If the callback is provided, the return value of the functions is not defined
and the functions only invoke the callback upon completion or error.
To set a promise dependency from an external library, use a code such as:
const Promise = ;Apify;
Apify.setPromisesDependency() is not called, the runtime defaults to
native promises if they are available, or it throws an error.
Apify.client property contains a reference to the
(from the apify-client NPM package),
that is used for all underlying calls to the Apify API.
The instance is created when the
apify package is first imported
and it is configured using the
The default settings of the instance can be overridden by calling
Apify.events property contains a reference to an
that is used by Actor runtime to notify your process about various events.
This will be used in the future.
TODO: this is still not finished
You can run a web server inside the act and handle the requests all by yourself.
const http = ;const server = http;server;
Note that by calling
Apify.readyFreddy() you tell the Actor runtime that your server is ready to start
receiving HTTP requests over the port specified by the
APIFY_INTERNAL_PORT environment variable.
npm run testto run tests
npm run test-covto generate test coverage
npm run buildto transform ES6/ES7 to ES5 by Babel
npm run cleanto clean
npm publishto run Babel, run tests and publish the package to NPM