scrap-dans-tes-mains
All the scrappers we need to get content from l'internet mondial. |
Install
-
Yarn is a promising node package manager. We recommend to use it. So first make sure you have a global version of it (https://yarnpkg.com/en/docs/install).
-
Install the node_modules with
yarn
- Symlink the 'scrap' binary command
yarn symlink
Run locally
- Launch just once a watcher that will transpile your babel src files into vanilla javascript lib files anytime you do change in your src scrapper files.
yarn dev
- You can execute a scrap run with the following command, for example
scrap --url=https://www.washingtonpost.com/world/national-security/american-strikes-against-syria-prompt-both-praise-and-condemnation/2017/04/07/df58e194-1bb1-11e7-855e-4824bbb5d748_story.html?hpid=hp_hp-top-table-main_ussyria-820pm%3Ahomepage%2Fstory&utm_term=.1d9f998edc55
Or run directly the node scripts/scrap.js file.
Add a new scrapper
-
Follow the example given with src/www_washingtonpost_com.js, and create your own
-
Don't forget to add a test file in the test folder, and run the test !
yarn test