README
Webster
Overview
Webster is A Powerful and Extensible Web Crawling Framework for Node.js application. You can use Webster to crawl websites and extract structured data from their pages.
Which is different from other crawling framework is that Webster can scrape the content which rendered by browser client side javascript and ajax request.
Docker quick start
pull the example docker image:
docker pull zhuyingda/webster-demo
docker run -it zhuyingda/webster-demo
here is a simple demo for crawling this sample site, (which was a demo used by Scrapy framework):
node demo_producer.js
env MOD=debug node demo_consumer.js
Requirements
- Node.js 10.x+, redis
- Works on Linux, Mac OSX
Or you can deploy on Docker.
Install
npm install webster
Usage on Raspbian Platform
sudo apt install chromium-browser chromium-codecs-ffmpeg
env MOD=debug EXE_PATH=/usr/bin/chromium-browser node demo_consumer.js
Architecture overview
Documentation
You can see more details from here.
Contributors
Code Contributors
This project exists thanks to all the people who contribute. [Contribute].
Financial Contributors
Become a financial contributor and help us sustain our community. [Contribute]
Individuals
Organizations
Support this project with your organization. Your logo will show up here with a link to your website. [Contribute]
License
Copyright (c) 2017-present, Yingda (Sugar) Zhu