s3-bucket-stream
Readable stream of the Body of every object in an S3 bucket.
Updated by @sunnypurewal
turbocrawl
The simple and fast crawling framework. So you can focus on scraping.
Updated by @sunnypurewal
hittp
HTTP library specifically designed for crawling the web. Built-in caching and per-domain queueing
Updated by @sunnypurewal
getsitemap
Node.js module that recursively crawls a website's sitemap and returns a stream of URLs
Updated by @sunnypurewal