robots-parser
popularNodeJS robots.txt parser with support for wildcard (*) matching.
Updated by @samclarke
simplecrawler
popularVery straightforward, event driven web crawler. Features a flexible queue interface and a basic cache mechanism with extensible backend.
Updated by @cgiffard
robots-txt-parser
A lightweight robots.txt parser for Node.js with support for wildcards, caching and promises.
Updated by @cakroyd
webhead
An easy-to-use Node web crawler storing cookies, following redirects, traversing pages and submitting forms.
Updated by @pme-legend
quick-scraper
An easy, lightweight scraper for humans with many inbuilt features..
Updated by @unbuttun-spark
es6-crawler-detect
This is an ES6 adaptation of the original PHP library CrawlerDetect, this library will help you detect bots/crawlers/spiders vie the useragent.
Updated by @jefferyhus
@zachleat/spider-pig
Get a list of local URL links from a root URL. Works with JavaScript generated content. Can also act as a live-DOM CSS search across multiple files (find all the templates that …
Updated by @zachleat
js-spider
JSpider 3 is a Chrome DevTools crawler framework that includes full crawler support. JSpider 3 是在 Chrome Devtools 中进行爬虫的爬虫框架, 这个框架包括了完整的爬虫支持。
Updated by @konghayao
aliexpress-product-scraper
Get Aliexpress product details as a json reponse including feedbacks, variants, description, images, etc.,
Updated by @sudheerranga