4chan-crawler

Crawl 4Chan's archives from most recent to oldest, saving contents to disk.

Usage no npm install needed!

<script type="module">
  import chanCrawler from 'https://cdn.skypack.dev/4chan-crawler';
</script>

README

4chan-crawlerJS

Preamble

tomcat-bit/4chan-crawler provides and easy to use crawler for the live site, which I wanted to rework for crawling the archive and collecting text as well as media. Archive.4plebs.org DDos protection blocks requests from python-requests, but not from Node's https, so I built a JS version.

Installation

npm i 4chan-crawler

Setup

npm i

Usage

Update Desired boards and output directory in config.js

npm start