nft-scrape

Scrape NFTs from ethereum block chain

Usage no npm install needed!

<script type="module">
  import nftScrape from 'https://cdn.skypack.dev/nft-scrape';
</script>

README

NFT Scrape

This repository contains code to scrape all the NFTs from the Ethereum block chain. I decided to publish it after a few pepole asked following my writeup at YourNfts.

How do you scrape all the NFTs?

NFTs generally implement EIP-721 or EIP-1155.

Discovering NFTs can be done by looking at the events they emit in the logs on the blockchain.

Once you have discovered a [contractAddress, tokenId] tuple which uniquely identifies an NFT you need to grab it's metadata URI by calling contract.uri(tokenId).

At this point, getting the JSON metadata is a pretty straightforward web scraping task and outside the scope of this repository.

If you want more nitty gritty details I suggest just reading the code. It is only ~200 lines long.

Using CLI

  • You need access to ethereum's JSON API with very high rate limits. Etherscan's rate limits are way too low. Running your own node is probably best.
  • Install this package with npm install -g nft-scrape
  • Discover the NFTs mentioned in a block using nft-scrape --rpc 'http://127.0.0.1:8545' 14000000. Note: replace 127.0.0.1 with your RPC host.
  • The command line interface writes JSONL to standard out. Schema below:
interface NFT {
    /** deterministically generated UUID for NFT */
    id: string,
    /** the contract type of NFT */
    type: 'eip721' | 'eip1155'
    /** the address where the NFTs contract can be found */
    contractAddress: string,
    /** the token id of the NFT. This and contractAddress uniquiely identify an NFT */
    tokenId: string,
    /** TransactionHash where NFT was discovered */
    tnxHash: string,
    /** block where NFT was discovered */
    blkNum: number,
    /** from address iff NFT was discovered via a Transfer */
    from?: string,
    /** to address iff NFT was discovered via a Transfer */
    to?: string,
    /** operator iff NFT was discovered via a EIP 1155 Transfer */
    operator?: string,
    /** value iff NFT was discovered via a EIP 1155 Transfer */
    value?: string,
    /** URI for NFT metadata */
    uri: string,
}

How to scrape all the blocks

To achive any sort of speed you will need to use multiple processes. The cli program parallel is an easy way to do this. The command below will scrape blocks 14000000 to 14000500 using 8 processes.

seq 14000000 14000500 | parallel -j8 --ungroup nft-scrape--rpc 'http://127.0.0.1:8545' '{}'

Deduplication

The same NFT can be discovered in multiple blocks. Deduplication can be done by using the id field in the JSONL.

Using Node.js

  • install package with npm install --save nft-scrape
  • import with import getNftsInBlock from "nft-scrape"
  • call getNftsInBlock(web3: Web3, blockNum: number): AsyncGenerator<Nft, void, void>

Notes

  • The metadata extension is optional so not every NFT has a uri.
  • Some calls to get metadata uri need a higher gas cap than the default. If you using geth, --rpc.gascap 100000000 should fix the issue.