it-tar

it-tar is a streaming tar parser (and maybe a generator in the future) and nothing else. It operates purely using async iterables which means you can easily extract/parse tarballs without ever hitting the file system.

Usage no npm install needed!

<script type="module">
  import itTar from 'https://cdn.skypack.dev/it-tar';
</script>

README

it-tar

build status dependencies Status JavaScript Style Guide

it-tar is a streaming tar parser and generator and nothing else. It operates purely using async iterables which means you can easily extract/parse tarballs without ever hitting the file system. Note that you still need to gunzip your data if you have a .tar.gz.

Install

npm install it-tar

Usage

it-tar packs and extracts tarballs.

It implementes USTAR with additional support for pax extended headers. It should be compatible with all popular tar distributions out there (gnutar, bsdtar etc)

Packing

To create a pack stream use tar.pack() and pipe entries to it.

const Tar = require('it-tar')
const pipe = require('it-pipe')
const toIterable = require('stream-to-it')

await pipe(
  [
    // add a file called my-test.txt with the content "Hello World!"
    {
      header: { name: 'my-test.txt' },
      body: 'Hello World!'
    },
    // add a file called my-stream-test.txt from a stream
    {
      header: { name: 'my-stream-test.txt', size: 11 },
      body: fs.createReadStream('./my-stream-test.txt')
    }
  ]
  Tar.pack(),
  // pipe the pack stream somewhere
  toIterable.sink(process.stdout)
)

Extracting

To extract a stream use tar.extract() and pipe a source iterable to it.

const Tar = require('it-tar')
const pipe = require('it-pipe')

await pipe(
  source, // An async iterable (for example a Node.js readable stream)
  Tar.extract(),
  source => {
    for await (const entry of source) {
      // entry.header is the tar header (see below)
      // entry.body is the content body (might be an empty async iterable)
      for await (const data of entry.body) {
        // do something with the data
      }
    }
    // all entries read
  }
)

The tar archive is streamed sequentially, meaning you must drain each entry's body as you get them or else the main extract stream will receive backpressure and stop reading.

Note that the body stream yields BufferList objects not Buffers.

Headers

The header object using in entry should contain the following properties. Most of these values can be found by stat'ing a file.

{
  name: 'path/to/this/entry.txt',
  size: 1314,        // entry size. defaults to 0
  mode: 0644,        // entry mode. defaults to to 0755 for dirs and 0644 otherwise
  mtime: new Date(), // last modified date for entry. defaults to now.
  type: 'file',      // type of entry. defaults to file. can be:
                     // file | link | symlink | directory | block-device
                     // character-device | fifo | contiguous-file
  linkname: 'path',  // linked file name
  uid: 0,            // uid of entry owner. defaults to 0
  gid: 0,            // gid of entry owner. defaults to 0
  uname: 'maf',      // uname of entry owner. defaults to null
  gname: 'staff',    // gname of entry owner. defaults to null
  devmajor: 0,       // device major version. defaults to 0
  devminor: 0        // device minor version. defaults to 0
}

Modifying existing tarballs

Using tar-stream it is easy to rewrite paths / change modes etc in an existing tarball.

const Tar = require('it-tar')
const pipe = require('it-pipe')
const toIterable = require('stream-to-it')

await pipe(
  fs.createReadStream('./old-tarball.tar'),
  Tar.extract(),
  async function * (source) {
    for await (const entry of source) {
      // let's prefix all names with 'tmp'
      entry.header.name = path.join('tmp', entry.header.name)
      // write the new entry to the pack stream
      yield entry
    }
  },
  Tar.pack(),
  toIterable.sink(fs.createWriteStream('./new-tarball.tar'))
)

Related

  • it-pipe Utility to "pipe" async iterables together
  • it-reader Read an exact number of bytes from a binary (async) iterable
  • stream-to-it Convert Node.js streams to streaming iterables

Contribute

Feel free to dive in! Open an issue or submit PRs.

License

MIT