@wholebuzz/fs

File system interface abstraction with implementations for GCP GCS, AWS S3, Azure, SMB, and Local file systems.

Usage no npm install needed!

<script type="module">
  import wholebuzzFs from 'https://cdn.skypack.dev/@wholebuzz/fs';
</script>

README

@wholebuzz/fs image test

File system abstraction with implementations for GCP GCS, AWS S3, Azure, SMB, HTTP, and Local file systems. Provides atomic primitives enabling multiple readers and writers.

Provides file format implementations for:

Additionally provides streaming & sharding utilities.

Dependencies

The FileSystem implementations require peer dependencies:

  • AnyFileSystem: None. URL resolution as a FileSystem. Files have URLs and HTTP is a file system.
  • AzureBlobStorageFileSystem: @azure/storage-blob and @azure/identity
  • AzureFileShareFileSystem: @azure/storage-file-share
  • GoogleCloudFileSystem: @google-cloud/storage
  • HTTPFileSystem: axios
  • LocalFileSystem: fs-ext, glob, and glob-stream
  • S3FileSystem: aws-sdk, s3-stream-upload, and athena-express
  • SMBFileSystem: @marsaud/smb2

Credits

Built with the tree-stream primitives ReadableStreamTree and WritableStreamTree.

Project history

The project started to support @wholebuzz/archive, a terabyte-scale archive for GCS. The focus has since expanded to include powering dbcp with a collection of file system implementations under a common interface. The atomic primitives are only available for Google Cloud Storage and local.

Example

import {
  AnyFileSystem,
  GoogleCloudFileSystem,
  HTTPFileSystem,
  LocalFileSystem,
  S3FileSystem
} from '@wholebuzz/fs'
import { readJSON, writeJSON } from '@whilebuzz/fs/lib/json'

const httpFileSystem = new HTTPFileSystem()
const fs = new AnyFileSystem([
  { urlPrefix: 'gs://', fs: new GoogleCloudFileSystem() },
  { urlPrefix: 's3://', fs: new S3FileSystem() },
  { urlPrefix: 'http://', fs: httpFileSystem },
  { urlPrefix: 'https://', fs: httpFileSystem },
  { urlPrefix: '', fs: new LocalFileSystem() },
])

await writeJSON(fs, 's3://bucket/file', { foo: 'bar' })
const foobar = await readJSON(fs, 's3://bucket/file')

CLI

node lib/cli.js ls .
node lib/cli.js --help

API Reference

Modules

Methods

Constructors

constructor

+ new FileSystem(): FileSystem

Returns: FileSystem

Methods

appendToFile

Abstract appendToFile(urlText: string, writeCallback: (stream: WritableStreamTree) => Promise<boolean>, createCallback?: (stream: WritableStreamTree) => Promise<boolean>, createOptions?: CreateOptions, appendOptions?: AppendOptions): Promise<null | FileStatus>

Appends to the file, safely. Either writeCallback or createCallback is called. For simple appends, the same paramter can be supplied for both writeCallback and createCallback.

Parameters

Name Type Description
urlText string The URL of the file to append to.
writeCallback (stream: WritableStreamTree) => Promise<boolean> Stream callback for appending to the file.
createCallback? (stream: WritableStreamTree) => Promise<boolean> Stream callback for initializing the file, if necessary.
createOptions? CreateOptions Initial metadata for initializing the file, if necessary.
appendOptions? AppendOptions -

Returns: Promise<null | FileStatus>

Defined in: src/fs.ts:203


copyFile

Abstract copyFile(sourceUrlText: string, destUrlText: string): Promise<boolean>

Copies the file.

Parameters

Name Type Description
sourceUrlText string The URL of the source file to copy.
destUrlText string The destination URL to copy the file to.

Returns: Promise<boolean>

Defined in: src/fs.ts:172


createFile

Abstract createFile(urlText: string, createCallback?: (stream: WritableStreamTree) => Promise<boolean>, options?: CreateOptions): Promise<boolean>

Creates file, failing if the file already exists.

Parameters

Name Type Description
urlText string The URL of the file to create.
createCallback? (stream: WritableStreamTree) => Promise<boolean> Stream callback for initializing the file.
options? CreateOptions -

Returns: Promise<boolean>

Defined in: src/fs.ts:149


ensureDirectory

Abstract ensureDirectory(urlText: string, options?: EnsureDirectoryOptions): Promise<boolean>

Ensures the directory exists

Parameters

Name Type Description
urlText string The URL of the directory.
options? EnsureDirectoryOptions -

Returns: Promise<boolean>

Defined in: src/fs.ts:103


fileExists

Abstract fileExists(urlText: string): Promise<boolean>

Returns true if the file exists.

Parameters

Name Type Description
urlText string The URL of the file to check whether exists.

Returns: Promise<boolean>

Defined in: src/fs.ts:115


getFileStatus

Abstract getFileStatus(urlText: string, options?: GetFileStatusOptions): Promise<FileStatus>

Determines the file status. The file version is used to implement atomic mutations.

Parameters

Name Type Description
urlText string The URL of the file to retrieve the status for.
options? GetFileStatusOptions -

Returns: Promise<FileStatus>

Defined in: src/fs.ts:121


moveFile

Abstract moveFile(sourceUrlText: string, destUrlText: string): Promise<boolean>

Moves the file.

Parameters

Name Type Description
sourceUrlText string The URL of the source file to copy.
destUrlText string The destination URL to copy the file to.

Returns: Promise<boolean>

Defined in: src/fs.ts:179


openReadableFile

Abstract openReadableFile(url: string, options?: OpenReadableFileOptions): Promise<ReadableStreamTree>

Opens a file for reading.

optional version Fails if version doesn't match for GCS URLs.

Parameters

Name Type Description
url string The URL of the file to read from.
options? OpenReadableFileOptions -

Returns: Promise<ReadableStreamTree>

Defined in: src/fs.ts:128


openWritableFile

Abstract openWritableFile(url: string, options?: OpenWritableFileOptions): Promise<WritableStreamTree>

Opens a file for writing.

optional version Fails if version doesn't match for GCS URLs.

Parameters

Name Type Description
url string The URL of the file to write to.
options? OpenWritableFileOptions -

Returns: Promise<WritableStreamTree>

Defined in: src/fs.ts:138


queueRemoveFile

Abstract queueRemoveFile(urlText: string): Promise<boolean>

Queues deletion, e.g. after DaysSinceCustomTime.

Parameters

Name Type Description
urlText string The URL of the file to remove.

Returns: Promise<boolean>

Defined in: src/fs.ts:165


readDirectory

Abstract readDirectory(urlText: string, options?: ReadDirectoryOptions): Promise<DirectoryEntry[]>

Returns the URLs of the files in a directory.

Parameters

Name Type Description
urlText string The URL of the directory to list files in.
options? ReadDirectoryOptions -

Returns: Promise<DirectoryEntry[]>

Defined in: src/fs.ts:88


readDirectoryStream

Abstract readDirectoryStream(urlText: string, options?: ReadDirectoryOptions): Promise<ReadableStreamTree>

Returns a stream of the URLs of the files in a directory.

Parameters

Name Type Description
urlText string The URL of the directory to list files in.
options? ReadDirectoryOptions -

Returns: Promise<ReadableStreamTree>

Defined in: src/fs.ts:94


removeDirectory

Abstract removeDirectory(urlText: string): Promise<boolean>

Removes the directory

Parameters

Name Type Description
urlText string The URL of the directory.

Returns: Promise<boolean>

Defined in: src/fs.ts:109


removeFile

Abstract removeFile(urlText: string): Promise<boolean>

Deletes the file.

Parameters

Name Type Description
urlText string The URL of the file to remove.

Returns: Promise<boolean>

Defined in: src/fs.ts:159


replaceFile

Abstract replaceFile(urlText: string, writeCallback: (stream: WritableStreamTree) => Promise<boolean>, options?: ReplaceFileOptions): Promise<boolean>

Replaces the file, failing if the file version doesn't match.

Parameters

Name Type Description
urlText string The URL of the file to replace.
writeCallback (stream: WritableStreamTree) => Promise<boolean> Stream callback for replacing the file.
options? ReplaceFileOptions -

Returns: Promise<boolean>

Defined in: src/fs.ts:188 @wholebuzz/fs / Exports / json

Module: json

Table of contents

Variables

Functions

Variables

JSONStream

Const JSONStream: any

Defined in: src/json.ts:11

Functions

newJSONLinesFormatter

Const newJSONLinesFormatter(): Transform

Returns: Transform

Defined in: src/json.ts:146


newJSONLinesParser

Const newJSONLinesParser(): ThroughStream

Returns: ThroughStream

Defined in: src/json.ts:147


parseJSON

parseJSON(stream: ReadableStreamTree): Promise<unknown>

Parses JSON object from [[stream]]. Used to implement readJSON.

Parameters

Name Type Description
stream ReadableStreamTree The stream to read a JSON object from.

Returns: Promise<unknown>

Defined in: src/json.ts:72


parseJSONLines

parseJSONLines(stream: ReadableStreamTree): Promise<unknown[]>

Parses JSON object from [[stream]]. Used to implement readJSON.

Parameters

Name Type Description
stream ReadableStreamTree The stream to read a JSON object from.

Returns: Promise<unknown[]>

Defined in: src/json.ts:80


pipeJSONFormatter

pipeJSONFormatter(stream: WritableStreamTree, isArray: boolean): WritableStreamTree

Create JSON formatter stream.

Parameters

Name Type Description
stream WritableStreamTree -
isArray boolean Accept array objects or property tuples.

Returns: WritableStreamTree

Defined in: src/json.ts:127


pipeJSONLinesFormatter

pipeJSONLinesFormatter(stream: WritableStreamTree): WritableStreamTree

Create JSON-lines formatter stream.

Parameters

Name Type
stream WritableStreamTree

Returns: WritableStreamTree

Defined in: src/json.ts:142


pipeJSONLinesParser

pipeJSONLinesParser(stream: ReadableStreamTree): ReadableStreamTree

Create JSON parser stream.

Parameters

Name Type
stream ReadableStreamTree

Returns: ReadableStreamTree

Defined in: src/json.ts:119


pipeJSONParser

pipeJSONParser(stream: ReadableStreamTree, isArray: boolean): ReadableStreamTree

Create JSON parser stream.

Parameters

Name Type
stream ReadableStreamTree
isArray boolean

Returns: ReadableStreamTree

Defined in: src/json.ts:110


readJSON

readJSON(fileSystem: FileSystem, url: string): Promise<unknown>

Reads a serialized JSON object or array from a file.

Parameters

Name Type Description
fileSystem FileSystem -
url string The URL of the file to parse a JSON object or array from.

Returns: Promise<unknown>

Defined in: src/json.ts:17


readJSONHashed

readJSONHashed(fileSystem: FileSystem, url: string): Promise<[unknown, null | string]>

Reads a serialized JSON object from a file, and also hashes the file.

Parameters

Name Type Description
fileSystem FileSystem -
url string The URL of the file to parse a JSON object from.

Returns: Promise<[unknown, null | string]>

Defined in: src/json.ts:25


readJSONLines

readJSONLines(fileSystem: FileSystem, url: string): Promise<unknown[]>

Reads a serialized JSON-lines array from a file.

Parameters

Name Type Description
fileSystem FileSystem -
url string The URL of the file to parse a JSON object or array from.

Returns: Promise<unknown[]>

Defined in: src/json.ts:35


serializeJSON

serializeJSON(stream: WritableStreamTree, obj: object | any[]): Promise<boolean>

Serializes JSON object to [[stream]]. Used to implement writeJSON.

Parameters

Name Type Description
stream WritableStreamTree The stream to write a JSON object to.
obj object | any[] -

Returns: Promise<boolean>

Defined in: src/json.ts:88


serializeJSONLines

serializeJSONLines(stream: WritableStreamTree, obj: any[]): Promise<boolean>

Serializes JSON object to [[stream]]. Used to implement writeJSONLines.

Parameters

Name Type Description
stream WritableStreamTree The stream to write a JSON object to.
obj any[] -

Returns: Promise<boolean>

Defined in: src/json.ts:103


writeJSON

writeJSON(fileSystem: FileSystem, url: string, value: object | any[]): Promise<boolean>

Serializes object or array to a JSON file.

Parameters

Name Type Description
fileSystem FileSystem -
url string The URL of the file to serialize a JSON object or array to.
value object | any[] The object or array to serialize.

Returns: Promise<boolean>

Defined in: src/json.ts:44


writeJSONLines

writeJSONLines(fileSystem: FileSystem, url: string, obj: object[]): Promise<boolean>

Serializes array to a JSON Lines file.

Parameters

Name Type Description
fileSystem FileSystem -
url string The URL of the file to serialize a JSON array to.
obj object[] -

Returns: Promise<boolean>

Defined in: src/json.ts:53


writeShardedJSONLines

writeShardedJSONLines(fileSystem: FileSystem, url: string, obj: object[], shards: number, shardFunction?: (x: object, modulus: number) => number): Promise<boolean>

Parameters

Name Type
fileSystem FileSystem
url string
obj object[]
shards number
shardFunction (x: object, modulus: number) => number

Returns: Promise<boolean>

Defined in: src/json.ts:57