README
cloudblob-store
Node document store built on cloud persistent storage - currently only AWS S3 is supported. Hope to add Azure Blob storage and Google Cloud Storage soon.
Overview
Use cloudblob-store
as a hobbyist, for prototyping or even for scaling (this would require a propper caching strategy to keep request time quick).
Offers indexing & search capabilities out the box through the help of libraries like, FlexSearch and Elasticlunr.
Why
Sometimes you need a data storage backend which is rarely updated and frequently read and should scale when required. Combining the persistence of cloud object/blob storage and serverless architecture gives us that versatility, scaleability and ease of development.
The cloudblob stack was developed to provide a lightweight datastore solution for high read and low write applications that's also very easy to implement and also extremely cost effective. Combine this with caching and indexing workers to provide a scaleable eventually consistent data store.
One of the main aims is to avoid vendor lock-in. If you want to host your own stack have a look at cloudblob-server.
The interface of the datastore client is simple enough. If the latency of cloud storage doesn't work for you, and you've already tried caching. You could always just wrap a mongo client with the same interface.
Getting started
Install the package
npm install @cloudblob/store
Example Usage
const {Datastore, AWS, Flexsearch} = require('@cloudblob/store');
var awsConfig = {
// AWS-sdk s3 client parameters
accessKeyId: "xxx...",
secretAccessKey: "xxx..."
}
const store = new Datastore({
// the db name here is the bucket name
db: 'example-database',
storage: new AWS(awsConfig),
// specify the namespaces and their indexer class, each namespace can use a different indexer
// so you can optimise for different types of data
namespaces: {
// the parameters for the indexer are (fields array, unique ref)
user: {
indexer: new Flexsearch(['name', 'about', 'age'], '_id'), // the 'indexer' field is optional. Use it if you want namespace to be searchable.
ref: "_id" // unique field to use for constructing storage key/path
}
}
});
var doc = {
name: "John Doe",
about: "Lorem ipsum dolar sit amet...",
age: '30'
}
// save a document, it returns a promise that resolves the saved document (including it's autogenerated unique reference)
store.put('user', doc).then(console.log)
// Prints
// {
// _id: "<auto_generated_uuid4_hex>",
// name: "John Doe",
// about: "I'm a deceased person",
// age: '30'
// }
// index the document, the namespace index file is lazyloaded
store.index('user', doc).then(console.log)
// at this stage a manual index flush/dump is required.
store.dumpIndex('user').then(console.log)
// prints
// 'true' or 'false'
// read the document
store.get('user', 'some_doc_key').then(console.log)
// search namespace index (returns key only by default)
store.filter('user', 'John Doe').then(console.log)
// list namespace documents as a paginated response
store.list('user').then(console.log)
Improvements
- Move the storage backend code to separate repositories to reduce unnecessary SDK bloat
- Move indexers to separate package
- Implementing a shardable indexer since index files are lazy loaded