README
Pipez
Pipez stands for purely functional pipelines. A pipeline is a function composed of other functions, like a sequence. It takes some data as input and pushes it down through, transforming it on each stage until the final result is achieved. Each function's output is an input for the next function in a sequence, and so on.
This tiny (~100 lines of code) library implements a novel way for describing it in modern JavaScript, proposing a framework that focuses on easy ad-hoc parameterization of a constructed utility, so you can build incredibly configurable tools with less pain.
A case study (toy logging)
Take a logging function as an example, that behaves like console.log
in general, but has also some fancy additional features, like timestamping and indentation. As the proof of concept, one may come up with the following code (omiting the screen output part):
indent = 0 // Configuration
timestamp = false
log = (...args) => // Implementation
(timestamp ? [new Date (), ...args] : args) // Insert timestamp (if needed)
.map (arg => String (arg)) // Stringify arguments
.join (' ') // Concatenate results
.split ('\n') // Split with linebreaks
.map (line => '\t'.repeat (indent) + line), // Apply indentation
... // ...
It kinda "works", but these global configuration params are not looking good. One common way is to modify a function signature, introducing a special configuration parameter to it:
log = ({ indent = 0, timestamp = false }, ...args) =>
But that's somewhat intrusive, invading the original call semantics. With closures and first-class functions, JS offers a better way of separating these concerns:
configure = ({ indent = 0, timestamp = false }) => // Configuration
(...args) => // Implementation
...
log = configure ({ indent: 2, timestamp: true })
log ('hello', 'world')
Some better languages even got this feature implemented on the syntatic level (currying)! But speaking of JS, it is also very convenient to have that configure
as a regular method of a constructed function instance. This way you can stack up multiple configure
calls, thus being able to incrementally update an existing configuration, in an ad hoc way:
mylog = log.configure ({ timestamp: true })
mylog ('hello')
mylog.configure ({ indent: 2 }) ('world') // ad-hoc configuration
I recently had coded a couple of tiny libraries (as-table, String.ify) embracing that API design principle, and found it immensely useful in practice.
But as of now, we had only scratched the surface of the hidden complexity landscape that Pipez successfully tackles. When you start thinking about configuration — i.e. what and how can be parameterized externally — once-simple things can quickly start getting really complicated...
log = ({
indentLevel = 0,
indentCharacters = '\t' // Many prefer spaces over tabs
timestamp = false,
stringify = x => String (x) // Custom argument stringifiers are more than useful
stringifyDate = date => date.toISOString (), // So that are custom date formatters, too
when = new Date (), // Sometimes you need to set a date other than the current date
linebreak = '\n', // Think about outputting HTML (may want <br>'s instead)
wordSeparator = ' ', // In HTML we may want use the instead...
}) => (...args) =>
timestamp ? [stringifyDate (when), ...args] : args
.map (arg => stringify (arg))
.join (wordSeparator)
.split (linebreak)
.map (line => indentCharacters.repeat (indentLevel) + line),
...
And your beautiful tiny several-lines-of-code-proof-of-concept-thing start turning into 500-pound nightmare in production! The new ES6 destructuring/defaults syntax is amazing and helping alot, though.
Again, there exist better ways to deal with such a high degree extensibility. Can you, by the way, tell the biggest problem with the code above? To me, paradoxically, this is the very thing that we considered good until quite recently — the separation of concerns. The actual logic now is starting to split between the externalized and the intrinsic part, and it's becoming harder to grasp the full thing, as you need to constantly switch your attention back and forth while trying to understand what's the code does. As the codebase grows and you extract more features into configurable parameters, the problem arise.
This is not really better as the global parameters... Wouldn't it be nice, if we could somehow modularize the thing, finding a way of specifying the external parameters and their default values just along with the code that uses it?
Function sequencing
Think of it as a sequence of functions. Each step is essentially a function, taking input from the previous step and outputting result to the next one in the chain:
args → timestamp → stringify → concat → linebreaks → indent
Or (in terms of function application):
indent (linebreaks (concat (stringify (timestamp (args)))))
As for somewhat unexpected feature, we can specify the sequence using the object initializer syntax, thus giving each step a meaningful name. Order matters, so it's really an ordered list, not a random dictonary — and with the new Reflect.ownKeys
API we can consistently capture the order declared:
log = pipez ({
timestamp: args => ...,
stringify: args => ...,
concat: args => ...,
linebreaks: text => ...,
indent: lines => ...,
...
})
log ('hello', 'world')
Each routine can receive the externally configurable parameters (coming as the second formal parameter of a routine). These parameters are local, so no name conflicts with other steps' stuff — both routines can declare a print
thing here, as an example:
log = pipez ({
timestamp: (args, { print = x => x.toISOString (), when = new Date () }) => [print (when), ...args],
stringify: (args, { print = x => String (x) }) => args.map (print),
...
indent: (lines, { level = 0, characters = '\t' }) => lines.map (line => characters.repeat (level) + line),
...
})
Binding to parameters
These parameters can be bound via the configure
calls.
Pre-configuring
Given a log
function, this creates a derived mylog
function configured in some special way:
mylog = log.configure ({ indent: { characters: ' ' }, timestamp: { print: x => x.getDate () } })
Ad-hoc configuration
Given the previously defined mylog
function, it prints hello world
message with the indentation level set to 2
:
mylog.configure ({ indent: { level: 2 } }) ('hello world')
Turning arbitrary steps on and off
Instead of manually coding an on/off switch:
log = pipez ({
timestamp: (args, { yes = true, when = new Date () }) => yes ? [when, ...args] : args,
...
})
You can just use this semantics, as it's already recognized by the framework as built-in:
mylog = log.configure ({ timestamp: { yes: false } })
A shortcut notation:
mylog = log.configure ({ timestamp: false }) // timestamp step will be skipped from evaluation
Replacing the code
You may override a step behavior completely, rather just changing it parameters. Pass a function instead of an object, and it will become a new step implementation. You can also declare and use the new external params as well.
Creates a derived mylog
function that draws ANSI-colored timestamps in the end of messages (using the ansicolor library):
mylog = log.configure ({ timestamp: (args, { color = 'red' }) => [...args, ansicolor[color] (new Date ())]
Prints 'hello world' followed with a cyan timestamp:
mylog.configure ({ timestamp: { color: 'cyan' } }) ('hello world')
Injecting code before and after steps
If you don't want to replace the original behavior, you may bind to the before and the after execution of steps, giving your function a special name, with +
symbol placed before or after the target step name, respectively. Following code will be chained in just after the concat
step:
log.magenta = log.configure ({ 'concat+': text => ansicolor.magenta (text) })
And this schedules to execute just before the linebreaks
step:
log.magenta = log.configure ({ '+linebreaks': text => ansicolor.magenta (text) })
Executing just a part of a sequence:
Executing all steps before a step (not including it):
let concatenated = log.before ('linebreaks') (...)
Executing all steps after a step, including it:
log.from ('linebreaks') (concatenated)
Adding inherited methods
This adds magenta
property accessor to the log
:
log = pipez ({ ... })
log.methods ({
get magenta () { return this.configure ({ 'concat+': text => ansicolor.magenta (text) }) }
})
log.magenta ('this is magenta colored')
...and to it's every derived object:
mylog = log.configure ({ ... })
mylog.magenta ('this is magenta colored too')
Accessing initial arguments
Every step can access it from its configuration parameters, as the initialArguments
property:
const logThatReturnsFirstArgument = log.configure ({
'output+': (_, { initialArguments: [first] }) => first // adds a step after the 'output' step
})
logThatReturnsFirstArgument ('foo', 'bar', 42) // returns 'foo'
Applications
- Ololog! — a platform-agnostic logging with blackjack and hookers