README
sax-super-stream
Transform stream converting XML into object by applying hierarchy of element parsers. It's implemented using sax parser, which allows it to process large XML files in a memory efficient manner. It's very flexible: by configuring element parsers only for those elements, from which you need to extract data, you can avoid creating an intermediary representation of the entire XML structure.
Install
$ npm install --save sax-super-stream
Usage
Example below shows how to print the titles of the articles from RSS feed.
var getlet = require('getlet');
var stream = require('sax-super-stream');
var PARSERS = {
'rss': {
'channel': {
'item': {
$: stream.object,
'title': {
$text: function(text, o) { o.title = text; }
}
}
}
}
};
getlet('http://blog.npmjs.org/rss')
.pipe(stream(PARSERS))
.on('data', function(item) {
console.log(item.title);
});
More examples can be found in Furkot GPX and KML importers.
API
stream(parserConfig[, options])
Create transform stream that reads XML and writes objects
parserConfig
- contains hierarchical configuration of element parsers, each entry correspondes to the XML element tree, each value describes the action performed when an element is encountered during XML parsingoptions
- optional set of options passed to sax parser - defaults are as followstrim
- truenormalize
- truelowercase
- falsexmlns
- trueposition
- falsestrictEntities
- truenoscript
- true
parserConfig
parserConfig
is a hierarchical object that contains references to either parse functions or other parseConfig
objects
parse function - function(xmlnode, object, context)
xmlnode
- sax node with attributesobject
- contains reference to the currently constructed object if anycontext
- provided to be used by parser functions, it can be used to store intermediatry data
this
is bound to current parsed object stack
parse config reference - object
each propery of the object represents a direct child element of the parsed node in XML hierachy,
special $
is a self reference
'item': parseItemFunction
is the same as:
'item': {
'