domlike

A better DomHandler for fb55's htmlparser2

Usage no npm install needed!

<script type="module">
  import domlike from 'https://cdn.skypack.dev/domlike';
</script>

README

domlike

npm version

For use with htmlparser2, which is great as a parser, but kind of got off the tracks with its DomHandler implementation, which was fractured into two repositories, fb55/domhandler and fb55/DomUtils, and uses arbitrary names in its implementation.

This repository, domlike, replaces both DomHandler and DomUtils (as well as the strange and premature domelementtype), and seeks to implement most of the DOM2/DOM3 standard for Node.

Quickstart

Install:

npm install --save domlike htmlparser2

Use:

var request = require('request');
var htmlparser2 = require('htmlparser2');
var domlike = require('domlike');

request.get('http://henrian.com', function(err, res, body) {
  if (err) throw err;

  var handler = new domlike.Handler(function(err, document) {
    if (err) throw err;

    // collect all anchors (<a> elements) and print out their text and url
    // in Markdown syntax
    document.queryPredicateAll(function(node) {
      return node.tagName == 'a';
    }).forEach(function(node) {
      console.log('[%s](%s)', node.textContent, node.attributes.href);
    });

    console.log(document.textContent);
  });

  var parser = new htmlparser2.Parser(handler, {decodeEntities: true});
  parser.write(body);
  parser.done();
});

TODO: Compare htmlfmt output with xmlformat's.

License

Copyright 2014-2015 Christopher Brown. MIT Licensed.