Purell — The Hand Santizer for HTML
Purell is used to extract manuscript from any kind of file format and prepare it for conversion into a raw Superbook. It is
pure CLI utility to convert and sanitize low quality markup into clean markdown compatible markup—the lowest common denominator between web and other propreitary formats.
How to use
Purell is almost entirely designed to be used at the root of a Bookiza app. But it can be used as a transitor between file formats as well, like so:
- MS Word ⭌ Ugly HTML extract ⭌ Markdown Compatible HTML (Sanitized) ⭌ Superbook
- ePub ⭌ Markdown Compatible HTML (Sanitized) ⭌ Superbook
- PDF ⭌ Markdown Compatible HTML (Sanitized) ⭌ Superbook
- Webpage (Scroll) ⭌ Markdown Compatible HTML (Sanitized) ⭌ Superbook
Documentation on these parts are sparse. Copious amounts of patience is advised.
$ pure --help $ pure fetch <url> // Will fetch original.html from source URI $ pure defile <path to file> // Will extract original.html from source file. $ pure sanitize // Markdown Compatible HTML (Sanitized)
The responsibility to paginate
Markdown Compatible HTML (the sanitized.html file) into a Superbook is left with h2s.
Blueoak Model License 1.0.0