wordsoap-regex

Regular expressions for cleaning up dirty HTML output from Microsoft Word.

Usage no npm install needed!

<script type="module">
  import wordsoapRegex from 'https://cdn.skypack.dev/wordsoap-regex';
</script>

README

wordsoap-regex

Build Status NPM version

Regular expressions for cleaning up dirty HTML output from Microsoft Word.

module.exports = {
    // from http://tim.mackey.ie/CleanWordHTMLUsingRegularExpressions.aspx
    msoTags: /<[\/]?(font|span|xml|del|ins|[ovwxp]:\w+)[^>]*?>/,
    msoAttributes: /<([^>]*)(?:class|lang|style|size|face|[ovwxp]:\w+)=(?:'[^']*'|""[^""]*""|[^\s>]+)([^>]*)>/,
}

License

ISC © Raine Lourie