email-regex-safe

Regular expression matching for email addresses. Maintained, configurable, more accurate, and browser-friendly alternative to email-regex. Works in Node v10.12.0+ and browsers. Made for Spam Scanner and Forward Email.

Usage no npm install needed!

<script type="module">
  import emailRegexSafe from 'https://cdn.skypack.dev/email-regex-safe';
</script>

README

email-regex-safe

build status code coverage code style styled with prettier made with lass license npm downloads

Regular expression matching for email addresses. Maintained, configurable, more accurate, and browser-friendly alternative to email-regex. Works in Node v10.12.0+ and browsers. Made for Spam Scanner and Forward Email

Table of Contents

Foreword

Previously I was using email-regex through my work on Spam Scanner and Forward Email. However this package has too many issues and false positives.

This package should hopefully more closely resemble real-world intended usage of an email regular expression, and also let you configure several Options. Please check out Forward Email if this package helped you, and explore our source code on GitHub which shows how we use this package.

It will not perform strict email validation, but instead hints the complete matches resembling an email address. We recommend to use validator.isEmail for validation (e.g. validator.isEmail(match)).

Install

npm:

npm install email-regex-safe

yarn:

yarn add email-regex-safe

Usage

Node

This package automatically includes RE2 for regex optimization and vulnerability protection. You will not have to manually wrap your URL regular expressions with new RE2(emailRegex()) anymore through email-regex-safe (we do it automatically for you).

const emailRegexSafe = require('email-regex-safe');

const str = 'some long string with foo@bar.com in it';
const matches = str.match(emailRegexSafe());

for (const match of matches) {
  console.log('match', match);
}

console.log(emailRegexSafe({ exact: true }).test('hello@example.com'));

Browser

Since RE2 is not made for the browser, it will not be used. If there were to be any regex vulnerabilities, they would only crash the user's browser tab, and not your server (as they would on the Node.js side without the use of RE2).

VanillaJS

This is the solution for you if you're just using <script> tags everywhere!

<script src="https://unpkg.com/email-regex-safe"></script>
<script type="text/javascript">
  (function() {
    var str = 'some long string with foo@bar.com in it';
    var matches = str.match(emailRegexSafe());

    for (var i=0; i<matches.length; i++) {
      console.log('match', matches[i]);
    }

    console.log(emailRegexSafe({ exact: true }).test('hello@example.com'));
  })();
</script>

Bundler

Assuming you are using browserify, webpack, rollup, or another bundler, you can simply follow Node usage above.

Options

Property Type Default Value Description
exact Boolean false Only match an exact String. Useful with regex.test(str) to check if a String is an email address. We set this to false by default as the most common use case for a RegExp parser is to parse out emails, as opposed to check strict validity; we feel this closely more resembles real-world intended usage of this package.
strict Boolean false If true, then it will allow any TLD as long as it is a minimum of 2 valid characters. If it is false, then it will match the TLD against the list of valid TLD's using tlds.
gmail Boolean true Whether or not to abide by Gmail's rules for email usernames (see Gmail's Create a username article for more insight). Note that since RE2 does not support negative lookahead nor negative lookbehind, we are leaving it up to you to filter out a select few invalid matches while using gmail: true. Invalid matches would be those that end with a "." (period) or "+" (plus symbol), or have two or more consecutive ".." periods in a row anywhere in the username portion. We recommend to use str.matches(emailSafeRegex()) to get an Array of all matches, and then filter those that pass validator.isEmail after having end period(s) and/or plus symbol(s) stripped from them, as well as filtering out matches with repeated periods.
utf8 Boolean true Whether or not to allow UTF-8 characters for email usernames. This Boolean is only applicable if gmail option is set to false.
localhost Boolean true Allows localhost in the URL hostname portion. See the test/test.js for more insight into the localhost test and how it will return a value which may be unwanted. A pull request would be considered to resolve the "pic.jp" vs. "pic.jpg" issue.
ipv4 Boolean true Match against IPv4 URL's.
ipv6 Boolean false Match against IPv6 URL's. This is set to false by default, since IPv6 is not really supported anywhere for email addresses, and it's not even included in validator.isEmail's logic.
tlds Array tlds Match against a specific list of tlds, or the default list provided by tlds.
returnString Boolean false Return the RegExp as a String instead of a RegExp (useful for custom logic, such as we did with Spam Scanner).

How to validate an email address

If you would like to validate email addresses found, then you should use the validator.isEmail method. This will further enforce the email RFC specification limitations of 64 characters for the username/local part of the email address, 254 for the domain/hostname portion, and 255 in total; including the "@" (at symbol).

Limitations

Since we cannot use regular expression's "negative lookbehinds" functionality (due to RE2 limitations), we could not merge the logic from this pull request. This would have allowed us to make it so example.jpeg would match only if it was example.jp, however if you pass example.jpeg right now it will extract example.jp from it (since .jp is a TLD). An alternative solution may exist, and we welcome community contributions regarding this issue.

Contributors

Name Website
Nick Baugh http://niftylettuce.com/

License

MIT © Nick Baugh