utterance-expander

Text macro expansion tailored for Alexa/Lex utterance syntax.

Usage no npm install needed!

<script type="module">
  import utteranceExpander from 'https://cdn.skypack.dev/utterance-expander';
</script>

README

Utterance Expander

Simple function to expand text macros, tailored for the input syntax of Amazon Lex/Alexa utterances.

There are a few of these already, we just wanted one that uses:

  • Normal parentheses for the macros (instead of curly or square brackets)
  • The more common stroke / character instead of the less common pipe |
  • Common "string cleaning" replacements built in (e.g. typographic/curly apostrophes replaced with Amazon-friendly ASCII ones)
  • A filter to remove duplicate output utterances

Note: requires node v8.9.x

Macro syntax

The following input…

how (can/do) we generate (lots of/many/more) utterances
we (can/could/should) (expand macros/use macro expansion)
sometimes we (need/want) to reference Amazon {SlotNames}
we don’t like curly apostrophes

…produces this output:

how can we generate lots of utterances
how do we generate lots of utterances
how can we generate many utterances
how do we generate many utterances
how can we generate more utterances
how do we generate more utterances
we can expand macros
we could expand macros
we should expand macros
we can use macro expansion
we could use macro expansion
we should use macro expansion
sometimes we need to reference Amazon {SlotNames}
sometimes we want to reference Amazon {SlotNames}
we don't like curly apostrophes

Usage

In a browser

Web version for easy copy/pasting

(Hosted on GitLab Pages, look in the docs directory if you want to edit it)

As a CLI tool

Install globally for use as a CLI tool with:

# npm install -g --production utterance-expander

Disclaimer: it's very limited, just reads stdin and writes to stdout:

$ echo "(hello/nihao) world" | utterance-expander
hello world
nihao world
$ utterance-expander < input_file.txt | sort -u > utterances.txt

Please note that the CLI script operates on each input line in isolation. It cannot detect expanded duplicates in the output if they're caused by different input lines. Always pass CLI output through sort -u.

As a module in something else

Install in your project with:

$ npm install --save utterance-expander

It exposes a single method: expand(origin_text, replacements)

  • origin_text Array: Original input lines with macros
  • (optional) replacements Array: String replacement objects; Objects have replace and with properties

Used like so:

const utter = require("utterance-expander");
const sample_input = [
    "(hello/nihao) world",
    "this exclamation & curly apostrophe won’t survive!"
];
const swaps = [
    {replace: "’", with: "'"},
    {replace: "&", with: "and"},
    {replace: "!", with: ""}
];
console.log(utter.expand(sample_input, swaps));

Which outputs:

[ 'hello world',
  'nihao world',
  'this exclamation and curly apostrophe won\'t survive' ]

Note that the replacements param is optional; there's a default array default_chars defined in utterance-expander.js.

Developing & publishing

CI config is in .gitlab-ci.yml; it runs browserify for the web version, and semantic-release to automate npm releases (configured in the release property in package.json. Releases are based on tags and the format of commit messages, so if you'd like to contribute a pull request please stick to the default fix/feat/breaking pattern.