Utterance Expander
Simple function to expand text macros, tailored for the input syntax of Amazon Lex/Alexa utterances.
There are a few of these already, we just wanted one that uses:
- Normal parentheses for the macros (instead of curly or square brackets)
- The more common stroke
character instead of the less common pipe|
- Common "string cleaning" replacements built in (e.g. typographic/curly apostrophes replaced with Amazon-friendly ASCII ones)
- A filter to remove duplicate output utterances
Note: requires node v8.9.x
Macro syntax
The following input…
how (can/do) we generate (lots of/many/more) utterances
we (can/could/should) (expand macros/use macro expansion)
sometimes we (need/want) to reference Amazon {SlotNames}
we don’t like curly apostrophes
…produces this output:
how can we generate lots of utterances
how do we generate lots of utterances
how can we generate many utterances
how do we generate many utterances
how can we generate more utterances
how do we generate more utterances
we can expand macros
we could expand macros
we should expand macros
we can use macro expansion
we could use macro expansion
we should use macro expansion
sometimes we need to reference Amazon {SlotNames}
sometimes we want to reference Amazon {SlotNames}
we don't like curly apostrophes
In a browser
Web version for easy copy/pasting
(Hosted on GitLab Pages, look in the docs
directory if you want to edit it)
As a CLI tool
Install globally for use as a CLI tool with:
# npm install -g --production utterance-expander
Disclaimer: it's very limited, just reads stdin and writes to stdout:
$ echo "(hello/nihao) world" | utterance-expander
hello world
nihao world
$ utterance-expander < input_file.txt | sort -u > utterances.txt
Please note that the CLI script operates on each input line in isolation. It
cannot detect expanded duplicates in the output if they're caused by different
input lines. Always pass CLI output through sort -u
As a module in something else
Install in your project with:
$ npm install --save utterance-expander
It exposes a single method: expand(origin_text, replacements)
Array: Original input lines with macros- (optional)
Array: String replacement objects; Objects havereplace
Used like so:
const utter = require("utterance-expander");
const sample_input = [
"(hello/nihao) world",
"this exclamation & curly apostrophe won’t survive!"
const swaps = [
{replace: "’", with: "'"},
{replace: "&", with: "and"},
{replace: "!", with: ""}
console.log(utter.expand(sample_input, swaps));
Which outputs:
[ 'hello world',
'nihao world',
'this exclamation and curly apostrophe won\'t survive' ]
Note that the replacements
param is optional; there's a default array
defined in utterance-expander.js.
Developing & publishing
CI config is in .gitlab-ci.yml
; it runs browserify for the web version, and
semantic-release to
automate npm releases (configured in the release
property in package.json
Releases are based on tags and the format of commit messages, so if you'd like
to contribute a pull request please stick to the
default fix/feat/breaking pattern.