grammar-graph

Takes a context-free grammar and converts it into a decision-making graph. Can produce interactive Guides which generate valid sentences from the grammar.

Usage no npm install needed!

<script type="module">
  import grammarGraph from 'https://cdn.skypack.dev/grammar-graph';
</script>

README

Interactively construct a sentence from a context-free grammar, or check if some text is a valid sentence in a grammar.

Create a GrammarGraph

Install the npm module.

npm install grammar-graph

Require GrammarGraph, input a grammar, and construct a new graph. See grammar format for details on how a grammar works.

var GrammarGraph = require('grammar-graph')

var grammar = {
        Sentence: ['NounPhrase VerbPhrase'],
      NounPhrase: ['the Noun',
                   'the Noun RelativeClause'],
      VerbPhrase: ['Verb',
                   'Verb NounPhrase'],
  RelativeClause: ['that VerbPhrase'],
            Noun: ['dog',
                   'cat',
                   'bird',
                   'squirrel'],
            Verb: ['befriended',
                   'loved',
                   'ate',
                   'attacked']
}

var graph = new GrammarGraph(grammar)

Check out the vertices in your graph. The constructor creates a vertex for every terminal and non-terminal symbol in the grammar.

graph.vertices()       =>
[ 'Sentence', 'NounPhrase', 'VerbPhrase', 'RelativeClause', 'Noun',
  'Verb',  '_NounPhrase_1', '_NounPhrase_2', '_VerbPhrase_1', 'that',
  'dog', 'cat', 'bird', 'squirrel', 'befriended', 'loved', 'ate',
  'attacked', 'the' ]

Where did '_NounPhrase_1', '_NounPhrase_2', and '_VerbPhrase_1' come from? Look at the definition of NounPhrase in the original grammar declaration. Both options contained multiple symbols, and the constructor has automatically created a name for each combination. In the case of VerbPhrase, only the second option contained multiple symbols, so only one extra name is needed. The automatic expansion of the original NounPhrase and VerbPhrase definitions result in the following equivalent definitions:

{
     NounPhrase: ['_NounPhrase_1',
                  '_NounPhrase_2'],
     VerbPhrase: ['Verb',
                  '_VerbPhrase_1'],
  _NounPhrase_1: ['the Noun'],
  _NounPhrase_2: ['the Noun RelativeClause'],
  _VerbPhrase_1: ['Verb NounPhrase']
}

GrammarGraph.createGuide

Let's create a new guide for constructing sentences from the langauge. Just indicate a starting point in the grammar, in this case Sentence. The guide will help you construct a complete Sentence.

var guide = graph.createGuide('Sentence')

The guide gives choices for the next terminal in your construction. Behind the scenes, it is doing a breadth-first search for terminals from the current position. In our grammar, the only possible first terminal is 'the':

guide.choices()        =>  ['the']

You can also check all the possible constructs at any point in time. In this case, we will see that even though 'the' is the only possible first terminal, there are actually two possible paths for the construction.

guide.constructs()     =>
[ 'the Noun RelativeClause VerbPhrase', 'the Noun VerbPhrase' ]

To get to this point, the Guide has expanded 'Sentence' => 'NounPhrase VerbPhrase' => 'the Noun RelativeClause VerbPhrase' or 'the Noun VerbPhrase'. It stops at this point because it has reached terminal symbol 'the' in both possible paths.

'the' is the only choice, so let's choose it. We can then check our construction and possible constructs.

guide.choose('the')
guide.construction()   =>  ['the']
guide.constructs()     =>
[ 'the bird RelativeClause VerbPhrase',
  'the bird VerbPhrase',
  'the cat RelativeClause VerbPhrase',
  'the cat VerbPhrase',
  'the dog RelativeClause VerbPhrase',
  'the dog VerbPhrase',
  'the squirrel RelativeClause VerbPhrase',
  'the squirrel VerbPhrase' ]
guide.choices()        =>  ['squirrel', 'bird', 'cat', 'dog' ]

Let's continue the construction.

guide.choices()        =>  ['dog', 'cat', 'squirrel', 'bird']
guide.choose('dog')

guide.choices()        =>  ['that', 'befriended', 'loved', 'ate', 'attacked']
guide.choose('ate')

guide.choices()        => ['the']

We could choose 'the' from this last set of choices, but it just so happens that the current construction could be considered a complete Sentence (our starting point):

guide.construction()   => ['the', 'dog', 'ate']
guide.isComplete()     => true
guide.constructs()     =>
[ 'the dog ate',
  'the dog ate the Noun',
  'the dog ate the Noun RelativeClause' ]

If we go ahead and choose 'the', we no longer have a complete sentence.

guide.choose('the')
guide.construction()   => ['the', 'dog', 'ate', 'the']
guide.isComplete()     => false

At any point, you can move back a step by popping off the last choice.

guide.pop()            => 'the'
guide.construction()   => ['the', 'dog', 'ate']
guide.isComplete()     => true

You can optionally provide guide.choices() with a number indicating the depth of choices you want. If you request a depth greater than 1, instead of an array of strings, it will return an array of TreeNodes which are each at most nDeep (or less if a path ends in a terminal).

guide.choose('the')
guide.construction()    => ['the', 'dog', 'ate', 'the']
guide.choices()         => ['squirrel', 'bird', 'cat', 'dog']
guide.choices(3)        =>
[ { val: 'squirrel',                              // squirrel
    next: [ { val: 'that',                        // squirrel that
            next:
             [ { val: 'attacked',   next: [] },   // squirrel that attacked
               { val: 'ate',        next: [] },   // squirrel that ate
               { val: 'loved',      next: [] },   // squirrel that loved
               { val: 'befriended', next: [] }    // squirrel that befriended
             ]
           },

  { val: 'bird',                                  // bird
    next: [ { val: 'that',                        // bird that
            next:
             [ { val: 'attacked',   next: [] },   // bird that attacked
               { val: 'ate',        next: [] },   // bird that ate
               { val: 'loved',      next: [] },   // bird that loved
               { val: 'befriended', next: [] }    // bird that befriended
             ]
           },

  { val: 'cat',
    next: [ etc... ] },

  { val: 'dog',
    next: [ etc... ] }
]

In addition to a single terminal string, guide.choose() can also accept an array of terminal strings.

guide.choose(['squirrel', 'that', 'attacked'])
guide.construction()    => ['the', 'dog', 'ate', 'the', 'squirrel', 'that', 'attacked']
guide.complete()        => true

GrammarGraph.createRecognizer

A Recognizer can be used to check if some text is a valid sentence in a grammar. Just like when creating a guide, you need to indicate a starting terminal:

// (using graph declared before) var graph = new GrammarGraph(grammar)
var sentence = graph.createRecognizer('Sentence')

A recognizer can check whether or not a text is a valid and complete construction:

sentence.isComplete('the dog ate the cat')                   => true
sentence.isComplete('the dog ate the cat that')              => false
sentence.isComplete('the dog ate the cat that orange juice') => false
sentence.isComplete('the dog ate the cat that attacked')     => true

or whether the text is valid so far (though it may not be complete):

sentence.isValid('the dog ate the cat')                      => true
sentence.isValid('the dog ate the cat that')                 => true
sentence.isValid('the dog ate the cat that orange juice')    => false
sentence.isValid('the dog ate the cat that attacked')        => true

Grammar

A context-free grammar is a list of rules. Here is a grammar with eight rules that builds creatures like this:

~~(^__^)~~ or ~~(-______-)~~ or ~~(*_*)~~.

{
  Creature: ['Arm Head Arm'],
      Head: ['( Face )'],
      Face: ['HappyFace',
             'ZenFace',
             'SleepyFace'],
 HappyFace: ['^ Mouth ^'],
   ZenFace: ['- Mouth -'],
SleepyFace: ['* Mouth *'],
     Mouth: ['_',
             '_ Mouth'],
       Arm: ['~~']
}

Rules

A rule simply means to replace a word like Creature with its definition. If we are constructing a sentence with the Creature grammar and come across the word Head, we will replace it with its definition: ( Face ). Some rules have multiple options, such as a Face which can be rewritten as HappyFace, ZenFace, or SleepyFace.

More formally, a grammar is an object consisting of key-value pairs, with each non-terminal symbol pointing to an array of one or more symbol chains choices for this non-terminal.

Symbol Chains

Arm Head Arm and ( Face ) are symbol chains. By default, each symbol is seperated by white-space, so both of these symbol chains are made up of three symbols: Arm, Head, Arm and (, Face, ).

Terminal Symbols

If a symbol has no definition in the grammar, it is a terminal. The six terminal symbols in the creature grammar are: (, ), ^, *, _, ~~. These are the actual building blocks of the language, and are the only symbols that will make it into a final construction.

Non-terminal Symbols

If a symbol has a definition in the grammar, it is non-terminal and can be broken down further. A non-terminal's definition is an array of one or more symbol chains indicating possible choices for this rule.

{
  RuleName: ['I am this', 'or this', 'or could be this'],
 RuleName2: ['I mean only one thing']
}

Recursive definitions

Recursive definitions are what make a grammar interesting and powerful. The creature grammar has only one recursive definition: Mouth: ['_', '_ Mouth']. This allows creatures to have mouths of one character if we immediately choose the first option, or up to infinite characters if we always choose the second option.

Do not define a non-terminal to equal only itself. This will not work: Mouth: ['Mouth'].

Building a Creature

When constructing from a grammar, you need to indicate a starting point. In this case it only makes sense to start from Creature. Let's break down Creature until we are left with only terminal symbols.

// construction     // replacement on this step

Creature            // Creature => Arm Head Arm
Arm Head Arm        // Arm      => ~~
~~Head Arm          // Head     => ( Face )
~~(Face) Arm        // Face     => ZenFace
~~(ZenFace) Arm     // Mouth    => _ Mouth
~~(-Mouth-) Arm     // Mouth    => _ Mouth
~~(-_Mouth-) Arm    // Mouth    => _ Mouth
~~(-__Mouth-) Arm   // Mouth    => _
~~(-___-) Arm       // Arm      => ~~
~~(-___-)~~

Docs

View the api documentation here.

Development

View the internal api documentation here.

Clone the git repository and install development dependencies:

git clone https://github.com/jrleszcz/grammar-graph.git
cd grammar-graph
npm install

To run eslint and tape tests:

npm test

To generate api documentation for api.md:

npm run docs

Credit

This module is based on Alex Shkotin's Graph representation of context-free grammars.