
A javascript wrapper for the MALLET command line tool for topic modelling.

Usage no npm install needed!

<script type="module">
  import malletTopics from '';


Mallet Topics

A javascript wrapper for the MALLET command line tool for topic modelling. Really? Yeah.


MALLET 2.0.8



yarn add mallet-topics


const { importData, trainTopics } = require('mallet-topics')

const malletExecutable = '/path/to/mallet-2.0.8/bin/mallet'
const dataDir = '/path/to/dir/containing/textfiles'

.then(({ malletDataFile }) => {
  console.log(`Successfully imported data into ${malletDataFile}`)
  return malletDataFile
.then(malletDataFile => trainTopics(
.then(({ topicKeysFile, docTopicsFile }) => {
  console.log(`Successfully trained topics. Have a look at ${topicKeysFile} and ${docTopicsFile}`)
.catch(err => {


importData(mallet, dataDir, options)

Returns a promise which resolves when data is successfully imported to MALLET format. The resolve value is an object with a property malletDataFile which points to the newly created .mallet file.

  • mallet - absolute path to executable e.g. /path/to/mallet-2.0.8/bin/mallet
  • dataDir - absolute path to directory of text files to classify (one file per document)
  • options
    • malletDataFile - filepath to write data in MALLET format (default ./${}_data.mallet)
    • stopFile - path to file containing newline-separated stopwords to omit from classification
    • onStdData(stdType, msg) - function to handle data sent to stdout or stderr from MALLET child process (default (stdType, msg) => console.log(msg.toString()))
    • singleFile - boolean to determine whether import-dir or import-file is used. Default false (one instance per file).

trainTopics(mallet, malletDataFile, options)

Returns a promise which resolves when topics are successfully generated. The resolve value is an object with properties topicKeysFile and docTopicsFile which contain the generated topics and document topic scores respectively.

  • mallet - absolute path to executable e.g. /path/to/mallet-2.0.8/bin/mallet
  • malletDataFile - filepath to data file created by importData
  • options
    • numTopics - number of topics to generate (default 10)
    • numIterations - number of sampling iterations (default 100)
    • topicKeysFile - filepath to write topics in tab-separated format (default ./${}_topics.tsv)
    • docTopicsFile - filepath to write topic scores for each document in tab-separated format (default ./${}_doc_topics.tsv)
    • optimizeInterval - number of iterations between hyperparameter optimizations (default undefined)
    • optimizeBurnIn - number of iterations before hyperparameter optimization begins (default 2*optimizeInterval)
    • onStdData(stdType, msg) - function to handle data sent to stdout or stderr from MALLET child process (default (stdType, msg) => console.log(msg.toString()))