@ladjs/naivebayes

Naive Bayes Classifier for JavaScript.

Usage no npm install needed!

<script type="module">
  import ladjsNaivebayes from 'https://cdn.skypack.dev/@ladjs/naivebayes';
</script>

README

@ladjs/naivebayes

build status code coverage code style styled with prettier made with lass npm downloads

A ladjs naivebayes package forked from surmon-china/naivebayes

Table of Contents

What can I use this for

Naive-Bayes classifier for JavaScript.

naivebayes takes a document (piece of text), and tells you what category that document belongs to.

You can use this for categorizing any text content into any arbitrary set of categories. For example:

  • Is an email spam, or not spam ?
  • Is a news article about technology, politics, or sports ?
  • Is a piece of text expressing positive emotions, or negative emotions?

Install

npm

npm install @ladjs/naivebayes

yarn

yarn add @ladjs/naivebayes

Usage

const NaiveBayes = require('naivebayes')

const classifier = new NaiveBayes()

// teach it positive phrases
classifier.learn('amazing, awesome movie!! Yeah!! Oh boy.', 'positive')
classifier.learn('Sweet, this is incredibly, amazing, perfect, great!!', 'positive')

// teach it a negative phrase
classifier.learn('terrible, cruddy thing. Damn. Sucks!!', 'negative')

// now ask it to categorize a document it has never seen before
classifier.categorize('awesome, cool, amazing!! Yay.')
// => 'positive'

// serialize the classifier's state as a JSON string.
const stateJson = classifier.toJson()

// load the classifier back from its JSON representation.
const revivedClassifier = NaiveBayes.fromJson(stateJson)

const NaiveBayes = require('naivebayes')

const Segment = require('segment')
const segment = new Segment()

segment.useDefault()

const classifier = new NaiveBayes({

    tokenizer(sentence) {

        const sanitized = sentence.replace(/[^(a-zA-Z\u4e00-\u9fa50-9_)+\s]/g, ' ')

        return segment.doSegment(sanitized, { simple: true })
    }
})

API

Class

const classifier = new NaiveBayes([options])

Returns an instance of a Naive-Bayes Classifier.

Options

  • tokenizer(text) - (type: function) - Configure your own tokenizer.
  • vocabularyLimit - (type: number default: 0) - Reference a max word count where 0 is the default, meaning no limit.
  • stopwords - (type: boolean default: false) - To remove stopwords from text

Eg.

const classifier = new NaiveBayes({
    tokenizer(text) {
        return text.split(' ')
    }
})

Learn

classifier.learn(text, category)

Teach your classifier what category the text belongs to. The more you teach your classifier, the more reliable it becomes. It will use what it has learned to identify new documents that it hasn't seen before.

Probabilities

classifier.probabilities(text)

Returns an array of { category, probability } objects with probability calculated for each category. Its judgement is based on what you have taught it with .learn().

Categorize

classifier.categorize(text ,[probability])

Returns the category it thinks text belongs to. Its judgement is based on what you have taught it with .learn().

ToJson

classifier.toJson()

Returns the JSON representation of a classifier. This is the same as JSON.stringify(classifier.toJsonObject()).

ToJsonObject

classifier.toJsonObject()

Returns a JSON-friendly representation of the classifier as an object.

FromJson

const classifier = NaiveBayes.fromJson(jsonObject)

Returns a classifier instance from the JSON representation. Use this with the JSON representation obtained from classifier.toJson().

Debug

To run naivebayes in debug mode simply set DEBUG=naivebayes when running your script.

Contributors

Name Website
Surmon http://surmon.me/
Shaun Warman https://shaunwarman.com/