NaiveBayesClassifier Class: NaiveBayesClassifier

new NaiveBayesClassifier(options) → {Object}

The NaiveBayesClassifier object holds all the properties and methods used by the classifier.

Parameters:

Name Type Argument Description

options

Object

Options that can be used for intialisation

Properties

Name	Type	Description
`tokenizer`	function	Custom tokenization function

Properties:

Name	Type	Description
`VERSION`	String	Library version number

Returns:

NaiveBayesClassifier

Type: Object

Members

<static, constant> VERSION

Properties:

Type	Description
String	Library version number

categories :Object

Hashmap holding all category names

Type:

Object

docFrequencyCount :Object

Document frequency table for each of our categories. For each category, how many documents were mapped to it.

Type:

Object

options :Object

Options defined at intialisation

Type:

Object

Properties:

Name	Type	Description
`tokenizer`	function	Tokenization function (can be custom provided or default).

totalNumberOfDocuments :Number

A counter that holds the total number of documents we have learnt from.

Type:

Number

<constant> VERSION

Properties:

Type	Description
String	Instance version number

vocabulary :Object

Hashmap holding all words that have been learnt

Type:

Object

vocabularySize :Number

A counter that holds the size of NaiveBayesClassifier#vocabulary hashmap

Type:

Number

wordCount :Object

Word count table for each of our categories For each category, how many words in total were mapped to it.

Type:

Object

wordFrequencyCount :Object

Word frequency table for each of our categories. For each category, how frequently did a given word appear.

Type:

Object

Methods

<static> withClassifier(classifier) → {Object}

Initialise a new classifier from an existing NaiveBayesClassifier object. For example, the existing object may have been retrieved from a database or localstorage.

Parameters:

Name	Type	Description
`classifier`	NaiveBayesClassifier	An existing NaiveBayesClassifier

Returns:

NaiveBayesClassifier

Type: Object

addWordToVocabulary(word) → {undefined}

Add a word to our vocabulary and increment the NaiveBayesClassifier#vocabularySize counter.

Parameters:

Name	Type	Description
`word`	String	Word to be added to the vocabulary

Returns:

Type: undefined

categorize(text) → {String}

Determine the category some `text` most likely belongs to. Use Laplace (add-1) smoothing to adjust for words that do not appear in our vocabulary (i.e. unknown words).

Parameters:

Name	Type	Description
`text`	String	Raw text that needs to be tokenized and categorised.

Returns:

category - Category of “maximum a posteriori” (i.e. most likely category), or 'unclassified'

Type

String
probability - The probablity for the category specified

Type

Number
categories - Hashmap of probabilities for each category

Type

Object

frequencyTable(tokens) → {Object}

Build a frequency hashmap where the keys are the entries in `tokens` and the values are the frequency of each entry (`token`).

Parameters:

Name	Type	Description
`tokens`	Array	Normalized word array

Returns:

FrequencyTable

Type: Object

getOrCreateCategory(categoryName) → {String}

Retrieve a category. If it does not exist, then initialize the necessary data structures for a new category.

Parameters:

Name	Type	Description
`categoryName`	String	Name of the category you want to get or create

Returns:

Parameters:

Name	Type	Description
`text`	String	Some text that should be learnt
`category`	String	The category to which the text provided belongs to

Returns:

NaiveBayesClassifier

Type: Object

tokenProbability(token, category) → {Number}

Calculate probability that a `token` belongs to a `category`

Parameters:

Name	Type	Description
`token`	String	The token (usually a word) for which we want to calculate a probability
`category`	String	The category we want to calculate for

Returns:

probability

Type: Number

<inner> defaultTokenizer(text) → {Array}

Given an input string, tokenize it into an array of word tokens. This tokenizer adopts a naive "independant bag of words" assumption. This is the default tokenization function used if the user does not provide one in NaiveBayesClassifier#options.

Parameters:

Name	Type	Description
`text`	String	Text to be tokenized

Returns:

String tokens

Type: Array

Class: NaiveBayesClassifier

new NaiveBayesClassifier(options) → {Object}

Parameters:

Properties

Properties:

Returns:

Members

<static, constant> VERSION

Properties:

categories :Object

Type:

docFrequencyCount :Object

Type:

options :Object

Type:

Properties:

totalNumberOfDocuments :Number

Type:

<constant> VERSION

Properties:

vocabulary :Object

Type:

vocabularySize :Number

Type:

wordCount :Object

Type:

wordFrequencyCount :Object

Type:

Methods

<static> withClassifier(classifier) → {Object}

Parameters:

Returns:

addWordToVocabulary(word) → {undefined}

Parameters:

Returns:

categorize(text) → {String}

Parameters:

Returns:

frequencyTable(tokens) → {Object}

Parameters:

Returns:

getOrCreateCategory(categoryName) → {String}

Parameters:

Returns:

learn(text, category) → {Object}

Parameters:

Returns:

tokenProbability(token, category) → {Number}

Parameters:

Returns:

<inner> defaultTokenizer(text) → {Array}

Parameters:

Returns: