Wolfram Language & System Documentation Center

"Markov" (Machine Learning Method)

Method for Classify.
Model class probabilities using the n-gram frequencies of the given sequence.

Details & Suboptions

In a Markov model, at training time, an n-gram language model is computed for each class. At test time, the probability for each class is computed according to Bayes's theorem, , where is given by the language model of the given class and is class prior.
The following options can be given:

"AdditiveSmoothing"	.1	the smoothing parameter to use
"MinimumTokenCount"	Automatic	minimum count for an n-gram to to be considered
"Order"	Automatic	n-gram length

When "Order"n, the method partitions sequences in (n+1)-grams.
When "Order"0, the method uses unigrams (single tokens). The model can then be called a unigram model or naive Bayes model.
The value of "AdditiveSmoothing" is added to all n-gram counts. It is used to regularize the language model.

Examples

open all close all

Basic Examples (1)

Train a classifier function on labeled examples:

Wolfram Language code:

c = Classify[{{"flour", "butter"}, {"pasta", "tomato"}, {"apple", "ice cream"}, {"salt", "meat"}, {"honey", "sugar", "butter"}} -> {"dessert", "main course", "dessert", "main course", "dessert"}, Method -> "Markov"]

Obtain information about the classifier:

Wolfram Language code: Information[c]

Classify a new example:

Wolfram Language code: c[{"tomato", "meat", "salt"}]

Options (4)

"AdditiveSmoothing" (2)

Train a classifier using the "AdditiveSmoothing" suboption:

Wolfram Language code:

Classify[{{[image], "this is dark red"} -> Red, {[image], "this is blue"} -> Blue, {[image], "this is dark blue"} -> Blue, {[image], "this red is almost orange"} -> Red}, Method -> {"Markov", "AdditiveSmoothing" -> 2}]

Train two classifiers on an imbalanced dataset by varying the value of "AdditiveSmoothing":

Wolfram Language code: data = {"aaabb" -> True, "cdfff" -> True, "cdffvvv" -> True, "aaa" -> True, "vvvtul" -> False, "dqedewf" -> True};

Wolfram Language code:

c1 = Classify[data, Method -> {"Markov", "AdditiveSmoothing" -> .1}];
c2 = Classify[data, Method -> {"Markov", "AdditiveSmoothing" -> 7}];

Look at the corresponding probabilities for the imbalanced element:

Wolfram Language code: c1["vvvtul", "Probabilities"]

Wolfram Language code: c2["vvvtul", "Probabilities"]

"Order" (2)

Train a classifier by specifying the "Order":

Wolfram Language code:

Classify[{{[image], "this is dark red"} -> Red, {[image], "this is blue"} -> Blue, {[image], "this is dark blue"} -> Blue, {[image], "this red is almost orange"} -> Red}, Method -> {"Markov", "Order" -> 2}]

Generate a dataset of real words and random strings:

Wolfram Language code:

alphabet = Alphabet[];
randomstrings = RandomChoice[alphabet, RandomInteger[{2, 5}]];
realwords = DictionaryLookup["a" ~~ ___];
trainingset = <|"RealWord" -> RandomSample[realwords, 200], 
	"RandomString" -> Table[StringJoin@@randomstrings, 200]|>;

Generate classifiers using different values for the "Order":

Wolfram Language code: c0 = Classify[trainingset, Method -> {"Markov" , "Order" -> 0}]

Wolfram Language code: c2 = Classify[trainingset, Method -> {"Markov", "Order" -> 2}]

Compare the probabilities of these classifiers on a new real word:

Wolfram Language code:

SeedRandom[4]
notRandom = RandomSample[DictionaryLookup["a" ~~ ___], 1];
Dataset@<|"Classifier1" -> c0[notRandom, "Probabilities"], "Classifier2" -> c2[notRandom, "Probabilities"]|>

Top

More Learning

Tech Support

Wolfram Solutions

Wolfram Solutions For Education

Get Started

Grow Your Skills

Work with Us

Educational Programs for Adults

Educational Programs for Youth

Read

"Markov" (Machine Learning Method)

Details & Suboptions

Examples

Basic Examples (1)

Options (4)

"AdditiveSmoothing" (2)

"Order" (2)

"Markov" (Machine Learning Method)

Details & Suboptions

Examples

Basic Examples (1)

Options (4)

"AdditiveSmoothing" (2)

"Order" (2)

See Also

Related Links

History