---
title: "LogisticRegression"
language: "en"
type: "Method"
summary: "LogisticRegression (Machine Learning Method) Method for Classify. Models class probabilities with logistic functions of linear combinations of features. LogisticRegression models the log probabilities of each class with a linear combination of numerical features x={x_ 1,x_ 2,\\[Ellipsis],x_n}, log(P(class = k|x))\\[Proportional]x.\\[Theta]^(k), where \\[Theta]^(k)={\\[Theta]_ 1,\\[Theta]_ 2,\\[Ellipsis],\\[Theta]_m} corresponds to the parameters for class k. The estimation of the parameter matrix \\[Theta]={\\[Theta]^(1),\\[Theta]^(2),\\[Ellipsis],\\[Theta]^(nclass)} is done by minimizing the loss function UnderoverscriptBox[\\[Sum], RowBox[{i, =, 1}], m]-log(P_\\[Theta](class=y_i|x_i))+\\[Lambda]_ 1 UnderoverscriptBox[\\[Sum], RowBox[{i, =, 1}], n]TemplateBox[{SubscriptBox[\\[Theta], i]}, Abs]+ ( \\[Lambda]_ 2 ) / ( 2 ) UnderoverscriptBox[\\[Sum], RowBox[{i, =, 1}], n]\\[Theta]_i^2. The following options can be given: Possible settings for OptimizationMethod include:"
keywords: 
- Machine Learning
- Classification
- Logistic function
canonical_url: "https://reference.wolfram.com/language/ref/method/LogisticRegression.html"
source: "Wolfram Language Documentation"
---
# "LogisticRegression" (Machine Learning Method)

* Method for ``Classify``.

* Models class probabilities with logistic functions of linear combinations of features.

---

## Details & Suboptions

* ``"LogisticRegression"`` models the log probabilities of each class with a linear combination of numerical features $x=\left\{x_1,x_2,\ldots ,x_n\right\}$, $\log (P(\text{class} = k|x))\propto x.\theta ^{(k)}$, where $\theta ^{(k)}=\left\{\theta _1,\theta _2,\ldots ,\theta _m\right\}$ corresponds to the parameters for class ``k``. The estimation of the parameter matrix $\theta =\left\{\theta ^{(1)},\theta ^{(2)},\ldots ,\theta ^{(\text{nclass})}\right\}$ is done by minimizing the loss function $\sum _{i=1}^m -\log \left(P_{\theta }\left(\text{class}=y_i|x_i\right)\right)+\lambda _1 \sum _{i=1}^n \left| \theta _i\right| +\frac{\lambda _2}{2}
\sum _{i=1}^n \theta _i{}^2$.

* The following options can be given:

|                       |           |                                                                          |
| --------------------- | --------- | ------------------------------------------------------------------------ |
| "L1Regularization"    | 0         | value of $\lambda _1$ in the loss function |
| "L2Regularization"    | Automatic | value of $\lambda _2$ in the loss function |
| "OptimizationMethod"  | Automatic | what method to use                                                       |

* Possible settings for ``"OptimizationMethod"`` include:

|                             |                                                           |
| --------------------------- | --------------------------------------------------------- |
| "LBFGS"                     | limited memory Broyden–Fletcher–Goldfarb–Shanno algorithm |
| "StochasticGradientDescent" | stochastic gradient method                                |
| "Newton"                    | Newton method                                             |

---

## Examples (8)

### Basic Examples (2)

Train a classifier function on labeled examples:

```wl
In[1]:= c = Classify[{1, 2, 3, 4} -> {1, 1, 2, 2}, Method -> "LogisticRegression"]

Out[1]=
ClassifierFunction[Association["ExampleNumber" -> 4, "ClassNumber" -> 2, 
  "Input" -> Association["Preprocessor" -> MachineLearning`MLProcessor["ToMLDataset", 
      Association["Input" -> Association["f1" -> Association["Type" -> "Numerical"]], 
 ...  -> DateObject[{2025, 6, 25, 1, 12, 
       47.2343829`9.426833106796206}, "Instant", "Gregorian", -5.], "ProcessorCount" -> 4, 
    "ProcessorType" -> "x86-64", "OperatingSystem" -> "Windows", "SystemWordLength" -> 64, 
    "Evaluations" -> {}]]]
```

Obtain information about the classifier:

```wl
In[2]:= Information[c]

Out[2]=
MachineLearning`MLInformationObject[ClassifierFunction[Association["ExampleNumber" -> 4, 
   "ClassNumber" -> 2, "Input" -> Association["Preprocessor" -> MachineLearning`MLProcessor[
       "ToMLDataset", Association["Input" -> Association[
        ... DateObject[{2025, 6, 25, 1, 12, 
        47.2343829`9.426833106796206}, "Instant", "Gregorian", -5.], "ProcessorCount" -> 4, 
     "ProcessorType" -> "x86-64", "OperatingSystem" -> "Windows", "SystemWordLength" -> 64, 
     "Evaluations" -> {}]]]]
```

Classify a new example:

```wl
In[3]:= c[1.3]

Out[3]= 1
```

---

Generate some normally distributed data:

```wl
In[1]:= sampledata[center_] := RandomVariate[MultinormalDistribution[center, IdentityMatrix[2]], 200];

In[2]:= clusters = sampledata /@ {{1.5, 1}, {-1.5, 1}, {0, -3}};
```

Visualize it:

```wl
In[3]:= ListPlot[clusters, PlotStyle -> Darker@{Yellow, Blue, Green}]

Out[3]= [image]
```

Train a classifier on this dataset:

```wl
In[4]:= c = Classify[<|Yellow -> clusters[[1]], Blue -> clusters[[2]], Green -> clusters[[3]]|>, Method -> "LogisticRegression"]

Out[4]= ClassifierFunction[…]
```

Plot the training set and the probability distribution of each class as a function of the features:

```wl
In[5]:=
Show[
	Plot3D[{
	c[{x, y}, "Probability" -> Yellow], 
	c[{x, y}, "Probability" -> Blue], 
	c[{x, y}, "Probability" -> Green]}, 
	{x, -4, 4}, {y, -5, 4}, 
	Exclusions -> None], ListPointPlot3D[Map[Append[#, 1]&, clusters, {2}], PlotStyle -> {Yellow, Blue, Green}]]

Out[5]= [image]
```

### Options (6)

#### "L1Regularization" (2)

Train a classifier using the ``"L1Regularization"`` option:

```wl
In[1]:= c = Classify[<|1 -> {1, 23, 1.3, 4}, 2 -> {-4, -3.2, -4, -5}|>, Method -> {"LogisticRegression", "L1Regularization" -> 3}]

Out[1]= ClassifierFunction[…]
```

---

Generate some data and visualize it:

```wl
In[1]:=
colors = {RGBColor[1, 0, 0], RGBColor[0, 0, 1], RGBColor[0, 1, 0], RGBColor[1., 0.77, 0.]};
clusters = Table[RandomVariate[BinormalDistribution[
	RandomReal[{-3, 3}, 2], 
	RandomReal[{0.5, 2}, 2], 
	RandomReal[{0.2, 0.8}]], RandomInteger[{30, 40}]], {4}];
plot = ListPlot[clusters, PlotStyle -> Darker[colors, 0.1], ImageSize -> 200, PlotRange -> {{-5, 5}, {-5, 5}}, Frame -> True, AspectRatio -> 1, PlotLabel -> "data"]

Out[1]= [image]

In[2]:=
line = Range[-5, 5, 0.25];
points = Tuples[line, 2];

In[3]:=
makecolormap[probs_]  := Transpose @ Partition[
	Map[Blend[Keys[#], Values[#]]&, probs], 
	Length[line]];
```

Train several classifiers using different values for ``"L1Regularization"`` and compare the results:

```wl
In[4]:=
data = AssociationThread[colors, clusters];
Table[
	ArrayPlot[
	makecolormap @ Classify[data, points, "Probabilities", Method -> {"LogisticRegression", "L1Regularization" -> λ}], 
	PlotLabel -> ("L1Regularization" -> λ), DataReversed -> True, ImageSize -> 150], 
	{λ, {0, 3, 5, 10}}]~Multicolumn~2 ~Legended~plot

Out[4]= [image]
```

#### "L2Regularization" (2)

Train a classifier using the ``"L2Regularization"`` option:

```wl
In[1]:= c = Classify[<|1 -> {1, 23, 1.3, 4}, 2 -> {-4, -3.2, -4, -5}|>, Method -> {"LogisticRegression", "L2Regularization" -> 3}]

Out[1]= ClassifierFunction[…]
```

---

Generate some data and visualize it:

```wl
In[1]:=
colors = {RGBColor[1, 0, 0], RGBColor[0, 0, 1], RGBColor[0, 1, 0], RGBColor[1., 0.77, 0.]};
clusters = Table[RandomVariate[BinormalDistribution[
	RandomReal[{-4, 3}, 2], 
	RandomReal[{0.5, 2}, 2], 
	RandomReal[{0.2, 0.8}]], RandomInteger[{30, 40}]], {4}];
plot = ListPlot[clusters, PlotStyle -> Darker[colors, 0.1], ImageSize -> 200, PlotRange -> {{-5, 5}, {-5, 5}}, Frame -> True, AspectRatio -> 1, PlotLabel -> "data"]

Out[1]= [image]

In[2]:=
line = Range[-5, 5, 0.25];
points = Tuples[line, 2];
```

Train several classifiers using different values for ``"L2Regularization"`` and compare the results:

```wl
In[3]:=
makecolormap[probs_]  := Transpose @ Partition[
	Map[Blend[Keys[#], Values[#]]&, probs], 
	Length[line]];

In[4]:=
data = AssociationThread[colors, clusters];
Table[
	ArrayPlot[
	makecolormap @ Classify[data, points, "Probabilities", Method -> {"LogisticRegression", "L2Regularization" -> λ}], 
	PlotLabel -> ("L2Regularization" -> λ), DataReversed -> True, ImageSize -> 150], 
	{λ, {0, 3, 5, 10}}]~Multicolumn~2 ~Legended~plot

Out[4]= [image]
```

#### "OptimizationMethod" (2)

Train a classifier using a specific ``"OptimizationMethod"`` :

```wl
In[1]:= c = Classify[<|1 -> {1, 23, 1.3, 4}, 2 -> {-4, -3.2, -4, -5}|>, Method -> {"LogisticRegression", "OptimizationMethod" -> "StochasticGradientDescent"}]

Out[1]= ClassifierFunction[…]
```

---

Train a classifier using the ``"Newton"`` method:

```wl
In[1]:= trainingset = {[image] -> 2, [image] -> 5, [image] -> 8, [image] -> 0, [image] -> 2, [image] -> 7, [image] -> 5, [image] -> 1, [image] -> 3, [image] -> 0, [image] -> 3, [image] -> 9, [image] -> 6, [image] -> 2, [image] -> 8, [image] -> 2, [image] -> 0, [image] -> 6, [image] -> 6, [image] -> 1, [image] -> 1, [image] -> 7, [image] -> 8, [image] -> 5, [image] -> 0, [image] -> 4, [image] -> 7, [image] -> 6, [image] -> 0, [image] -> 2, [image] -> 5, [image] -> 3, [image] -> 1, [image] -> 5, [image] -> 6, [image] -> 7, [image] -> 5, [image] -> 4, [image] -> 1, [image] -> 9, [image] -> 3, [image] -> 6, [image] -> 8, [image] -> 0, [image] -> 9, [image] -> 3, [image] -> 0, [image] -> 3, [image] -> 7, [image] -> 4, [image] -> 4, [image] -> 3, [image] -> 8, [image] -> 0, [image] -> 4, [image] -> 1, [image] -> 3, [image] -> 7, [image] -> 6, [image] -> 4, [image] -> 7, [image] -> 2, [image] -> 7, [image] -> 2, [image] -> 5, [image] -> 2, [image] -> 0, [image] -> 9, [image] -> 8, [image] -> 9, [image] -> 8, [image] -> 1, [image] -> 6, [image] -> 4, [image] -> 8, [image] -> 5, [image] -> 8, [image] -> 0, [image] -> 6, [image] -> 7, [image] -> 4, [image] -> 5, [image] -> 8, [image] -> 4, [image] -> 3, [image] -> 1, [image] -> 5, [image] -> 1, [image] -> 9, [image] -> 9, [image] -> 9, [image] -> 2, [image] -> 4, [image] -> 7, [image] -> 3, [image] -> 1, [image] -> 9, [image] -> 2, [image] -> 9, [image] -> 6};

In[2]:= logistic = Classify[trainingset, Method -> {"LogisticRegression", "OptimizationMethod" -> "Newton"}]

Out[2]= ClassifierFunction[…]
```

Train a classifier using the ``"StochasticGradientDescent"`` method:

```wl
In[3]:= logistic2 = Classify[trainingset, Method -> {"LogisticRegression", "OptimizationMethod" -> "StochasticGradientDescent"}]

Out[3]= ClassifierFunction[…]
```

Compare the corresponding training times:

```wl
In[4]:=
Information[logistic, "TrainingTime"]
Information[logistic2, "TrainingTime"]

Out[4]= Quantity[4.856789, "Seconds"]

Out[4]= Quantity[1.566438, "Seconds"]
```

## See Also

* [`Classify`](https://reference.wolfram.com/language/ref/Classify.en.md)
* [`ClassifierFunction`](https://reference.wolfram.com/language/ref/ClassifierFunction.en.md)
* [`ClassifierMeasurements`](https://reference.wolfram.com/language/ref/ClassifierMeasurements.en.md)
* [`Predict`](https://reference.wolfram.com/language/ref/Predict.en.md)
* [`PredictorMeasurements`](https://reference.wolfram.com/language/ref/PredictorMeasurements.en.md)
* [`SequencePredict`](https://reference.wolfram.com/language/ref/SequencePredict.en.md)
* [`ClusterClassify`](https://reference.wolfram.com/language/ref/ClusterClassify.en.md)
* [`LogisticSigmoid`](https://reference.wolfram.com/language/ref/LogisticSigmoid.en.md)
* [`DecisionTree`](https://reference.wolfram.com/language/ref/method/DecisionTree.en.md)
* [`Markov`](https://reference.wolfram.com/language/ref/method/Markov.en.md)
* [`NearestNeighbors`](https://reference.wolfram.com/language/ref/method/NearestNeighbors.en.md)
* [`NeuralNetwork`](https://reference.wolfram.com/language/ref/method/NeuralNetwork.en.md)
* [`RandomForest`](https://reference.wolfram.com/language/ref/method/RandomForest.en.md)
* [`SupportVectorMachine`](https://reference.wolfram.com/language/ref/method/SupportVectorMachine.en.md)

## Related Links

* [An Elementary Introduction to the Wolfram Language: Machine Learning](https://www.wolfram.com/language/elementary-introduction/22-machine-learning.html)

## History

* [Introduced in 2014 (10.0)](https://reference.wolfram.com/language/guide/SummaryOfNewFeaturesIn100.en.md)