LearningRateMultipliers
✖
LearningRateMultipliers
Details


- With the default value of LearningRateMultipliers->Automatic, all layers learn at the same rate.
- LearningRateMultipliers->{rule1,rule2,…} specifies a set of rules that will be used to determine learning rate multipliers for every trainable array in the net.
- In LearningRateMultipliers->{rule1,rule2,…}, each of the rulei can be of the following forms:
-
"part"r use multiplier r for a named layer, subnetwork or array in a layer nr use multiplier r for the n layer
m;;nr use multiplier r for layers m through n {part1,part2,…}r use multiplier r for a nested layer or array _r use multiplier r for all layers - LearningRateMultipliersr specifies using the same multiplier r for all trainable arrays.
- If r is zero or None, it specifies that the layer or array should not undergo training and will be left unchanged by NetTrain.
- If r is a positive or negative number, it specifies a multiplier to apply to the global learning rate chosen by the training method to determine the learning rate for the given layer or array.
- For each trainable array, the rate used is given by the first matching rule, or 1 if no rule matches.
- Rules that specify a subnet (e.g. a nested NetChain or NetGraph) apply to all layers and arrays within that subnet.
- LearningRateMultipliers->{part->None} can be used to "freeze" a specific part.
- LearningRateMultipliers->{part->1,_->None} can be used to "freeze" all layers except for a specific part.
- The hierarchical specification {part1,part2,…} used by LearningRateMultipliers to refer to parts of a net is equivalent to that used by NetExtract and NetReplacePart.
- Information[net,"ArraysLearningRateMultipliers"] yields the default learning rate multipliers for all arrays of a net.
- The multipliers that are genuinely used when training can be obtained from a NetTrainResultsObject via the property "ArraysLearningRateMultipliers".
Examples
open allclose allBasic Examples (2)Summary of the most common use cases
Create and initialize a net with three layers, but train only the last layer:

https://wolfram.com/xid/08u6jqjzi3h70ws-js14pr


https://wolfram.com/xid/08u6jqjzi3h70ws-urg4gr

Evaluate the trained net on an input:

https://wolfram.com/xid/08u6jqjzi3h70ws-x7rf7y

The first layer of the initial net started with zero biases:

https://wolfram.com/xid/08u6jqjzi3h70ws-m9uwb4

The biases of the first layer remain zero in the trained net:

https://wolfram.com/xid/08u6jqjzi3h70ws-4shnzh

The biases of the third layer have been trained:

https://wolfram.com/xid/08u6jqjzi3h70ws-s8mi6c

Create a frozen layer with given array values:

https://wolfram.com/xid/08u6jqjzi3h70ws-2fo4zr

Nest this layer inside a bigger net:

https://wolfram.com/xid/08u6jqjzi3h70ws-etka1v

Get the learning rate multipliers that will be used by default in NetTrain, for all arrays of the net:

https://wolfram.com/xid/08u6jqjzi3h70ws-wy51jf


https://wolfram.com/xid/08u6jqjzi3h70ws-rxbupu

Check the learning rate multipliers that were used to train:

https://wolfram.com/xid/08u6jqjzi3h70ws-10fxdt

The arrays of the frozen layer were unchanged during training:

https://wolfram.com/xid/08u6jqjzi3h70ws-7bmy4b


https://wolfram.com/xid/08u6jqjzi3h70ws-xjhq6b


https://wolfram.com/xid/08u6jqjzi3h70ws-7kqa13

Scope (1)Survey of the scope of standard use cases
Replace LearningRateMultipliers in a Network (1)

https://wolfram.com/xid/08u6jqjzi3h70ws-7t7hwi

Set the LearningRateMultipliers of the first layer of this net to zero:

https://wolfram.com/xid/08u6jqjzi3h70ws-d7saa0

Check programmatically the values of LearningRateMultipliers options:

https://wolfram.com/xid/08u6jqjzi3h70ws-3584g7


https://wolfram.com/xid/08u6jqjzi3h70ws-wqconw

Applications (1)Sample problems that can be solved with this function
Train an existing network to solve a new task. Obtain a pre-trained convolutional model that was trained on handwritten digits:

https://wolfram.com/xid/08u6jqjzi3h70ws-9g900d

Remove the final two layers, and attach two new layers, in order to classify images into 3 classes:

https://wolfram.com/xid/08u6jqjzi3h70ws-kn06l8

Generate training data by rasterizing the characters "x", "y", and "z" with a variety of fonts, sizes, and cases:

https://wolfram.com/xid/08u6jqjzi3h70ws-ve26qu

https://wolfram.com/xid/08u6jqjzi3h70ws-kjthon

https://wolfram.com/xid/08u6jqjzi3h70ws-ubddz4


https://wolfram.com/xid/08u6jqjzi3h70ws-coeaek

Train the modified network on the new task:

https://wolfram.com/xid/08u6jqjzi3h70ws-m8jo9g


https://wolfram.com/xid/08u6jqjzi3h70ws-0vjoyk


https://wolfram.com/xid/08u6jqjzi3h70ws-h1lztz

Measure the performance on the original training data, which includes the training and validation set:

https://wolfram.com/xid/08u6jqjzi3h70ws-ya18rn

Properties & Relations (1)Properties of the function, and connections to other functions
Train LeNet on the MNIST dataset with specific learning rate multipliers, returning a NetTrainResultsObject:

https://wolfram.com/xid/08u6jqjzi3h70ws-fey8ok

Obtain the actual learning rate multipliers used on individual weight arrays:

https://wolfram.com/xid/08u6jqjzi3h70ws-wvkchy

Possible Issues (1)Common pitfalls and unexpected behavior
When a shared array occurs at several places in the network, only a unique learning rate multiplier will be applied to all the occurrences of the shared array.
Create a network with shared arrays:

https://wolfram.com/xid/08u6jqjzi3h70ws-n2cua7

Specifying a LearningRateMultipliers to a shared array in the network will assign the same multiplier to all places:

https://wolfram.com/xid/08u6jqjzi3h70ws-yd9zx2

If there is a conflict, the first matching value will be used:

https://wolfram.com/xid/08u6jqjzi3h70ws-48sw08

The same happens when LearningRateMultipliers is specified when constructing the network:

https://wolfram.com/xid/08u6jqjzi3h70ws-i83apx


https://wolfram.com/xid/08u6jqjzi3h70ws-tdv8y3

Wolfram Research (2017), LearningRateMultipliers, Wolfram Language function, https://reference.wolfram.com/language/ref/LearningRateMultipliers.html (updated 2020).
Text
Wolfram Research (2017), LearningRateMultipliers, Wolfram Language function, https://reference.wolfram.com/language/ref/LearningRateMultipliers.html (updated 2020).
Wolfram Research (2017), LearningRateMultipliers, Wolfram Language function, https://reference.wolfram.com/language/ref/LearningRateMultipliers.html (updated 2020).
CMS
Wolfram Language. 2017. "LearningRateMultipliers." Wolfram Language & System Documentation Center. Wolfram Research. Last Modified 2020. https://reference.wolfram.com/language/ref/LearningRateMultipliers.html.
Wolfram Language. 2017. "LearningRateMultipliers." Wolfram Language & System Documentation Center. Wolfram Research. Last Modified 2020. https://reference.wolfram.com/language/ref/LearningRateMultipliers.html.
APA
Wolfram Language. (2017). LearningRateMultipliers. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/LearningRateMultipliers.html
Wolfram Language. (2017). LearningRateMultipliers. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/LearningRateMultipliers.html
BibTeX
@misc{reference.wolfram_2025_learningratemultipliers, author="Wolfram Research", title="{LearningRateMultipliers}", year="2020", howpublished="\url{https://reference.wolfram.com/language/ref/LearningRateMultipliers.html}", note=[Accessed: 19-June-2025
]}
BibLaTeX
@online{reference.wolfram_2025_learningratemultipliers, organization={Wolfram Research}, title={LearningRateMultipliers}, year={2020}, url={https://reference.wolfram.com/language/ref/LearningRateMultipliers.html}, note=[Accessed: 19-June-2025
]}