WOLFRAM

is an option for net layers and for NetTrain, NetChain, NetGraph that specifies learning rate multipliers to apply during training.

Details

  • With the default value of LearningRateMultipliers->Automatic, all layers learn at the same rate.
  • LearningRateMultipliers->{rule1,rule2,} specifies a set of rules that will be used to determine learning rate multipliers for every trainable array in the net.
  • In LearningRateMultipliers->{rule1,rule2,}, each of the rulei can be of the following forms:
  • "part"ruse multiplier r for a named layer, subnetwork or array in a layer
    nruse multiplier r for the n^(th) layer
    m;;nruse multiplier r for layers m through n
    {part1,part2,}ruse multiplier r for a nested layer or array
    _ruse multiplier r for all layers
  • LearningRateMultipliersr specifies using the same multiplier r for all trainable arrays.
  • If r is zero or None, it specifies that the layer or array should not undergo training and will be left unchanged by NetTrain.
  • If r is a positive or negative number, it specifies a multiplier to apply to the global learning rate chosen by the training method to determine the learning rate for the given layer or array.
  • For each trainable array, the rate used is given by the first matching rule, or 1 if no rule matches.
  • Rules that specify a subnet (e.g. a nested NetChain or NetGraph) apply to all layers and arrays within that subnet.
  • LearningRateMultipliers->{part->None} can be used to "freeze" a specific part.
  • LearningRateMultipliers->{part->1,_->None} can be used to "freeze" all layers except for a specific part.
  • The hierarchical specification {part1,part2,} used by LearningRateMultipliers to refer to parts of a net is equivalent to that used by NetExtract and NetReplacePart.
  • Information[net,"ArraysLearningRateMultipliers"] yields the default learning rate multipliers for all arrays of a net.
  • The multipliers that are genuinely used when training can be obtained from a NetTrainResultsObject via the property "ArraysLearningRateMultipliers".

Examples

open allclose all

Basic Examples  (2)Summary of the most common use cases

Create and initialize a net with three layers, but train only the last layer:

Out[1]=1
Out[2]=2

Evaluate the trained net on an input:

Out[3]=3

The first layer of the initial net started with zero biases:

Out[4]=4

The biases of the first layer remain zero in the trained net:

Out[5]=5

The biases of the third layer have been trained:

Out[6]=6

Create a frozen layer with given array values:

Out[1]=1

Nest this layer inside a bigger net:

Out[2]=2

Get the learning rate multipliers that will be used by default in NetTrain, for all arrays of the net:

Out[3]=3

Train with the net:

Out[4]=4

Check the learning rate multipliers that were used to train:

Out[5]=5

The arrays of the frozen layer were unchanged during training:

Out[6]=6
Out[7]=7
Out[8]=8

Scope  (1)Survey of the scope of standard use cases

Replace LearningRateMultipliers in a Network  (1)

Take a net:

Out[1]=1

Set the LearningRateMultipliers of the first layer of this net to zero:

Out[2]=2

Check programmatically the values of LearningRateMultipliers options:

Out[3]=3
Out[4]=4

Applications  (1)Sample problems that can be solved with this function

Train an existing network to solve a new task. Obtain a pre-trained convolutional model that was trained on handwritten digits:

Out[1]=1

Remove the final two layers, and attach two new layers, in order to classify images into 3 classes:

Out[2]=2

Generate training data by rasterizing the characters "x", "y", and "z" with a variety of fonts, sizes, and cases:

Out[5]=5
Out[6]=6

Train the modified network on the new task:

Out[7]=7

Classify an unseen letter:

Out[8]=8
Out[9]=9

Measure the performance on the original training data, which includes the training and validation set:

Out[10]=10

Properties & Relations  (1)Properties of the function, and connections to other functions

Train LeNet on the MNIST dataset with specific learning rate multipliers, returning a NetTrainResultsObject:

Out[1]=1

Obtain the actual learning rate multipliers used on individual weight arrays:

Out[2]=2

Possible Issues  (1)Common pitfalls and unexpected behavior

When a shared array occurs at several places in the network, only a unique learning rate multiplier will be applied to all the occurrences of the shared array.

Create a network with shared arrays:

Out[1]=1

Specifying a LearningRateMultipliers to a shared array in the network will assign the same multiplier to all places:

Out[2]=2

If there is a conflict, the first matching value will be used:

Out[3]=3

The same happens when LearningRateMultipliers is specified when constructing the network:

Out[4]=4
Out[5]=5
Wolfram Research (2017), LearningRateMultipliers, Wolfram Language function, https://reference.wolfram.com/language/ref/LearningRateMultipliers.html (updated 2020).
Wolfram Research (2017), LearningRateMultipliers, Wolfram Language function, https://reference.wolfram.com/language/ref/LearningRateMultipliers.html (updated 2020).

Text

Wolfram Research (2017), LearningRateMultipliers, Wolfram Language function, https://reference.wolfram.com/language/ref/LearningRateMultipliers.html (updated 2020).

Wolfram Research (2017), LearningRateMultipliers, Wolfram Language function, https://reference.wolfram.com/language/ref/LearningRateMultipliers.html (updated 2020).

CMS

Wolfram Language. 2017. "LearningRateMultipliers." Wolfram Language & System Documentation Center. Wolfram Research. Last Modified 2020. https://reference.wolfram.com/language/ref/LearningRateMultipliers.html.

Wolfram Language. 2017. "LearningRateMultipliers." Wolfram Language & System Documentation Center. Wolfram Research. Last Modified 2020. https://reference.wolfram.com/language/ref/LearningRateMultipliers.html.

APA

Wolfram Language. (2017). LearningRateMultipliers. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/LearningRateMultipliers.html

Wolfram Language. (2017). LearningRateMultipliers. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/LearningRateMultipliers.html

BibTeX

@misc{reference.wolfram_2025_learningratemultipliers, author="Wolfram Research", title="{LearningRateMultipliers}", year="2020", howpublished="\url{https://reference.wolfram.com/language/ref/LearningRateMultipliers.html}", note=[Accessed: 19-June-2025 ]}

@misc{reference.wolfram_2025_learningratemultipliers, author="Wolfram Research", title="{LearningRateMultipliers}", year="2020", howpublished="\url{https://reference.wolfram.com/language/ref/LearningRateMultipliers.html}", note=[Accessed: 19-June-2025 ]}

BibLaTeX

@online{reference.wolfram_2025_learningratemultipliers, organization={Wolfram Research}, title={LearningRateMultipliers}, year={2020}, url={https://reference.wolfram.com/language/ref/LearningRateMultipliers.html}, note=[Accessed: 19-June-2025 ]}

@online{reference.wolfram_2025_learningratemultipliers, organization={Wolfram Research}, title={LearningRateMultipliers}, year={2020}, url={https://reference.wolfram.com/language/ref/LearningRateMultipliers.html}, note=[Accessed: 19-June-2025 ]}