BatchNormalizationLayer
represents a trainable net layer that normalizes its input data by learning the data mean and variance.
Details and Options
- BatchNormalizationLayer is typically used inside NetChain, NetGraph, etc. to regularize and speed up network training.
- The following optional parameters can be included:
-
"Epsilon" 0.001` stability parameter Interleaving False the position of the channel dimension "Momentum" 0.9 momentum used during training - With the setting InterleavingFalse, the channel dimension is taken to be the first dimension of the input and output arrays.
- With the setting InterleavingTrue, the channel dimension is taken to be the last dimension of the input and output arrays.
- The following learnable arrays can be included:
-
"Biases" Automatic learnable bias array "MovingMean" Automatic moving estimate of the mean "MovingVariance" Automatic moving estimate of the variance "Scaling" Automatic learnable scaling array - With Automatic settings, the biases, scaling, moving mean and moving variance arrays are initialized automatically when NetInitialize or NetTrain is used.
- The following training parameter can be included:
-
LearningRateMultipliers Automatic learning rate multipliers for the arrays - BatchNormalizationLayer freezes the values of "MovingVariance" and "MovingMean" during training with NetTrain if LearningRateMultipliers is 0 or "Momentum" is 1.
- If biases, scaling, moving variance and moving mean have been set, BatchNormalizationLayer[…][input] explicitly computes the output from applying the layer.
- BatchNormalizationLayer[…][{input1,input2,…}] explicitly computes outputs for each of the inputi.
- When given a NumericArray as input, the output will be a NumericArray.
- BatchNormalizationLayer exposes the following ports for use in NetGraph etc.:
-
"Input" a vector, matrix or higher-rank array "Output" a vector, matrix or higher-rank array - When it cannot be inferred from other layers in a larger net, the option "Input"->{n1,n2,…} can be used to fix the input dimensions of BatchNormalizationLayer.
- NetExtract can be used to extract biases, scaling, moving variance and moving mean arrays from a BatchNormalizationLayer object.
- Options[BatchNormalizationLayer] gives the list of default options to construct the layer. Options[BatchNormalizationLayer[…]] gives the list of default options to evaluate the layer on some data.
- Information[BatchNormalizationLayer[…]] gives a report about the layer.
- Information[BatchNormalizationLayer[…],prop] gives the value of the property prop of BatchNormalizationLayer[…]. Possible properties are the same as for NetGraph.
Examples
open allclose allBasic Examples (2)
Create a BatchNormalizationLayer:
Create an initialized BatchNormalizationLayer that takes a vector and returns a vector:
Scope (4)
Ports (2)
Create an initialized BatchNormalizationLayer that takes a rank-3 array and returns a rank-3 array:
Create an initialized BatchNormalizationLayer that takes a vector and returns a vector:
Apply the layer to a batch of input vectors:
Use NetEvaluationMode to use the training behavior of BatchNormalizationLayer:
Parameters (2)
"Biases" (1)
Create a BatchNormalizationLayer with an initial value for the "Biases" parameter:
Extract the "Biases" parameter:
The default value for "Biases" chosen by NetInitialize is a zero vector:
"Scaling" (1)
Create an initialized BatchNormalizationLayer with the "Scaling" parameter set to zero and the "Biases" parameter set to a custom value:
Applying the layer to any input returns the value for the "Biases" parameter:
The default value for "Scaling" chosen by NetInitialize is a vector of 1s:
Options (2)
"Epsilon" (1)
Create a BatchNormalizationLayer with the "Epsilon" parameter explicitly specified:
"Momentum" (1)
Create a BatchNormalizationLayer with the "Momentum" parameter explicitly specified:
Applications (1)
BatchNormalizationLayer is commonly inserted between a ConvolutionLayer and its activation function in order to stabilize and speed up training:
Properties & Relations (1)
During ordinary evaluation, BatchNormalizationLayer computes the following function:
Evaluate a BatchNormalizationLayer on an example vector containing a single channel:
Possible Issues (3)
Specifying negative values for the "MovingVariance" parameter causes numerical errors during evaluation:
BatchNormalizationLayer cannot be initialized until all its input and output dimensions are known:
The "MovingMean" and "MovingVariance" arrays of BatchNormalizationLayer cannot be shared:
Create a BatchNormalizationLayer with shared arrays:
Extract the trained batch normalization layers:
The "Scaling" and "Biases" arrays were shared, but not "MovingMean" or "MovingVariance":
Text
Wolfram Research (2016), BatchNormalizationLayer, Wolfram Language function, https://reference.wolfram.com/language/ref/BatchNormalizationLayer.html (updated 2020).
CMS
Wolfram Language. 2016. "BatchNormalizationLayer." Wolfram Language & System Documentation Center. Wolfram Research. Last Modified 2020. https://reference.wolfram.com/language/ref/BatchNormalizationLayer.html.
APA
Wolfram Language. (2016). BatchNormalizationLayer. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/BatchNormalizationLayer.html