Wolfram Language & System Documentation Center

BatchSize

is an option for NetTrain and related functions that specifies the size of a batch of examples to process together.

Details

Setting BatchSizen specifies that n examples should be processed together.
The default setting of BatchSize->Automatic specifies that the BatchSize should be chosen based on factors such as the available GPU or system memory, etc.
BatchSize can be specified when evaluating a net by writing net[input,BatchSize->n]. This can be important when GPU computation is also specified via TargetDevice->"GPU", as memory is typically more limited in this case.
For nets that contain dynamic dimensions (usually specified as "Varying"), the BatchSize is usually automatically chosen to be 16.
The BatchSize used when training can be obtained from a NetTrainResultsObject via the "BatchSize" property.

Examples

open all close all

Basic Examples (1)

Define a single-layer neural network and train this network with a BatchSize of 300:

Wolfram Language code:

net = NetChain[{100, Tanh, 100, Tanh, 1}, "Input" -> "Scalar", "Output" -> "Scalar"];
trainingData = Table[i -> Sin[i], {i, -Pi, Pi, Pi / 300.}];
trained = NetTrain[net, trainingData, BatchSize -> 300]

Predict the value of a new input:

Wolfram Language code: trained[Pi]

Properties & Relations (1)

NetTrain typically processes more inputs per second when larger batch sizes are used, at the cost of extra memory usage. Training a simple net with a BatchSize of 1:

Wolfram Language code:

net = LinearLayer[];
data = Thread[RandomReal[1, 2000] -> RandomReal[1, 2000]];

Wolfram Language code: trained = NetTrain[net, data, MaxTrainingRounds -> 4, BatchSize -> 1];//AbsoluteTiming

Using a BatchSize of 1000:

Wolfram Language code: trained = NetTrain[net, data, MaxTrainingRounds -> 4];//AbsoluteTiming

This can also be seen by returning the mean examples per second processed by NetTrain:

Wolfram Language code: NetTrain[net, data, "MeanExamplesPerSecond", MaxTrainingRounds -> 4, BatchSize -> 1]

Wolfram Language code: NetTrain[net, data, "MeanExamplesPerSecond", MaxTrainingRounds -> 4, BatchSize -> 1000]

Depending on the task, larger batch sizes provide only marginal benefit to final net quality and may exhaust the available memory when training on a GPU. Furthermore, a given amount of training time may be better spent on making more frequent updates using smaller batches, as long as the batch size is still large enough to produce a low-variance estimate of the gradient.

Top

More Learning

Tech Support

Wolfram Solutions

Wolfram Solutions For Education

Get Started

Grow Your Skills

Work with Us

Educational Programs for Adults

Educational Programs for Youth

Read

BatchSize

Details

Examples

Basic Examples (1)

Properties & Relations (1)

Text

CMS

APA

BibTeX

BibLaTeX

BatchSize

Details

Examples

Basic Examples (1)

Properties & Relations (1)

See Also

Tech Notes

Related Guides

History

Text

CMS

APA

BibTeX

BibLaTeX