- Setting BatchSizen specifies that n examples should be processed together.
- The default setting of BatchSize->Automatic specifies that the BatchSize should be chosen based on factors such as the available GPU or system memory, etc.
- BatchSize can be specified when evaluating a net by writing net[input,BatchSize->n]. This can be important when GPU computation is also specified via TargetDevice->"GPU", as memory is typically more limited in this case.
- For nets that contain dynamic dimensions (usually specified as "Varying"), the BatchSize is usually automatically chosen to be 16.
- The BatchSize used when training can be obtained from a NetTrainResultsObject via the "BatchSize" property.
Examplesopen allclose all
Basic Examples (1)
Define a single-layer neural network and train this network with a BatchSize of 300:
Properties & Relations (1)
NetTrain typically processes more inputs per second when larger batch sizes are used, at the cost of extra memory usage. Training a simple net with a BatchSize of 1:
Using a BatchSize of 1000:
This can also be seen by returning the mean examples per second processed by NetTrain:
Depending on the task, larger batch sizes provide only marginal benefit to final net quality and may exhaust the available memory when training on a GPU. Furthermore, a given amount of training time may be better spent on making more frequent updates using smaller batches, as long as the batch size is still large enough to produce a low-variance estimate of the gradient.
Wolfram Research (2016), BatchSize, Wolfram Language function, https://reference.wolfram.com/language/ref/BatchSize.html (updated 2018).
Wolfram Language. 2016. "BatchSize." Wolfram Language & System Documentation Center. Wolfram Research. Last Modified 2018. https://reference.wolfram.com/language/ref/BatchSize.html.
Wolfram Language. (2016). BatchSize. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/BatchSize.html