BayesianMinimization
✖
BayesianMinimization
gives an object representing the result of Bayesian minimization of the function f over the configurations confi.
minimizes over the region represented by the region specification reg.
minimizes over configurations obtained by applying the function sampler.
applies the function nsampler to successively generate configurations starting from the confi.
Details and Options


- BayesianMinimization[…] returns a BayesianMinimizationObject[…] whose properties can be obtained using BayesianMinimizationObject[…]["prop"].
- Possible properties include:
-
"EvaluationHistory" configurations and values explored during minimization "Method" method used for Bayesian minimization "MinimumConfiguration" configuration found that minimizes the result from f "MinimumValue" estimated minimum value obtained from f "NextConfiguration" configuration to sample next if minimization were continued "PredictorFunction" best prediction model found for the function f "Properties" list of all available properties - Configurations can be of any form accepted by Predict (single data element, list of data elements, association of data elements, etc.) and of any type accepted by Predict (numerical, textual, sounds, images, etc.).
- The function f must output a real-number value when applied to a configuration conf.
- BayesianMinimization[f,…] attempts to find a good minimum using the smallest number of evaluations of f.
- In BayesianMinimization[f,spec], spec defines the domain of the function f. A domain can be defined by a list of configurations, a geometric region, or a configuration generator function.
- In BayesianMinimization[f,sampler], sampler[] must output a configuration suitable for f to be applied to it.
- In BayesianMinimization[f,{conf1,conf2,…}->nsampler], nsampler[conf] must output a configuration.
- BayesianMinimization takes the following options:
-
AssumeDeterministic False whether to assume that f is deterministic InitialEvaluationHistory None intial set of configurations and values MaxIterations 100 maximum number of iterations Method Automatic method used to determine configurations to evaluate ProgressReporting $ProgressReporting how to report progress RandomSeeding 1234 what seeding of pseudorandom generators should be done internally - Possible settings for Method include:
-
Automatic pick the method automatically "MaxExpectedImprovement" maximize expected improvement over current best value "MaxImprovementProbability" maximize improvement probability over current best value - Possible settings for RandomSeeding include:
-
Automatic automatically reseed every time the function is called Inherited use externally seeded random numbers seed use an explicit integer or strings as a seed
Examples
open allclose allBasic Examples (3)Summary of the most common use cases
Minimize a function over an interval:

https://wolfram.com/xid/0ixi7f7wvbpgx32-yafonk

Use the resulting BayesianMinimizationObject[…] to get the estimated minimum configuration:

https://wolfram.com/xid/0ixi7f7wvbpgx32-pz88u4

Get the estimated minimum function value:

https://wolfram.com/xid/0ixi7f7wvbpgx32-wwg70u

Minimize a function over a set of configurations:

https://wolfram.com/xid/0ixi7f7wvbpgx32-6zcz9u

Get the minimum configuration over the set:

https://wolfram.com/xid/0ixi7f7wvbpgx32-uz6dpb

Minimize a function over a domain defined by a random generator:

https://wolfram.com/xid/0ixi7f7wvbpgx32-zdwoqt

Get the estimated minimum value:

https://wolfram.com/xid/0ixi7f7wvbpgx32-h4xnun

Scope (3)Survey of the scope of standard use cases
Minimize a function over a region:

https://wolfram.com/xid/0ixi7f7wvbpgx32-crxktu

https://wolfram.com/xid/0ixi7f7wvbpgx32-dqz944

Get the list of available properties to query:

https://wolfram.com/xid/0ixi7f7wvbpgx32-m05hl3

Get the history of evaluations:

https://wolfram.com/xid/0ixi7f7wvbpgx32-4jx2ii

Get information about the method used to determine the configurations to explore:

https://wolfram.com/xid/0ixi7f7wvbpgx32-b6mbml

Get the current probabilistic model of the function (this is a PredictorFunction):

https://wolfram.com/xid/0ixi7f7wvbpgx32-dn9cnp

Find the best configuration to explore if the minimization were continued:

https://wolfram.com/xid/0ixi7f7wvbpgx32-nbree6

Find a list of properties simultaneously:

https://wolfram.com/xid/0ixi7f7wvbpgx32-21tcfr

Visualize how well the function is modeled, particularly near the minimum:

https://wolfram.com/xid/0ixi7f7wvbpgx32-xmvbai

Minimize a function with initial configurations over a domain defined by a random neighborhood configuration generator:

https://wolfram.com/xid/0ixi7f7wvbpgx32-beev2s

https://wolfram.com/xid/0ixi7f7wvbpgx32-indum1

Get the model of the function:

https://wolfram.com/xid/0ixi7f7wvbpgx32-19buy9

Visualize the model's performance near the minimum:

https://wolfram.com/xid/0ixi7f7wvbpgx32-9ec9b9

Define a function that takes an image and computes the negative probability returned by ImageIdentify of identifying the image as the entity , with the domain defined by a random generator over a corpus of images:

https://wolfram.com/xid/0ixi7f7wvbpgx32-lz0xfs

https://wolfram.com/xid/0ixi7f7wvbpgx32-namz5e

Get the minimum configuration:

https://wolfram.com/xid/0ixi7f7wvbpgx32-50yapp


https://wolfram.com/xid/0ixi7f7wvbpgx32-tm5jl2

Options (4)Common values & functionality for each option
AssumeDeterministic (1)
Minimize a function over a domain defined by a random generator:

https://wolfram.com/xid/0ixi7f7wvbpgx32-6yo200

https://wolfram.com/xid/0ixi7f7wvbpgx32-pr37wk

The function is assumed to be stochastic; the value from the probabilistic model will differ in general from the function value for evaluated configurations:

https://wolfram.com/xid/0ixi7f7wvbpgx32-z6eirg


https://wolfram.com/xid/0ixi7f7wvbpgx32-ija59a

Include information that the function is deterministic, i.e. noise free:

https://wolfram.com/xid/0ixi7f7wvbpgx32-y3xsjp

For a deterministic function, the model value and the function value for evaluated configurations agree to a good precision:

https://wolfram.com/xid/0ixi7f7wvbpgx32-dy9h5s


https://wolfram.com/xid/0ixi7f7wvbpgx32-0ji5cq

InitialEvaluationHistory (1)
Minimize a function over a disk region with a small number of iterations:

https://wolfram.com/xid/0ixi7f7wvbpgx32-b2xr57

https://wolfram.com/xid/0ixi7f7wvbpgx32-5hqj2m

Use the information from this evaluation in the next:

https://wolfram.com/xid/0ixi7f7wvbpgx32-ejlq10


https://wolfram.com/xid/0ixi7f7wvbpgx32-vl667a

Get the estimated minimum configuration now:

https://wolfram.com/xid/0ixi7f7wvbpgx32-p0bco2

MaxIterations (1)
Minimize a function with a domain defined by a random generator:

https://wolfram.com/xid/0ixi7f7wvbpgx32-bqpg5k

https://wolfram.com/xid/0ixi7f7wvbpgx32-o68e0b

Get the number of function evaluations:

https://wolfram.com/xid/0ixi7f7wvbpgx32-m1n6ee

Specify the maximum number of iterations:

https://wolfram.com/xid/0ixi7f7wvbpgx32-r6xxom


https://wolfram.com/xid/0ixi7f7wvbpgx32-2pd5jj

Applications (2)Sample problems that can be solved with this function
Define a training set to train predictor functions using the Predict function and a test set to measure their performance:

https://wolfram.com/xid/0ixi7f7wvbpgx32-1kuz80
Create a "loss" function to test the performance of different methods in Predict:

https://wolfram.com/xid/0ixi7f7wvbpgx32-5ixsxo
Minimize the loss function over a domain defined by a list of different methods for Predict:

https://wolfram.com/xid/0ixi7f7wvbpgx32-sjwhr1

Examine the evaluation history:

https://wolfram.com/xid/0ixi7f7wvbpgx32-quyr5r

Find the best configuration to explore next:

https://wolfram.com/xid/0ixi7f7wvbpgx32-0czb4e

Load Fisher's Iris dataset and divide it into a training set and a validation set:

https://wolfram.com/xid/0ixi7f7wvbpgx32-fr84u5
Create "blackbox" functions. Here the functions are the loss (negative log-likelihood) functions for two different methods used in the Classify function. The arguments of the functions are known as hyperparameters.
Train a logistic regression classifier with two hyperparameters, the L1 and L2 regularization coefficients:

https://wolfram.com/xid/0ixi7f7wvbpgx32-t140d
Minimize the loss function for the logistic regression classifier over a domain defined by a rectangular region in the logarithm of the hyperparameters:

https://wolfram.com/xid/0ixi7f7wvbpgx32-7jcyzv

Get the model of the function:

https://wolfram.com/xid/0ixi7f7wvbpgx32-cmt4aa

Visualize the model of the loss function of the classifier together with the estimated minimum:

https://wolfram.com/xid/0ixi7f7wvbpgx32-f3mjc5


https://wolfram.com/xid/0ixi7f7wvbpgx32-uiwp10

Now train a support vector machine (SVM) classifier with two hyperparameters, the soft margin parameter and the gamma scaling parameter:

https://wolfram.com/xid/0ixi7f7wvbpgx32-jpmc35
Minimize the loss function for the SVM classifier over a domain defined by a rectangular region in the logarithm of the hyperparameters:

https://wolfram.com/xid/0ixi7f7wvbpgx32-6enhy6

Get the model of the function:

https://wolfram.com/xid/0ixi7f7wvbpgx32-5l8azx

Visualize the model of the loss function together with the estimated minimum:

https://wolfram.com/xid/0ixi7f7wvbpgx32-78mjwf


https://wolfram.com/xid/0ixi7f7wvbpgx32-da2yjm

Possible Issues (2)Common pitfalls and unexpected behavior
When the domain of the objective function is defined by an initial configuration set and a neighborhood configuration generator, the results depend on the quality of the generator provided.
Minimize a function where the domain is defined as above:

https://wolfram.com/xid/0ixi7f7wvbpgx32-cwx39c

https://wolfram.com/xid/0ixi7f7wvbpgx32-6u3b3m

Get the model of the function:

https://wolfram.com/xid/0ixi7f7wvbpgx32-d1lalk

Since the starting initial configurations are "far" from the global minimum and the generator takes relatively small steps, the algorithm could converge to a local minimum:

https://wolfram.com/xid/0ixi7f7wvbpgx32-ij5d68

If the neighborhood configuration generator takes steps that are "too large" or "too small", this could lead to problems:

https://wolfram.com/xid/0ixi7f7wvbpgx32-jajzx2

In this case, at each step the generator gives a configuration value that is the square of the previous one, so the values quickly become extremely large and the probabilistic model does not work properly.
Wolfram Research (2016), BayesianMinimization, Wolfram Language function, https://reference.wolfram.com/language/ref/BayesianMinimization.html (updated 2017).
Text
Wolfram Research (2016), BayesianMinimization, Wolfram Language function, https://reference.wolfram.com/language/ref/BayesianMinimization.html (updated 2017).
Wolfram Research (2016), BayesianMinimization, Wolfram Language function, https://reference.wolfram.com/language/ref/BayesianMinimization.html (updated 2017).
CMS
Wolfram Language. 2016. "BayesianMinimization." Wolfram Language & System Documentation Center. Wolfram Research. Last Modified 2017. https://reference.wolfram.com/language/ref/BayesianMinimization.html.
Wolfram Language. 2016. "BayesianMinimization." Wolfram Language & System Documentation Center. Wolfram Research. Last Modified 2017. https://reference.wolfram.com/language/ref/BayesianMinimization.html.
APA
Wolfram Language. (2016). BayesianMinimization. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/BayesianMinimization.html
Wolfram Language. (2016). BayesianMinimization. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/BayesianMinimization.html
BibTeX
@misc{reference.wolfram_2025_bayesianminimization, author="Wolfram Research", title="{BayesianMinimization}", year="2017", howpublished="\url{https://reference.wolfram.com/language/ref/BayesianMinimization.html}", note=[Accessed: 23-May-2025
]}
BibLaTeX
@online{reference.wolfram_2025_bayesianminimization, organization={Wolfram Research}, title={BayesianMinimization}, year={2017}, url={https://reference.wolfram.com/language/ref/BayesianMinimization.html}, note=[Accessed: 23-May-2025
]}