MissingValueSynthesis

MissingValueSynthesis

is an option for functions such as Classify that specifies how missing values should be replaced.

Details

  • Missing value synthesis, also known as missing imputation, is done by conditioning a distribution on known values, as in SynthesizeMissingValues.
  • Missing values are typically represented by Missing[].
  • MissingValueSynthesis can be used at training time, inference time or to update the synthesizer of an existing model.
  • Classify[data,,MissingValueSynthesissynth] can be used to specify a missing synthesis method or model for training (and similarly for other training functions).
  • ClassifierFunction[][example,MissingValueSynthesissynth] can be used to temporarily overwrite the synthesis method during classifier inference (and similarly for other machine learning models).
  • Classify[ClassifierFunction[],MissingValueSynthesissynth] can be used to overwrite the internal missing synthesizer of the classifier (and similarly for other machine learning models).
  • Possible settings for MissingValueSynthesis include:
  • Automaticautomatically choose distribution method and synthesis strategy
    Nonedo not use any missing synthesizer
    methoduse the specified method
    strategyhow to synthesize from the distribution
    assocspecify both distribution method and synthesis strategy
  • Possible settings for method include:
  • Automaticautomatically choose the distribution method
    "Multinormal"use a multivariate normal (Gaussian) distribution
    "ContingencyTable"discretize data and store each possible probability
    "KernelDensityEstimation"use a kernel mixture distribution
    "DecisionTree"use a decision tree to compute probabilities
    "GaussianMixture"use a mixture of Gaussian (normal) distributions
    LearnedDistribution[]use the specified distribution
  • Possible settings for strategy include:
  • Automaticautomatically choose the synthesis strategy
    "RandomSampling"randomly sample from the conditioned distribution
    "ModeFinding"attempt to find the mode of the conditioned distribution
  • In the form Methodassoc, the association assoc should be of the form <|"LearningMethod"method,"EvaluationStrategy"strategy|>.

Examples

Basic Examples  (2)

Train a predictor with two input features:

Get the prediction for an example that has a missing value:

Set the missing value synthesis to replace missing variables with their most likely value given known values (which is the default behavior):

Replace missing variables with random samples conditioned on known values:

Averaging over many random imputations is usually the best strategy and allows obtaining the uncertainty caused by the imputation:

Specify a learning method during training to control how the distribution of data is learned:

Predict an example with missing values using the "KernelDensityEstimation" distribution to condition values:

Provide an existing LearnedDistribution at training to use it when imputing missing values during training and later evaluations:

Specify an existing LearnedDistribution to synthesize missing values for an individual evaluation:

Control both the learning method and the evaluation strategy by passing an association at training:

Train a classifier with two input features:

Get class probabilities for an example that has a missing value:

Set the missing value synthesis to replace missing variables with their most likely value given known values (which is the default behavior):

Replace missing variables with random samples conditioned on known values:

Averaging over many random imputations is usually the best strategy and allows obtaining the uncertainty caused by the imputation:

Wolfram Research (2021), MissingValueSynthesis, Wolfram Language function, https://reference.wolfram.com/language/ref/MissingValueSynthesis.html.

Text

Wolfram Research (2021), MissingValueSynthesis, Wolfram Language function, https://reference.wolfram.com/language/ref/MissingValueSynthesis.html.

CMS

Wolfram Language. 2021. "MissingValueSynthesis." Wolfram Language & System Documentation Center. Wolfram Research. https://reference.wolfram.com/language/ref/MissingValueSynthesis.html.

APA

Wolfram Language. (2021). MissingValueSynthesis. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/MissingValueSynthesis.html

BibTeX

@misc{reference.wolfram_2023_missingvaluesynthesis, author="Wolfram Research", title="{MissingValueSynthesis}", year="2021", howpublished="\url{https://reference.wolfram.com/language/ref/MissingValueSynthesis.html}", note=[Accessed: 29-March-2024 ]}

BibLaTeX

@online{reference.wolfram_2023_missingvaluesynthesis, organization={Wolfram Research}, title={MissingValueSynthesis}, year={2021}, url={https://reference.wolfram.com/language/ref/MissingValueSynthesis.html}, note=[Accessed: 29-March-2024 ]}