ClusterClassify
✖
ClusterClassify
generates a ClassifierFunction[…] by partitioning data into clusters of similar elements.
Details and Options




- ClusterClassify works for a variety of data types, including numerical, textual, and image, as well as dates and times and combinations of these.
- The number of clusters can be specified in the following ways:
-
Automatic find the number of clusters automatically n find exactly n clusters UpTo[n] find at most n clusters - The following options can be given:
-
CriterionFunction Automatic criterion for selecting a method DistanceFunction Automatic the distance function to use FeatureExtractor Identity how to extract features from which to learn FeatureNames Automatic feature names to assign for input data FeatureTypes Automatic feature types to assume for input data Method Automatic what method to use MissingValueSynthesis Automatic how to synthesize missing values PerformanceGoal Automatic aspect of performance to optimize RandomSeeding 1234 what seeding of pseudorandom generators should be done internally Weights Automatic what weight to give to each example - By default, ClusterClassify will preprocess the data automatically unless a DistanceFunction is specified.
- The setting for DistanceFunction can be any distance or dissimilarity function, or a function f defining a distance between two values.
- Possible settings for PerformanceGoal include:
-
Automatic automatic tradeoff among speed, accuracy, and memory "Memory" minimize the storage requirements of the classifier "Quality" maximize the accuracy of the classifier "Speed" maximize the speed of the classifier "TrainingSpeed" minimize the time spent producing the classifier - Possible settings for Method include:
-
Automatic automatically select a method "Agglomerate" single linkage clustering algorithm "DBSCAN" density-based spatial clustering of applications with noise "GaussianMixture" variational Gaussian mixture algorithm "JarvisPatrick" Jarvis–Patrick clustering algorithm "KMeans" k-means clustering algorithm "KMedoids" partitioning around medoids "MeanShift" mean-shift clustering algorithm "NeighborhoodContraction" shift data points toward high-density regions "SpanningTree" minimum spanning tree-based clustering algorithm "Spectral" spectral clustering algorithm - The methods "KMeans" and "KMedoids" can only be used when the number of clusters is specified.
- The methods "DBSCAN", "GaussianMixture", "JarvisPatrick", "MeanShift" and "NeighborhoodContraction" can only be used when the number of clusters is Automatic.
- The following plots show results of common methods on toy datasets:
- Possible settings for CriterionFunction include:
-
"StandardDeviation" root-mean-square standard deviation "RSquared" R-squared "Dunn" Dunn index "CalinskiHarabasz" Calinski–Harabasz index "DaviesBouldin" Davies–Bouldin index "Silhouette" Silhouette score Automatic internal index - Possible settings for RandomSeeding include:
-
Automatic automatically reseed every time the function is called Inherited use externally seeded random numbers seed use an explicit integer or strings as a seed - ClusterClassify[…,FeatureExtractor"Minimal"] indicates that the internal preprocessing should be as simple as possible.

https://wolfram.com/xid/0jz8jj0ja7kd8g-kdgqdj
Examples
open allclose allBasic Examples (3)Summary of the most common use cases
Train the ClassifierFunction on some numerical data:

https://wolfram.com/xid/0jz8jj0ja7kd8g-0ebdz

Use the classifier function to classify a new unlabeled example:

https://wolfram.com/xid/0jz8jj0ja7kd8g-sph195

Obtain classification probabilities for this example:

https://wolfram.com/xid/0jz8jj0ja7kd8g-4zp6ss


https://wolfram.com/xid/0jz8jj0ja7kd8g-holkya

Plot the probabilities for the two different classes in the interval {-5,5}:

https://wolfram.com/xid/0jz8jj0ja7kd8g-ydgxpz

Train the ClassifierFunction on some colors by requiring the number of classes to be 5:

https://wolfram.com/xid/0jz8jj0ja7kd8g-szt4fw


https://wolfram.com/xid/0jz8jj0ja7kd8g-ym3bco

Use the ClassifierFunction on some unlabeled data:

https://wolfram.com/xid/0jz8jj0ja7kd8g-zy2mue

Gather the elements by their class number:

https://wolfram.com/xid/0jz8jj0ja7kd8g-izazxv

Train the ClassifierFunction on some strings:

https://wolfram.com/xid/0jz8jj0ja7kd8g-e6iypw

https://wolfram.com/xid/0jz8jj0ja7kd8g-6wp1yu


https://wolfram.com/xid/0jz8jj0ja7kd8g-4h237h

Gather the elements by their class number:

https://wolfram.com/xid/0jz8jj0ja7kd8g-4l3vi1

Scope (11)Survey of the scope of standard use cases

https://wolfram.com/xid/0jz8jj0ja7kd8g-nozwv1

https://wolfram.com/xid/0jz8jj0ja7kd8g-n44cqs


https://wolfram.com/xid/0jz8jj0ja7kd8g-4f57j9


https://wolfram.com/xid/0jz8jj0ja7kd8g-h33lre

https://wolfram.com/xid/0jz8jj0ja7kd8g-xgalcu


https://wolfram.com/xid/0jz8jj0ja7kd8g-jkkn6p


https://wolfram.com/xid/0jz8jj0ja7kd8g-grm7nn


https://wolfram.com/xid/0jz8jj0ja7kd8g-w9ozux

Use the classifier to assign clusters to a new Boolean True, False vector:

https://wolfram.com/xid/0jz8jj0ja7kd8g-r4hx1s

Use the classifier to assign clusters to a Boolean 1, 0 vector:

https://wolfram.com/xid/0jz8jj0ja7kd8g-gpdhgz


https://wolfram.com/xid/0jz8jj0ja7kd8g-7itu0a


https://wolfram.com/xid/0jz8jj0ja7kd8g-pquq9z

https://wolfram.com/xid/0jz8jj0ja7kd8g-jhuell


https://wolfram.com/xid/0jz8jj0ja7kd8g-pv0vkp

Use the classifier to cluster new images:

https://wolfram.com/xid/0jz8jj0ja7kd8g-e2ti2f

https://wolfram.com/xid/0jz8jj0ja7kd8g-wswzl


https://wolfram.com/xid/0jz8jj0ja7kd8g-635tx1


https://wolfram.com/xid/0jz8jj0ja7kd8g-d9zq0z


https://wolfram.com/xid/0jz8jj0ja7kd8g-cy1gnu


https://wolfram.com/xid/0jz8jj0ja7kd8g-5nph0v


https://wolfram.com/xid/0jz8jj0ja7kd8g-073b54


https://wolfram.com/xid/0jz8jj0ja7kd8g-sm24z4


https://wolfram.com/xid/0jz8jj0ja7kd8g-gzyhhy

https://wolfram.com/xid/0jz8jj0ja7kd8g-514a6y

Use the classifier to cluster new strings:

https://wolfram.com/xid/0jz8jj0ja7kd8g-9ezsw0

https://wolfram.com/xid/0jz8jj0ja7kd8g-xsnfny


https://wolfram.com/xid/0jz8jj0ja7kd8g-7hanau


https://wolfram.com/xid/0jz8jj0ja7kd8g-g74ir8


https://wolfram.com/xid/0jz8jj0ja7kd8g-9cqla9


https://wolfram.com/xid/0jz8jj0ja7kd8g-22bldm


https://wolfram.com/xid/0jz8jj0ja7kd8g-eamwze

Use the classifier to cluster the data:

https://wolfram.com/xid/0jz8jj0ja7kd8g-hptkis


https://wolfram.com/xid/0jz8jj0ja7kd8g-siuq1r

https://wolfram.com/xid/0jz8jj0ja7kd8g-qjolvl

Look at the classifier information:

https://wolfram.com/xid/0jz8jj0ja7kd8g-8jy9w4

Get a description for the specific method used:

https://wolfram.com/xid/0jz8jj0ja7kd8g-3pw6hb

Generate random points in the plane and visualize them:

https://wolfram.com/xid/0jz8jj0ja7kd8g-w79a2z


https://wolfram.com/xid/0jz8jj0ja7kd8g-xltr0q

Classify new random points in the place:

https://wolfram.com/xid/0jz8jj0ja7kd8g-63oxik
Visualize the resulting clustering:

https://wolfram.com/xid/0jz8jj0ja7kd8g-gu6o77

Classify the same test data using IndeterminateThreshold:

https://wolfram.com/xid/0jz8jj0ja7kd8g-7s0lw
Visualize the resulting clustering including the Indeterminate cluster:

https://wolfram.com/xid/0jz8jj0ja7kd8g-orer4m

Options (10)Common values & functionality for each option
CriterionFunction (1)
Generate some separated data and visualize it:

https://wolfram.com/xid/0jz8jj0ja7kd8g-00c176

Construct a classifier function using the Automatic CriterionFunction:

https://wolfram.com/xid/0jz8jj0ja7kd8g-ys32xz

Construct a classifier function using the Calinski–Harabasz index as CriterionFunction:

https://wolfram.com/xid/0jz8jj0ja7kd8g-klq9cl

Compare the two clusterings of the data:

https://wolfram.com/xid/0jz8jj0ja7kd8g-ryoy9h


https://wolfram.com/xid/0jz8jj0ja7kd8g-4d0yr3

FeatureExtractor (1)
Create a ClassifierFunction from a list of images and classify new examples:

https://wolfram.com/xid/0jz8jj0ja7kd8g-y4rpu8

https://wolfram.com/xid/0jz8jj0ja7kd8g-524nko


https://wolfram.com/xid/0jz8jj0ja7kd8g-3y3rjz

Create a custom FeatureExtractor to extract features:

https://wolfram.com/xid/0jz8jj0ja7kd8g-ip44vf


https://wolfram.com/xid/0jz8jj0ja7kd8g-sdneu4


https://wolfram.com/xid/0jz8jj0ja7kd8g-uh5nlm

FeatureNames (1)
Generate a classifier function and give a name to each feature:

https://wolfram.com/xid/0jz8jj0ja7kd8g-48k90i

Use the association format to assign cluster to a new example:

https://wolfram.com/xid/0jz8jj0ja7kd8g-f3cwzr

The list format can still be used:

https://wolfram.com/xid/0jz8jj0ja7kd8g-6665m0

FeatureTypes (1)
Generate a classifier function assuming numerical and nominal feature types:

https://wolfram.com/xid/0jz8jj0ja7kd8g-vup77u

Generate a classifier function assuming nominal feature types instead:

https://wolfram.com/xid/0jz8jj0ja7kd8g-74cv6w

Compare the result on new examples:

https://wolfram.com/xid/0jz8jj0ja7kd8g-00dri2

Method (2)
Generate some data using uniform distributions:

https://wolfram.com/xid/0jz8jj0ja7kd8g-1yt3on


https://wolfram.com/xid/0jz8jj0ja7kd8g-9l2r1q

Use Information to obtain a method description:

https://wolfram.com/xid/0jz8jj0ja7kd8g-l6az2q


https://wolfram.com/xid/0jz8jj0ja7kd8g-f4ht2o

Classify the data using k-means:

https://wolfram.com/xid/0jz8jj0ja7kd8g-l7f9d5


https://wolfram.com/xid/0jz8jj0ja7kd8g-wtjsy4

Generate a large dataset using multinormal distributions and visualize it:

https://wolfram.com/xid/0jz8jj0ja7kd8g-q74jcq

Use ClusterClassify to find clusters by specifying the method to use and look at the AbsoluteTiming:

https://wolfram.com/xid/0jz8jj0ja7kd8g-xeshqx

Look at the resulting clustering:

https://wolfram.com/xid/0jz8jj0ja7kd8g-vjyznx

Use ClusterClassify to find clusters without specifying the method to use and look at the AbsoluteTiming:

https://wolfram.com/xid/0jz8jj0ja7kd8g-34apqm

MissingValueSynthesis (1)
Generate a large dataset using multinormal distributions and visualize it:

https://wolfram.com/xid/0jz8jj0ja7kd8g-l8qqrk

Use ClusterClassify to find clusters:

https://wolfram.com/xid/0jz8jj0ja7kd8g-1o92r9

Get the top cluster probabilities for a point with missing data:

https://wolfram.com/xid/0jz8jj0ja7kd8g-07ipm

Set the missing value synthesis to replace each missing variable with its estimated most likely value given known values (which is the default behavior):

https://wolfram.com/xid/0jz8jj0ja7kd8g-373liq

Replace missing variables with random samples conditioned on known values:

https://wolfram.com/xid/0jz8jj0ja7kd8g-vhbupi

Get the distribution of likely clusters for the point by replacing missing variables repeatedly with the random sampling strategy:

https://wolfram.com/xid/0jz8jj0ja7kd8g-yftwgq

PerformanceGoal (1)
Generate a uniformly distributed dataset and visualize it:

https://wolfram.com/xid/0jz8jj0ja7kd8g-45i81h

Obtain a classifier from this data, with an emphasis on training speed:

https://wolfram.com/xid/0jz8jj0ja7kd8g-467v87

Assign clusters to some randomly generated data and look at the AbsoluteTiming:

https://wolfram.com/xid/0jz8jj0ja7kd8g-iibo83

https://wolfram.com/xid/0jz8jj0ja7kd8g-yzdcxp

Obtain a classifier from this data, with an emphasis on the speed:

https://wolfram.com/xid/0jz8jj0ja7kd8g-lcg15e

Assign clusters to some randomly generated data and look at the AbsoluteTiming compared to the one above:

https://wolfram.com/xid/0jz8jj0ja7kd8g-8asw2h

Visualize the two clusterings for the test data and note how the setting "TrainingSpeed" gives better results:

https://wolfram.com/xid/0jz8jj0ja7kd8g-zpj9zx


https://wolfram.com/xid/0jz8jj0ja7kd8g-6mc44v

RandomSeeding (1)
Train several classifiers on random colors:

https://wolfram.com/xid/0jz8jj0ja7kd8g-zqb4xv


https://wolfram.com/xid/0jz8jj0ja7kd8g-yh1noh
Compute the classifiers on a new color and observe that the result is always the same:

https://wolfram.com/xid/0jz8jj0ja7kd8g-6yzdlt

Train several classifiers on the same colors by using different values of the RandomSeeding option:

https://wolfram.com/xid/0jz8jj0ja7kd8g-umfvx6
Compute the classifiers on and observe how the classifier differs:

https://wolfram.com/xid/0jz8jj0ja7kd8g-fk5e85

Weights (1)
Generate some separated data containing outliers:

https://wolfram.com/xid/0jz8jj0ja7kd8g-sfmfd1

https://wolfram.com/xid/0jz8jj0ja7kd8g-lb5snz


https://wolfram.com/xid/0jz8jj0ja7kd8g-3eop7x

Use the classifier function to classify the outlier together with another point:

https://wolfram.com/xid/0jz8jj0ja7kd8g-cnkb6c

Clusterize the data, adding a big weight on the outlier:

https://wolfram.com/xid/0jz8jj0ja7kd8g-r6jn4n

Use the classifier function to classify the same points:

https://wolfram.com/xid/0jz8jj0ja7kd8g-9tdd1

Applications (3)Sample problems that can be solved with this function
Train several classifiers on a small, uniformly distributed dataset:

https://wolfram.com/xid/0jz8jj0ja7kd8g-n6dayj

https://wolfram.com/xid/0jz8jj0ja7kd8g-jeom79
Divide a triangle into segments by using the classifiers on a large number of uniformly distributed random points:

https://wolfram.com/xid/0jz8jj0ja7kd8g-8rclb9

https://wolfram.com/xid/0jz8jj0ja7kd8g-584u7j

https://wolfram.com/xid/0jz8jj0ja7kd8g-2j9ber

Generate some normally distributed data:

https://wolfram.com/xid/0jz8jj0ja7kd8g-0mugs5

Clusterize the data without specifying the number of classes:

https://wolfram.com/xid/0jz8jj0ja7kd8g-wwu61c


https://wolfram.com/xid/0jz8jj0ja7kd8g-6fq6w3

Clusterize the data, specifying the number of classes:

https://wolfram.com/xid/0jz8jj0ja7kd8g-836l2k


https://wolfram.com/xid/0jz8jj0ja7kd8g-4sgm92

Find dominant colors in an image:

https://wolfram.com/xid/0jz8jj0ja7kd8g-x986el
Cluster the data given by the array of pixel values of the image:

https://wolfram.com/xid/0jz8jj0ja7kd8g-wsmc6f
Use the classifier to assign clusters to each pixel:

https://wolfram.com/xid/0jz8jj0ja7kd8g-b3uybh
Use the classifier function to find four dominant colors:

https://wolfram.com/xid/0jz8jj0ja7kd8g-3rz0f5

Use the classifier to get binary masks for each dominant color:

https://wolfram.com/xid/0jz8jj0ja7kd8g-ymqt0f

Wolfram Research (2016), ClusterClassify, Wolfram Language function, https://reference.wolfram.com/language/ref/ClusterClassify.html (updated 2020).
Text
Wolfram Research (2016), ClusterClassify, Wolfram Language function, https://reference.wolfram.com/language/ref/ClusterClassify.html (updated 2020).
Wolfram Research (2016), ClusterClassify, Wolfram Language function, https://reference.wolfram.com/language/ref/ClusterClassify.html (updated 2020).
CMS
Wolfram Language. 2016. "ClusterClassify." Wolfram Language & System Documentation Center. Wolfram Research. Last Modified 2020. https://reference.wolfram.com/language/ref/ClusterClassify.html.
Wolfram Language. 2016. "ClusterClassify." Wolfram Language & System Documentation Center. Wolfram Research. Last Modified 2020. https://reference.wolfram.com/language/ref/ClusterClassify.html.
APA
Wolfram Language. (2016). ClusterClassify. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/ClusterClassify.html
Wolfram Language. (2016). ClusterClassify. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/ClusterClassify.html
BibTeX
@misc{reference.wolfram_2025_clusterclassify, author="Wolfram Research", title="{ClusterClassify}", year="2020", howpublished="\url{https://reference.wolfram.com/language/ref/ClusterClassify.html}", note=[Accessed: 19-June-2025
]}
BibLaTeX
@online{reference.wolfram_2025_clusterclassify, organization={Wolfram Research}, title={ClusterClassify}, year={2020}, url={https://reference.wolfram.com/language/ref/ClusterClassify.html}, note=[Accessed: 19-June-2025
]}