gives an array in which each element at the lowest level of array is replaced by an integer index representing the cluster in which the element lies.
finds at most n clusters.
finds clusters at the specified level in array.
finds clusters of pixels with similar values in image.
finds at most n clusters in image.
Details and Options
- ClusteringComponents works for a variety of data types, including numerical, textual, and image, as well as dates and times.
- The following options can be given:
CriterionFunction Automatic criterion for selecting a method DistanceFunction Automatic the distance function to use FeatureExtractor Identity how to extract features from which to learn FeatureNames Automatic feature names to assign for input data FeatureTypes Automatic feature types to assume for input data Method Automatic what method to use MissingValueSynthesis Automatic how to synthesize missing values PerformanceGoal Automatic aspect of performance to optimize RandomSeeding 1234 what seeding of pseudorandom generators should be done internally Weights Automatic what weight to give to each example
- By default, ClusteringComponents will preprocess the data automatically unless a DistanceFunction is specified.
- The setting for DistanceFunction can be any distance or dissimilarity function, or a function f defining a distance between two values.
- Possible settings for PerformanceGoal include:
Automatic automatic tradeoff among speed, accuracy, and memory "Quality" maximize the accuracy of the classifier "Speed" maximize the speed of the classifier
- Possible settings for Method include:
Automatic automatically select a method "Agglomerate" single linkage clustering algorithm "DBSCAN" density-based spatial clustering of applications with noise "NeighborhoodContraction" shift data points toward high-density regions "JarvisPatrick" Jarvis–Patrick clustering algorithm "KMeans" k-means clustering algorithm "MeanShift" mean-shift clustering algorithm "KMedoids" partitioning around medoids "SpanningTree" minimum spanning tree-based clustering algorithm "Spectral" spectral clustering algorithm "GaussianMixture" variational Gaussian mixture algorithm
- The methods "KMeans" and "KMedoids" can only be used when the number of clusters is specified.
- The following plots show results of common methods on toy datasets:
- Possible settings for CriterionFunction include:
"StandardDeviation" root-mean-square standard deviation "RSquared" R-squared "Dunn" Dunn index "CalinskiHarabasz" Calinski–Harabasz index "DaviesBouldin" Davies–Bouldin index Automatic internal index
- Possible settings for RandomSeeding include:
Automatic automatically reseed every time the function is called Inherited use externally seeded random numbers seed use an explicit integer or string as a seed
Examplesopen allclose all
Basic Examples (3)
Find a cluster assignment with exactly two clusters using different settings for CriterionFunction:
Use FeatureNames to name features, and refer to their names in further specifications:
Use FeatureTypes to enforce the interpretation of the features:
Perform the same operation with PerformanceGoal set to "Quality":
Compute their clustering several times by changing the RandomSeeding option and compare the results:
Properties & Relations (3)
Convert the result of ClusteringComponents to partitions of similar elements:
FindClusters yields the same result:
Convert the result of FindClusters to a list of cluster indices:
ClusteringComponents yields the same result:
Wolfram Research (2010), ClusteringComponents, Wolfram Language function, https://reference.wolfram.com/language/ref/ClusteringComponents.html (updated 2020).
Wolfram Language. 2010. "ClusteringComponents." Wolfram Language & System Documentation Center. Wolfram Research. Last Modified 2020. https://reference.wolfram.com/language/ref/ClusteringComponents.html.
Wolfram Language. (2010). ClusteringComponents. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/ClusteringComponents.html