"MeanShift" (Machine Learning Method)

Details & Suboptions

  • "MeanShift" is a density-based clustering method where the density is estimated using a neighbor-based approach. "MeanShift" works for arbitrary cluster shapes and sizes; however, it can fail when clusters have different densities or are intertwined.
  • The following plots show the results of the "MeanShift" method applied to toy datasets:
  • The "MeanShift" method iteratively shifts data points toward higher-density regions. During this procedure, data points tend to collapse to different fixed points, each of them representing a cluster.
  • Formally, at each step, each data point is set to with , defining an effective neighborhood radius. The difference is called mean shift. The algorithm repeats the mean-shift updates until points stop moving; all points belonging to a cluster are then collapsed (up to a tolerance). This algorithm is equivalent to the "NeighborhoodContraction" method but with a different neighborhood definition.
  • The option DistanceFunction can be used to define which distance to use.
  • The following suboption can be given:
  • "NeighborhoodRadius" Automaticradius ϵ

Examples

open allclose all

Basic Examples  (3)

Find clusters of nearby values using the "MeanShift" clustering method:

Train the ClassifierFunction on a list of colors using the "MeanShift" method:

Gather the elements by their class number:

Train a ClassifierFunction on a list of strings:

Find the cluster assignments and gather the elements by their cluster:

Options  (3)

DistanceFunction  (1)

Cluster data using Manhattan distance:

"NeighborhoodRadius"  (2)

Find clusters by specifying the "NeighborhoodRadius" suboption:

Generate a list of 100 random colors:

Cluster the colors using the "MeanShift" method:

Try different "NeighborhoodRadius" suboptions for clustering the colors: