• This algorithm identifies clusters by detecting areas of high density of samples.
  • The algorithm iteratively shifts each point in the direction of increasing KDE (kernel density estimation) until all points are on a KDE surface peak.
  • KDE surface is a weighting given to each sample to estimate the underlying distribution.
  • Initially, a window size \(\lambda\) (bandwidth) is chosen.
  • Then all the samples are copied.
  • Next, the sample copies move in the direction of increasing KDE. To that end, we must compute the distance and then the weight of sample against all the original samples, and update its location by adding the weight to the location.
  • Although the algorithm has been extensively used, a mathematical proof that for the convergence of the algorithm in dimensions higher than one does not exists.

Pros

  • Does not assume any specific shape of data.
  • Only one parameter to choose \(\lambda\).
  • Independent of initialization.
  • Does not need a user-defined number of clusters.
  • Will not get trapped in local minima, like k-means does.

Cons

  • Choosing \(\lambda\) is not trivial.
  • It can be slow for large number of features. Slower than k-means.

Algorithm

  1. Write a kernel function. A kernel function receives two arguments: distance and λ, and returns weight. The simplest kernel function is a flat one where weight is 1 if distance is less than λ and 0 otherwise. For a distance function, you may Euclidean distance.
  2. Write a shift function. This function will be applied to each data point and move them.
    1. The shift function should accept three arguments: a data point p, all data X, and λ.
    2. Write a loop for each sample in X. Calculate the weight of each sample according to its distance from p.
    3. Return the new position of p by taking the weighted average of all the points.
  3. Write the mean shift function.
    1. It accepts three arguments: all data X, λ, and threshold, and returns a list of coordinates for group centroids.
    2. Create an empty array for centroids.
    3. Write a loop to shift each sample.
      1. Create a copy from the sample using deepcopy.
      2. Shift the point until it is at a peak:
        1. In a while loop, keep shifting the point using the shift function.
        2. After each shift, check whether it has moved more than the threshold.
        3. If not, it has reached the peak. In this case, check whether its distance to any of the centroids is less than the threshold. If it is, then move on to the next sample.
        4. Otherwise, add the shifted point to the list of centroids.
  4. After you have the centroids, you can find which samples belong to which centroids by assigning them to the nearest centroid.

Exercises

  1. Use the described algorithm above to write a mean shift algorithm that classifies the synthetic dataset below. (3 points)

       using ScikitLearn
       @sk_import datasets: make_blobs
       features, labels = make_blobs(n_samples=500, centers=3, cluster_std=0.55, random_state=0)
    
  2. Image segmentation is the act of finding chunks of images that are meaningful. Pixels are too small to have any meaning. Segmentation has many applications, for example, to compress images, blur background, remove background, and find objects in an image. Figure below is an example of how mean shift has been able to find meaningful segments in an image.

    From doi: 10.1109/34.1000236

    Find segments in an image of your choice using mean shift (3 points). For each pixel, define five features: x and y positions, luminosity, and color channels (U, V). Luminosity and U, V color channels (YUB color encoding) is a different representation of an image than RGB. To create a YUB encoding from RGB, weighted values of R, G, and B are summed to produce Y, a measure of overall brightness or luminance. U and V are computed as scaled differences between Y and the B and R values. The formula to convert RGB to YUB is as follows. Note that for the constants below, RGB is normalized between 0 and 1, meaning you will have to convert RGB values by 256.

      Y = 0.299R + 0.587G + 0.114B
      U = 0.492 (B-Y)  # or U = -0.147R - 0.289G + 0.436B
      V = 0.877 (R-Y)  # or V = 0.615R - 0.515G - 0.100B
    

    The inverse conversion is with these formulas:

       R = Y + 1.140V
       G = Y - 0.395U - 0.581V
       B = Y + 2.032U
    

    Create a new image where each pixel has the RGB values of the centroid.

Tags: clustering