Image Compression and Color Quantization using K-Means Clustering

In this post, you'll able to compress an image of higher size relatively to a smaller size. Here size I mean the image's memory consumption, not the aspect ratio (though it is also somewhat related to the size). Before we begin, let's be familiar with what Image Compression, Color Quantization and K-Means Clustering is.

Basically K-Means Clustering is used to find the central value (centroid) for k clusters of data. Then each data point is assigned to the cluster whose center is nearest to k. Then, a new centroid is calculated for each of the k clusters based upon the data points that are assigned in that cluster.

In our case, the data points will be Image pixels. Assuming that you know what pixels are, these pixels actually comprises of 3 channels, Red, Green and Blue. Each of these channels' have intensity ranging from 0 to 255, i.e., altogether 256. So as a whole, total number of colors in each pixel is, 256x256x256. Each pixel(color) has 2^8 colors acquiring 8 bits of memory. Thus, each pixel requires 8+8+8, i.e., 24 bits of memory for storage.

Now, using K-Means clustering, we'll try to group pixels of similar color together in k clusters, i.e., 8 different colors. Instead of each pixel originally representing 256x256x256 colors, each pixel can only represent 8 possible colors. The most surprising thing is, each pixel now requires only (2^3 = 8) 3 bits of memory, instead of original 24 bits.

Here in the process, we're breaking all possible colors of the RGB color space over k colors. This is called Color Quantization and during this, the k centroids of the clusters are representative of 3 dimensional RGB color space. It will now replace the colors of all points in their cluster and thus the image will only have k colors in it.

This was a little bit of a concept for understanding, what we are going to do. 😁
Let us now begin with our real task. Throughout this tutorial, we'll be using some of OpenCV and Python libraries. Basically, we'll use Scikit-learn, Matlplotlib and Numpy.

Let's install some dependencies.

1. Installing OpenCV

conda install -c conda-forge opencv

2. Installing Scikit-learn

conda install -c conda-forge scikit-learn

3. Installing a python 2D plotting library, i.e., matplotlib

conda conda install -c conda-forge matplotlib

For this tutorial, you can download the image from here.

tiger.jpg

import cv2 
import urllib.request 
image_url = "http://eskipaper.com/images/high-quality-animal-wallpapers-1.jpg"
urllib.request.urlretrieve(image_url, "tiger.jpg") #downloads the image as tiger.jpg
im_tiger = cv2.imread("tiger.jpg") #reading the downloaded image in im_tiger

Note: In case the url doesn't work, you'll need to manually download the file and give the link to that image file. For example, you can simply do the following after you've downloaded the file (in my case, it is in 'images' folder).

im = cv2.imread("images/tiger.jpg")

Moving on, let's import some important modules.

import os
import math
import matplotlib.pyplot as plt
%matplotlib inline

First let's see the original file size of the image we downloaded (in KB).

img_corrected = cv2.cvtColor(im_tiger, cv2.COLOR_BGR2RGB)
plt.axis('off')
plt.imshow(img_corrected)
print("The size of im_tiger is: {} Kilo Bytes".format(str(math.ceil((os.stat('im_tiger.jpg').st_size)/1000))))

Run the code to the see the output as below:

Now it's time to import and use K-Means clustering from Sklearn. We'll also import numpy as for transforming the dimension of image.

from sklearn.cluster import KMeans
import numpy as np

In order to transform the image into certain dimension, we first need to extract the number of rows and columns of the image above. After that, we'll reshape the image into a dimension compatible with the original shape of the above image.

numberOfRows = im_tiger.shape[0]
numberOfCols = im_tiger.shape[1]
transform_image_for_KMeans = im_tiger.reshape(numberOfRows * numberOfCols, 3)
print(transform_image_for_KMeans)

We'll see something like this in the Jupyter Notebook:

It's the dimension of the image and the pixels that we just reshaped into a matrix of dimension
(numberOfRows * numberOfCols)x3. Here, numberOfRows = 2560 and numberOfCols = 1600.

Now it's time for some magic.

#calling KMeans() constructor and initializing with number of clusters = 8
kmeans = KMeans(n_clusters=8) 
#passing the dimension of image to the fit() method in KMeans class
kmeans.fit(transform_image_for_KMeans)
#center of different clusters
cluster_centroids = np.asarray(kmeans.cluster_centers_,dtype=np.uint8) 
print(cluster_centroids) #you may try printing the value to see what's happening
# labels represent the label of each pixel and which cluster it belongs to
labels = np.asarray(kmeans.labels_,dtype=np.uint8 )  
labels = labels.reshape(numberOfRows,numberOfCols);    
print(labels) #you may print this and see how it's actually working

It'll take sometime to execute this part of the cell. You will find the output to be as follow:

This is the final matrix of cluster_centroids and labels. Now we iteratively assign the clustered_centroids into our initialized matrix.

#initializing our new image, compressed_image
compressed_image = np.ones((numberOfRows, numberOfCols, 3), dtype=np.uint8)
for r in range(numberOfRows):
    for c in range(numberOfCols):
        compressed_image[r, c, :] = cluster_centroids[labels[r, c], :]

cv2.imwrite("compressed_tiger.jpg", compressed_image) #saving the image
compressed_im_tiger = cv2.imread("compressed_tiger.jpg")
compressed_im_tiger_corrected = cv2.cvtColor(compressed_im_tiger, cv2.COLOR_BGR2RGB)
plt.axis('off')
plt.imshow(compressed_im_tiger_corrected)    

#printing the final compressed image
print("Compressed size of tiger's image is: {} Kilo Bytes".format(str(math.ceil((os.stat('compressed_tiger.png').st_size)/1000))))

After you execute the cell in Jupyter Notebook, you'll see the output as following:

compressed_tiger.jpg

You can see some difference (the background, color of tiger's fur, etc.) in the original and the compressed image of the tiger. Now one last final thing to do, i.e., see the change in size of compressed_tiger.jpg.

dat1 = math.ceil((os.stat('compressed_tiger.jpg').st_size)/1000)
dat2 = math.ceil((os.stat('tiger.jpg').st_size)/1000)
perc = ((dat2-dat1)/dat2)*100
print("Original image size: {} Kilo Bytes".format(str(math.ceil(dat2))))
print("Original image size: {} Kilo Bytes".format(str(math.ceil(dat1))))
print("Compressed by: {} percent".format(str(math.ceil(perc))))
print("Compressed size of tiger's image is: {} Kilo Bytes".format(str(math.ceil((os.stat('compressed_tiger.jpg').st_size)/1000))))

Run the code above to see the following output:

Well done! We've successfully compressed the tiger's image by 60%. And this is indeed a great compression. You can further change the value of n_cluster to see the variations. You can find this article in my github page as well.

References:
Coursera's Introduction to Computer Vision with IBM Watson and OpenCV

Code in py

Search

Image Compression and Color Quantization using K-Means Clustering

Comments

Post a Comment

Popular posts from this blog

Simple Face Recognition Project using OpenCV python Deep Learning

Data Structure and Algorithm Implementation In Python: Stack