Skip to main content

Image Compression and Color Quantization using K-Means Clustering

In this post, you'll able to compress an image of higher size relatively to a smaller size. Here size I mean the image's memory consumption, not the aspect ratio (though it is also somewhat related to the size). Before we begin, let's be familiar with what Image Compression, Color Quantization and K-Means Clustering is.

Basically K-Means Clustering is used to find the central value (centroid) for k clusters of data. Then each data point is assigned to the cluster whose center is nearest to k. Then, a new centroid is calculated for each of the k clusters based upon the data points that are assigned in that cluster.

In our case, the data points will be Image pixels. Assuming that you know what pixels are, these pixels actually comprises of 3 channels, Red, Green and Blue. Each of these channels' have intensity ranging from 0 to 255, i.e., altogether 256. So as a whole, total number of colors in each pixel is, 256x256x256.  Each pixel(color) has 2^8 colors acquiring 8 bits of memory. Thus, each pixel requires 8+8+8, i.e., 24 bits of memory for storage.

Now, using K-Means clustering, we'll try to group pixels of similar color together in k clusters, i.e., 8 different colors. Instead of each pixel originally representing 256x256x256 colors, each pixel can only represent 8 possible colors. The most surprising thing is, each pixel now requires only (2^3 = 8) 3 bits of memory, instead of original 24 bits.

Here in the process, we're breaking all possible colors of the RGB color space over k colors. This is called Color Quantization and during this, the k centroids of the clusters are representative of 3 dimensional RGB color space. It will now replace the colors of all points in their cluster and thus the image will only have k colors in it.

This was a little bit of a concept for understanding, what we are going to do. 😁
Let us now begin with our real task. Throughout this tutorial, we'll be using some of OpenCV and Python libraries. Basically, we'll use Scikit-learn, Matlplotlib and Numpy.

Let's install some dependencies.

1. Installing OpenCV
conda install -c conda-forge opencv 

2. Installing Scikit-learn
conda install -c conda-forge scikit-learn

3. Installing a python 2D plotting library, i.e., matplotlib
conda conda install -c conda-forge matplotlib

For this tutorial, you can download the image from here.
tiger.jpg




import cv2 
import urllib.request 
image_url = "http://eskipaper.com/images/high-quality-animal-wallpapers-1.jpg"
urllib.request.urlretrieve(image_url, "tiger.jpg") #downloads the image as tiger.jpg
im_tiger = cv2.imread("tiger.jpg") #reading the downloaded image in im_tiger

Note: In case the url doesn't work, you'll need to manually download the file and give the link to that image file. For example, you can simply do the following after you've downloaded the file (in my case, it is in  'images' folder).
im = cv2.imread("images/tiger.jpg")

Moving on, let's import some important modules.
import os
import math
import matplotlib.pyplot as plt
%matplotlib inline

First let's see the original file size of the image we downloaded (in KB).
img_corrected = cv2.cvtColor(im_tiger, cv2.COLOR_BGR2RGB)
plt.axis('off')
plt.imshow(img_corrected)
print("The size of im_tiger is: {} Kilo Bytes".format(str(math.ceil((os.stat('im_tiger.jpg').st_size)/1000))))

Run the code to the see the output as below:

Now it's time to import and use K-Means clustering from Sklearn. We'll also import numpy as for transforming the dimension of image.
from sklearn.cluster import KMeans
import numpy as np

In order to transform the image into certain dimension, we first need to extract the number of rows and columns of the image above. After that, we'll reshape the image into a dimension compatible with the original shape of the above image.
numberOfRows = im_tiger.shape[0]
numberOfCols = im_tiger.shape[1]
transform_image_for_KMeans = im_tiger.reshape(numberOfRows * numberOfCols, 3)
print(transform_image_for_KMeans)

We'll see something like this in the Jupyter Notebook:

It's the dimension of the image and the pixels that we just reshaped into a matrix of dimension
(numberOfRows * numberOfCols)x3. Here, numberOfRows = 2560 and numberOfCols = 1600.

Now it's time for some magic.
#calling KMeans() constructor and initializing with number of clusters = 8
kmeans = KMeans(n_clusters=8) 
#passing the dimension of image to the fit() method in KMeans class
kmeans.fit(transform_image_for_KMeans)
#center of different clusters
cluster_centroids = np.asarray(kmeans.cluster_centers_,dtype=np.uint8) 
print(cluster_centroids) #you may try printing the value to see what's happening
# labels represent the label of each pixel and which cluster it belongs to
labels = np.asarray(kmeans.labels_,dtype=np.uint8 )  
labels = labels.reshape(numberOfRows,numberOfCols);    
print(labels) #you may print this and see how it's actually working

It'll take sometime to execute this part of the cell. You will find the output to be as follow:









This is the final matrix of cluster_centroids and labels. Now we iteratively assign the clustered_centroids into our initialized matrix.
#initializing our new image, compressed_image
compressed_image = np.ones((numberOfRows, numberOfCols, 3), dtype=np.uint8)
for r in range(numberOfRows):
    for c in range(numberOfCols):
        compressed_image[r, c, :] = cluster_centroids[labels[r, c], :]

cv2.imwrite("compressed_tiger.jpg", compressed_image) #saving the image
compressed_im_tiger = cv2.imread("compressed_tiger.jpg")
compressed_im_tiger_corrected = cv2.cvtColor(compressed_im_tiger, cv2.COLOR_BGR2RGB)
plt.axis('off')
plt.imshow(compressed_im_tiger_corrected)    

#printing the final compressed image
print("Compressed size of tiger's image is: {} Kilo Bytes".format(str(math.ceil((os.stat('compressed_tiger.png').st_size)/1000))))

After you execute the cell in Jupyter Notebook, you'll see the output as following:
compressed_tiger.jpg

You can see some difference (the background, color of tiger's fur, etc.) in the original and the compressed image of the tiger. Now one last final thing to do, i.e., see the change in size of compressed_tiger.jpg.

dat1 = math.ceil((os.stat('compressed_tiger.jpg').st_size)/1000)
dat2 = math.ceil((os.stat('tiger.jpg').st_size)/1000)
perc = ((dat2-dat1)/dat2)*100
print("Original image size: {} Kilo Bytes".format(str(math.ceil(dat2))))
print("Original image size: {} Kilo Bytes".format(str(math.ceil(dat1))))
print("Compressed by: {} percent".format(str(math.ceil(perc))))
print("Compressed size of tiger's image is: {} Kilo Bytes".format(str(math.ceil((os.stat('compressed_tiger.jpg').st_size)/1000))))

Run the code above to see the following output:



Well done! We've successfully compressed the tiger's image by 60%. And this is indeed a great compression. You can further change the value of n_cluster to see the variations. You can find this article in my github page as well.

References:
Coursera's Introduction to Computer Vision with IBM Watson and OpenCV


Comments

Popular posts from this blog

Simple Face Recognition Project using OpenCV python Deep Learning

Okay, not from completely scratch though, in this Article you are going to learn to build a simple face detection and recognition console based application using Opencv python and Deeplearning Before Starting: If you don't have enough time to read the whole article or you are too lazy to read articles  Scroll all the way down and there is source code at last heading Resources. If you really love to learn step by step, there are lots of comments inside the code. I highly recommand you to read and go through it And at last, Don't panic :D Lets start: Installing Libraries: dlib (by davis king) Face_recognition (by adam geitgey ) (wraps around dlib’s facial recognition functionality making it to work easily with dlib) we also need to install imutils. Actually imutils is used to make basic image processing functions such as translation, rotation, resizing, skeletonization, and displaying Matplotlib images easier with OpenCV but we will be using it to maintain di

Happiness Detection in Images using Keras

Hello folks! Are you happy or are you not sure? Alright, let's build a model that will help you find out if you're happy or not.  Well, let's start with some basic understanding of this tutorial and later dive deeper into the neural networks. We're very well known what  popular Computer Vision is. It is one of the most popular field of machine learning. Happiness Detection is also one of such field where we apply Computer Vision techniques. This is a binary classification type of problem where we'll building a model that will detect whether the input image is either smiling or not.   The dataset is already labeled as smiling or not smiling. We'll be using 600 images for training and 150 images as test dataset. Before we get our hands into the core part, let's first import some libraries. Now let's know more about the data.  After the execution, you'll be able to look at the number of data we've taken for training and testing the prepared model. N