In this post, you'll able to compress an image of higher size relatively to a smaller size. Here size I mean the image's memory consumption, not the aspect ratio (though it is also somewhat related to the size). Before we begin, let's be familiar with what Image Compression, Color Quantization and K-Means Clustering is.
Basically K-Means Clustering is used to find the central value (centroid) for k clusters of data. Then each data point is assigned to the cluster whose center is nearest to k. Then, a new centroid is calculated for each of the k clusters based upon the data points that are assigned in that cluster.
In our case, the data points will be Image pixels. Assuming that you know what pixels are, these pixels actually comprises of 3 channels, Red, Green and Blue. Each of these channels' have intensity ranging from 0 to 255, i.e., altogether 256. So as a whole, total number of colors in each pixel is, 256x256x256. Each pixel(color) has 2^8 colors acquiring 8 bits of memory. Thus, each pixel requires 8+8+8, i.e., 24 bits of memory for storage.
Now, using K-Means clustering, we'll try to group pixels of similar color together in k clusters, i.e., 8 different colors. Instead of each pixel originally representing 256x256x256 colors, each pixel can only represent 8 possible colors. The most surprising thing is, each pixel now requires only (2^3 = 8) 3 bits of memory, instead of original 24 bits.
Here in the process, we're breaking all possible colors of the RGB color space over k colors. This is called Color Quantization and during this, the k centroids of the clusters are representative of 3 dimensional RGB color space. It will now replace the colors of all points in their cluster and thus the image will only have k colors in it.
This was a little bit of a concept for understanding, what we are going to do. 😁
Let us now begin with our real task. Throughout this tutorial, we'll be using some of OpenCV and Python libraries. Basically, we'll use Scikit-learn, Matlplotlib and Numpy.
Let's install some dependencies.
1. Installing OpenCV
2. Installing Scikit-learn
3. Installing a python 2D plotting library, i.e., matplotlib
For this tutorial, you can download the image from here.
Note: In case the url doesn't work, you'll need to manually download the file and give the link to that image file. For example, you can simply do the following after you've downloaded the file (in my case, it is in 'images' folder).
Moving on, let's import some important modules.
First let's see the original file size of the image we downloaded (in KB).
Run the code to the see the output as below:
Now it's time to import and use K-Means clustering from Sklearn. We'll also import numpy as for transforming the dimension of image.
In order to transform the image into certain dimension, we first need to extract the number of rows and columns of the image above. After that, we'll reshape the image into a dimension compatible with the original shape of the above image.
We'll see something like this in the Jupyter Notebook:
It's the dimension of the image and the pixels that we just reshaped into a matrix of dimension
(numberOfRows * numberOfCols)x3. Here, numberOfRows = 2560 and numberOfCols = 1600.
Now it's time for some magic.
It'll take sometime to execute this part of the cell. You will find the output to be as follow:
This is the final matrix of cluster_centroids and labels. Now we iteratively assign the clustered_centroids into our initialized matrix.
After you execute the cell in Jupyter Notebook, you'll see the output as following:
You can see some difference (the background, color of tiger's fur, etc.) in the original and the compressed image of the tiger. Now one last final thing to do, i.e., see the change in size of compressed_tiger.jpg.
Run the code above to see the following output:
Well done! We've successfully compressed the tiger's image by 60%. And this is indeed a great compression. You can further change the value of n_cluster to see the variations. You can find this article in my github page as well.
References:
Coursera's Introduction to Computer Vision with IBM Watson and OpenCV
Basically K-Means Clustering is used to find the central value (centroid) for k clusters of data. Then each data point is assigned to the cluster whose center is nearest to k. Then, a new centroid is calculated for each of the k clusters based upon the data points that are assigned in that cluster.
In our case, the data points will be Image pixels. Assuming that you know what pixels are, these pixels actually comprises of 3 channels, Red, Green and Blue. Each of these channels' have intensity ranging from 0 to 255, i.e., altogether 256. So as a whole, total number of colors in each pixel is, 256x256x256. Each pixel(color) has 2^8 colors acquiring 8 bits of memory. Thus, each pixel requires 8+8+8, i.e., 24 bits of memory for storage.
Now, using K-Means clustering, we'll try to group pixels of similar color together in k clusters, i.e., 8 different colors. Instead of each pixel originally representing 256x256x256 colors, each pixel can only represent 8 possible colors. The most surprising thing is, each pixel now requires only (2^3 = 8) 3 bits of memory, instead of original 24 bits.
Here in the process, we're breaking all possible colors of the RGB color space over k colors. This is called Color Quantization and during this, the k centroids of the clusters are representative of 3 dimensional RGB color space. It will now replace the colors of all points in their cluster and thus the image will only have k colors in it.
This was a little bit of a concept for understanding, what we are going to do. 😁
Let us now begin with our real task. Throughout this tutorial, we'll be using some of OpenCV and Python libraries. Basically, we'll use Scikit-learn, Matlplotlib and Numpy.
Let's install some dependencies.
1. Installing OpenCV
conda install -c conda-forge opencv
2. Installing Scikit-learn
conda install -c conda-forge scikit-learn
3. Installing a python 2D plotting library, i.e., matplotlib
conda conda install -c conda-forge matplotlib
For this tutorial, you can download the image from here.
tiger.jpg |
import cv2 import urllib.request image_url = "http://eskipaper.com/images/high-quality-animal-wallpapers-1.jpg" urllib.request.urlretrieve(image_url, "tiger.jpg") #downloads the image as tiger.jpg im_tiger = cv2.imread("tiger.jpg") #reading the downloaded image in im_tiger
Note: In case the url doesn't work, you'll need to manually download the file and give the link to that image file. For example, you can simply do the following after you've downloaded the file (in my case, it is in 'images' folder).
im = cv2.imread("images/tiger.jpg")
Moving on, let's import some important modules.
import os import math import matplotlib.pyplot as plt %matplotlib inline
First let's see the original file size of the image we downloaded (in KB).
img_corrected = cv2.cvtColor(im_tiger, cv2.COLOR_BGR2RGB) plt.axis('off') plt.imshow(img_corrected) print("The size of im_tiger is: {} Kilo Bytes".format(str(math.ceil((os.stat('im_tiger.jpg').st_size)/1000))))
Run the code to the see the output as below:
from sklearn.cluster import KMeans import numpy as np
In order to transform the image into certain dimension, we first need to extract the number of rows and columns of the image above. After that, we'll reshape the image into a dimension compatible with the original shape of the above image.
numberOfRows = im_tiger.shape[0] numberOfCols = im_tiger.shape[1] transform_image_for_KMeans = im_tiger.reshape(numberOfRows * numberOfCols, 3) print(transform_image_for_KMeans)
We'll see something like this in the Jupyter Notebook:
It's the dimension of the image and the pixels that we just reshaped into a matrix of dimension
(numberOfRows * numberOfCols)x3. Here, numberOfRows = 2560 and numberOfCols = 1600.
Now it's time for some magic.
#calling KMeans() constructor and initializing with number of clusters = 8 kmeans = KMeans(n_clusters=8) #passing the dimension of image to the fit() method in KMeans class kmeans.fit(transform_image_for_KMeans) #center of different clusters cluster_centroids = np.asarray(kmeans.cluster_centers_,dtype=np.uint8) print(cluster_centroids) #you may try printing the value to see what's happening # labels represent the label of each pixel and which cluster it belongs to labels = np.asarray(kmeans.labels_,dtype=np.uint8 ) labels = labels.reshape(numberOfRows,numberOfCols); print(labels) #you may print this and see how it's actually working
It'll take sometime to execute this part of the cell. You will find the output to be as follow:
This is the final matrix of cluster_centroids and labels. Now we iteratively assign the clustered_centroids into our initialized matrix.
#initializing our new image, compressed_image compressed_image = np.ones((numberOfRows, numberOfCols, 3), dtype=np.uint8) for r in range(numberOfRows): for c in range(numberOfCols): compressed_image[r, c, :] = cluster_centroids[labels[r, c], :] cv2.imwrite("compressed_tiger.jpg", compressed_image) #saving the image compressed_im_tiger = cv2.imread("compressed_tiger.jpg") compressed_im_tiger_corrected = cv2.cvtColor(compressed_im_tiger, cv2.COLOR_BGR2RGB) plt.axis('off') plt.imshow(compressed_im_tiger_corrected) #printing the final compressed image print("Compressed size of tiger's image is: {} Kilo Bytes".format(str(math.ceil((os.stat('compressed_tiger.png').st_size)/1000))))
After you execute the cell in Jupyter Notebook, you'll see the output as following:
compressed_tiger.jpg |
dat1 = math.ceil((os.stat('compressed_tiger.jpg').st_size)/1000) dat2 = math.ceil((os.stat('tiger.jpg').st_size)/1000) perc = ((dat2-dat1)/dat2)*100 print("Original image size: {} Kilo Bytes".format(str(math.ceil(dat2)))) print("Original image size: {} Kilo Bytes".format(str(math.ceil(dat1)))) print("Compressed by: {} percent".format(str(math.ceil(perc)))) print("Compressed size of tiger's image is: {} Kilo Bytes".format(str(math.ceil((os.stat('compressed_tiger.jpg').st_size)/1000))))
Run the code above to see the following output:
Well done! We've successfully compressed the tiger's image by 60%. And this is indeed a great compression. You can further change the value of n_cluster to see the variations. You can find this article in my github page as well.
References:
Coursera's Introduction to Computer Vision with IBM Watson and OpenCV
Comments
Post a Comment