prev: Note of data science training EP 11: NLP & Spacy – Languages are borderless

Image processing is a branch of data science theories. Apparently Python also has the library of this job.

We would know that, picture files contain arrays of numbers to be calculated in computers as visible figures. Practically we talk about 3-color system that is Red, Green, and Blue (RGB). Then there are 8 bits in a pixel of a gray-scaled picture, so there are 24 bits per pixel in colored pictures or it is 3 bytes per pixel (1 byte = 8 bits)

A number indicating colors is provided by 0 to 255 as here.


Let’s say we want absolute red, it is (255, 0, 0) while we add green to be yellow that is (255, 255, 0). We want darker color so we reduce the number until we get black (0, 0, 0), otherwise we increase the number to get a brighter color until white (255, 255, 255).


Yes, we are talking about this. Install this library via

Very first of the time, we have to import skimage and add matplotlib.pyplot to enable to display a picture on Jupyter notebook.

Open the pictures

Use this code to view the images.

lynx =

Now “lynx” contains an array of integers. As an array, we can show its size with .shape and see it is 423 pixels of height and 640 pixels of width plus 3 layers of RGB color system.

And then we can modify the number to change a part of the image.

for i in range(30, 60):
    for j in range(60, 90):
        lynx[i,j] = [60, 60, 60]


Run this to transform the image from color scale to gray scale.

import skimage.color
plt.imshow(skimage.color.rgb2gray(lynx), cmap='gray')

Stretch it

skimage.transform.resize() change the image by our desired size. Here we just resize it by double height (.shape[0] * 2) and 1.5 times of width (.shape[1] * 1.5).

import skimage.transform
    skimage.transform.resize(lynx, (lynx.shape[0] * 2, lynx.shape[1] * 1.5))

Geometric figures

After we import skimage.morphology, then we can create these.





Adding filters

With skimage.filter and skimage.morphology, we can add blur filters to our images.

skimage.filter.median() + disk(5)

skimage.filter.median() + disk(10)

skimage.filter.median() + diamond(10)


And .try_all_threshold() can compute various types of filters using threshold calculations.



Here are 3 sample images.




We can calculate how different two images are with this.

import skimage.metrics
skimage.metrics.mean_squared_error(a, b)

.mean_squared_error() computes Mean-Squared Error (MSE) by per-pixel comparison. The greater this is, the more difference those images are.

import skimage.metrics
skimage.metrics.structural_similarity(a, b, multichannel=True)

.structural_similarity() computes Structural Similarity Index Measure (SSIM) which include calculation over noise and brightness. The greater it is, the more similarity they are.

And here we are going to calculate both metrics.

As per values of MSE and SSIM, we can conclude that “latest” is more similar to “latest_2nd” than “first”.

Eye-catching differences

Above are numbers and now we go find differences in images like playing a photo-hunt game.

Let’s say we try to make it on “latest” and “latest_2nd. First we’re going to find all pixels that have different values less than 0.1 .

change_px = np.abs(latest_gray - latest_2nd_gray) < 0.1
plt.imshow(change_px, cmap='gray')

Next, to remove "holes" that smaller than a given values "area_threshold".

We notice now there is the "road". Its differences are from "vehicles", isn't it?

road = skimage.morphology.remove_small_holes(change_px, area-threshole=400)

Final, to highlight the black pixels a.k.a. the "road".

road = skimage.morphology.erosion(road)

Reference links:

Here are sample of this library's functions.

Stay tuned for next blog.


next: Note of data science training EP 13: Regularization – make it regular with Regularization