Sharpening Images using kernels

22/5/2019

Well, it's been two years since I posted anything. Hello again. It's the same schick, I'm slowly learning rust and need somewhere to collect my thoughts, so here we are. I thought I'd stick with the last post's theme, image manipulation. I won't really explain what image kernels are - just think of them as a filter that we apply to an image to sharpen, blur, edge detection, etc.

I'll obviously explain how they are applied because that's part of the code, but if you want a visual guide to how they work, Victor Powell is where you'll want to start.

Before we start, here's what the program does do my avatar.

This time will be a little more complex, I want my program to:

Ask the user for the name of the image they wish to sharpen
If it can't be found, tell the user and exit the program
Sharpen the image and save it to file

This will be... a little bigger than the previous examples so we'll split it up into imports, functions, then complete it with main.

Ah, image my old friend. You have changed, you have stayed the same, I still love you. Anyway yes, we are still using image because it'll still the best library for general image loading, manipulation, and saving. We will be using a DynamicImage for all of our dirty work, but most of its cool pixel things are hidden inside GenericImage(View) so we'll import those as well.

The next two are just to shorten some code later on. Simple stuff.

Onto the meat of the program.

Here's where the magic happens. Firstly, the parameters are the kernel (representing a 3x3 array) as a slice, and the mutable reference to the image we will edit.

Then we take the width & height of the image and copy the image; because our function will be destructive to the input image, and we will need the unaltered pixels for our algorithm. You could also keep the input image unaltered and return a new image, but that's a design choice.

For each pixel(x, y) in the image (excluding the edges... edge cases man, the bane of my existence), we take all the surrounding pixels and copy them into subpixels.

Side note, I was planning on doing just referencing them, but it turns out [u8; 4] (the internal type of Rgba) is twice as small as &Rgba, at least on my machine. Because 4 u8s is 32 bits (4 bytes), while on my x64 Mac, a reference is 64 bits. the more you know.

Anyway, we make a mutable [f32; 4] container called total to hold our new pixel which we will put into the input image. We do this because the filter can both add/subtract larger numbers than a u8 can hold, so we will convert that later to u8.

Then, foreach of our subpixels - we use an enumeration so we have access to the index - we add the each channel (red, green, blue, alpha). The zeroth subpixel is multiplied with the corresponding kernel.

So kernel[n] * subpixels[n].data[channel] for n => 0..8 gets added to the total[channel]. This "averages" or applies the kernel to the pixel in the original image. Now... this part I'm not too sure about.

f32tou8... Firstly, a terrible name for a what is essentially a conversion function. It turns f32 into u8. I know what you are wondering. "William you dumbass, there's the as keyword made just for that, you literally use it in f32tou8 too" and you'd be right. However there's one small problem with the prebuild as. It's too good. Lemme give you an example.

What do you think 0f32 as u8 equals? Yep. 0u8. Perfect.
-1f32 as u8? If you said 255u8 you're smarter than me, I thought it would be 0u8.
256f32 as u8? If you said 0u8, you're still smarter than me. Damn edge cases again.

It bloody handles overflowing like a boss. This is cool, but we REALLY don't want that, because if a pixel value we get is -128, it should be as black as my lungs, gray(128). If there's a better way to do this, please let me know.

Then it's a simple matter of mapping our total: [f32; 4] to a Vec<32> using our function, and putting that pixel into the input image. This is why we needed to copy it, because the next loop would have taken that edited pixel instead of the original.

That's it, the function finishes and what's left is a sharpened image. Onto main.

We ask the user for the image path, use my readln function from a few posts ago and hello... What is that question mark.

Now, when I came back to rust after my hiatus, I found out the main now supports a return type, std::io::Result<()>, which makes error handling with IO super easy, because you can write ? as shorthand for .unwrap(). My readln also returns this type, hence the ?.

We then load the image, and gracefully handle when something goes terribly wrong. And hello, the actual kernel, which I copied of wikipedia (thanks wikipedia).

We use out apply_kernal function, then make a new path using format!, which is like println!, but it returns the string rather than printing it. Good stuff.

We save the image using the new path, then tell the user we have finished.

Lastly, because our main now returns a Result, we need to return an empty Ok result.

Now, there's a problem with my problem. God damn edge cases.

Now currently, the when apply_images finishes, the image will have the same pixels around the border. This is fine when we are sharpening the image. However, if you are doing edge detection, this looks awful and does not work. There are ways to handle the edges (make the image 2px smaller, average the pixels new the border, apply only a subsection of the kernel to the edges/corners) but truly I couldn't be bothered.

And that's it. This was a long one, but I enjoyed playing around with images and rust again. Thanks for reading.

0 Comments

Sharpening Images using kernels

Leave a Reply.

Archives

Categories