What is Convolution?
Let’s use dice and probability as an example.
You have two uniform dice and , and you want to find the probability of the number you get when you add the two up.
=1 | =2 | =3 | =4 | =5 | =6 | |
---|---|---|---|---|---|---|
=1 | 2 | 3 | 4 | 5 | 6 | 7 |
=2 | 3 | 4 | 5 | 6 | 7 | 8 |
=3 | 4 | 5 | 6 | 7 | 8 | 9 |
=4 | 5 | 6 | 7 | 8 | 9 | 10 |
=5 | 6 | 7 | 8 | 9 | 10 | 11 |
=6 | 7 | 8 | 9 | 10 | 11 | 12 |
To find the probability of each possible sum, simply count how many times each sum appears and divide by the total number of combinations (36), since each face has a uniform probability of 1/6.
But what if the dice aren’t uniform?
Suppose you have two three-sided, non-uniform dice, C and D, with probability distributions and for outcomes respectively.
To compute the probability distribution of their sum, we perform a convolution of the two distributions.
What we can then do is first reverse and multiply like so:
[0.5, 0.3, 0.2]
[0.3, 0.6, 0.1]
For example, to calculate the probability of 2, we do: to get that probability.
[0.5, 0.3, 0.2]
[0.3, 0.6, 0.1]
Next, to compute the probability of a sum of 3 (e.g., 1+2 and 2+1), we do: , and so on so forth.
By doing this, we are left with a new probability distribution of , which is the convolution of the two original distributions.
We can generalize this with any two arrays of and with the equation:
to make it continous we use this equation:
Uses
Some other uses for convolutions include calculating moving averages and image processing.
Moving Averages
We set our array being the distribution we want to average and being another array, with all its elements adding up to 1, to preserve the scale of the data. is also what we call the kernel.
By doing , we get an averaged out distribution. by changing the distribution of the elements of , we can have different average distributions.
Image Processing
To blur an image, we can just set as square 2D array, and do convolution on both axes, as we can represent the image as a 2D array as well.
Convolution is used extensively for tasks such as blurring, sharpening, edge detection, and noise reduction.
An image can be represented as a 2D array (or matrix) of pixel values. To apply a filter to the image, we define a 2D kernel and slide it over the image, computing a weighted sum of neighboring pixel values at each location.
To blur an image, for example, a common 2D kernel is a normalized box filter like:
This averages the value of each pixel with its 8 neighbors, resulting in a smoothing effect.