to the last level where it is able to distinguish a car from a truck. Typically each convolutional layer is followed by Max-Pooling*** one, gradually reducing the size of the matrix, but increasing the level of "abstraction". So, we move from basic filters, such as vertical and horizontal lines, to increasingly more sophisticated ones, able for example to recognize the lights, the windshield. This "sliding" (below) process is precisely what convolutions are about. Convolutions and filtersĬonvolutional neural networks work like all neural networks: an input layer, one or more hidden layers, performing calculations via activation functions, and an output layer for the result. Convolutions are precisely what makes the difference.Įach layer holds the "feature maps", which is the specific feature that its nodes are after. In the example below the first layer is coding horizontal and vertical lines, therefore special filters (here in the example a 2×2) are made slid across the image, performing a scalar product between the filter itself and the underlying area. So what?įirst of all, it is worth understanding why inventing algorithms so complex: would not it be better to have a fully connected network? In the end, with a fully connected network, there is no information loss. Problem is, totally fully connected networks lead to a combinatorial explosion* of the required number of nodes and connections. In short, analyzing an image, the first step might be recognizing the figures' silhouettes, therefore we would find for example a filter for vertical lines, one for the horizontal, one for the diagonals, all three scanning the whole image.ĭeeper layers would, for example, recognize eyes, ears, hands, etc. Eventually, the last layer would be able to recognize and distinguish sheep, people, cars. This is a strictly mathematical definition of the process, but how does it interest us? In the case of image analysis, for example, the red function represents the input image, while the second (blue) is known as "filter" because it identifies a particular signal or structure in the image. Mathematically speaking, it means "sliding" a function (blue) over another (red), in fact, "mixing" them together. The result will be a function (green) that represents the product of the two functions. What is a convolution?Ī convolution is an integral that expresses the amount of overlap of one function as it is shifted over another function. It therefore "blends" one function with another. For example, in synthesis imaging, the measured dirty map is a convolution of the "true" CLEAN with the dirty beam (the Fourier transform of the sampling distribution). The Convolutional Neural Networks or ConvNets (CNN) are perhaps the most commonly used Deep Learning algorithm in computer vision and are used in many fields, from autonomous cars to drones, from medical diagnoses to support and treatment for the visually impaired.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |