The HOG features are widely use for object detection. HOG decomposes an image into small squared cells, computes an histogram of oriented gradients in each cell, normalizes the result using a block-wise pattern, and return a descriptor for each cell.
Stacking the cells into a squared image region can be used as an image window descriptor for object detection, for example by means of an SVM.
This tutorial shows how to use the VLFeat
function vl_hog
to compute HOG features of various kind
and manipulate them.
We start by considering an example input image:
HOG is computed by calling the vl_hog
function:
The same function can also be used to generate a pictorial
rendition of the features, although this unavoidably destroys some of
the information contained in the feature itself. To this end, use the
render
command:
This should produce the following image:
HOG is an array of cells, with the third dimension spanning feature components:
In this case the feature has 31 dimensions. HOG exists in many variants. VLFeat supports two: the UoCTTI variant (used by default) and the original Dalal-Triggs variant (with 2×2 square HOG blocks for normalization). The main difference is that the UoCTTI variant computes bot directed and undirected gradients as well as a four dimensional texture-energy feature, but projects the result down to 31 dimensions. Dalal-Triggs works instead with undirected gradients only and does not do any compression, for a total of 36 dimension. The Dalal-Triggs variant can be computed as
The result is visually very similar:
Often it is necessary to flip HOG features from left to right (for example in order to model an axis symmetric object). This can be obtained analytically from the feature itself by permuting the histogram dimensions appropriately. The permutation is obtained as follows:
Then these two examples produce identical results (provided that the image contains an exact number of cells:
This is shown in the figure:
vl_hog
supports other parameters as well. For example,
one can specify the number of orientations in the histograms by the
numOrientations
option:
Changing the number of orientations changes the features quite significantly:
numOrientations
equal to 3, 4, 5, 9, and 21 repsectively.Another useful option is BilinearOrientations
switching
on the bilinear orientation assignment of the gradient (this is not used
in certain implementation like UoCTTI).
resulting in
numOrientations
equals to
four, and soft orientation assigments.