Karthik Yearning Deep Learning

Class Activation Map


Learning Deep Features for Discriminative Localization



We know that Convolution Neural Networks are good at classification tasks. This paper decodes how previous layer activation contribute for localization tasks even though the network is being trained on classification tasks. Using Global Average pooling layer, the localization ability is studied.

Advantage of Global Average Pooling extends beyond regularization in the network. A CNN trained for classification task can also localize the descriminative regions.



Two works are mostly related to this paper:

Weakly supervised object localization.

Visualizing CNNs.



Generating Class Activation Map

Generation of Class activation maps using Global Average Pooling in CNN is described in this Paper.

By performing global average pooling on the convolution feature maps and use those as features for a fully connected layer that produce the desired output.

We can identify the importance of the image regions by projecting back the weights of the output layer on to the convolution feature maps, a technique we call class activation mapping.

Global Average pooling outputs the spatial average of the feature map of each unit at the last convolution layer. A weighted sum of these values is used to generate the final output.

Similarly, we compute a weighted sum of feature maps of the last convolution layer to obtain our class activation maps.


\(f_k(x,y)\) represent the activation of unit k in the last convolution layer at spatial location (x,y)

\[F_k = \sum_{x,y} f_k(x,y)\] \[S_c = \sum_{k} w_k^c.F_k \\ w_k^c \ weight \ corresponding \ to \ class \ c \ for \ unit \ k\] \[P_c = {e^{S_c} \over \sum_{c} e^{S_c}}\] \[M_c(x,y) = \sum_k w_k^c. f_k(x,y) \\ M_c \ is \ the \ class \ activation \ map \\] \[S_c = \sum_{x,y} M_c(x,y)\]


Reference:

comments powered by Disqus