Region of Interest Pooling
Region of interest pooling is described here: girshick15_fast_r_cnn
- The task: object detection.
- The problem: we have a set of regions of interest (RoI). We want to run a classifier on each region, to predict which, if any, object class is contained within the region. However, all the regions are different scales, so we cannot use a simple fully connect layer to do this.
- The solution: Select a fixed dimension \(H \times W\) for each region \(i\) of size \(h_i \times w_i\), divide into sub-windows of (approximate) size \(\frac{h_i}{H} \times \frac{w_i}{W}\). Perform max pooling over the sub-windows. Now we have an appropriately sized input that we can pass to a fully connected layer.