-
Notifications
You must be signed in to change notification settings - Fork 22
Description
Just to put a context: I was asked to find a paper and reproduce some results from scratch (it weights the 50% of the subject), I've my deadline around the 10 of June of 2022.
While rewriting the detection network (in order to fully understand the paper) I found strange the CPM part and I would like to ask for advice.
Papers Text
The paper says:
with Context Sensitive feature extractor followed by series of transpose convolutions to enhance spatial resolution of feature maps.
and
we augmented on top of each individual FPNs, a Context-sensitive Prediction Module (CPM) [63]. This contextual module consists of 4 Inception-ResNet-A blocks [62] with 128 and 256 filters for 3 × 3 convolution and 1024 filters for 1 × 1 convolution.
The reference 63 says:
We design the Context-sensitive Predict Module (CPM), see Fig. 3(b), in which we replace the convolution layers of context module in SSH by the residual-free prediction module of DSSD.
Issues
From the previous cites, I understand the CPM as a SSH with different convolution operations.
But your Figure 4 (from the paper) and your code shows a channel expansion which seems like the prediction module of DSSD (a kind of simplified Inception) followed by a standard SSH.
I did not find any Inception-ResNet-A blocks.
Additionally, I did not find the transpose convolutions part.
Sorry for the inconvenience, I just want to make sure I don't miss any detail and have it done correctly as soon as possible...