Skip to content

Commit 1d6970c

Browse files
committed
updated notes on nn interpetability -- cs230
1 parent 5c547ca commit 1d6970c

1 file changed

Lines changed: 26 additions & 0 deletions

File tree

stanford_lectures/cs230/cs230.md

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -85,4 +85,30 @@
8585
- **saliency maps**
8686
- with the pixels that have a stronger influence on the score, we can use that to segment the image — this can be done using a simple thresholding value
8787
- saliency maps is a quick technique to visualize what the network is looking at
88+
- **occlusion sensitivity**
89+
- let’s say we’re trying to classify the same dog / cat image from before, but we cover some pixels up with a grey square — now we test how much impact the square has in different locations
90+
- where ever the output is the lowest, that’s where the dog most likely is / where the most important pixels are
91+
- **important:** it may also be that the square improves the probability, for example if there’s a human and a dog in the picture, if the square covers up the human, the model will most likely be able to perform better
92+
- **class activation maps**
93+
- classification networks have really good localization ability
94+
- off topic, but your typical CNN structure looks like this
95+
- a bunch of conv, relu, max pool layers (in that exact order) and then you flatten the output, pass it through a couple FCC layers, apply softmax and then you get the output. the FCC players the role of the classifier
96+
- building off of that idea, to produce a class activation map, we get rid of the flatten layer as it gets rid of all the spatial information as everything is flattened into one vector
97+
- instead we use a global average pooling layer — we take all the feature maps produced by the previous conv layer and then take the average of each feature map
98+
- for example if the dimension was (4,4,6) — this corresponds to 6 feature maps with dimension 4x4 — then the new output after global avg pooling would be (1,1,6)
99+
- after this, we pass it through a single FCC + softmax and it outputs a bunch of probabilities corresponding to each class
100+
- the feature maps actually contain some **visual patterns**
101+
- and there’ll be some parts that are lit up and that tells you the activations have found something in those spots — you can repeat this process for all the feature maps
102+
- this basically means there was a visual pattern in the input that activated the feature map
103+
104+
![Screenshot 2024-07-13 at 7.26.19 PM.png](https://prod-files-secure.s3.us-west-2.amazonaws.com/c9aa599c-2115-4330-846e-652102e8621e/5fdf3b37-7592-4cef-b54f-dceec38518f1/Screenshot_2024-07-13_at_7.26.19_PM.png)
105+
106+
- looking at the image above, the score of dog is 91% — now we can reverse engineer and see how much of that score came from each of these feature maps
107+
- if we take a weighted average and sum it all up, you’ll get another feature map which is the **class activation map** for “dog”
108+
109+
![Screenshot 2024-07-13 at 7.29.02 PM.png](https://prod-files-secure.s3.us-west-2.amazonaws.com/c9aa599c-2115-4330-846e-652102e8621e/fb9f6ea2-190a-4d75-aa27-c1a67c393784/Screenshot_2024-07-13_at_7.29.02_PM.png)
110+
111+
- you can see that it was probably highly influence by the 2nd feature map as that’s components of the dog
112+
- **dataset search**
113+
- take a feature map from the last conv layer and then find examples in the dataset which coordinate to that feature map — you will likely find a common trend and you’ll know what that feature map was looking for
88114
-

0 commit comments

Comments
 (0)