Skip to content

ImageNet-3DCC and corruption updates#85

Merged
max-andr merged 11 commits intoRobustBench:masterfrom
ofkar:master
May 9, 2022
Merged

ImageNet-3DCC and corruption updates#85
max-andr merged 11 commits intoRobustBench:masterfrom
ofkar:master

Conversation

@ofkar
Copy link
Contributor

@ofkar ofkar commented Apr 30, 2022

Hi

Git tells that nearly every file in the repository has been changed in this pull request (224 files in total), although for most files there are no actual changes. This can be a bit misleading for the git history. Is it possible to remove this effect? The responsible commit for that seems to be this one.

I forked again and made the updates, this should be good now. Sorry for the issue.

I'd remove print statements here and here.

Done.

Now thinking a bit more, I'm not sure if it's needed there. In principle, the evaluation process of 3DCC is the same as for 2DCC which is already illustrated in the lines above. The only difference is how to download the data.

Actually, I also created a new loader function to isolate the two. Also, the quickstart includes the names of the corruptions in ImageNet-3DCC which could be handy. Let me know if this sounds good.

Now about how to get the data: we briefly noted here that the test set of ImageNet should be downloaded manually. We didn't tell anything about ImageNet-C, though (I think we just forgot to write something about it). I'd suggest that you add there regarding where to download ImageNet-C and then your instruction about ImageNet-3DCC:
Download the data from here using the provided tool. The data will be saved into a folder named ImageNet-3DCC.
What do you think?

Good point, added a section for that.

Since it's not going to be a new leaderboard, I'd formulate it a bit differently. More like: "We have extended the common corruptions leaderboard on ImageNet" instead of "created a new benchmark". Also it's worth specifying that we still sort the entries according to the 2D common corruptions. Also it's worth telling in a single sentence why these 3D common corruptions are interesting: (1) they are more realistic, (2) they can be used to assess generalization of the existing models which may have overfitted to 2DCC.

Done.

And as a separate news, I'd also add that we fixed the preprocessing issue and write explicitly that this changed the ranking between the top-1 and top-2 entries.

Done.

Let me know if you see further issues. Thanks!

@ofkar ofkar changed the title ImageNet-3DD and corruption updates ImageNet-3DCC and corruption updates Apr 30, 2022
@fra31
Copy link
Member

fra31 commented Apr 30, 2022

Hi,

thanks a lot for the contribution, it looks great!

Is it possible to add also the unaggregated results as here?

As minor thing, I'd add in the header of the table in the readme e.g. arrows to indicate that for robust accuracy higher is better, while for mCE it's the opposite.

@ofkar
Copy link
Contributor Author

ofkar commented Apr 30, 2022

Thanks. I made the updates you suggested. Let me know if everything looks good.

@max-andr
Copy link
Member

max-andr commented May 1, 2022

Wow, that was fast, especially for a Saturday ;) The changes look good. A few further suggestions:

  • README, news section: add a space after '-' (so that we Markdown renders it as a bullet point).

  • README, news section: I'd explain in a few words: "We fixed the preprocessing issue for ImageNet corruption evaluations." -> "We fixed the preprocessing issue for ImageNet corruption evaluations: previously we used resize to 256x256 and central crop to 224x224 which was a mistake since the ImageNet-C images are already 224x224 and we cropped them further losing information."

  • Slightly reorganized the dataset downloading instructions.

Actually, I also created a new loader function to isolate the two. Also, the quickstart includes the names of the corruptions in ImageNet-3DCC which could be handy. Let me know if this sounds good.

  • Ok, I agree, some people can find that snippet useful. I've compressed it a bit to make more concise (e.g., saving as a pickle is not the first necessity, the loop over models is something that the users can do by themselves, etc).

I applied these and few other minor changes to the README in a new commit. I'd say that everything looks good to me and we could merge unless others (@VSehwag, @dedeswim) have some further suggestions.

Copy link
Member

@dedeswim dedeswim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Sorry if it took me long to review this. Thanks a lot for the contribution! :)

Copy link
Member

@dedeswim dedeswim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I have just realized that also the Jinja template used to generate the leaderboard website should be updated to reflect this addition, by adding the new columns in the Corruptions ImageNet leaderboard.

@ofkar can you take care of that? Otherwise I can help with this :)

@ofkar
Copy link
Contributor Author

ofkar commented May 3, 2022

@dedeswim I'm not too familiar with it, but I actually created another PR to update the website too: RobustBench/robustbench.github.io#13

So is this update related to those changes as well? I have already entered new entries to the corruption leaderboard there.

@dedeswim
Copy link
Member

dedeswim commented May 3, 2022

Yeah I saw the PR, thanks also for that one! We have this template and script which we use for generating the leaderboard from the *.json files to make updates easier. If you are not familiar with it, I can edit the template for you, no worries

@max-andr
Copy link
Member

max-andr commented May 9, 2022

We agreed with Edoardo that the change to the ninja template can be done as a separate PR. Merging then! Thanks again, Oguzhan!

@max-andr max-andr merged commit df31621 into RobustBench:master May 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants