You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
1 - Ranger is the optimizer we used to beat the high scores for 8 different categories on the FastAI leaderboards! (Previous records all held with AdamW optimizer).
8
9
9
-
1 - We used Ranger to beat the FastAI leaderboard score by nearly 20% (19.77%). The trick was to combine Ranger with: Mish activation function, and flat+ cosine anneal training curve.
10
+
2 - Highly recommend combining Ranger with: Mish activation function, and flat+ cosine anneal training curve.
10
11
11
-
2 - Based on that, also found .95 is better than .90 for beta1 (momentum) param (ala betas=(0.95, 0.999)).
12
+
3 - Based on that, also found .95 is better than .90 for beta1 (momentum) param (ala betas=(0.95, 0.999)).
12
13
13
-
3 - Verified no load/save issues in our codebase here. It was an issue for people that were using LookAhead/RAdam as seperate components.
14
+
Fixes:
15
+
1 - Differential Group learning rates now supported. This was fix in RAdam and ported here thanks to @sholderbach.
16
+
2 - In progress fix - save and then load may leave first run weights stranded in memory, slowing down future runs...trying to investigate and fix now.
0 commit comments