-
From Weight Decay to Hyperball Optimization (Part 2): Weight decay theory deep dive
Part 2 of a two-part series: why weight decay sets effective step size in scale-invariant nets.
-
From Weight Decay to Hyperball Optimization (Part 1): Hyperball optimizer + intuition
Part 1 of a two-part series: Hyperball optimization and empirical intuition.