RMSNorm(Wx)
This demo shows that for positive scales c > 0 (and ε = 0),
replacing W by c·W leaves
f(x) = RMSNorm(Wx) unchanged. Try different values of c, ε, and edit x, W.
ε = 0 and c ≥ 0 to see exact invariance (up to floating point). Allow negative c to observe the global sign flip.
c = 0, then c·W x = 0 and RMSNorm returns the zero vector (degenerate case) — invariance does not hold at c=0.
Let z = W x. RMSNorm(z) = z / √(mean(z²) + ε).
For c > 0 and ε = 0:
RMSNorm(c z) = (c z) / √(mean((c z)²))
= (c z) / √(c² mean(z²))
= z / √(mean(z²)) = RMSNorm(z).
For c < 0 (ε = 0), RMSNorm(c z) = sign(c) · RMSNorm(z): a global sign flip.