Scale Invariance of `RMSNorm(Wx)`

This demo shows that for positive scales c > 0 (and ε = 0), replacing W by c·W leaves f(x) = RMSNorm(Wx) unchanged. Try different values of c, ε, and edit x, W.

Controls

Dimension d Seed ε (epsilon)

Scale c restrict to c ≥ 0

Set ε = 0 and c ≥ 0 to see exact invariance (up to floating point). Allow negative c to observe the global sign flip.

x (input vector)

W (weight matrix)

Outputs

y = RMSNorm(Wx)

ŷ = RMSNorm(c·W x)

Difference (ŷ − y)

‖ŷ − y‖₂ = 0

Tip: If c = 0, then c·W x = 0 and RMSNorm returns the zero vector (degenerate case) — invariance does not hold at c=0.

Why this happens (one‑liner)

Let z = W x. RMSNorm(z) = z / √(mean(z²) + ε).
For c > 0 and ε = 0:
  RMSNorm(c z) = (c z) / √(mean((c z)²))
               = (c z) / √(c² mean(z²))
               = z / √(mean(z²)) = RMSNorm(z).
For c < 0 (ε = 0), RMSNorm(c z) = sign(c) · RMSNorm(z): a global sign flip.