
Reproducing the deep double descent paper
stpn
created: June 5, 2025, 6:34 p.m. | updated: June 5, 2025, 10:43 p.m.
After reading the Deep Double Descent paper, I wanted to see if I understood enough to reproduce the results.
The phrasing double descent refers to this behavior where error gets better at first, then peaks much worse, then eventually comes back down again.
To be honest, I'm not sure how much I trust my intuition so far about why double descent happens.
No label noise:10% label noise:20% label noise:some observationsWith no label noise, there is no double descent.
On the same 10% graph, scanning left/right, the larger models starting with k=16 also present with double descent.
2 days, 10 hours ago: Hacker News: Front Page