About 498,000 results
Open links in new tab
  1. What is regularization in plain english? - Cross Validated

    Is regularization really ever used to reduce underfitting? In my experience, regularization is applied on a complex/sensitive model to reduce complexity/sensitvity, but never on a …

  2. How does regularization reduce overfitting? - Cross Validated

    Mar 13, 2015 · A common way to reduce overfitting in a machine learning algorithm is to use a regularization term that penalizes large weights (L2) or non-sparse weights (L1) etc. How can …

  3. What are Regularities and Regularization? - Cross Validated

    Is regularization a way to ensure regularity? i.e. capturing regularities? Why do ensembling methods like dropout, normalization methods all claim to be doing regularization?

  4. L1 & L2 double role in Regularization and Cost functions?

    Mar 19, 2023 · Regularization - penalty for the cost function, L1 as Lasso & L2 as Ridge Cost/Loss Function - L1 as MAE (Mean Absolute Error) and L2 as MSE (Mean Square Error) …

  5. When should I use lasso vs ridge? - Cross Validated

    The regularization can also be interpreted as prior in a maximum a posteriori estimation method. Under this interpretation, the ridge and the lasso make different assumptions on the class of …

  6. neural networks - L2 Regularization Constant - Cross Validated

    Dec 3, 2017 · When implementing a neural net (or other learning algorithm) often we want to regularize our parameters $\\theta_i$ via L2 regularization. We do this usually by adding a …

  7. Boosting: why is the learning rate called a regularization parameter?

    The learning rate parameter ($\nu \in [0,1]$) in Gradient Boosting shrinks the contribution of each new base model -typically a shallow tree- that is added in the series. It was shown to dramatically

  8. Difference between weight decay and L2 regularization

    Apr 6, 2025 · I'm reading [Ilya Loshchilov's work] [1] on decoupled weight decay and regularization. The big takeaway seems to be that weight decay and $L^2$ norm …

  9. machine learning - Why use regularisation in polynomial …

    Aug 1, 2016 · Compare, for example, a second-order polynomial without regularization to a fourth-order polynomial with it. The latter can posit big coefficients for the third and fourth powers so …

  10. How is adding noise to training data equivalent to regularization?

    Oct 18, 2021 · I've noticed that some people argue that adding noise to training data equivalent to regularizing our predictor parameters. How is this the case? Some of the examples listed on …