regularize in a hacky way #129

beckermr · 2021-11-05T21:55:53Z

We can use N-fold cross-validation to do hacky regularization. sklearn has a lot of nice tools for thus. More or less, you can use a GridSearchCV object to do this. How it works is that you split the data into N sections. You loop and leave one of them out, fit a model, then predict for the other one. At the very end, you combine all of the out-of-sample predictions.

With this technique, we can loop through a range of regularization amplitudes, run CV for each of them, and pick the one that has the minimum chi2 or w/e.

It fits a lot more models, but would do the trick.

rmjarvis · 2021-11-05T22:34:22Z

That sounds like it would multiply the running time by a large factor, which seems probably untenable. Am I missing something?

beckermr · 2021-11-05T22:47:08Z

How long is the running time now?

Yes in general it would.

Most other options I know of have similar costs or more.

beckermr · 2021-11-05T22:49:18Z

You only need to do this for a representative subset fwiw. Then you can likely fix the regularization for the rest of the survey.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

regularize in a hacky way #129

regularize in a hacky way #129

beckermr commented Nov 5, 2021

rmjarvis commented Nov 5, 2021

beckermr commented Nov 5, 2021

beckermr commented Nov 5, 2021

regularize in a hacky way #129

regularize in a hacky way #129

Comments

beckermr commented Nov 5, 2021

rmjarvis commented Nov 5, 2021

beckermr commented Nov 5, 2021

beckermr commented Nov 5, 2021