Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regularize in a hacky way #129

Open
beckermr opened this issue Nov 5, 2021 · 3 comments
Open

regularize in a hacky way #129

beckermr opened this issue Nov 5, 2021 · 3 comments

Comments

@beckermr
Copy link
Collaborator

beckermr commented Nov 5, 2021

We can use N-fold cross-validation to do hacky regularization. sklearn has a lot of nice tools for thus. More or less, you can use a GridSearchCV object to do this. How it works is that you split the data into N sections. You loop and leave one of them out, fit a model, then predict for the other one. At the very end, you combine all of the out-of-sample predictions.

With this technique, we can loop through a range of regularization amplitudes, run CV for each of them, and pick the one that has the minimum chi2 or w/e.

It fits a lot more models, but would do the trick.

@rmjarvis
Copy link
Owner

rmjarvis commented Nov 5, 2021

That sounds like it would multiply the running time by a large factor, which seems probably untenable. Am I missing something?

@beckermr
Copy link
Collaborator Author

beckermr commented Nov 5, 2021

How long is the running time now?

Yes in general it would.

Most other options I know of have similar costs or more.

@beckermr
Copy link
Collaborator Author

beckermr commented Nov 5, 2021

You only need to do this for a representative subset fwiw. Then you can likely fix the regularization for the rest of the survey.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants