Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Meta learners #170

Draft
wants to merge 62 commits into
base: main
Choose a base branch
from
Draft

Conversation

matekadlicsko
Copy link

Implemented basic functionalities for frequentist S, T and X learners. Documentations and further features are next. After that, I'm implementing PyMC compatible counterpart.

@matekadlicsko matekadlicsko marked this pull request as ready for review February 27, 2023 13:03

def summary(self):
raise(NotImplementedError())

"Prints summary. Conent is undecided yet."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"Prints summary. Conent is undecided yet."
"Prints summary. Content is undecided yet."

@matekadlicsko matekadlicsko marked this pull request as draft March 1, 2023 16:01
@drbenvincent
Copy link
Collaborator

Hi @matekadlicsko. Sorry for the slow reply, some crazy things happened in my life!

Thanks for the work on this!

Once things are more progressed, would you be good to:

  • add some relevant entries in the glossary (in docs/glossary.md)?
  • add one (or more) example notebooks for the docs to demonstrate use?

@matekadlicsko
Copy link
Author

Hi @drbenvincent , I hope everything's fine. Sure, I'll add a few tests as well as soon as I'm ready with implementing all the basic functionalities I envisioned.

@drbenvincent
Copy link
Collaborator

This PR has prompted me to ramp up on meta learners, it's not something I knew about before. There is a really excellent summary of them in this video by @ShawhinT.

We should definitely include a link to that material when it comes to completing the glossary or example notebooks👍

@juanitorduz
Copy link
Collaborator

I think this has to do with the lack of doc-strings. Try commenting https://github.com/pymc-labs/CausalPy/blob/main/.pre-commit-config.yaml#L36-L42 and then run it again. You can also run git commit -m"my message" -n to ignore the pre-commit and push it to test against the ci/cd in the github actions.

pyproject.toml Outdated
@@ -25,6 +25,7 @@ requires-python = ">=3.8"
# For an analysis of this field vs pip's requirements files see:
# https://packaging.python.org/discussions/install-requires-vs-requirements/
dependencies = [
"aesera",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aesara shouldn't be required, we changed the backend to pytensor in pymc 5.0.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was suspicious to me as well, but couldn't figure out why the tests failed with ERROR - ModuleNotFoundError: No module named 'aesara'. I think pymc_bart must be causing this error. Could it be because I did not specify the version requirement for it?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it not pymc-bart with a dash and not an underscore?

@matekadlicsko
Copy link
Author

I don't think I'll be able to work on this a lot this week unfortunately. I'll share where it lacks and what my plans are for (hopefully) the next week.

  • Uplift, QINI and cumulative gain diagrams will be implemented, MetaLearner.plot should be rethought using these.
  • In the PyMC class, summary is too brief, HDI-s, etc. will be added.
  • The notebook at this moment is very much not complete, none of the Bayesian metalearners are included.

@matekadlicsko
Copy link
Author

matekadlicsko commented Mar 22, 2023

  • a bigger one:
    In the X and the DR learner, the prediction of one learner is used in another model. At this moment, I calculate the mean of the first prediction and go with that, however that feels lazy. For example here
        # Split data to treated and untreated subsets
        X_t, y_t = X[treated == 1], y[treated == 1]
        X_u, y_u = X[treated == 0], y[treated == 0]


        # Estimate response function
        _fit(treated_model, X_t, y_t, coords)
        _fit(untreated_model, X_u, y_u, coords)


        pred_u_t = az.extract(
            untreated_model.predict(X_t), group="posterior_predictive", var_names="mu"
        ).mean(axis=1)
        pred_t_u = az.extract(
            treated_model.predict(X_u), group="posterior_predictive", var_names="mu"
        ).mean(axis=1)


        tau_t = y_t - pred_u_t
        tau_u = y_u - pred_t_u


        # Estimate CATE separately on treated and untreated subsets
        _fit(treated_cate_estimator, X_t, tau_t, coords)
        _fit(untreated_cate_estimator, X_u, tau_u, coords)

the posterior predictive pred_u_t should be used to construct the prior of tau_t imo. I guess I'll test things out.

@juanitorduz
Copy link
Collaborator

I don't think I'll be able to work on this a lot this week unfortunately. I'll share where it lacks and what my plans are for (hopefully) the next week.

  • Uplift, QINI and cumulative gain diagrams will be implemented, MetaLearner.plot should be rethought using these.
  • In the PyMC class, summary is too brief, HDI-s, etc. will be added.
  • The notebook at this moment is very much not complete, none of the Bayesian metalearners are included.

I think this is a fantastic job and I also had planned to explore this. Take your time. I hope to re-visit the latest changes soon.

Also, I think it should not be perfect. We can break down the ambitious proposal into some milestones.

@juanitorduz
Copy link
Collaborator

@matekadlicsko let me know whenever you need another review (no rush)

@matekadlicsko
Copy link
Author

@juanitorduz unfortunately I have a bit less time nowadays, but I'm progressing slowly.

I'm pretty much done with skl_meta_learners.py for now. I would be very interested in your opinion about summary.py. It is definitely not in its final form, but I would love to have some feedback.

The PyMC version is not yet ready for review unfortunately.

@juanitorduz
Copy link
Collaborator

Sure! No rush! Take your time. I'll be off for a couple of weeks and come back for a review 🤜🤛

@drbenvincent
Copy link
Collaborator

drbenvincent commented May 25, 2023

@matekadlicsko It looks like the work in summary.py that this PR will close #174. If so, would you be able to edit the issue to mention that so it auto closes?

EDIT: In fact, just a suggestion, but this PR might be trying to do a lot in one go. Feel free to break out the HTML summary output into a separate PR if you want. Up to you though :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants