Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Longitudinal models #520

Merged
merged 20 commits into from
Apr 11, 2023
Merged

Conversation

NathanielF
Copy link
Contributor

@NathanielF NathanielF commented Feb 5, 2023

Example Notebook on Longitudinal Analysis and Growth Curve Trajectories.

Related to this issue: #508

Example of iterative model construction on longitudinal data following the Willett and Singer book. In the first model we follow the text quite closely, and in the second model we choose an alternative likelihood model and fully difrerent set of priors.

Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@NathanielF NathanielF marked this pull request as ready for review February 13, 2023 10:13
@NathanielF
Copy link
Contributor Author

I think this is close to done now. Tagging @aloctavodia and @drbenvincent since you both expressed an interest in the topic.
The broad structure of the notebook is as follows:

image

I focus on 2 examples of iterative model building and the extraction of the within-individual and across-individual trajectories of growth learned by the model. I've tried to explain the crucial aspects of the model as i understand them without getting too caught up in the confusing vocabulary of fixed and random effects, so i'd be grateful for feedback on this aspect especially. I don't want to add to the confusion around these terms!

In the first example i follow the textbook very closely using a normal likelihood and derive parameter estimates akin to those reported in the text. In the second example i go off-reservation and use a gumbel likelihood with censoring which looks more appropriate to me than the Normal likelihood they were constrained to use in the text. I also briefly call out that you can (and probably should) use Bambi for these kinds of models where possible.

@drbenvincent
Copy link
Contributor

Sorry for slow response @NathanielF. Fingers crossed I'll have time to review this weekend, or early next week.

@NathanielF
Copy link
Contributor Author

Sorry for slow response @NathanielF. Fingers crossed I'll have time to review this weekend, or early next week.

No problem! Thanks for letting me know.

@NathanielF
Copy link
Contributor Author

Giving this one another nudge in case Friday is a good day for this review @drbenvincent?

@review-notebook-app
Copy link

review-notebook-app bot commented Mar 11, 2023

View / edit / reply to this conversation on ReviewNB

drbenvincent commented on 2023-03-11T17:27:55Z
----------------------------------------------------------------

First line should have the name of the notebook in the parentheses, not the _title_ of the notebook


NathanielF commented on 2023-03-12T13:54:09Z
----------------------------------------------------------------

Amended

@review-notebook-app
Copy link

review-notebook-app bot commented Mar 11, 2023

View / edit / reply to this conversation on ReviewNB

drbenvincent commented on 2023-03-11T17:27:56Z
----------------------------------------------------------------

Double check the pymc.sampling_jax import line, just include what's necessary. Optionally, if you use PyMC 5.1.0, you can avoid importing these I believe and provide a nuts_sampler argument to pm.sample.


NathanielF commented on 2023-03-12T13:57:42Z
----------------------------------------------------------------

Yeah, i don't need the blackjax sampler. Removed now.

@review-notebook-app
Copy link

review-notebook-app bot commented Mar 11, 2023

View / edit / reply to this conversation on ReviewNB

drbenvincent commented on 2023-03-11T17:27:57Z
----------------------------------------------------------------

Capitalize Bayesian

Structure makes it sound like Bambi is used exclusively, but it's only used for one model?


NathanielF commented on 2023-03-12T13:59:12Z
----------------------------------------------------------------

Adjusted this now to be more explicit that the bambi section is a digression to explain an alternative method. Capitalised Bayesian too

@review-notebook-app
Copy link

review-notebook-app bot commented Mar 11, 2023

View / edit / reply to this conversation on ReviewNB

drbenvincent commented on 2023-03-11T17:27:58Z
----------------------------------------------------------------

Ad space after try/except block


NathanielF commented on 2023-03-12T13:59:39Z
----------------------------------------------------------------

Adjusted.

@review-notebook-app
Copy link

review-notebook-app bot commented Mar 11, 2023

View / edit / reply to this conversation on ReviewNB

drbenvincent commented on 2023-03-11T17:27:59Z
----------------------------------------------------------------

Use the hide-input cell tag. See style guide for more info.


NathanielF commented on 2023-03-12T13:59:52Z
----------------------------------------------------------------

Hidden

@review-notebook-app
Copy link

review-notebook-app bot commented Mar 11, 2023

View / edit / reply to this conversation on ReviewNB

drbenvincent commented on 2023-03-11T17:28:00Z
----------------------------------------------------------------

Use the hide-input cell tag. See style guide for more info.


NathanielF commented on 2023-03-12T14:02:43Z
----------------------------------------------------------------

hidden

Copy link
Contributor Author

hidden


View entire conversation on ReviewNB

Copy link
Contributor Author

hidden


View entire conversation on ReviewNB

Copy link
Contributor Author

adjusted. Note one concern here is the color formatting renders differently to color the whole line rather than just the symbols on the site.


View entire conversation on ReviewNB

Copy link
Contributor Author

Hidden


View entire conversation on ReviewNB

Copy link
Contributor Author

Adjusted


View entire conversation on ReviewNB

Copy link
Contributor Author

hidden


View entire conversation on ReviewNB

Copy link
Contributor Author

hidden


View entire conversation on ReviewNB

Copy link
Contributor Author

Adjusted


View entire conversation on ReviewNB

Copy link
Contributor Author

adjusted


View entire conversation on ReviewNB

Copy link
Contributor

@drbenvincent drbenvincent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few minor style issues:

  • For the multi-line equation block where we have the align environment, if you place the $ symbol before the = or ~ then everything aligns to that. At the moment it's right aligned.
  • Add time series to the tags
  • Consider moving from the Case Studies section to the Time Series section of the repo
  • Missing space after full stop for "our model.The"
  • Remove commented out line... # grade = pm.MutableData('grade_data', df_external['GRADE'].values)
  • Ideally, it would be possible to suppress console output after plotting with ;. But I can't remember if the black formatting removes that. I believe there's a way to suppress black from removing that
  • Where you do use sample_numpyro_nuts, can you use the hide-output (cell tag) because we get a whole bunch of progress bar output that we don't need. Or maybe there's a specific kwarg for that sampler to suppress that?
  • In the rendered website output, there's something really broken about the rendering of the L2 Watermark header, after the References L2 header. I've not double checked, but can probably be fixed by having that in it's own cell. Either way, there's something not working there. More info here https://www.pymc.io/projects/docs/en/latest/contributing/jupyter_style.html#watermark

Copy link
Contributor Author

Done.


View entire conversation on ReviewNB

@drbenvincent
Copy link
Contributor

Nice. I'll endeavour to do a final pass, reviewing content/clarity in the next few days and we can ship this :) Do feel free to ping me, but it is on my list.

@NathanielF
Copy link
Contributor Author

Consider moving from the Case Studies section to the Time Series section of the repo

On this point, i think i prefer it in the case studies section. Time series analysis/modelling is (in my head anyway) a different kind of beast than longitudinal curve analysis. They're clearly, related... but just not the same bucket for me. Can move it if you feel strongly about it, but i prefer it as it is.

On the watermark thing i wasn't sure what you meant... it looks ok to me:

image

@NathanielF
Copy link
Contributor Author

Nice. I'll endeavour to do a final pass, reviewing content/clarity in the next few days and we can ship this :) Do feel free to ping me, but it is on my list.

Perfect. Thanks for your help on this!

@NathanielF
Copy link
Contributor Author

Just a small nudge @drbenvincent.

@drbenvincent drbenvincent self-requested a review April 10, 2023 10:58
Copy link
Contributor

@drbenvincent drbenvincent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a coord for the observations for all models. We have 246 in the early models, and 254 with later models.

@drbenvincent
Copy link
Contributor

Typo in this sentence "Implementing the model is PyMC is as follows"

@drbenvincent
Copy link
Contributor

drbenvincent commented Apr 10, 2023

I'm wondering if we could get the model math for the "The Uncontrolled Effects of Parental Alcoholism" model. Don't worry about the colour coding if that gets messy.

Same for the "Model controlling for Peer Effects" model

@drbenvincent
Copy link
Contributor

drbenvincent commented Apr 10, 2023

After this sentence

The formula specification uses 1 to denote an intercept term and a conditional | operator to denote a subject level parameter combined with the global parameter of the same type in the manner specified above.

could you just spell out how to read the hierarchical term (1 + age_14 | id) in the Bambi model equation

And we don't need the 0 + part of the equation I believe?

@drbenvincent
Copy link
Contributor

I think there's one instance of a lower case "bambi", but everything else is capitalised.

This is great. Happy to approve after these minor tweaks 👍
Sorry for the delay in finishing the review

@NathanielF
Copy link
Contributor Author

Thanks @drbenvincent! Will get to these tonight.

@NathanielF
Copy link
Contributor Author

NathanielF commented Apr 10, 2023

Made those changes @drbenvincent. Seemed to work in the Review Notebook, but docs build check renders strangely:
Seems to effect all notebooks, so likely something to do with the broader site render than my specific PR here.

Probably related to this: #541

image

@NathanielF NathanielF requested a review from drbenvincent April 10, 2023 19:56
Copy link
Contributor

@drbenvincent drbenvincent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the formatting issue looks like a broader issue, not something specific to this notebook. So happy to approve. Let me know if there are any minor tweaks you wanted to make, otherwise I'll merge.

Thanks again for the contribution, and for bearing with the tardy reviews.

@NathanielF
Copy link
Contributor Author

Happy with it. Feel free to merge! Thanks again

@drbenvincent drbenvincent merged commit ea709bb into pymc-devs:main Apr 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants