-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unified framework for dealing with Nulls? #17
Comments
For functions like Anyway it's not clear whether this system would be the same with statistical models. If the question is as simple as for
So overall, maybe we just need to standardize a keyword argument name, like EDIT: Forgot to address the case of pairwise functions like |
Yes, sounds very sensible. Ah, and what's the current policy for predict? I believe I should not have to worry about it for JuliaData/DataFrames.jl#1160. |
I forgot to address the case of pairwise functions like cor and cov, where more choices are possible: skip all rows with at least in null in one of the columns ( But this actually has really little or no relation with modeling functions. :-) Regarding |
Cool. Yes, e.g. |
It is a bit tricky perhaps to get the |
Yes, there's still the option of allowing both |
There is some iterator functionality for handling this in |
For reduction functions like |
I actually recently put a generic |
Interesting, but that |
That is a good idea - the function has always been intended to be extended to multiple dimensions. Which design do you think would be preferable - one where |
I would start assuming a scalar. We can always make it even more general if that turns out to be useful. |
Currently,
lm
seems to deal with NAs in DataFrames by ignoring the rows where they appear:This is great, but is there a general syntax for this behaviour? I'm asking, among other things, because I want to add a keyword for this behaviour to plotting with DataFrames in StatPlots.
In R, this is a complete mess.
mean
uses the keywordna.rm = T
,lm
usesna.action = "na.omit"
,cor
uses `use = "complete.obs", which is just the sort of thing that guarantees that half your coding time in R is spent reading man pages also after 10 years of use.The text was updated successfully, but these errors were encountered: