Open
Description
Why matrix field of struct ContrastsMatrix is Matrix{Float64}? For many cases fo DummyCoding() or FullDummyCoding() this can be BitMatrix or SparseMatrixCSC{Bool, Int64}.
For big datasets I try to make something like this:
mutable struct OwnDummyCoding <: AbstractContrasts
# Dummy contrasts
end
function StatsModels.contrasts_matrix(C::OwnDummyCoding, baseind, n)
sparse(I, n, n)[:, [1:(baseind-1); (baseind+1):n]]
end
But I have memory overflow because ContrastsMatrix tries to convert this to Matrix{Float64}.
Metadata
Assignees
Labels
No labels
Activity
PharmCat commentedon Jan 3, 2022
Is it possible to make:
palday commentedon May 19, 2022
@PharmCat how many contrast levels do you have? If this is for the grouping variable in MixedModels.jl, then there is the
Grouping()
pseudocontrast which avoids creating an actual matrixPharmCat commentedon May 20, 2022
@palday
Hello! It can be more than 10^5. Actually I'am working on Metida.jl, that helps me in some tasks where MixedModels.jl can't be used. I know that in MixedModels this problem solved, Metida have some "workaround" too. And I see 'Grouping' in MixedModels.jl and may be 'Grouping' code should be moved to StatsModels.jl and documented there (may be with some other code from MixedModels, such using "/" in terms).
Also I don't know why ContrastsMatrix matrix field set as Matrix{Float64}, why in can't be more flexible.
So also I can't find any roadmap for StatsModels, I think StatsModels is a core package for JuliaStats ecosystem, but have no information about it's development plan to version 1.0
palday commentedon May 20, 2022
The nesting syntax
/
is implemented inRegressionFormulae.jl
palday commentedon May 20, 2022
The implementation of
Grouping()
is quite simple: https://github.com/JuliaStats/MixedModels.jl/blob/621f88b1f594ea0827d9ac7e8628113dd2121bef/src/grouping.jl#L2-L34Depending on the exact structure of your model, you might be able to skip using the full formula infrastructure and instead call a custom
modelcols
method directly -- this is how random effects and associated sparse matrices are constructed in MixedModels.PharmCat commentedon May 20, 2022
Yep, but this means that I should copy this code or include MixedModels as a dependency. Maybe place this functionality in StatsModels?
palday commentedon Jun 28, 2022
There's nothing wrong with copying this code, but maybe @kleinschmidt has thoughts on whether it makes more general sense to include this in StatsModels?