Skip to content

Why ContrastsMatrix matrix is Matrix{Float64}? #251

Open
@PharmCat

Description

Why matrix field of struct ContrastsMatrix is Matrix{Float64}? For many cases fo DummyCoding() or FullDummyCoding() this can be BitMatrix or SparseMatrixCSC{Bool, Int64}.
For big datasets I try to make something like this:

mutable struct OwnDummyCoding <: AbstractContrasts
# Dummy contrasts 
end
function StatsModels.contrasts_matrix(C::OwnDummyCoding, baseind, n)
    sparse(I, n, n)[:, [1:(baseind-1); (baseind+1):n]]
end

But I have memory overflow because ContrastsMatrix tries to convert this to Matrix{Float64}.

Activity

PharmCat

PharmCat commented on Jan 3, 2022

@PharmCat
Author

Is it possible to make:

struct ContrastsMatrix{C <: AbstractContrasts, T, U, M}
    matrix::M
    termnames::Vector{U}
    levels::Vector{T}
    contrasts::C
    invindex::Dict{T,Int}
    function ContrastsMatrix(matrix::M,
                             termnames::Vector{U},
                             levels::Vector{T},
                             contrasts::C) where {U,T,C <: AbstractContrasts} where M <: AbstractMatrix
        allunique(levels) || throw(ArgumentError("levels must be all unique, got $(levels)"))
        invindex = Dict{T,Int}(x=>i for (i,x) in enumerate(levels))
        new{C,T,U,M}(matrix, termnames, levels, contrasts, invindex)
    end
end
palday

palday commented on May 19, 2022

@palday
Member

@PharmCat how many contrast levels do you have? If this is for the grouping variable in MixedModels.jl, then there is the Grouping() pseudocontrast which avoids creating an actual matrix

PharmCat

PharmCat commented on May 20, 2022

@PharmCat
Author

@PharmCat how many contrast levels do you have? If this is for the grouping variable in MixedModels.jl, then there is the Grouping() pseudocontrast which avoids creating an actual matrix

@palday

Hello! It can be more than 10^5. Actually I'am working on Metida.jl, that helps me in some tasks where MixedModels.jl can't be used. I know that in MixedModels this problem solved, Metida have some "workaround" too. And I see 'Grouping' in MixedModels.jl and may be 'Grouping' code should be moved to StatsModels.jl and documented there (may be with some other code from MixedModels, such using "/" in terms).
Also I don't know why ContrastsMatrix matrix field set as Matrix{Float64}, why in can't be more flexible.

So also I can't find any roadmap for StatsModels, I think StatsModels is a core package for JuliaStats ecosystem, but have no information about it's development plan to version 1.0

palday

palday commented on May 20, 2022

@palday
Member

The nesting syntax / is implemented in RegressionFormulae.jl

palday

palday commented on May 20, 2022

@palday
Member

The implementation of Grouping() is quite simple: https://github.com/JuliaStats/MixedModels.jl/blob/621f88b1f594ea0827d9ac7e8628113dd2121bef/src/grouping.jl#L2-L34

Depending on the exact structure of your model, you might be able to skip using the full formula infrastructure and instead call a custom modelcols method directly -- this is how random effects and associated sparse matrices are constructed in MixedModels.

PharmCat

PharmCat commented on May 20, 2022

@PharmCat
Author

The implementation of Grouping() is quite simple:

Yep, but this means that I should copy this code or include MixedModels as a dependency. Maybe place this functionality in StatsModels?

palday

palday commented on Jun 28, 2022

@palday
Member

There's nothing wrong with copying this code, but maybe @kleinschmidt has thoughts on whether it makes more general sense to include this in StatsModels?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      Why ContrastsMatrix matrix is Matrix{Float64}? · Issue #251 · JuliaStats/StatsModels.jl