Skip to content

[python-package] [c++] support scipy sparse arrays #6352

Closed
@jameslamb

Description

Summary

LightGBM currently has APIs to support two formats of sparse matrices:

  • CSC = "Compressed Sparse Column"
  • CSR = "Compressed Sparse Row"

In the Python library scipy, these are represented by classes scipy.sparse.csc_matrix and scipy.sparse.csr_matrix, respectively.

Per scipy's docs (link)

This package is switching to an array interface, compatible with NumPy arrays, from the older matrix interface. We recommend that you use the array objects (bsr_array, coo_array, etc.) for all new work.

lightgbm should add support for scipy.sparse.csc_array and scipy.sparse.csr_array in all places that currently support the corresponding *_matrix classes.

Motivation

Allows continued use of scipy sparse types with lightgbm, even after future scipy releases remove those matrix types.

Description

See the discussion in #6348 for more information.

References

Created based on this comment: #6348 (comment)

Good visual summary of these types: https://matteding.github.io/2019/04/25/sparse-matrices/#compressed-sparse-rowcolumn

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions