-
Notifications
You must be signed in to change notification settings - Fork 14
Compute Convex (Bi)Clustering Solutions via Algorithmic Regularization
License
DataSlingers/clustRviz
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
--- output: github_document always_allow_html: true --- <!-- README.md is generated from README.Rmd. Please edit that file --> ```{r, echo = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-" ) ``` [![GitHub Actions Build Status](https://github.com/DataSlingers/clustRviz/workflows/R-CMD-check and Deploy/badge.svg)](https://github.com/DataSlingers/clustRviz/actions?query=workflow%3A%22R-CMD-check+and+Deploy%22) [![codecov Coverage Status](https://codecov.io/gh/DataSlingers/clustRviz/branch/develop/graph/badge.svg)](https://codecov.io/gh/DataSlingers/clustRviz/branch/develop) [![License: GPL v3](https://img.shields.io/badge/License-GPL%20v3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0) [![CRAN\_Status\_Badge](http://www.r-pkg.org/badges/version/clustRviz)](https://cran.r-project.org/package=clustRviz) [![Project Status: Active – The project has reached a stable, usable state and is being actively developed.](http://www.repostatus.org/badges/latest/active.svg)](http://www.repostatus.org/#active) # clustRviz `clustRviz` aims to enable fast computation and easy visualization of Convex Clustering solution paths. ## Installation You can install `clustRviz` from github with: ```{r gh-installation, eval = FALSE} # install.packages("devtools") devtools::install_github("DataSlingers/clustRviz") ``` Note that `RcppEigen` (which `clustRviz` internally) triggers many compiler warnings (which cannot be suppressed per [CRAN policies](http://cran.r-project.org/web/packages/policies.html#Source-packages)). Many of these warnings can be locally suppressed by adding the line `CXX11FLAGS+=-Wno-ignored-attributes` to your `~/.R/Makevars` file. To install an `R` package from source, you will need suitable development tools installed including a `C++` compiler and potentially a Fortran runtime. Details about these toolchains are available on CRAN for [Windows](https://cran.r-project.org/bin/windows/Rtools/) and [macOS](https://mac.r-project.org/tools/). ## Quick-Start There are two main entry points to the `clustRviz` package, the `CARP` and `CBASS` functions, which perform convex clustering and convex biclustering respectively. We demonstrate the use of these two functions on a text minining data set, `presidential_speech`, which measures how often the 44 U.S. presidents used certain words in their public addresses. ```{r load_data} library(clustRviz) data(presidential_speech) presidential_speech[1:6, 1:6] ``` ### Clustering We begin by clustering this data set, grouping the rows (presidents) into clusters: ```{r carp_example} carp_fit <- CARP(presidential_speech) print(carp_fit) ``` The algorithmic regularization technique employed by `CARP` makes computation of the whole solution path almost immediate. We can examine the result of `CARP` graphically. We begin with a standard dendrogram, with three clusters highlighted: ```{r carp_dendro} plot(carp_fit, type = "dendrogram", k = 3) ``` Examing the dendrogram, we see two clear clusters, consisting of pre-WWII and post-WWII presidents and Warren G. Harding as a possible outlier. Harding is generally considered one of the worst US presidents of all time, so this is perhaps not too surprising. A more interesting visualization is the dynamic path visualization, whereby we can watch the clusters fuse as the regularization level is increased: ```{r carp_dynamic, eval = FALSE} plot(carp_fit, type = "path", dynamic = TRUE) ``` ### BiClustering The use of `CBASS` for convex biclustering is similar, and we demonstrate it here with a cluster heatmap, with the regularization set to give 3 observation clusters: ```{r cbass} cbass_fit <- CBASS(presidential_speech) plot(cbass_fit, k.row = 3) ``` By default, plotting the result of CBASS gives the traditional cluster heatmap, but we can also get the row or column dendrograms as well: ```{r cbass_rowdendro} plot(cbass_fit, type = "row.dendrogram", k.row = 3) ``` By default, if a regularization level is specified, all plotting functions in `clustRviz` will plot the clustered data. If the regularization level is not specified, the raw data will be plotted instead: ```{r cbass_heatmap2} plot(cbass_fit, type = "heatmap") ``` More details about the use and mathematical formulation of `CARP` and `CBASS` may be found in the package documentation.
About
Compute Convex (Bi)Clustering Solutions via Algorithmic Regularization
Topics
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published