Skip to content

Use of custom transofmration in linear models. #160

Open
@SilasK

Description

I would like to use variations of CLR transformation.

e.g., non-zero or iqlr CLR, both using only a subset of the features to estimate the geometric mean.

I can transform a phyloseq object before the taxatree_models, but then it complains that the input contains non-zero values, which messes up the aggregation. I found no way to turn off the aggregation.

Ideally, one could add a custom function to tax_aggregate or directly implement these transformations in microViz.

Here is my function:

#' @title get otu table as a matrix with taxa are columns format
#' @description A wrapper to get otu table as a matrix with taxa are columns format
#' @param pseq The phyloseq object.
#' @return otu table as a matrix with taxa are columns format
#' @examples
#' get_otu_matrix(pseq)
#' @export
#' @import phyloseq
otu_matrix <- function(pseq) {
  OTU <- as(phyloseq::otu_table(pseq), "matrix")
  
  if (phyloseq::taxa_are_rows(pseq)) {
    OTU <- t(OTU)
  }
  
  OTU
}

#' @title robust (non zero) centered log transformation
#' @description A wrapper to perform Bayesian replacement of zeroes and CLR-transformation. It is rebust in the sense that it takes only values that are not zero in the calculation of the geometric mean.
#' @param pseq The phyloseq object.
#' @param log_fun The log function to use. Default is log2.
#' @return phyloseq object with CLR-transformed abundances.
#' @export
#'
rclr_transformation <- function(pseq, log_fun = log2) {
  if (class(pseq) != "phyloseq") {
    stop("pseq must be a phyloseq object.")
  }
  
  abundances <- otu_matrix(pseq)
  mdat <- microbiome::meta(pseq)
  if (!identical(rownames(abundances), rownames(mdat))) {
    stop("OTU table and metadata are not in order.")
  }
  
  
  # remove all columns with all zeros
  abundances <- abundances[, colSums(abundances) > 0]
  
  abundances.log <- zCompositions::cmultRepl(abundances, label = 0, method = "CZM",z.delete = FALSE) |>
    log_fun() |>
    as.matrix()
  
  abundances.log.nz <- abundances.log
  abundances.log.nz[abundances == 0] <- NaN
  
  
  geo_means <- rowMeans(abundances.log.nz, na.rm = TRUE)
  
  
  transformed <- abundances.log - geo_means
  
  
  
  ps <- phyloseq::phyloseq(
    phyloseq::otu_table(t(transformed), taxa_are_rows = TRUE),
    phyloseq::sample_data(pseq),
    phyloseq::phy_tree(pseq, errorIfNULL = FALSE),
    phyloseq::tax_table(pseq)
  )
  
  return(ps)
}

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions