Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rebase dev #96

Merged
merged 58 commits into from
Jan 4, 2025
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
58 commits
Select commit Hold shift + click to select a range
56813a7
Merge pull request #79 from roaldarbol/dev
roaldarbol Dec 7, 2024
57d6b36
Lots of new functions! Still rough
Dec 12, 2024
72cd715
Mostly docstrings and a few modifications.
Dec 12, 2024
db721a8
A few name changes
Dec 12, 2024
33f4790
Log y-axis on check_poses
Dec 12, 2024
b95b92a
Fix. Also silence ggplot
Dec 12, 2024
e39fc92
Add reference_keypoint to check_pose
Dec 12, 2024
622b533
Add doc and export to replace_na
Dec 12, 2024
143d137
Allow na_interpolation to return an unfiltered data frame with a warning
Dec 12, 2024
d817a39
Pathc to last commit
Dec 12, 2024
d692715
Another small patch
Dec 12, 2024
efcbe5b
Allow plotting of all NAs in check_na_timing
Dec 12, 2024
0118c58
Add min_obs parameter to smooth_movement
Dec 12, 2024
4d752ea
Hopefully improve speed of translate_coords_keypoint
Dec 13, 2024
9e47f16
lots of new functions and test data moved
Dec 14, 2024
d40e5e0
Just filename changes
Dec 14, 2024
906c0f1
Just docs and patches to ensure successful building
Dec 14, 2024
530e868
Add better read_trex docstring
Dec 14, 2024
eb9cb2b
Merge pull request #87 from roaldarbol/main
roaldarbol Dec 14, 2024
4b835ab
Expose and add documentation to set_individual and set_framerate
Dec 15, 2024
6a304fd
Add imports
Dec 15, 2024
91afc51
Expport set_ functions and update get_example_data
Dec 15, 2024
0177652
Add to NAMESPACE
Dec 15, 2024
c58f8ca
Fix time series plots when all values are NA
Dec 15, 2024
4904fdd
Tiny patch
Dec 15, 2024
be0c861
Add peak/trough detection
Dec 16, 2024
7111681
Great improvements the extrema detection functions. Also lots of test…
Dec 16, 2024
004aa31
Add movement classification
Dec 16, 2024
5a12cce
Export classification
Dec 16, 2024
2d5e44d
And add it to NAMESPACE
Dec 16, 2024
add149a
Merge branch 'everything_everywhere_all_at_once' of https://github.co…
Dec 16, 2024
6b71fbc
Fix bug in filter_by_speed
Dec 17, 2024
4f1e22e
Add NA tests
Dec 17, 2024
b547bfd
Bug fix for calculate_kinematics - added group_by keypoint and indivi…
Dec 17, 2024
679d30c
Fix set_framerate so it detects whether a frame rate has previously b…
Dec 17, 2024
a497269
Add bandwidth filters
Dec 19, 2024
d5b0ec6
Updates to the classification functions
Dec 19, 2024
414e90b
Changed method names in smooth_movement function
Dec 19, 2024
eb39902
Added NA testing for filter_by_speed
Dec 19, 2024
fe972ff
Add return_type parameter
Dec 19, 2024
1d35cfc
Add return_type parameter
Dec 19, 2024
4685c92
Merge branch 'everything_everywhere_all_at_once' of https://github.co…
Dec 19, 2024
d746502
Clean-up
Dec 19, 2024
ca26260
Patch
Dec 19, 2024
b05dca9
Another patch
Dec 19, 2024
6a1a08c
Patch again
Dec 19, 2024
65c696d
Improved bandwidth filters
Dec 20, 2024
18e7661
Adds Kalman filters
Dec 20, 2024
d9d1ca5
Add rotation of coordinates and egocentric transformation
Dec 21, 2024
ff8ebdc
Alignment of timeseries and classification w peak+trough
Dec 25, 2024
383aa8d
Improve detection of active periods
Dec 25, 2024
e0589f2
Adds replace_na functions and classify_low_periods
Dec 26, 2024
f8e440e
Filtering functions
Jan 3, 2025
d4b790d
Calculations
Jan 3, 2025
0492954
Filter NA functions
Jan 3, 2025
e64e9fa
The rest
Jan 3, 2025
0c2bba8
Update version
Jan 3, 2025
635e699
Merge pull request #94 from roaldarbol/everything_everywhere_all_at_once
roaldarbol Jan 3, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
The rest
  • Loading branch information
Mikkel Roald-Arbøl committed Jan 3, 2025
commit e64e9fa368da32339dcdc3c231a2064fa02d8984
25 changes: 17 additions & 8 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -14,28 +14,32 @@ export(classify_by_stability)
export(classify_by_threshold)
export(classify_high_periods)
export(classify_low_periods)
export(clean_kinematics)
export(does_file_have_expected_headers)
export(ensure_file_has_expected_headers)
export(ensure_file_has_headers)
export(filter_by_confidence)
export(filter_by_speed)
export(filter_highpass)
export(filter_highpass_fft)
export(filter_kalman)
export(filter_kalman_irregular)
export(filter_lowpass)
export(filter_lowpass_fft)
export(filter_movement)
export(filter_na_confidence)
export(filter_na_roi)
export(filter_na_speed)
export(filter_rollmean)
export(filter_rollmedian)
export(filter_sgolay)
export(find_lag)
export(find_peaks)
export(find_time_lag)
export(find_troughs)
export(get_example_data)
export(get_metadata)
export(ggplot_na_gapsize)
export(group_every)
export(init_metadata)
export(map_to_cartesian)
export(map_to_polar)
export(na_interpolation)
export(plot_position_timeseries)
export(plot_speed_timeseries)
export(read_animalta)
Expand All @@ -48,7 +52,6 @@ export(read_sleap)
export(read_trackball)
export(read_treadmill)
export(read_trex)
export(replace_missing)
export(replace_na)
export(replace_na_linear)
export(replace_na_locf)
Expand All @@ -60,8 +63,6 @@ export(set_framerate)
export(set_individual)
export(set_start_datetime)
export(set_uuid)
export(smooth_by_savgol)
export(smooth_movement)
export(transform_to_egocentric)
export(translate_coords)
export(validate_animalta)
Expand Down Expand Up @@ -132,7 +133,15 @@ importFrom(signal,butter)
importFrom(signal,filtfilt)
importFrom(signal,sgolayfilt)
importFrom(stats,approx)
importFrom(stats,ccf)
importFrom(stats,complete.cases)
importFrom(stats,fft)
importFrom(stats,mad)
importFrom(stats,median)
importFrom(stats,qnorm)
importFrom(stats,quantile)
importFrom(stats,spline)
importFrom(stinepack,stinterp)
importFrom(stringi,stri_rand_strings)
importFrom(utils,download.file)
importFrom(vroom,vroom)
86 changes: 86 additions & 0 deletions R/add_centroid.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
#' Add Centroid to Movement Data
#'
#' @description
#' Calculates and adds a centroid point to movement tracking data. The centroid
#' represents the mean position of selected keypoints at each time point.
#'
#' @param data A data frame containing movement tracking data with the following
#' required columns:
#' - `individual`: Identifier for each tracked subject
#' - `keypoint`: Factor specifying tracked points
#' - `time`: Time values
#' - `x`: x-coordinates
#' - `y`: y-coordinates
#' - `confidence`: Confidence values for tracked points
#' @param include_keypoints Optional character vector specifying which keypoints
#' to use for centroid calculation. If NULL (default), all keypoints are used
#' unless `exclude_keypoints` is specified.
#' @param exclude_keypoints Optional character vector specifying which keypoints
#' to exclude from centroid calculation. If NULL (default), no keypoints are
#' excluded unless `include_keypoints` is specified.
#' @param centroid_name Character string specifying the name for the centroid
#' keypoint (default: "centroid")
#'
#' @return A data frame with the same structure as the input, but with an
#' additional keypoint representing the centroid. The centroid's confidence
#' values are set to NA.
#'
#' @details
#' The function calculates the centroid as the mean x and y position of the
#' selected keypoints at each time point for each individual. Keypoints can be
#' selected either by specifying which ones to include (`include_keypoints`) or
#' which ones to exclude (`exclude_keypoints`). The resulting centroid is added
#' as a new keypoint to the data frame.
#'
#' @examples
#' \dontrun{
#' # Add centroid using all keypoints
#' add_centroid(movement_data)
#'
#' # Calculate centroid using only specific keypoints
#' add_centroid(movement_data,
#' include_keypoints = c("head", "thorax", "abdomen"))
#'
#' # Calculate centroid excluding certain keypoints
#' add_centroid(movement_data,
#' exclude_keypoints = c("antenna_left", "antenna_right"),
#' centroid_name = "body_centroid")
#' }
#'
#' @seealso
#' `convert_nan_to_na()` for NaN handling in the centroid calculation
#'
#' @importFrom dplyr filter group_by summarise mutate arrange bind_rows
#'
#' @export
add_centroid <- function(data,
include_keypoints=NULL,
exclude_keypoints=NULL,
centroid_name="centroid"){
# Check that centroid isn't there
# Check that it's a movement data frame
# To be optimised with collapse later on
if (!is.null(include_keypoints)){
df_centroid <- data |>
dplyr::filter(.data$keypoint %in% include_keypoints)
} else if (!is.null(exclude_keypoints)){
df_centroid <- data |>
dplyr::filter(!.data$keypoint %in% exclude_keypoints)
} else {
df_centroid <- data
}

df_centroid <- df_centroid |>
dplyr::group_by(.data$individual, .data$time) |>
dplyr::summarise(x = mean(.data$x, na.rm=TRUE),
y = mean(.data$y, na.rm=TRUE),
confidence = NA,
.groups = "keep") |>
dplyr::mutate(keypoint = factor(as.character(centroid_name))) |>
convert_nan_to_na()

data <- bind_rows(data, df_centroid) |>
dplyr::arrange(.data$time, .data$individual, .data$keypoint)

return(data)
}
18 changes: 10 additions & 8 deletions R/align_timeseries.R
Original file line number Diff line number Diff line change
Expand Up @@ -21,14 +21,16 @@
#' t <- seq(0, 10, 0.1)
#' reference <- sin(t)
#' signal <- sin(t - 0.5) # Signal delayed by 0.5 units
#' lag <- find_time_lag(signal, reference)
#' lag <- find_lag(signal, reference)
#' print(lag) # Should be approximately 5 samples (0.5 units)
#'
#' @seealso \code{\link{align_time_series}} for applying the computed lag
#' @seealso \code{\link{align_timeseries}} for applying the computed lag
#'
#' @importFrom stats complete.cases ccf
#'
#' @export
find_time_lag <- function(signal, reference, max_lag = 5000, normalize = TRUE) {
complete_cases <- complete.cases(signal, reference)
find_lag <- function(signal, reference, max_lag = 5000, normalize = TRUE) {
complete_cases <- stats::complete.cases(signal, reference)
signal <- signal[complete_cases]
reference <- reference[complete_cases]

Expand All @@ -42,7 +44,7 @@ find_time_lag <- function(signal, reference, max_lag = 5000, normalize = TRUE) {
max_lag = length(signal) - 1
}

ccf_result <- ccf(signal, reference, plot = FALSE, lag.max = max_lag)
ccf_result <- stats::ccf(signal, reference, plot = FALSE, lag.max = max_lag)
best_lag <- ccf_result$lag[which.max(abs(ccf_result$acf))]

# Subtract one observation, which seems to be needed in tests
Expand All @@ -55,10 +57,10 @@ find_time_lag <- function(signal, reference, max_lag = 5000, normalize = TRUE) {
#'
#' This function aligns two time series by shifting one series relative to the
#' reference based on their cross-correlation. It first finds the optimal lag
#' using \code{find_time_lag}, then applies the shift by padding with NA values
#' using \code{find_lag}, then applies the shift by padding with NA values
#' as needed.
#'
#' @inheritParams find_time_lag
#' @inheritParams find_lag
#' @param signal Time series to align (numeric vector)
#' @param reference Reference time series to align against (numeric vector)
#'
Expand All @@ -81,7 +83,7 @@ find_time_lag <- function(signal, reference, max_lag = 5000, normalize = TRUE) {
#'
#' @export
align_timeseries <- function(signal, reference, max_lag = 5000, normalize = TRUE) {
lag <- find_time_lag(signal, reference, max_lag, normalize)
lag <- find_lag(signal, reference, max_lag, normalize)

if (lag > 0) {
aligned <- c(rep(NA, lag), signal[1:(length(signal)-lag)])
Expand Down
78 changes: 78 additions & 0 deletions R/classify_by_high_periods.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
#' Classifies Periods of High Activity in Time Series Using Peaks and Troughs
#'
#' @description
#' Identifies periods of high activity in a time series by analyzing peaks and troughs,
#' returning a logical vector marking these periods. The function handles special cases
#' like adjacent peaks and the initial/final sequences.
#'
#' @param x numeric vector; the time series values
#' @param peaks logical vector; same length as x, TRUE indicates peak positions
#' @param troughs logical vector; same length as x, TRUE indicates trough positions
#'
#' @return logical vector; TRUE indicates periods of high activity
#'
#' @details
#' The function performs the following steps:
#' 1. Resolves adjacent peaks by keeping only the highest
#' 2. Handles the initial sequence before the first trough
#' 3. Handles the final sequence after the last event
#' 4. Identifies regions between troughs containing exactly one peak
#'
#' @examples
#' \dontrun{
#' x <- c(1, 3, 2, 1, 4, 2, 1)
#' peaks <- c(FALSE, TRUE, FALSE, FALSE, TRUE, FALSE, FALSE)
#' troughs <- c(FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, TRUE)
#' classify_high_periods(x, peaks, troughs)
#' }
#'
#' @export
classify_high_periods <- function(x, peaks, troughs) {
cli::cli_abort("Doesn't currently work")
# # Input validation
# if (length(peaks) != length(troughs) || length(x) != length(peaks)) {
# cli::cli_abort("Lengths of x, peaks, and troughs must match")
# }
#
# n <- length(x)
# result <- logical(n)
#
# # First handle adjacent peaks - keep only highest
# peak_indices <- which(peaks)
# for(i in 1:(length(peak_indices)-1)) {
# # Look at all peaks until we find a trough
# for(j in (i+1):length(peak_indices)) {
# if(any(troughs[peak_indices[i]:peak_indices[j]])) break
# # Keep highest peak, remove others
# if(x[peak_indices[i]] <= x[peak_indices[j]]) {
# peaks[peak_indices[i]] <- FALSE
# break
# } else {
# peaks[peak_indices[j]] <- FALSE
# }
# }
# }
#
# # Handle start sequence
# first_event <- min(c(peak_indices[1], trough_indices[1]))
# result[1:first_event] <- ifelse(first_event == peak_indices[1], TRUE, FALSE)
#
# # End sequence
# last_event <- max(c(peak_indices[length(peak_indices)],
# trough_indices[length(trough_indices)]))
# result[last_event:n] <- ifelse(last_event == peak_indices[length(peak_indices)],
# TRUE, FALSE)
#
# # Find regions between troughs that have exactly one peak
# for(i in 1:(length(trough_indices)-1)) {
# current_trough <- trough_indices[i]
# next_trough <- trough_indices[i+1]
# peaks_between <- which(peaks[current_trough:next_trough])
#
# if(length(peaks_between) == 1) {
# result[(current_trough+1):(next_trough-1)] <- TRUE
# }
# }
#
# return(result)
}
54 changes: 54 additions & 0 deletions R/classify_by_low_periods.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
#' Classifies Periods of Low Activity in Time Series Using Peaks and Troughs
#'
#' @description
#' Identifies periods of low activity in a time series by analyzing peaks and troughs,
#' returning a logical vector marking these periods. Low activity periods are defined
#' as regions between consecutive troughs that contain no peaks.
#'
#' @param peaks logical vector; TRUE indicates peak positions
#' @param troughs logical vector; same length as peaks, TRUE indicates trough positions
#'
#' @return logical vector; TRUE indicates periods of low activity
#'
#' @details
#' The function performs the following steps:
#' 1. Validates input lengths
#' 2. Initializes all periods as potentially low activity (TRUE)
#' 3. For each pair of consecutive troughs:
#' - If no peaks exist between them, maintains TRUE for that period
#' - If any peaks exist, marks that period as FALSE (not low activity)
#'
#' @examples
#' peaks <- c(FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE)
#' troughs <- c(FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, TRUE)
#' classify_low_periods(peaks, troughs)
#'
#' @export
classify_low_periods <- function(peaks, troughs) {
# Input validation
if (length(peaks) != length(troughs)) {
cli::cli_abort("Lengths of peaks and troughs must match")
}

# Initialize output vector
result <- rep(TRUE, length(peaks))

# Find indices of troughs
trough_indices <- which(troughs)

# For each consecutive pair of troughs
for (i in seq_len(length(trough_indices) - 1)) {
start_idx <- trough_indices[i]
end_idx <- trough_indices[i + 1]

# Check if there are any peaks between these troughs
between_slice <- peaks[(start_idx + 1):(end_idx - 1)]

if (length(between_slice) > 0 && !any(between_slice)) {
# If no peaks between troughs, set those positions to FALSE
result[(start_idx + 1):(end_idx - 1)] <- FALSE
}
}

return(result)
}
11 changes: 7 additions & 4 deletions R/classify_by_stability.R
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,9 @@
#' - 1: High activity state
#' - 0: Low activity state
#' - NA: Unable to classify (usually due to missing data)
#'
#' @importFrom stats quantile qnorm median mad
#'
#' @export
classify_by_stability <- function(speed,
window_size = 30,
Expand Down Expand Up @@ -78,7 +81,7 @@ classify_by_stability <- function(speed,
))

# Find baseline statistics using stable periods
var_threshold <- quantile(roll_var, 0.75, na.rm = TRUE)
var_threshold <- stats::quantile(roll_var, 0.75, na.rm = TRUE)
stable_periods <- !is.na(roll_var) & roll_var < var_threshold

rle_obj <- rle(stable_periods)
Expand Down Expand Up @@ -109,7 +112,7 @@ classify_by_stability <- function(speed,
baseline_sd <- sd(speed[baseline_start:baseline_end], na.rm = TRUE)

# Convert tolerance to threshold using inverse normal CDF
threshold_multiplier <- qnorm(1 - tolerance)
threshold_multiplier <- stats::qnorm(1 - tolerance)
threshold <- baseline_mean + threshold_multiplier * baseline_sd

# Initial classification
Expand Down Expand Up @@ -154,8 +157,8 @@ classify_by_stability <- function(speed,

stable_means <- roll_mean[stable_mask]
list(
level = median(stable_means, na.rm = TRUE),
spread = mad(stable_means, na.rm = TRUE)
level = stats::median(stable_means, na.rm = TRUE),
spread = stats::mad(stable_means, na.rm = TRUE)
)
}

Expand Down
Loading