Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancements to Subsampling Feature in tidygate #1

Merged
merged 2 commits into from
Nov 27, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ Depends:
R (>= 3.6.0)
Encoding: UTF-8
LazyData: true
RoxygenNote: 7.1.1
RoxygenNote: 7.2.3
Imports:
utils,
graphics,
Expand Down
22 changes: 20 additions & 2 deletions R/functions_chr_int.R
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ pretty_plot_chr_int = function(.data,
.color = NULL,
.shape = NULL,
.size = NULL,
subsample = NULL,
opacity = 1,
is_size_fixed) {
# Comply with CRAN NOTES
Expand All @@ -58,8 +59,19 @@ pretty_plot_chr_int = function(.data,
.shape = enquo(.shape)
.size = enquo(.size)
my_size_range = c(1, 3)



if (!is.null(subsample)) {
if (subsample > 0 && subsample < 1) {
# Subsample as a fraction
set.seed(123) # Set seed for reproducibility
.data <- .data %>% sample_frac(subsample)
} else if (subsample >= 1) {
# Subsample as a fixed number
set.seed(123) # Set seed for reproducibility
.data <- .data %>% sample_n(subsample)
}
}


.data_formatted =
.data %>%
Expand Down Expand Up @@ -263,6 +275,7 @@ pretty_plot_chr_int = function(.data,
#' @param .color A column symbol. Color of points
#' @param .shape A column symbol. Shape of points
#' @param .size A column symbol. Size of points
#' @param subsample A numeric value or a fraction indicating the subset of data to sample.
#' @param opacity A number between 0 and 1. The opacity level of the data points
#' @param how_many_gates An integer. The number of gates to label
#' @param gate_list A list of gates. It is returned by gate function as attribute \"gate\". If you want to create this list yourself, each element of the list is a data frame with x and y columns. Each row is a coordinate. The order matter.
Expand All @@ -277,6 +290,7 @@ gate_interactive_chr_int <-
.color = NA,
.shape = NULL,
.size = NULL,
subsample = NULL,
opacity = 1,
how_many_gates = 1,
is_size_fixed,
Expand Down Expand Up @@ -336,6 +350,8 @@ gate_interactive_chr_int <-
# size can be number or column
.size = !!.size,

subsample = subsample,

opacity = opacity,
is_size_fixed = is_size_fixed
)
Expand Down Expand Up @@ -427,6 +443,7 @@ gate_programmatic_chr_int <-
.color = NULL,
.shape = NULL,
.size = NULL,
subsample = NULL,
opacity = 1,
how_many_gates = 1,
.group_by = NULL,
Expand Down Expand Up @@ -484,6 +501,7 @@ gate_programmatic_chr_int <-
.color = .color,
.shape = .shape,

subsample =subsample,
# size can be number of column
.size = .size,

Expand Down
7 changes: 7 additions & 0 deletions R/methods.R
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
#' @param .color A column symbol. Colour of points
#' @param .shape A column symbol. Shape of points
#' @param .size A column symbol. Size of points
#' @param subsample A numeric value or a fraction indicating the subset of data to sample.
#' @param opacity A number between 0 and 1. The opacity level of the data points
#' @param how_many_gates An integer. The number of gates to label
#' @param .group_by A column symbol. The column that is used to calculate distance (i.e., normally genes)
Expand Down Expand Up @@ -73,6 +74,7 @@ gate_chr <- function(.dim1,
.color = NULL,
.shape = NULL,
.size = NULL,
subsample = NULL,
opacity = 1,
how_many_gates = 1,
.group_by = NULL,
Expand All @@ -95,6 +97,7 @@ gate_chr.numeric = function( .dim1,
.color = NULL,
.shape = NULL,
.size = NULL,
subsample = NULL,
opacity = 1,
how_many_gates = 1,
.group_by = NULL,
Expand All @@ -109,6 +112,7 @@ gate_chr.numeric = function( .dim1,
.color = .color,
.shape = .shape,
.size = .size,
subsample = subsample,
opacity = opacity,
how_many_gates = how_many_gates,
.group_by = .group_by,
Expand Down Expand Up @@ -136,6 +140,7 @@ gate_int <- function(.dim1,
.color = NULL,
.shape = NULL,
.size = NULL,
subsample = NULL,
opacity = 1,
how_many_gates = 1,
.group_by = NULL,
Expand All @@ -157,6 +162,7 @@ gate_int.numeric = function( .dim1,
.color = NULL,
.shape = NULL,
.size = NULL,
subsample = NULL,
opacity = 1,
how_many_gates = 1,
.group_by = NULL,
Expand All @@ -170,6 +176,7 @@ gate_int.numeric = function( .dim1,
.color = .color,
.shape = .shape,
.size = .size,
subsample = subsample,
opacity = opacity,
how_many_gates = how_many_gates,
.group_by = .group_by,
Expand Down
11 changes: 11 additions & 0 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -118,4 +118,15 @@ tidygate_data %>%

```

## Subsampling

The `subsample` parameter allows you to sample a subset of your data for analysis with a fixed seed. This can be particularly useful for large datasets or for preliminary exploratory analysis. Here's how you can use it:

```{r}
# For sampling 50% of your data
result <- your_function(data, subsample = 0.5)

# For sampling 100 specific observations
result <- your_function(data, subsample = 100)

```
21 changes: 21 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -131,3 +131,24 @@ tidygate_data %>%
## 9 adrenal 1 fibrobla… immune_c… 0.52 ACC -0.776 -0.383 0
## 10 adrenal 1 immune_c… endothel… 1 ACC 0.980 -0.116 0
## # … with 2,230 more rows

## Subsampling

The `subsample` parameter allows you to sample a subset of your data for analysis with a fixed seed. This can be particularly useful for large datasets or for preliminary exploratory analysis. Here's how you can use it:

```r
# For sampling 50% of your data
tidygate_data %>%
mutate( gate = gate_chr(
Dim1, Dim2, subsample = 0.5
))


# For sampling 100 specific observations
tidygate_data %>%
mutate( gate = gate_chr(
Dim1, Dim2, subsample = 100
))


```
4 changes: 4 additions & 0 deletions man/gate_chr-methods.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 3 additions & 0 deletions man/gate_chr.numeric.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 3 additions & 0 deletions man/gate_int.numeric.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 3 additions & 0 deletions man/gate_interactive_chr_int.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.