Run the tSNE algorithm (using Rtsne::Rtsne())

Method to run a tSNE dimensionality reduction algorithm. A tSNE (t-distributed stochastic neighbor embedding) plot is a useful means to visualise data. As it is a dimensionality reduction algorithm, some data will be lost. It is good practice to validate any populations (namely through manual gating). Output data will be "tsne.res". Uses the R package "Rtsne" to calculate plots.

Usage

run.tsne(dat, use.cols, tsne.x.name, tsne.y.name, tsne.seed, 
dims, initial_dims, perplexity, theta, check_duplicates, pca, 
max_iter, verbose, is_distance, Y_init, stop_lying_iter, 
mom_switch_iter, momentum, final_momentum, eta, exaggeration_factor)

Arguments

dat: NO DEFAULT. data.frame.
use.cols: NO DEFAULT. Vector of numbers, reflecting the columns to use for dimensionality reduction.
tsne.x.name: DEFAULT = "tSNE_X". Character. Name of tSNE x-axis.
tsne.y.name: DEFAULT = "tSNE_Y". Character. Name of tSNE y-axis.
tsne.seed: DEFAULT = 42. Numeric. Seed value for reproducibility.
dims: DEFAULT = 2. Number of dimensions for output results, either 2 or 3.
initial_dims: DEFAULT = 50. Number of dimensions retained in initial PCA step.
perplexity: DEFAULT = 30.
theta: DEFAULT = 0.5. Use 0.5 for Barnes-Hut tSNE, 0.0 for exact tSNE (takes longer).
check_duplicates: DEFAULT = FALSE.
pca: DEFAULT = TRUE. Runs PCA prior to tSNE run.
max_iter: DEFAULT = 1000. Maximum number of iterations.
verbose: DEFAULT = TRUE.
is_distance: DEFAULT = FALSE. Experimental, using X as a distance matrix.
Y_init: DEFAULT = NULL. Recommend NULL for random initialisation.
stop_lying_iter: DEFAULT = 250. Number of iterations of early exaggeration.
mom_switch_iter: DEFAULT = 250. Number of iterations before increased momentum of spread.
momentum: DEFAULT = 0.5. Initial momentum of spread.
final_momentum: DEFAULT = 0.8. Momentum of spread at 'final_momentum'.
eta: DEFAULT = 200. Learning rate.
exaggeration_factor: DEFAULT = 12.0. Factor used during early exaggeration.

Author

Felix Marsh-Wakefield, felix.marsh-wakefield@sydney.edu.au

Examples

# Run tSNE on a subset of the  demonstration dataset

cell.dat <- do.subsample(Spectre::demo.clustered, 10000) # Subsample the demo dataset to 10000 cells
cell.dat <- Spectre::run.tsne(dat = cell.dat,
                              use.cols = c("NK11_asinh", "CD3_asinh", 
                              "CD45_asinh", "Ly6G_asinh", "CD11b_asinh",
                               "B220_asinh", "CD8a_asinh", 
                               "Ly6C_asinh", "CD4_asinh"))
#> Performing PCA
#> Read the 10000 x 9 data matrix successfully!
#> OpenMP is working. 1 threads.
#> Using no_dims = 2, perplexity = 30.000000, and theta = 0.500000
#> Computing input similarities...
#> Building tree...
#>  - point 10000 of 10000
#> Done in 1.44 seconds (sparsity = 0.012680)!
#> Learning embedding...
#> Iteration 50: error is 97.360072 (50 iterations in 1.56 seconds)
#> Iteration 100: error is 86.988747 (50 iterations in 1.57 seconds)
#> Iteration 150: error is 83.922629 (50 iterations in 1.30 seconds)
#> Iteration 200: error is 83.434722 (50 iterations in 1.37 seconds)
#> Iteration 250: error is 83.261604 (50 iterations in 1.40 seconds)
#> Iteration 300: error is 3.176013 (50 iterations in 1.22 seconds)
#> Iteration 350: error is 2.820719 (50 iterations in 1.18 seconds)
#> Iteration 400: error is 2.621339 (50 iterations in 1.17 seconds)
#> Iteration 450: error is 2.489056 (50 iterations in 1.18 seconds)
#> Iteration 500: error is 2.393128 (50 iterations in 1.18 seconds)
#> Iteration 550: error is 2.319920 (50 iterations in 1.19 seconds)
#> Iteration 600: error is 2.262865 (50 iterations in 1.20 seconds)
#> Iteration 650: error is 2.216526 (50 iterations in 1.20 seconds)
#> Iteration 700: error is 2.179152 (50 iterations in 1.20 seconds)
#> Iteration 750: error is 2.149116 (50 iterations in 1.21 seconds)
#> Iteration 800: error is 2.125685 (50 iterations in 1.21 seconds)
#> Iteration 850: error is 2.107875 (50 iterations in 1.21 seconds)
#> Iteration 900: error is 2.094636 (50 iterations in 1.21 seconds)
#> Iteration 950: error is 2.083936 (50 iterations in 1.21 seconds)
#> Iteration 1000: error is 2.074844 (50 iterations in 1.21 seconds)
#> Fitting performed in 25.18 seconds.