clustering

Functions

`calc_perm_variance`(pca, embeddings_df[, ...])	Calculates the variance explained for a PCA of the permuted data.
`get_optimal_n_components`(embeddings[, ...])	Calculates the optimal number of principal components to keep in a dimension reduction situation.
`hdbscan_clustering`(reduced[, ...])	Uses HDBSCAN to calculate clusters from the reduced data.
`kmeans_clustering`(reduced[, num_clusters, ...])	This function calculates clusters based on the reduced vectors.
`reduce_dimensions_pca`(embeddings[, dimensions])	Reduces the number of dimensions using PCA.
`reduce_dimensions_umap`(embeddings[, ...])	Uses UMAP to reduce the dimensionality of the embeddings.
`shuffle`(df)	Shuffles the data by each column or row for a pandas dataframe.
`single_sample_t_test`(sample[, population_stat])	Run a simple t test on a sample to see if it is significantly different from the population mean.