Implemented Algorithms
The algorithms currently implemented in the package can be divided in five broad groups.
Intrinsic dimension estimation
These algorithms estimate the intrinsic dimension of the data manifold i.e., the minimum number of coordinates needed to describe the manifold without a significant loss of information. The algorithms currently implemented are:
Two NN (“Two nearest neighbour estimator”)
Gride (“Generalized ratios id estimator”)
Density estimation
These algorithms estimate the density profile from which the dataset was sampled. The algorithms currently implemented are:
k-NN (“k-nearest neighbours estimator”)
PAk (“Point adaptive k-NN estomator”)
k*-NN (“k-star nearest neighbours estimator”)
point-adaptive mean-shift gradient estimator
BMTI (“Binless Multidimensional Thermodynamic Integration”)
Density based clustering
These algorithms find the statistically significant peaks of the density profile and use this information to divide the dataset into clusters of data. The algorithms currently implemented are:
DP (“Density peaks clustering”)
ADP(“Advanced density peaks clustering”)
Metric space comparison
These algorithms estimate and quantify whether two spaces endowed with a distance measure are equivalent or not, and whether one space is more informative than the other. The algorithms currently implemented are:
Neighbourhood overlap
Information imbalance
Feature weighting / Differentiable Information Imbalance
This algorithm estimates the information content of an input set of features with respect to a ground truth (which can be the full set), assigning an optimal weight to each feature. The weights can take the value zero, leading to feature selection. The JAX implementation of this method, in the class DiffImbalance, is not compatible with Python versions lower than 3.9. The algorithm currently implemented is:
DII (“Differentiable Information Imbalance”)