Implemented Algorithms

The algorithms currently implemented in the package can be divided in four broad groups.

Intrinsic dimension estimation

These algorithms estimate the intrinsic dimension of the data manifold i.e., the minimum number of coordinates needed to describe the manifold without a significant loss of information. The algorithms currently implemented are:

  • Two NN (“Two nearest neighbour estimator”)

  • Gride (“Generalized ratios id estimator”)

Density estimation

These algorithms estimate the density profile from which the dataset was sampled. The algorithms currently implemented are:

  • k-NN (“k-nearest neighbours estimator”)

  • PAk (“Point adaptive k-NN estomator”)

  • k*-NN (“k-star nearest neighbours estimator”)

Density based clustering

These algorithms find the statistically significant peaks of the density profile and use this information to divide the dataset into clusters of data. The algorithms currently implemented are:

  • DP (“Density peaks clustering”)

  • ADP(“Advanced density peaks clustering”)

Metric space comparison

These algorithms estimate and quantify whether two spaces endowed with a distance measure are equivalent or not, and whether one space is more informative than the other. The algorithms currently implemented are:

  • Neighbourhood overlap

  • Information imbalance

Feature weighting

These algorithms estimate the information content of each feature in the dataset with respect to a ground truth (or the full set), and assign a weight to each feature. The weights can take one the value zero, leading to feature selection. The algorithm currently implemented is:

  • DII (“Differentiable Information Imbalance”)