The neigh_graph module
The neighbourhood_graph module contains the NeighGraph class.
It contains different methods and attributes which allow to exploit the structure of the directed neighbourhood graph.
- class neigh_graph.NeighGraph(coordinates=None, distances=None, maxk=None, verbose=False, n_jobs=2)[source]
Computes the directed neighbourhood graph (DNG) based on the kstar optimal neighbourhood selection and other DNG-based quantities. Inherits from class KStar. The DNG is stored in nind_list and can be retrieved using the kstar (inherited from Kstar class) and nind_iptr attributes. Can compute and store distances and vector differences between nodes connected on the DNG. Can compute and store the number of points in common in the neighbourhoods of couples of nodes connected on the DNG. Can use the common neighbours to give a geometric estimate of the overlap between neighbourhoods (various methods implemented).
- nspar
total number of edges in the (sparse) directed graph defined by kstar i.e. the sum over all points of (kstar -1).
- Type:
int
- nind_list
size nspar x 2. Each row contains a couple of indices of edges connected in the DNG stored in order of increasing point index and increasing neighbour rank. Therefore, nind_list assigns a unique index from 0 to nspar-1 to all the edges in the DNG.
- Type:
np.ndarray(int), optional
- nind_iptr
size N+1. For each element i, it stores the 0-th index in nind_list at which all the edges of the form [i,.] (i.e. the ones connecting point i to its neighbours) start. The last entry is set to nind_list.shape[0].
- Type:
np.array(int), optional
- common_neighs_array
size nspar. At position p, it contains the total number of points in common between the neighbourhoods of the two points forming the p-th directed edge of the neighbourhood graph.
- Type:
np.array(int), optional
- common_neighs_mat
size N x N. Entry (i,j) contains the total number of points in common between the neighbourhoods of points i and j.
- Type:
np.ndarray(float), optional
- neigh_similarity_index
size nspar. At position p, it contains an estimate of the overlap between the neighbourhoods of the two points forming the p-th directed edge of the neighbourhood graph.
- Type:
np.ndarray(float), optional
- neigh_similarity_index_mat
size N x N. Entry (i,j) contains an estimate of the overlap between the neighbourhoods of points i and j if (i,j) is an edge of the directed neighbourhood graph; it contains a 0 otherwise.
- Type:
np.ndarray(float), optional
- neigh_vector_diffs
size nspar x dims. At position p, it stores the vector difference from point nind_list[p,0] to point nind_list[p,1].
- Type:
np.ndarray(float), optional
- neigh_dists
size nspar. Stores the distances from each point to its k*-1 nearest neighbors in the order defined by nind_list.
- Type:
np.array(float), optional
- compute_common_neighs(comp_common_neighs_mat=False)[source]
Compute the common number of neighbours between the couple of points (i,j) such that j is in the neighbourhod of i.
The numbers are stored in common_neighs_array. If the flag comp_common_neighs_mat has value True, also the symmetric matrix common_neighs_mat is computed.
- compute_neigh_dists()[source]
Compute the (directed) neighbour distances graph using kstar[i] neighbours for each point i.
Distances are stored in the neigh_dists array.
- compute_neigh_indices()[source]
Compute indices of all couples [i,j] where j is a neighbour of i up to k*-th nearest (excluded). The couples of indices are stored in nind_list. Also compute and fill the attributes nspar (the 0-th shape of nind_list) nind_iptr.
- compute_neigh_similarity_index(method='jaccard')[source]
Compute an estimate of the overlaps between the neighbourhoods of the points connected by edges on the DNG, with values from 0 to 1, and stores them in the neigh_similarity_index attribute. See also the documentation for the neigh_similarity_index attribute for completeness.
- Parameters:
method (str) – currently implemented “jaccard”, “geometric”, “squared_geometric”.
Ω_2. (Let us denote the neighbourhoods of points 1 and 2 respectively by the sets Ω_1 and)
k_1 (Then k_1 = #Ω_1 and k_2 = #Ω_2 are the neighbourhood sizes and)
points (2 = Ω_1 ∩ Ω_2 the number of)
common_neighs_mat[1 (in common between the two neighbourhoods (which can be read off at)
2]
are (if common_neighs_mat has been computed). The methods to compute the neighbourhood similarity index)
"jaccard" – p_1,2 = k_1,2 / (k_1 + k_2 - k_1,2) = #(Ω_1 ∩ Ω_2) / #(Ω_1 ∪ Ω_2), i.e. the Jaccard index
"geometric" – p_1,2 = k_1,2 / sqrt(k_1 * k_2), i.e. the number of common points divided by the geometric mean
geometric" ("squared) – p_1,2 = (k_1,2)^2 / (k_1 * k_2), i.e. the square of the “geometric” version
- compute_neigh_similarity_index_mat(method=None, sparse_mat=False)[source]
Compute, for any couple (i,j) of points connected on the directed neighbourhood graph, an estimate of the overlaps between the neighbourhoods of the points connected by edges on the DNG, with values from 0 to 1 and stores them in the neigh_similarity_index_mat matrix attribute.
- Parameters:
method (str) – currently implemented “jaccard”, “geometric”, “squared_geometric”.
Ω_2. (Let us denote the neighbourhoods of points 1 and 2 respectively by the sets Ω_1 and)
k_1 (Then k_1 = #Ω_1 and k_2 = #Ω_2 are the neighbourhood sizes and)
points (2 = Ω_1 ∩ Ω_2 the number of)
common_neighs_mat[1 (in common between the two neighbourhoods (which can be read off at)
2]
are (if common_neighs_mat has been computed). The methods to compute the neighbourhood similarity index)
"jaccard" – p_1,2 = k_1,2 / (k_1 + k_2 - k_1,2) = #(Ω_1 ∩ Ω_2) / #(Ω_1 ∪ Ω_2), i.e. the Jaccard index
"geometric" – p_1,2 = k_1,2 / sqrt(k_1 * k_2), i.e. the number of common points divided by the geometric mean
geometric" ("squared) – p_1,2 = (k_1,2)^2 / (k_1 * k_2), i.e. the square of the “geometric” version
sparse_mat (bool) – if True, the matrix is returned in sparse format (scipy.sparse.lil_matrix). If False, it
format (is returned in dense)
- compute_neigh_vector_diffs()[source]
Compute the vector differences from each point to its kstar nearest neighbors.
The resulting vectors are stored in neigh_vector_diffs.
- return_sparse_distance_graph()[source]
Return the (directed) neighbour distances graph as a N x N scipy sparse csr_matrix form.
If the attribute neigh_dists is not assigned, invokes method compute_neigh_dists.
- set_kstar(k=0)[source]
Set all elements of kstar to a fixed value k.
Overload set_kstar of the superior class (Kstar). First, call the set_kstar from the KStar class. Then also reset all other NeighGraph attributes depending on kstar to None.
- Parameters:
k – number of neighbours used to compute the density. It can be an iteger or an array of integers