Structural Network Properties
This section describes the large collection of algorithms included in NOESIS for computing different structural network properties. These metrics provide a glimpse into the network topology and structure.
NOESIS is able to compute a large collection of structural network properties. Implementations for such structural properties are provided within the noesis.analysis.structure package.
Computing structural network properties
Structural network properties are implemented as NOESIS tasks, providing a unified interface for these metrics. Computing a structural network property requires instantiating the corresponding task first.
For example, in order to compute the node Katz centrality we would create an instance of the KatzCentrality class as follows:
KatzCentrality katzCentralityTask = new KatzCentrality(network);
Once the task has been created, we can use it to compute the structural network property scores. NOESIS tasks can be synchronously or asynchronously computed. The synchronous alternative blocks the current thread until the metric is computed. Synchronously computing a network property can be done using the call method as follows:
NodeScore scores = katzCentralityTask.call();
This operation returns a NodeScore object, which is a vector of node scores. We can easily get the score associated with a node using the get method as shown below:
double nodeScore = scores.get(nodeIndex);
The asynchronous computation of the structural network property is slightly more involved, but it allows a better use of concurrent system resources. NOESIS supports asynchronous tasks through futures, which provide an interface for representing the result of the asynchronous task. In order to schedule the computation of the structural network property algorithm and retrieve the associated future, we would call the getFuture method as follows:
Future<NodeScore> katzCentralityFuture = katzCentralityTask.getFuture();
This method call launches the algorithm execution, but it does not block the current thread. In order to retrieve the computed scores, we need to use the get method implemented by the Future, as shown below:
NodeScore scores = katzCentralityFuture.get();
If this call is carried out when the execution of the algorithm has already finished, it directly returns the scores associated to the nodes. However, if the algorithm is still being executed, this call blocks the current thread until the computation finishes.
In the previous example, we have shown how to compute a node-related structural network property. NOESIS also supports several link-related metrics through the interface provided by the LinkScoreTask class. Both interfaces provided by NodeScoreTask and LinkScoreTask are very similar, but the latter returns a LinkScore object instead of a NodeScore object.
For example, we would compute link betweenness as follows:
LinkBetweenness linkBetweennessTask = new LinkBetweenness(network);
LinkScore linkBetweennessScores = linkBetweennessTask.call();
Given the computed LinkScore object, we can obtain the link betweenness of a link as shown below:
double score = linkBetweennessScores.get(sourceNodeIndex, targetNodeIndex);
Creating a custom structural network property
NOESIS can be extended with custom structural network properties. In this section we describe how to create metrics both for nodes and for links.
Creating a custom node property
In order to create a node-related structural network property, we must implement a class extending the NodeScoreTask class. This requires implementing the compute method, which takes the index of a node and returns the score associated to it, as shown in the following example:
public class MyLocalNodeScore extends NodeScoreTask
{
public MyLocalNodeScore(Network network)
{
super(network);
}
@Override
public double compute(int node)
{
double nodeScore = ...; // Compute the node score here
return nodeScore;
}
}
However, many metrics require computing all node scores at once. This type of network structure properties can be implemented as shown below:
public class MyGlobalNodeScore extends NodeScoreTask
{
public MyGlobalNodeScore(Network network)
{
super(network);
}
@Override
public void compute ()
{
Network network = getNetwork();
NodeScore score = new NodeScore(this,network);
// Implement your algorithm here
...
// Set the score for each a node as follows
int nodeIndex = ...
double value = ...
score.set(nodeIndex, value);
setResult(score);
}
@Override
public double compute(int node)
{
checkDone();
return getResult(node);
}
}
Creating a custom link property
The implementation of a custom link metric is similar. You must extend the LinkScoreTask class, which requires implementing the compute method. This time, the compute method takes the index of the source and the target nodes of the link as parameters. The batch computation of all link scores can be carried out in a similar fashion to the previous example, as shown below:
public class MyGlobalLinkScore extends LinkScoreTask
{
public MyGlobalLinkScore(Network network)
{
super(network);
}
@Override
public void compute ()
{
Network network = getNetwork();
LinkScore score = new LinkScore(this,network);
// Implement your algorithm here
...
// You can set the score of a link as follows
int sourceNodeIndex = ...
int targetNodeIndex = ...
double value = ...
score.set(sourceNodeIndex, targetNodeIndex, value);
setResult(score);
}
@Override
public double compute(int source, int destination)
{
checkDone();
return getResult().get(source, destination);
}
}
Degree-related properties
Node degree is a central concept in network analysis. This subsection covers different structural network properties related to node degree, including different types of degrees and metrics for degree assortativity.
Degree
The degree of a node is defined as number of links the node has. NOESIS provides three different metrics for node degree: InDegree, OutDegree, and Degree, which measure node degree considering incoming links, outgoing links, and both types of links, respectively. A normalized version of these metrics, which computes the fraction of nodes in the network a node is connected to, is also provided with NormalizedInDegree, NormalizedOutDegree, and NormalizedDegree.
An example of how to compute the out-degree of a given node is shown below:
Network network = ...
OutDegree outDegreeTask = new OutDegree(network);
int nodeIndex = ...
int nodeOutDegree = (int) outDegreeTask.compute(nodeIndex);
Degree assortativity
Assortativity is the tendency of nodes to connect with other similar ones. This similarity is usually defined in terms of node degree: nodes with a high degree tend to form links with other high-degree nodes, whereas low-degree nodes do the same between themselves. This metric, implemented in the DegreeAssortativity class, is based on the idea of measuring the Pearson correlation between the degree of linked nodes in the network.
The following example shows how to compute the overall network degree assortativity:
Network network = ...
DegreeAssortativity degreeAssortativityTask = new DegreeAssortativity(network);
double assortativity = degreeAssortativityTask.networkAssortativity();
Since this task is defined as a node metric, we can obtain the contribution of each node, which is known as local assortativity, to the overall assortativity. This local assortativity can be computed as shown below:
NodeScore nodeAssortativityContrib = degreeAssortativityTask.call();
By definition, this metric tends to favor higher local assortativity scores for low-degree nodes compared to high-degree nodes. NOESIS implements a variant, proposed in the scientific literature, which overcomes this limitation. This technique is implemented by the UnbiasedDegreeAssortativity class, which has the same interface than the DegreeAssortativity class.
References:
- Piraveenan, M., Prokopenko, M., & Zomaya, A. Y. (2008). Local assortativeness in scale-free networks. Europhysics Letters, 84(2), 28002. DOI: 10.1209/0295-5075/84/28002.
- Piraveenan, M., Prokopenko, M., & Zomaya, A. Y. (2010). Classifying complex networks using unbiased local assortativity. In ALIFE (pp. 329-336).
Reachability properties
Node reachability measures how well a node can be reached from the other nodes in the network. These techniques revolve around the concept of path length between pairs of nodes. NOESIS implements different approaches for measuring node reachability, which are introduced in this section.
Eccentricity
The eccentricity of a node is defined as the length of the longest shortest path to any other node in the network. Note that the highest eccentricity score in the network is equal to the network diameter.
This metric can be computed using the Eccentricity class as shown below:
Network network = ...
Eccentricity eccentricityTask = new Eccentricity(network);
NodeScore eccentricity = eccentricityTask.call();
References:
- Hage, P. and Harary, F. (1995). Eccentricity and centrality in networks. Social Networks, 17(1):57–63. DOI: 10.1016/0378-8733(94)00248-9.
Average path length
This metric defines node reachability as the average length of the shortest path to all the other nodes in the network.
An example of how to use this metric, implemented by the AveragePathLength class, is shown in the following code snippet:
Network network = ...
AveragePathLength aplTask = new AveragePathLength(network);
NodeScore apl = aplTask.call();
Closeness
The closeness of a node is defined as the inverse of the average path length to other nodes.
The Closeness class, which implements this metric, can be used as shown below:
Network network = ...
Closeness closenessTask = new Closeness(network);
NodeScore closeness = closenessTask.call();
References:
- Freeman, L. C. (1978). Centrality in social networks conceptual clarification. Social networks, 1(3), 215-239. DOI: 10.1016/0378-8733(78)90021-7.
Decay
This metric, implement by the Decay class, defines node reachability as the sum of the exponential decay of the path length to other nodes in the network. This algorithms takes a decay parameter, with a value between 0 and 1, indicating the decay degree of the path length between pairs of nodes. On the one hand, if decay is set to 0, all paths would count equally regardless of their length. On the other hand, if decay is set to 1, there would be no decay at all, contributing each path according to its specific length.
An example of how to compute this structural network property, using a decay value of 0.4, is shown below:
Network network = ...
Decay decayTask = new Decay(network, 0.4);
NodeScore decay = decayTask.call();
References:
- Jackson, M. O. (2008). Social and economic networks. Princeton University Press, Princeton, NJ, USA.
Betweenness
Node betweenness has been widely studied in the field of network theory. This structural network property measures node centrality according to the amount of shortest paths between pairs of nodes that contain a given node.
In NOESIS, the classical Freeman's betweenness is implemented by the Betweenness class. An example how to use this class in shown in the following code snippet:
Network network = ...
Betweenness betweennessTask = new Betweenness(network);
NodeScore betweenness = betweennessTask.call();
Since betweenness depends on the number of nodes in the network, this metric could be biased if there are different strongly-connected component in the network. An alternative definition of betweenness, which adjusts Betweenness centrality to component sizes, is provided by AdjustedBetweenness.
NOESIS also provides a normalized implementation of AdjustedBetweenness, yielding values between 0 and 1, through the NormalizedBetweenness class.
References:
- Freeman, L. C. (1977). A set of measures of centrality based on betweenness. Sociometry, 35-41. DOI: 10.2307/3033543.
Influence metrics
Several algorithms have been proposed in the scientific literature to measure the influence of a node within a network. In this section, different algorithms implemented in NOESIS for measuring node influence are introduced, including examples of how to use them.
PageRank
PageRank is a popular algorithm for measuring the importance of each node in a network. This algorithm iteratively estimates the probability distribution of reaching each node through random walks starting from the other nodes in the network. PageRank also introduces a damping factor, with a value between 0 and 1, indicating the probability of moving to an adjacent node in the random walk versus jumping to a random node in the network.
The following example shows how to compute PageRank using a damping factor of 0.9 for a given network:
Network network = ...
PageRank pageRankTask = new PageRank(network,0.9);
NodeScore pageRank = pageRankTask.call();
References:
- Page, L., Brin, S., Motwani, R., & Winograd, T. (1998). The PageRank citation ranking: bringing order to the web. Technical report, Stanford InfoLab.
HITS
The Hyperlink-Induced Topic Search (HITS) algorithm measures node relevance using two concepts: hubs and authorities. On the one hand, hubs are nodes linked to many other nodes. On the other hand, authorities are nodes linked by many hubs. The HITS algorithm iteratively computes a pair of scores for each node: one for the degree of being a hub and other for the degree of being an authority.
An example of how to use this metric is shown below:
Network network = ...
HITS hitsTask = new HITS(network);
List<NodeScore> hits = hitsTask.call();
NodeScore hub = hits.get(0);
NodeScore authority = hits.get(1);
Note that this metric, which is implemented as a NodeScoreGroupTask, returns a list of NodeScore objects. The first position in this list contains the NodeScore object that represents the hub centrality of each node, while the second position contains the NodeScore that represents the authority centrality of each node.
References:
- Kleinberg, J. M. (1999). Authoritative sources in a hyperlinked environment. Journal of the ACM, 46(5), 604-632. DOI: 10.1145/324133.324140.
Eigenvector centrality
As other influence metrics previously described, eigenvector centrality assigns a higher influence to nodes linked to other high-influence nodes. This structural network property uses the power iteration algorithm on the network adjacency matrix in order to find its greatest eigenvalue and its associated eigenvector. Each element of this eigenvector can be considered as a measure of node influence for the associated node.
An example of how to use this metric is shown in the following code snippet:
Network network = ...
EigenvectorCentrality ecTask = new EigenvectorCentrality(network);
NodeScore centrality = ecTask.call();
Katz centrality
Katz centrality is a generalization of eigenvector centrality, as this technique also uses the power iteration algorithm to find the eigenvector associated to the greatest eigenvalue of the adjacency matrix. However, Katz centrality introduces two parameters in this process. On the one hand, an attenuation factor alpha regulates the influence of each path between pairs of nodes. On the other hand, a beta parameter controlling the initial node centrality.
An example of how to compute Katz centrality with alpha and beta set to 0.1 and 0.5, respectively, is shown in the following code fragment:
Network network = ...
KatzCentrality katzTask = new KatzCentrality(network, 0.1, 0.5);
NodeScore centrality = katzTask.call();
References:
- Katz, L. (1953). A new status index derived from sociometric analysis. Psychometrika, 39–43. DOI: 10.1007/BF02289026.
Diffusion centrality
Diffusion centrality measures node influence in terms of the number of paths with length equal to or less than a value indicated by a parameter. This model also requires specifying a passing probability parameter, which determines the degree of influence for paths with different length.
The following example shows how to compute this structural network property with paths of length up to 4 and a passing probability of 0.2:
Network network = ...
DiffusionCentrality diffCentTask = new DiffusionCentrality(network, 0.2, 4);
NodeScore centrality = diffCentTask.call();
LeaderRank
The LeaderRank algorithm is very similar to PageRank, with the difference that, in LeaderRank, all nodes are connected to a virtual ground node through bidirectional links. The classical PageRank algorithm is applied over this augmented network to obtain the influence degree of each node. Its original authors suggest that this simple change allows LeaderRank to converge faster than PageRank.
An example of how to use the LeaderRank algorithm is shown below:
Network network = ...
LeaderRank leaderRankTask = new LeaderRank(network);
NodeScore influence = leaderRankTask.call();
References:
- Lü, L., Zhang, Y. C., Yeung, C. H., & Zhou, T. (2011). Leaders in social networks, the delicious case. PloS one, 6(6), e21202. DOI: 10.1371/journal.pone.0021202.
Link properties
This section is devoted to link-related structural network properties. In addition to the techniques described here, more link-related methods can be found in the Link Scoring & Prediction page.
Link betweenness
Betweenness can be also defined for links as the number of shortest paths between pairs of nodes that contain a given link. This structural network property has very interesting applications. For example, it is used for detecting communities in networks by the Girvan–Newman algorithm, described in the Community Detection section.
The following example shows how to apply this metric to a given network:
Network network = ...
LinkBetweenness lbTask = new LinkBetweenness(network);
LinkScore linkBetweenness = lbTask.call();
Link embeddedness
Link embeddedness is defined as the number of shared neighbors between the nodes at the ends of the link. This metric can be seen as a measure of the strength of the link. NOESIS includes a large number of metrics for scoring links. More details about these metrics can be found under the Link Scoring & Prediction section.
The following code snippet shows an example of how to use this metric:
Network network = ...
LinkEmbeddedness linkEmbTask = new LinkEmbeddedness(network);
LinkScore linkEmbeddedness = linkEmbTask.call();
Link neighborhood overlap
The link neighborhood overlap is closely related to link embeddedness. This metric is defined as the number of shared neighbors between the nodes involved in the link, divided by the number of neighbors both nodes are connected to.
An example of how to compute the link neighborhood overlap of a link between two nodes, with indices 13 and 22, is shown below:
Network network = ...
LinkNeighborhoodOverlap lnoTask = new LinkNeighborhoodOverlap(network);
double linkNeighborhoodOverlap = lnoTask.compute(13, 22);
Link neighborhood size
This link metric counts the total number of neighbors that nodes at the ends of the link have.
The following example shows how to compute this metric for a link between two nodes with indices 15 and 27:
Network network = ...
LinkNeighborhoodSize lnsTask = new LinkNeighborhoodSize(network);
double linkNeighborhoodSize = lnsTask.compute(15, 27);
Link rays
Link rays count the number of different paths that pass through the link between neighbors of the nodes at the ends. This metric is defined as the product of the source node in-degree and the destination node out-degree.
An example of how to use this metric is shown below:
Network network = ...
LinkRays linkRaysTask = new LinkRays(network);
LinkScore linkRays = linkRaysTask.call();
int linkRaysScore = (int) linkRays.get(13, 28);
Clustering coefficient
The clustering coefficient measures the tendency of the neighbors of a node to connect between them. For a given node, this structural network property is defined as the number of closed triangles in which the node takes part divided by the number of potential triangles the node could take part. This fraction is a good indicator of the tendency of nodes in a network to cluster together.
The following example show how to use the ClusteringCoefficient class for computing node clustering coefficient:
Network network = ...
ClusteringCoefficient ccTask = new ClusteringCoefficient(network);
NodeScore clusteringCoefficient = ccTask.call();
Since the average clustering coefficient is also of interest, a shortcut for its calculation is given by the averageClusteringCoefficient method, as shown below:
double avgClusteringCoefficient = ccTask.averageClusteringCoefficient();
Robustness coefficient
The robustness coefficient measures the network resilience to attacks or faults. This structural network property is based on measuring how the largest component size changes as the network goes through disintegration by removing its nodes. Nodes are removed according to the order given by a list passed as an argument. The result of this metric is a final score measuring the network robustness. This value can be computed using the getRobustness method of the RobustnessCoefficient class. This class is defined as a node score, where each node is assigned the size of the largest component after its removal.
The following code fragment shows an example of how to compute the robustness coefficient using a custom order list in ascending order:
Network network = ...
List order = ...
boolean descending = false;
RobustnessCoefficient rcTask = new RobustnessCoefficient(network, order, descending);
NodeScore nodeRobustness = rcTask.call();
double robustnessCoefficient = rcTask.getRobustness();
The nodeRobustness object contains the scores associated to nodes, whereas the robustnessCoefficient variable contains the network robustness coefficient.
Other network properties can be used to specify the node removal order. For example, the following code snippet shows how to compute the robustness coefficient by removing nodes according to their degree in descending order:
Network network = ...
NodeScore degree = new Degree(network).call();
boolean descending = true;
RobustnessCoefficient rcTask = new RobustnessCoefficient(network, degree, descending);
double robustnessCoefficient = rcTask.getRobustness();
References:
- Piraveenan, M., Thedchanamoorthy, G., Uddin, S., & Chung, K. S. K. (2013). Quantifying topological robustness of networks under sustained targeted attacks. Social Network Analysis and Mining, 3(4), 939-952. DOI: 10.1007/s13278-013-0118-8.