Path-based algorithms

These algorithms consider the paths between different nodes in the network to find the recommendation scores. The framework contains the following algorithms within this family.

Global Leicht-Holme-Newman index

This algorithm considers that two users are similar if their immediate neighbors in the network are themselves similar.

Reference: E.A. Leicht, P. Holme, M.E.J. Newman. Vertex Similarity in Networks. Physical Review E 73(2): 026120 (2006).

Parameters

  • phi: decay factor for the similarity as we use longer paths between the users.

  • orientation: the method to choose the adjacency matrix. This parameter does not influence in undirected networks.

    • IN: coordinate \(A_{ij}\) shows the weight of the \((j,i)\) edge.

    • OUT: coordinate \(A_{ij}\) shows the weight of the \((i,j)\) edge.

    • UND: coordinate \(A_{ij}\) shows the sum of the weights of the \((i,j)\) and \((j,i)\) edges.

    • MUTUAL: coordinate \(A_{ij}\) shows the sum of the weights of the \((i,j)\) and \((j,i)\) edges (but only if both exist).

  • q: the exponent of the similarity.

Configuration file

Global LHN:
  phi:
    type: double
    values: [0.01,0.1,1,10,100]
  orientation:
    type: orientation
    values: [IN,OUT,UND,MUTUAL]

Katz

This algorithm weights the possible paths between the target and candidate users, giving more weight to those at closer distances.

References:
    1. Katz. A new status index derived from sociometric analysis. Psychometrika 18(1), 39-43 (1953)

    1. Liben-Nowell, D., J. Kleinberg. The Link Prediction Problem for Social Networks. Journal of the American Society for Information Science and Technology 58(7) (2007)

Parameters

  • b: decay factor for the greater distance paths.

  • orientation: the method to choose the adjacency matrix. This parameter does not influence in undirected networks.

    • IN: coordinate \(A_{ij}\) shows the weight of the \((j,i)\) edge.

    • OUT: coordinate \(A_{ij}\) shows the weight of the \((i,j)\) edge.

    • UND: coordinate \(A_{ij}\) shows the sum of the weights of the \((i,j)\) and \((j,i)\) edges.

    • MUTUAL: coordinate \(A_{ij}\) shows the sum of the weights of the \((i,j)\) and \((j,i)\) edges (but only if both exist).

Configuration file

Katz:
  b:
    type: double
    values: [0.01,0.1,1,10,100]
  orientation:
    type: orientation
    values: [IN,OUT,UND,MUTUAL]

Local path index

This algorithm weights the possible paths between the target and candidate users, giving more weight to those at closer distances.

References:
    1. Lü, C. Jin, T. Zhou. Similarity Index Based on Local Paths for Link Prediction of Complex Networks. Physical Review E 80(4): 046122 (2009)

    1. Lü, T. Zhou. Link Prediction in Complex Networks: A survey. Physica A 390(6), 1150-1170 (2011)

Parameters

  • b: decay factor for the greater distance paths.

  • k: the maximum distance between the target and candidate users (greater or equal than 3).

  • orientation: the method to choose the adjacency matrix. This parameter does not influence in undirected networks.

    • IN: coordinate \(A_{ij}\) shows the weight of the \((j,i)\) edge.

    • OUT: coordinate \(A_{ij}\) shows the weight of the \((i,j)\) edge.

    • UND: coordinate \(A_{ij}\) shows the sum of the weights of the \((i,j)\) and \((j,i)\) edges.

    • MUTUAL: coordinate \(A_{ij}\) shows the sum of the weights of the \((i,j)\) and \((j,i)\) edges (but only if both exist).

Configuration file

Local path index:
  b:
    type: double
    values: [0.01,0.1,1,10,100]
  k:
    type: int
    values: [3,4,5,6]
  orientation:
    type: orientation
    values: [IN,OUT,UND,MUTUAL]

Matrix forest

This algorithm takes as score the ratio of the number of spanning divergent forests such that the target and candidate belong to the same tree, rooted in the target user.

References:
    1. Lü, T. Zhou. Link Prediction in Complex Networks: A survey. Physica A 390(6), 1150-1170 (2011)

Parameters

  • alpha: importance of the Laplacian matrix.

  • orientation: the method to choose the adjacency matrix. This parameter does not influence in undirected networks.

    • IN: coordinate \(A_{ij}\) shows the weight of the \((j,i)\) edge.

    • OUT: coordinate \(A_{ij}\) shows the weight of the \((i,j)\) edge.

    • UND: coordinate \(A_{ij}\) shows the sum of the weights of the \((i,j)\) and \((j,i)\) edges.

    • MUTUAL: coordinate \(A_{ij}\) shows the sum of the weights of the \((i,j)\) and \((j,i)\) edges (but only if both exist).

Configuration file

Matrix forest:
  alpha:
    type: double
    values: [0.01,0.1,1,10,100]
  orientation:
    type: orientation
    values: [IN,OUT,UND,MUTUAL]

Pseudo inverse cosine

This algorithm represents each user by the u-th row of the pseudo-inverse of the Laplacian matrix of the network. Then, the score is just the cosine similarity between these two vectors.

References:
    1. Fouss, A. Pirotte, J-M. Renders, M. Saerens. Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation. IEEE TKDE 19(3), pp. 355-369 (2007).

Parameters

  • orientation: the method to choose the adjacency matrix. This parameter does not influence in undirected networks.

    • IN: coordinate \(A_{ij}\) shows the weight of the \((j,i)\) edge.

    • OUT: coordinate \(A_{ij}\) shows the weight of the \((i,j)\) edge.

    • UND: coordinate \(A_{ij}\) shows the sum of the weights of the \((i,j)\) and \((j,i)\) edges.

    • MUTUAL: coordinate \(A_{ij}\) shows the sum of the weights of the \((i,j)\) and \((j,i)\) edges (but only if both exist).

Configuration file

Pseudo-inverse cosine:
  alpha:
    type: double
    values: [0.01,0.1,1,10,100]
  orientation:
    type: orientation
    values: [IN,OUT,UND,MUTUAL]

Shortest path distance

The shortest path distance recommends people in the network who are close to the target user (the closest, the highest the score shall be)-

Reference: D. Liben-Nowell and J. Kleinberg. The link prediction problem for social networks. 12th International Conference on Information and Knowledge Management (CIKM 2003), ACM, 556-559 (2003).

Parameters

  • orientation: the orientation to choose for the edges.

    • IN: it considers the distance between the candidate user and the target user.

    • OUT: it considers the distance between the target and the candidate user.

    • UND: it considers the distance between the target and candidate users if the network was undirected.

    • MUTUAL: in this case, we consider just the most natural distance (the OUT case).

Configuration file

Distance:
  orientation:
    type: orientation
    values: [IN,OUT,UND]