Random walks

These algorithms consider the properties of random walks to determine the recommendation score. We include the following approaches:

Commute time
Hitting time
HITS
PageRank
Personalized PageRank
Personalized HITS
Personalized SALSA
PropFlow
SALSA

Commute time

Average time that a random walker needs to go from the target to the origin user and come back. We consider two variants, depending on the underlying random walk algorithm:

PageRank
Personalized PageRank

Reference: D. Liben-Nowell, D., J. Kleinberg. The Link Prediction Problem for Social Networks. Journal of the American Society for Information Science and Technology 58(7) (2007).

Parameters

r: the PageRank teleport rate.

Configuration file

The non-personalized PageRank version is selected as:

Commute time PageRank:
  r:
    type: double
    range:
    - start: 0.1
      end: 0.99
      step: 0.1

and the personalized one as:

Commute time personalized PageRank:
  r:
    type: double
    range:
    - start: 0.1
      end: 0.99
      step: 0.1

Hitting time

Average time that a random walker needs to go from the target to the candidate user. We consider two variants, depending on the underlying random walk algorithm:

PageRank
Personalized PageRank

Reference: D. Liben-Nowell, D., J. Kleinberg. The Link Prediction Problem for Social Networks. Journal of the American Society for Information Science and Technology 58(7) (2007).

Parameters

r: the PageRank teleport rate.

Configuration file

The non-personalized PageRank version is selected as:

Hitting time PageRank:
  r:
    type: double
    range:
    - start: 0.1
      end: 0.99
      step: 0.1

and the personalized one as:

Hitting time personalized PageRank:
  r:
    type: double
    range:
    - start: 0.1
      end: 0.99
      step: 0.1

HITS

Algorithm based on the Hyperlink-Induced Topic Search algorithm.

Reference: J.M. Kleinberg. Authoritative Sources in a Hyperlinked Environment. Journal of the ACM 46(5), 604-642 (1999).

Parameters

mode: true if we want to use the authorities scores, false if we want to use the hubs scores.

Configuration file

HITS:
  mode:
    type: boolean
    values: [true,false]

PageRank

This algorithm takes the non-personalized PageRank algorithm (initially designed for estimating the importance of web pages) as a recommendation / prediction method.

Reference: S. Brin, L. Page. The Anatomy of a Large-Scale Hypertextual Web Search Engine. 7th Annual International Conference on World Wide Web (WWW 1998), 107-117 (1998).

Parameters

r: teleport rate of the random walk.

Configuration file

PageRank:
  r:
    type: double
    range:
    - start: 0.1
      end: 0.99
      step: 0.1

Personalized HITS

Personalized version of the HITS algorithm, where a teleport probability to the target user of the recommendation has been added.

Reference: A. Goel. The Who-To-Follow System at Twitter: Algorithms, Impact and Further Research. 32rd Annual International Conference on World Wide Web (2014), industry track (2014)

Parameters

mode: true if we want to use the authorities scores, false if we want to use the hubs scores.
alpha: teleport rate.

Configuration file

Personalized HITS:
  alpha:
    type: double
    range:
    - start: 0.1
      end: 0.99
      step: 0.1
  mode:
    type: boolean
    values: [true, false]

Personalized PageRank

Personalized version of the PageRank algorithm, where the random walk always teleports to the target user. Also known as rooted PageRank.

Reference: S. White, P. Smyth. Algorithms for Estimating Relative Importance in Networks. 9th Annual ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2003).

Parameters

r: teleport rate of the random walk.

Configuration file

Personalized PageRank:
  r:
    type: double
    range:
    - start: 0.1
      end: 0.99
      step: 0.1

Personalized SALSA

Personalized version of the SALSA algorithm, where a teleport probability to the target user of the recommendation has been added.

Reference: A. Goel, P. Gupta, J. Sirois, D. Wang, A. Sharma, S. Gurumurthy. The who-to-follow system at Twitter: Strategy, algorithms and revenue impact. Interfaces 45(1), 98-107 (2015)

Parameters

mode: true if we want to use the authorities scores, false if we want to use the hubs scores.
alpha: teleport rate.

Configuration file

Personalized SALSA:
  alpha:
    type: double
    range:
    - start: 0.1
      end: 0.99
      step: 0.1
  mode:
    type: boolean
    values: [true, false]

PropFlow

This algorithm considers the probability that a random walker starting in the target user reaches the candidate user in less than few steps, using the edge weights as transition probabilities.

Reference: R. Lichtenwalter, J. Lussier, N. Chawla. New perspectives and methods in link prediction. 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2010), 243-252 (2010).

Parameters

maxLength: importance of the Laplacian matrix.
orientation: the method to choose the orientation of the paths.
- IN: the walker advances through incoming edges.
- OUT: the walker advances through outgoing edges.
- UND: the walker advances through any edge.
- MUTUAL: the walker advances through reciprocal edges.
weighted: (OPTIONAL) true to use the weights of the edges, false to consider them binary.

Configuration file

PropFlow:
  maxLength:
    type: int
    values: [3,4,5,6]
  orientation:
    type: orientation
    values: [IN,OUT,UND,MUTUAL]
  (weighted:
    type: boolean
    values: [true,false])

SALSA

Adaptation of the Stochastic Approach for Link-Structure Analysis (SALSA) algorithm.

Reference: R. Lempel, S. Moran. SALSA: The Stochastic Approach for Link-Structure Analysis. ACM TOIS 19(2), 131-160 (2001)

Parameters

mode: true if we want to use the authorities scores, false if we want to use the hubs scores.

Configuration file

SALSA:
  mode:
    type: boolean
    values: [true,false]