PGX 1.2.0
Documentation

PageRank Algorithms

Assigns a numerical weight to each vertex, measuring its relative importance within the graph.


Classic PageRank

Time Complexity O(k*E) [k = no. of iterations]    Space Requirement O(N)

The classic PageRank algorithm. The implementation of this algorithm uses an iterative method. At each iteration step, the PageRank value of all nodes in the graph are computed.

Signature

Input Argument Type Comment
G graph
e double maximum tolerated error value. The algorithm will stop once the sum of the error values of all nodes is below this value.
d double damping factor
max_iter int maximum number of iterations that will be performed
Output Argument Type Comment
pg_rank nodeProperty node property holding the (normalized) PageRank value for each node.

Green-Marl Code

/*
* Copyright (C) 2013 - 2015 Oracle and/or its affiliates. All rights reserved.
*/
procedure pagerank(G: graph, e,d: double, max_iter_count: int;
                   pg_rank: nodeProp<double>)
{
    double diff;
    int cnt = 0;
    double N = G.numNodes();
    G.pg_rank = 1 / N;
    do {
        diff = 0.0;
        foreach (t: G.nodes) {
            double val = (1-d) / N + d* 
                sum(w: t.inNbrs) {w.pg_rank / w.outDegree()} ;
            diff += | val - t.pg_rank |;
            t.pg_rank <= val;
        }
        cnt++;
    } while ((diff > e) && (cnt < max_iter_count));
}

Personalized PageRank

Time Complexity O(k*N) [k = no. of iterations]    Space Requirement O(N)

The algorithm computes the Personalized PageRank which evaluates the relative importance of nodes in a graph with respect to a given input node. The implementation uses a power iteration method.

Signature

Input Argument Type Comment
G graph
root node root node to start from.
e double maximum tolerated error value. The algorithm will stop once the sum of the error values of all nodes is below this value.
d double damping factor
max_iter int maximum number of iterations that will be performed.
Output Argument Type Comment
pg_rank nodeProperty node property where the PageRank value for each node will be written.

Green-Marl Code

/*
* Copyright (C) 2013 - 2015 Oracle and/or its affiliates. All rights reserved.
*/
procedure personalized_pagerank(G: graph, v: node, e,d: double, max_iter_count: int; pg_rank: nodeProp<double>)
{
    if (G.numNodes() == 0) {
        return;
    }

    double diff;
    int cnt = 0;
    G.pg_rank = 0;
    v.pg_rank = 1.0;

    do {
        diff = 0.0;
        foreach (t: G.nodes) {
            double val1 = (t==v) ? (1-d) : 0;
            double val2 = d* sum(w: t.inNbrs) {w.pg_rank / w.outDegree()} ;
            double val = val1 + val2;
            diff += | val - t.pg_rank |;
            t.pg_rank <= val;
        }
        cnt++;
    } while ((diff > e) && (cnt < max_iter_count));
}

Approximate PageRank

Time Complexity O(k*E) [k = no. of iterations]    Space Requirement O(N)

Computes an approximate PageRank. This algorithm computes less precise values but runs faster than the classic implementation on certain graphs. The top computed values are usually very close to the actual PageRank values.

Signature

Input Argument Type Comment
G graph
e double maximum tolerated error value. The algorithm will stop once the sum of the error values of all nodes is below this value.
d double damping factor
max_iter int maximum number of iterations that will be performed
Output Argument Type Comment
pg_rank nodeProperty node property holding the (normalized) PageRank value for each node.

Green-Marl Code

procedure pg_rank_delta(G: graph, tolerance,d: double, max_iter_count: int; pg_rank: nodeProp<double>)
{
    nodeProp<bool> active;
    nodeProp<double> my_delta;
    nodeProp<double> new_delta;

    double initial_pg_rank_value = 1.0 / G.numNodes();

    // initialize 
    bool nodes_active = true;
    int iteration = 0;
    double N = G.numNodes();
    G.active = true;
    G.my_delta = initial_pg_rank_value;
    G.new_delta = 0.0;
    G.pg_rank = initial_pg_rank_value;

    // push 
    while (nodes_active && iteration < max_iter_count) {

        nodes_active = false;

        // push 
        foreach(n:G.nodes)(n.active) { 
            foreach(k:n.outNbrs) 
                k.new_delta += n.my_delta / n.outDegree(); 
        } 

        // consolidate 
        foreach(n: G.nodes) {

            double new_page_rank;
            double normalized_delta;

            if (iteration == 0) { // first iteration needs special handling
                new_page_rank = initial_pg_rank_value*(1-d) + d*n.new_delta;
                normalized_delta = new_page_rank - initial_pg_rank_value;
            } 
            else { 
                new_page_rank = n.pg_rank + d*n.new_delta;
                normalized_delta = d*n.new_delta;
            }

            n.pg_rank = new_page_rank;
            n.my_delta = normalized_delta;

            if (normalized_delta < 0) normalized_delta = -normalized_delta;

            if (normalized_delta < tolerance) {
                n.active = false;
            } else {
                n.active = true;
                nodes_active |= true;
            }
            n.new_delta = 0;
        }
        iteration++;
    }
}