public class PageRank extends Object
This implementation requires a set of pages and a set of directed links as input and works as
follows.
In each iteration, the rank of every page is evenly distributed to all pages it points to. Each
page collects the partial ranks of all pages that point to it, sums them up, and applies a
dampening factor to the sum. The result is the new rank of the page. A new iteration is started
with the new ranks of all pages. This implementation terminates after a fixed number of
iterations.
This is the Wikipedia entry for the Page Rank
algorithm.
Input files are plain text files and must be formatted as follows:
"1\n2\n12\n42\n63"
gives five pages with IDs 1, 2, 12, 42, and 63.
"1 2\n2 12\n1 12\n42 63"
gives four (directed) links (1)->(2),
(2)->(12), (1)->(12), and (42)->(63).Usage:
PageRankBasic --pages <path> --links <path> --output <path> --numPages <n> --iterations <n>
If no parameters are provided, the program is run with default data from PageRankData
and 10 iterations.
This example shows how to use:
Note: All Flink DataSet APIs are deprecated since Flink 1.18 and will be removed in a future Flink major version. You can still build your application in DataSet, but you should move to either the DataStream and/or Table API. This class is retained for testing purposes.
Modifier and Type | Class and Description |
---|---|
static class |
PageRank.BuildOutgoingEdgeList
A reduce function that takes a sequence of edges and builds the adjacency list for the vertex
where the edges originate.
|
static class |
PageRank.Dampener
The function that applies the page rank dampening formula.
|
static class |
PageRank.EpsilonFilter
Filter that filters vertices where the rank difference is below a threshold.
|
static class |
PageRank.JoinVertexWithEdgesMatch
Join function that distributes a fraction of a vertex's rank to all neighbors.
|
static class |
PageRank.RankAssigner
A map function that assigns an initial rank to all pages.
|
Constructor and Description |
---|
PageRank() |
Copyright © 2014–2024 The Apache Software Foundation. All rights reserved.