Google PageRank Algorithm – References
Google PageRank – Algorithms and Computation
- The PageRank Citation Ranking: Bringing Order to the Web, Brin & Page et al (January 29, 1998)
This is the original description of PageRank
- The Anatomy of a Large-Scale Hypertextual Web Search Engine, Brin & Page
This is the original description of Google
- US Patents 7269587, 6799176, 7058628, 6285999
- Efficient Computation of PageRank.Haveliwala (1999)
Shows how PageRank calculation can be scaled to very large web sizes on modest hardware by partitioning, use of single precision arithmetic, and the minimum number of iterations to produce a stable ordering.
- The Google Pagerank Algorithm
and How It Works (2002)
Description of algorithm with many examples
- Web Page Scoring Systems
for Horizontal and Vertical SearchDiligenti, Gori, Maggini (2002)
Describes PageRank and HITS and extends application to search within a vertical.
- The Mathematical Model of Google Tal-EzerAnother description of the PageRank algorithm and computational issues using the power method
- US Patent 7028029 – Adaptive computation of ranking
Describes a method to reduce PageRank computation load by exploiting the fact that a high percentage of nodes in a PageRank computation converge quickly, so can be removed from subsequent computations, thereby adaptively reducing the order of the computation.
- Analysis and Computation of Google’s PageRank (2006) (slides)
Looks at convergence and error bounds on PageRank computation; includes consideration of TrustRank/personalisation vector
Toolbar PageRank, SEO
- Graph fibrations, graph isomorphism, and PageRank (2006) (slides)
Describes fibrations, which in this context are mappings of link structures into smaller link structures that retain key properties under the PageRank calculation, therefore providing a possible way to reduce the size of the PageRank computation (at the cost of first determining the mapping)
- Combating Web Spam with TrustRank Z. Gyöngyi, H. Garcia-Molina, J. Pedersen
Yahoo! sponsored research describing TrustRank, which incorporates flow-on effects to other sites from a relatively small set of seed sites of known good or bad reputation, using the same framework for compuation as PageRank. This is supposedly different from the Google-patented “TrustRank”, according to Matt Cutts’ comments.