The Way of the great learning involves manifesting virtue, renovating the people, and abiding by the highest good.

2009年1月22日星期四

How Google’s PageRank predicts Nobel Prize winners

January 21st, 2009 | by KFC |

pagerankings-graph

Ranking scientists by their citations–the number of times they are mentioned in other scientists’ papers– is a miserable business. Everybody can point to ways in which this system is flawed:

  • not all citations are equal. The importance of the citing paper is a significant factor
  • scientists in different fields of study use citations in different ways. An average paper in the life sciences is cited about six times, three times in physics, and about once in mathematics.
  • ground-breaking papers may be cited less often because a field is necessarily smaller in its early days.
  • important papers often stop being cited when they are incorporated into textbooks

The pattern of citations between papers forms a complex network, not unlike the one the internet forms. Might that be a clue that point us towards a better way of assessing the merits of the papers that it consists of?

Sergei Maslov from Brookhaven National Laboratory in New York state and Sidney Redner at Boston University have asked themselves just that question and suggest that Google’s PageRank algorithm might throw some light on the matter.

In essence, PageRank counts the number of citations a paper receives (or the number of links that point to a webpage). The more a paper receives, the higher it is ranked. But a citing is also weighted according to the ranking of the citing paper. So citations from important papers make another paper more important.

Maslov and Redner have applied the algorithm to 353,268 articles published by the American Physical Society since 1893 in journals such as Physical Review Letters . And the results are a breath of fresh air.

The top 10 papers by Google Pageranking are:

  1. Unitary Symmetry & Leptonic Decays by Cabibbo
  2. Theory of Superconductivity by Bardeen, Cooper & Schrieffer
  3. Self-Consistent Equations . . . by Kohn & Sham
  4. Inhomogeneous Electron Gas by Hohenberg & Kohn
  5. A Model of Leptons by Weinberg
  6. Crystal Statistics . . . by Onsager
  7. Theory of the Fermi Interaction by Feynman & Gell-Mann
  8. Absence of Diffusion in . . . by Anderson
  9. The Theory of Complex Spectra by Slater
  10. Scaling Theory of Localization by Abrahams, Anderson, et al.

That’s an impressive list, not least because most of these authors are Nobel Prize winners. (Curiously the author of the top paper, Nicola Cabibbo, is not. That ought to be of interest to the Nobel committee who awarded Makoto Kobayashi and Toshihide Maskawa the 2008 Nobel Prize for physics for work that was heavily based on Cabibbo’s ideas.)

All of which suggests an idea. Mining the later entries in this list might be an good way of predicting future prize winners. So get your bets in before the bookies get wind of it.

Redner and Maslov conclude: “Google’s PageRank algorithm and its modifications hold great promise for quantifying the impact of scientific publications.”

Can’t argue with that.

Ref: arxiv.org/abs/0901.2640: Promise and Pitfalls of Extending Google’s PageRank Algorithm to Citation Networks

没有评论: