Predicting Citation Counts Using Text and Graph Mining
Avishay Livne, Eytan Adar, Jaime Teevan, and Susan Dumais

As the volume of scientific literature grows faster it becomes more difficult for researchers to identify promising papers that are likely to become influential in their field. We study the problem of predicting future citation counts of papers given information available at the time of publication (five years forward in our pilot study). We apply machine learning techniques on a dataset of millions of academic papers from several research domains to identify predictive features including venue reputation, authors and institutions, citation networks and content measures. We identify how these features are differentially predictive in various domains and identify possible reasons where citation behaviors might lead to these differences.

Available as: PDF

to appear, iConference 2013 Workshop on Computational Scientometrics: Theory and Applications