The Most Influential NLP Papers on Google Scholar

A Guide to Getting Started with Academic Literature

Severin Perez
8 min readSep 7, 2020

Natural language processing (NLP) is a complex and evolving field. Part computer science, part linguistics, part statistics — it can be a challenge deciding where to begin. Books and online courses are a great place to start, and project-based learning is always a good idea, but at some point it becomes necessary to dig deeper, and that means looking at the academic literature.

Reading academic literature is an art unto itself, and just because a paper is popular doesn’t mean it’s the right place for a beginner. However, there is something to be said for papers that have withstood both the test of time and been widely accepted by experts. If a paper has been consistently cited in academic literature, then it’s probably fair to say that the paper is influential.

There are a variety of sources to find academic papers online, but one of the best is Google Scholar (GS), which helpfully provides citation data. We’re going to use this as our measure of influence. Unfortunately, GS doesn’t provide an API or other easy way to programmatically access data, so we manually downloaded the first 1,000 search results for the term natural language processing and then parsed and analyzed the data.

--

--