Member-only story

Key Python Libraries for NLP

All the Tools You Need for Your NLP Workflow

Severin Perez
9 min readSep 5, 2020
Photo by Susan Yin on Unsplash

One of the great things about using Python for natural language processing (NLP) is the large ecosystem of tools and libraries. From tokenization, to machine learning, to data visualization — Python has something for every NLP task in your workflow. Of course, choosing the *right* tool isn’t always so easy. Every NLP library provides slightly different functionality and has slightly different implementation. The key to finding the right tool is having an awareness about what is out there, and experimenting with each of them such that you know each tool’s strengths and weaknesses. To that end, provided below is a list of the major NLP tools in use today. We recommend you try them all out — if only to play around and see how they work.

Core NLP Tasks

Deconstructing text into machine-interpretable form, be it a bag-of-words, a matrix, or some other form is a critical part of the NLP pipeline. The below libraries provide various mechanisms for these core NLP tasks.

Gensim

Gensim is an open-source Python library used for a variety of tasks, including: topic modeling; indexing; and, document similarity. Gensim has functionality for latent semantic analysis, non-negative matrix factorization, latent Dirichlet…

--

--

No responses yet