Member-only story

Introduction to Search Relevance Models

Information Retrieval with Term Frequency and TF-IDF Models

9 min readOct 14, 2020

One of the core tasks in information retrieval is searching. Anyone who deals with large amounts of text data (and that’s almost all of us) knows how difficult this seemingly simple task can be. If your search term is too broad, you may find yourself sifting through an impossible quantity of documents. And if your search term is too narrow, you could be missing out on relevant results. So how do we decide which documents are the most relevant to our search?

Search relevance is a difficult problem — and modern search engines employ highly sophisticated (and proprietary) algorithms to deal with the issue. We won’t delve into those algorithms, but let’s look at some simple strategies that you might employ in your own information retrieval applications.

If you want to follow along with the full code and dataset for this article, check out the companion notebook, which includes functions for loading, manipulating, and analyzing term-document matrices and term frequency-inverse document frequency matrices. And if you want to learn more about information retrieval, Introduction to Information Retrieval by Christopher D. Manning should be your first stop.

Introduction to Search Relevance Models

Information Retrieval with Term Frequency and TF-IDF Models

State of the Union

Written by Severin Perez

Responses (1)