Member-only story
How Does a Computer Learn to Read?
Believe it or not, computers are illiterate. Despite appearances, a computer can’t “read” a word, understand it’s meaning, and place it in context next to adjacent words. To a computer, the words you read on the monitor are just a way of transmitting data in a way that humans can understand. The computer itself is operating on a much lower level of abstraction, because it can only store information as bits — ones and zeros that must be strung together to represent something more complex.
Consider the letter a
. A computer doesn't have any way to store an a
in memory, because computer memory is a string of switches that only have two states: on (1
) or off (0
). So, the computer has to find a way to represent the letter a
with those switches. To solve this problem, humans have created various systems that match letters and symbols with decimal numbers. In ASCII for example, a
is matched to 97
. Now, instead of the realm of letters, we're in the realm of numbers, and computers are great with numbers. A 97
can in turn be converted into binary format rather than decimal format, and we get 01100001
, which can be represented by the aforementioned switches in computer memory. To humans, this means a
, but as far as the computer is concerned, letters don't exist.
Setting aside the intricacies of digital storage, let’s consider why this matters in text analytics. Although…