This course has two kinds of meetings: lectures and recitations. You’ll attend lecture twice a week with all of the other students in the course, and recitation once a week with a smaller group of students.
Our first meeting of the semester is at 9:05 AM on Monday, August 21 in Peabody 1040. We’ll go over the structure of the course, access to resources such as readings and slides, and guidelines for success.
Our second (or third, if your recitation meets on Tuesday) meeting is at 9:05 AM on Wednesday, August 23. This meeting will be an overview of the content of the course.
All recitation sections will begin meeting this week. See the recitation schedule.
Due to the tragedy on campus, Tuesday’s classes have been cancelled. In order to keep all the recitations synchronized, Wednesday’s lecture and Thursday and Friday’s recitations are also cancelled. We will resume meeting on Monday, September 11.
Due to the Labor Day holiday on September 4 and the well-being day on September 5, neither lectures nor recitations will meet this entire week. Use this time to get a head start on next week's reading.
The first unit of this course introduces the fundamental concepts of knowledge, meaning, and information.
Total amount of required reading for this week: 3,700 words
What is knowledge? What does it mean to produce, acquire, possess, organize, or preserve knowledge?
To read before this meeting:
Total amount of required reading for this week: 6,500 words
What is meaning? How do we attribute meaning to gestures, sounds, images, artifacts and other perceptible phenomena?
To read before this meeting:
Due to the well-being day on September 25, neither lectures nor recitations will meet this entire week. Use this time to catch up on lectures or readings that you missed.
Total amount of required reading for this week: 9,500 words
Information is the result of transforming knowledge into a measurable commodity by de-emphasizing or ignoring meaning.
To read before this meeting:
Claude Shannon, an engineer who worked at Bell Labs, developed a mathematical theory of communication that came to be known as “information theory.” The papers in which Shannon developed his theory were originally published in 1948 in two parts in the Bell System Technical Journal. A year later, Warren Weaver published this summary of Shannon’s work.
There is some math in this report. If you’re not mathematically inclined, just skip over it—it isn’t necessary to understand the math in order to understand the basic ideas.
About six years after information theory made its debut, Shannon wrote this one-page editorial.
This short article use the information theoretic concept of entropy to explain why it is so easy to identify individual people based on their web browsing activity.
This chapter from science writer James Gleick’s book The Information is an engaging mini-biography of Claude Shannon, but it is also an accessible introduction to information theory.
Monday’s lecture will review the concepts introduced in the first unit.
On Wednesday, you will take the midterm exam at the same time and place as you usually attend lecture.
Recitations will not meet this week.
Due to the Fall break on October 19–20, neither lectures nor recitations will meet this entire week. Use this time to enjoy Fall break.
The second unit of this course introduces some key techniques employed by information professionals: classification, deduction, and induction.
Total amount of required reading for this week: 8,400 words
Classification is grouping things together in a principled, systematic way for a specific purpose.
To read before this meeting:
Total amount of required reading for this week: 3,400 words
Deduction is a kind of reasoning about classes that takes the form of a chain of premises and conclusions. Each conclusion in the chain automatically follows from its premises. These chains can be expressed and manipulated using the formal language of Boolean algebra.
To read before this meeting:
This article is by Edmund Berkeley, a pioneer of computer science and co-founder of the Association for Computing Machinery, which is still the primary scholarly association for computer scientists. But he wrote this article in 1937, before he became a computer scientist—because computers had yet to exist. At the time he was a mathematician working at the Prudential life insurance company, where he recognized the usefulness of Boolean algebra for modeling insurance data. He published this article in a professional journal for actuaries (people who compile and analyze statistics and use them to calculate insurance risks and premiums).
Berkeley uses some frightening-looking mathematical notation in parts of this article, but everything he discusses is actually quite simple. The most important parts are:
pages 373–374, where he gives a simple explanation of Boolean algebra,
pages 380–381, where he considers practical applications of Boolean algebra, and
pages 383 on, where he pays close attention to translation back and forth between Boolean algebra and English.
Total amount of required reading for this week: 6,500 words
Induction is a kind of reasoning about classes that seeks reoccurring patterns in how things have been grouped together. Conclusions (predictions of further reoccurrence) follow premises (observations of grouping) not automatically, but with some likelihood. Statisticians offer formal languages to characterize these patterns and to quantify the likelihood of their reoccurrence.
To read before this meeting:
Bill Maron was an engineer at missile manufacturer Ramo-Wooldridge when he began investigating statistical methods for classifying and retrieving documents. In this paper he describes a method for statistically modeling the subject matter of texts. He introduces the basic ideas behind what is now known as a Bayesian classifier, a technique that is still widely used today for a variety of automatic classification tasks from spam filtering to face recognition.
Trigger warning: math. The math is relatively basic, and if you’ve studied any probability, you should be able to follow it. But if not, just skip it: Maron explains everything important about his experiment in plain English. Pay extra attention to what he says about “clue words.”
Monday’s lecture will review the concepts introduced in the second unit.
On Wednesday, you will take the midterm exam at the same time and place as you usually attend lecture.
Recitations will not meet this week.
Due to the Thanksgiving recess, neither lectures nor recitations will meet this entire week. Use this time to enjoy Thanksgiving.
The third and final unit of this course considers a couple of complex social issues involving information professionals: labor and attention.
Total amount of required reading for this week: 4,400 words
Classification is labor. Information professionals use machines to automate this labor, but it is never fully automated. What kinds of classification labor are done by people, and what kinds are done by machines?
To read before this meeting:
The Bots is a video installation work created by media artists Eva & Franco Mattes. The Frankfurter Kunstverein describes the work as follows:
They present anonymous testimonies from content moderators who have worked for Facebook in Berlin. Six videos have been created … The films were executed with the typical aesthetic and features of online make-up tutorials. The statements in the films are derived from investigative research and interviews conducted with numerous witnesses employed as service providers for Facebook. The films were interpreted by actors so as to anonymise the statements of the content moderators. They perform the role of influencers addressing their followers directly. They recorded the videos using smartphones, for which reason the images are in portrait format. Advice on make-up products alternates with distressing descriptions of moderators’ work.
To view the videos, you will need to log in. I will post the login information to the course announcements list in early November.
In this chapter from her book Behind the Screen, Sarah Roberts provides an overview of commercial content moderation at companies like Facebook. She explains what commercial content moderation is, who does it, and the conditions under which they work.
Total amount of required reading for this week: 10,700 words
Information professionals create systems for classifying things as worthy or not worthy of attention. Knowing what people are paying attention can be valuable. When information professionals seek to profit from what they know about attention, it raises questions about whom their systems serve.
To read before this meeting:
Wednesday, December 6 is the last day of class, so recitations will not meet this week. Lectures will meet as usual.