Foundations of Information Science

UNC School of Information and Library Science, INLS 201, Fall 2019

August 20
Introduction

Today we’ll meet each other, and I’ll go over the syllabus, class policies, and how to use the course website. You’ll also tell me a little about yourself, and we’ll probably finish early.

After class is over, I’ll post any slides I showed to this website, and (if you are logged in) you will see a link to a PDF of them below.

August 22
The qualitative information professions

Total amount of required reading for this meeting: 13,100 words

This is a professional school, so we’ll begin by examining the “information professions.“ What are they? How do they relate to “information schools,” or “information science”? The story is complicated. In 1988, sociologist Andrew Abbott, who was interested in how professions emerge and change, tried to sort it all out.

For today, please read pages 215–226 of Abbott’s “The Information Professions.” In the first three paragraphs he discusses his sociological model of professions—just skim or skip those. Focus on the last paragraph of the introductory section, and the section titled “The Qualitative Task Area.”

To read before this meeting:

  1. Abbott, Andrew. “The Information Professions.” In The System of Professions, 215–246. University of Chicago Press, 1988. PDF.
    13,100 words
    Reading tips

    This is an excerpt from a book that advances a theory about how professions change over time, so there is some discussion of that theory here. Don’t worry too much about that—focus on how Abbott tries to define what “the information professions” are, and especially his discussion of the attempt to create a combined jurisdiction that would unify quantitative and qualitative information.

    Abbott’s story ends in 1988, so there is obviously more to say about what has happened to the “information professions” since then.

August 27
The quantitative information professions

Total amount of required reading for this meeting: 13,100 words

For today, please read pages 226–246 of Abbott’s “The Information Professions,” the sections titled “The Quantitative Task Area” and “The Combined Jurisdiction.”

Abbott’s story ends in 1988, so there is obviously more to say. In class I’ll try to sketch some of the major developments in “the information professions” in the past 30 years.

To read before this meeting:

  1. Abbott, Andrew. “The Information Professions.” In The System of Professions, 215–246. University of Chicago Press, 1988. PDF.
    13,100 words
    Reading tips

    This is an excerpt from a book that advances a theory about how professions change over time, so there is some discussion of that theory here. Don’t worry too much about that—focus on how Abbott tries to define what “the information professions” are, and especially his discussion of the attempt to create a combined jurisdiction that would unify quantitative and qualitative information.

    Abbott’s story ends in 1988, so there is obviously more to say about what has happened to the “information professions” since then.

August 29
Documents and evidence

Total amount of required reading for this meeting: 5,400 words

Our lives and our societies are structured by, and constituted through, documents—and this has been true for a long time.

Today’s reading is the second chapter of Michael Buckland’s book on Information and Society. Buckland is a professor at the Berkeley School of Information, and he was my doctoral advisor.

To read before this meeting:

  1. Buckland, Michael. “Document and Evidence.” In Information and Society, 21–49. MIT Press, 2017. PDF.
    5,400 words

September 3
Ryan is at 4S annual meeting; no class

September 5
Ryan is at 4S annual meeting; no class

September 10
Documents: thinking with eyes and hands

Total amount of required reading for this meeting: 9,200 words

For today we’ll read an article by Bruno Latour, a French philosopher, anthropologist and sociologist. Latour wrote this article to persuade his colleagues in the social sciences that they need to pay more attention to documents and processes of documentation.

To read before this meeting:

  1. Latour, Bruno. “Visualisation and Cognition: Thinking with Eyes and Hands.” Knowledge and Society: Studies in the Sociology of Culture Past and Present 6 (1986): 1–40. PDF.
    9,200 words
    Reading tips

    Latour uses some unusual terminology in this article. He refers to documents as inscriptions and practices of documentation as inscription procedures. He also refers to documents as immutable mobiles, highlighting what he considers to be two of their most important qualities: immutability and mobility.

    Latour is interested in the relationship between practices of documentation and thinking (cognition). His basic argument is that what may seem like great advances in thought are actually better understood as the emergence of new practices of documentation. Latour focuses primarily on documents as aids to visualization rather than as carriers of information. Thus he begins by discussing the emergence of new visualization techniques, such as linear perspective.

September 12
Documents: thinking with eyes and hands [continued]

No new reading for today: finish “Visualisation and Cognition: Thinking with Eyes and Hands,” if you haven’t already.

September 17
The recorded information universe

When Google asserts that its mission is to organize the world’s information, to what is it referring? What does “the world’s information” consist of?

Philosopher of librarianship Patrick Wilson wrestled with the same question in 1968, when he attempted to define the limits of “the bibliographical universe.”

To read before this meeting:

  1. Wilson, Patrick. “The Bibliographical Universe.” In Two Kinds of Power, 6–19. Berkeley: University of California Press, 1968. PDF.

September 19
Information science?

Total amount of required reading for this meeting: 6,900 words

Assignment 1 handed out

Can there be a science of information? It depends on what you mean by “science,” and also on what you mean by “information.”

Information scientist Marcia Bates often, and influentially, reflected on the nature of both information and information science. In 1999 she argued that information science should be understood as a “meta-science.”

To read before this meeting:

  1. Bates, Marcia J. “The Invisible Substrate of Information Science.” Journal of the American Society for Information Science; New York 50, no. 12 (October 1999): 1043–50. http://search.proquest.com/docview/231394612/abstract/DBA8FEBAEA134FE5PQ/1.
    6,900 words

September 24
The ideology of information

Total amount of required reading for this meeting: 4,600 words

Today we return to the question of “the information professions.” Does it make sense to create what Abbott called “the combined jurisdiction”? Philip Agre, a computer-scientist-turned-information-scholar, was skeptical. In the article we’ll read for today, he argues that treating different genres of document all as “information” is a way for information professionals to attempt to extend their influence, and not necessarily the best way to communicate about the content or meaning of documents.

To read before this meeting:

  1. Agre, Philip E. “Institutional Circuitry: Thinking about the Forms and Uses of Information.” Information Technology and Libraries 14, no. 4 (December 1995): 225. https://search.proquest.com/docview/215834010/abstract/D4ABCDE862CC4B56PQ/2.
    4,600 words

September 26
Peer review of assignment #1

Upload your PDF of assignment #1 before class today.

You will spend today’s class peer-reviewing two of your classmates’ assignments.

September 26
Assignment 1 due

October 1
Semiotics: a theory of meaning-making

The way we often talk about information can be misleading: we talk about information being “inside” books, or videos, or databases, or else we talk about information as a kind of invisible substance that travels around… but these metaphors do not reflect how communication actually works. It’s good to keep this in mind to avoid some common traps of thinking about information.

An alternative way of talking about information, that avoids some of these problems, is to focus on “signs”: how signs are organized into codes or languages, and how signs and codes operate in our broader culture. The study of signs and processes involving signs is known as “semiotics.”

To read before this meeting:

  1. Liebenau, Jonathan, and James Backhouse. “Introduction to Semiotics / Pragmatics.” In Understanding Information, 10–34. Macmillan Information Systems Series. London: Macmillan, 1990. PDF.

October 3
Making meaning by drawing distinctions

Total amount of required reading for this meeting: 16,900 words

Making things meaningful involves drawing distinctions—categorizing and classifying the world around us. Eviatar Zerubavel is a cognitive sociologist, meaning that he studies how social processes shape our thinking, and he’s written a number of fascinating and accessible books on the topic. For today we’ll read some selections from his book about making distinctions in everyday life.

To read before this meeting:

  1. Zerubavel, Eviatar. “Introduction / Islands of Meaning / The Great Divide / The Social Lens.” In The Fine Line, 1–17, 21–24, 61–80. New York: Free Press, 1991. PDF.
    16,900 words
    Reading tips

    Eviatar Zerubavel is a cognitive sociologist, meaning that he studies how social processes shape our thinking, and he’s written a number of fascinating and accessible books on the topic. These are selections from his book The Fine Line about making distinctions in everyday life.

October 8
Analyzing meaning: semantics

An important part of what “information professionals” do is try to analyze, understand, and disambiguate the meanings of signs that others use. Today we’ll try our hand at some semantic analysis.

To read before this meeting:

  1. Liebenau, Jonathan, and James Backhouse. “Semantics.” In Understanding Information, 37–51. Macmillan Information Systems Series. London: Macmillan, 1990. PDF.

October 10
Analyzing the meaning of texts

Total amount of required reading for this meeting: 11,900 words

While some forms of communication are fairly straightforward to analyze semantically, written texts pose a challenge. Even a short passage of text can have virtually endless meanings, as Patrick Wilson explains.

To read before this meeting:

  1. Wilson, Patrick. “Subjects and the Sense of Position.” In Two Kinds of Power, 69–92. Berkeley: University of California Press, 1968. PDF.
    11,900 words
    Reading tips

    In this chapter Patrick Wilson considers the problems that arise when one tries to come up with systematic rules for classifying texts by subject.

    Wilson can be a bit long-winded, but his insights are worth it. (You can skip the very long footnotes, so this reading is actually shorter than it looks.) What Wilson calls a “writing” is more typically referred to as a text. In this chapter he is criticizing the assumptions librarians make when cataloging texts by subject. The “sense of position” in the title of the chapter refers to the librarian’s sense of where in a classification scheme a text should be placed. Although he is talking about library classification, everything Wilson says is also applicable to state-of-the-art machine classification of texts today.

October 15
Classification: analyzing meaning systematically

Total amount of required reading for this meeting: 5,600 words

Assignment 2 handed out

We all categorize and classify all the time, but we don’t always do it intentionally and systematically. Today we’ll try out a form of systematic classification known as faceted classification.

To read before this meeting:

  1. Hunter, Eric. “What Is Classification? / Classification in an Information System / Faceted Classification.” In Classification Made Simple, 3rd ed. Farnham: Ashgate, 2009. PDF.
    5,600 words
    Reading tips

    This is an excerpt from a useful and easy-to-read (and very British) textbook on how to classify things.

October 17
Fall break

October 22
Classifying clouds

Total amount of required reading for this meeting: 10,100 words

Most of us would readily agree that our everyday “folk” classifications are historically contingent and somewhat arbitrary. Yet scientific classification presumably is different: science is the study of reality, and so scientific classifications are “real” in a way that other classifications are not.

Today we’ll discuss historian of science Lorraine Daston‘s history of scientists’ attempts to classify clouds.

To read before this meeting:

  1. Daston, Lorraine. “Cloud Physiognomy.” Representations 135, no. 1 (August 1, 2016): 45–71. https://doi.org/10.1525/rep.2016.135.1.45.
    10,100 words
    Reading tips

    Things to focus on in this reading:

    • What’s the difference between variety and variability, and why are both problems for classification?

    • What are some of the possible different approaches that might be taken to classify clouds?

    • What motivated the creation of cloud atlases?

    • What role do images play in cloud atlases?

October 24
Formalization

A formal language differs from natural languages (like English or Japanese) by having strict rules governing its use. Programming languages are formal languages, as are logics like propositional logic. Translating communication into a formal language, or formalization, is a necessary step toward making communication computable.

To read before this meeting:

  1. Liebenau, Jonathan, and James Backhouse. “Syntactics.” In Understanding Information, 55–64. Macmillan Information Systems Series. London: Macmillan, 1990. PDF.

October 24
Assignment 2 due

October 29
Computation

People were building systems to automate information organization and retrieval long before the invention of the computer, but the digital computer made possible many techniques that were previously unfeasible. The invention of computing also gave birth to a theory of computation, which gives us a mathematical framework for characterizing and measuring syntactic labor. Today we’ll look at one of the earliest computational techniques to be applied to information organization: Boolean logic.

To read before this meeting:

  1. Hillis, W. “Nuts and Bolts / Universal Building Blocks.” In The Pattern on the Stone, 1–38. New York: Basic Books, 1998. PDF.

October 31
Automating semiotic labor

We’ve looked at how people categorize, classify, and name things of interest. As we’ve seen, this can be hard work, and like other kinds of hard work, people have sought to escape it through automation.

To what extent can the organization of information be automated? Information scholar Julian Warner looks at this question by drawing a distinction between different kinds of semiotic labor.

To read before this meeting:

  1. Warner, Julian. “Forms of Labour in Information Systems.” Information Research 7, no. 4 (2002). http://www.informationr.net/ir/7-4/paper135.html.

November 5
Boolean information retrieval

When using Boolean retrieval, we treat texts as simple sets of words. This allows us to obtain lists of texts in response to queries consisting of words combined with the operators AND, OR, and NOT.

To read before this meeting:

  1. Manning, Christopher D, Prabhakar Raghavan, and Hinrich Schütze. “Boolean Retrieval.” In Introduction to Information Retrieval. Cambridge, UK: Cambridge University Press, 2008. http://nlp.stanford.edu/IR-book/pdf/01bool.pdf.
    Reading tips

    Introduces inverted indexes and shows how simple Boolean queries can be processed using such indexes.

November 7
A case for Boolean retrieval

Total amount of required reading for this meeting: 2,800 words

Boolean retrieval is sometimes characterized as hopelessly outdated. But there is something to be said for the division of labor—and power—that Boolean retrieval organizes.

To read before this meeting:

  1. Hjørland, Birger. “Classical Databases and Knowledge Organization: A Case for Boolean Retrieval and Human Decision-Making during Searches.” Journal of the Association for Information Science and Technology 66, no. 8 (August 1, 2015): 1559–75. PDF.
    2,800 words
    Reading tips

    This is an excerpt from an article arguing that, though they are perceived as outdated, selection systems based on Boolean algebra (more commonly referred to as Boolean retrieval systems) are preferable for some purposes because they offer more opportunities for human decision-making during searches.

November 12
Evaluating information retrieval

Assignment 3 handed out

Information retrieval may involve varying amounts of automated labor. Deciding whether and how to automate requires some way to evaluate the effects of automation on the quality of the information retrieval system.

To read before this meeting:

  1. Buckland, Michael. “Evaluation of Selection Methods.” In Information and Society, 153–164. MIT Press, 2017. PDF.

November 14
Probability and inductive logic

Information science took a major turn when the designers of information retrieval systems began to explore the statistical modeling of language.

Statistics is hard. Most people don’t intuitively understand probability, including me, and including the vast majority of scientists who rely on statistical methods. So today we’ll review some of the basics, so we know just enough to be dangerous.

To read before this meeting:

  1. Hacking, Ian. An Introduction to Probability and Inductive Logic. Cambridge: Cambridge University Press, 2001. PDF.

November 19
Automatic classification

Total amount of required reading for this meeting: 6,500 words

The shift to statistical modeling in information science can be traced to the work of Bill Maron. Maron was an engineer at missile manufacturer Ramo-Wooldridge when he began investigating statistical methods for classifying and retrieving documents. For today we’ll read a classic paper of Maron’s in which he develops the basic ideas behind the Bayesian classifier, a technique that is still widely used today for a variety of automatic classification tasks from spam filtering to face recognition.

To read before this meeting:

  1. Maron, M. E.“Automatic Indexing: An Experimental Inquiry.” Journal of the ACM 8, no. 3 (July 1961): 404–417. https://doi.org/10.1145/321075.321084.
    6,500 words
    Reading tips

    Bill Maron was an engineer at missile manufacturer Ramo-Wooldridge when he began investigating statistical methods for classifying and retrieving documents. In this paper he describes a method for statistically modeling the subject matter of texts. He introduces the basic ideas behind what is now known as a Bayesian classifier, a technique that is still widely used today for a variety of automatic classification tasks from spam filtering to face recognition.

    Trigger warning: math. The math is relatively basic and if you’ve studied any probability, you should be able to follow it. But if not, just skip it: Maron explains everything important about his experiment in plain English. Pay extra attention to what he says about “clue words.”

November 19
Assignment 3 due

November 21
Scrutinizing automatic classification

Total amount of required reading for this meeting: 10,300 words

For the remainder of the semester we’ll be “scrutinizing” some of the selection systems currently organizing us. Bernhard Rieder gets us started by scrutinizing Maron’s Bayes classifier.

To read before this meeting:

  1. Rieder, Bernhard. “Scrutinizing an Algorithmic Technique: The Bayes Classifier as Interested Reading of Reality.” Information, Communication & Society 20, no. 1 (January 2, 2017): 100–117. https://doi.org/10.1080/1369118X.2016.1181195.
    10,300 words

November 26
Scrutinizing recommendation systems

What are the consequences of the shift from 1) information systems that allow us to precisely specify the properties of the things we seek, to 2) information systems that attempt to anticipate our needs or desires and recommend things to us? If a YouTube video, a search result, a fashion brand, a scientific paper, or a restaurant that people discover via a recommendation service becomes popular and successful, is it because that video, result, brand, paper, or restaurant is of high quality, or is it perhaps due in part to the way the recommendation service works? Sociologists Matthew Salganik and Duncan Watts sought to investigate this question by building their own streaming music service.

To read before this meeting:

  1. Matthew J. Salganik, and Duncan J. Watts. “Leading the Herd Astray: An Experimental Study of Self-Fulfilling Prophecies in an Artificial Cultural Market.” Social Psychology Quarterly 71, no. 4 (December 1, 2008): 338–55. https://doi.org/10.1177/019027250807100404.

November 28
Thanksgiving

December 3
Scrutinizing large-scale recommendation systems

Final exam handed out

All selection systems, including recommendation systems, organize how we think about the things selected or recommended. But only a few selection systems become so large-scale that they begin to organize the production of those things. One of these few is YouTube.

To read before this meeting:

  1. Bridle, James. “Something Is Wrong on the Internet.” James Bridle, November 6, 2017. https://medium.com/@jamesbridle/something-is-wrong-on-the-internet-c39c471271d2.

December 7
Final exam due

The final exam is due at 3PM on Saturday, December 7.

December 7
Final exam due