Foundations of Information Science

UNC School of Information and Library Science, INLS 201, Spring 2019

January 10
Introduction

Today we’ll meet each other, and I’ll go over the syllabus, class policies, and how to use the course website. You’ll also tell me a little about yourself, and we’ll probably finish early.

After class is over, I’ll post any slides I showed to this website, and (if you are logged in) you will see a link to a PDF of them below.

January 15
The qualitative information professions

Total amount of reading for this week: 13,100 words

This is a professional school, so we’ll begin by examining the “information professions.“ What are they? How do they relate to “information schools,” or “information science”? The story is complicated. In 1988, sociologist Andrew Abbott, who was interested in how professions emerge and change, tried to sort it all out.

For today, please read pages 215–226 of Abbott’s “The Information Professions.” In the first three paragraphs he discusses his sociological model of professions—just skim or skip those. Focus on the last paragraph of the introductory section, and the section titled “The Qualitative Task Area.”

To read before this class:

  1. Abbott, Andrew. “The Information Professions.” In The System of Professions, 215–246. University of Chicago Press, 1988. PDF.
    13,100 words
    Reading tips

    This is an excerpt from a book that advances a theory about how professions change over time, so there is some discussion of that theory here. Don’t worry too much about that—focus on what Abbott means by the qualitative and quantitative information professions, and especially his discussion of the attempt to create a combined jurisdiction that would unify quantitative and qualitative information.

    Abbott’s story ends in 1988, so there is obviously more to say about what has happened to the “information professions” since then.

January 17
The quantitative information professions

Total amount of reading for this week: 13,100 words

For today, please read pages 226–246 of Abbott’s “The Information Professions,” the sections titled “The Quantitative Task Area” and “The Combined Jurisdiction.”

Abbott’s story ends in 1988, so there is obviously more to say. In class I’ll try to sketch some of the major developments in “the information professions” in the past 30 years.

To read before this class:

  1. Abbott, Andrew. “The Information Professions.” In The System of Professions, 215–246. University of Chicago Press, 1988. PDF.
    13,100 words
    Reading tips

    This is an excerpt from a book that advances a theory about how professions change over time, so there is some discussion of that theory here. Don’t worry too much about that—focus on what Abbott means by the qualitative and quantitative information professions, and especially his discussion of the attempt to create a combined jurisdiction that would unify quantitative and qualitative information.

    Abbott’s story ends in 1988, so there is obviously more to say about what has happened to the “information professions” since then.

January 22
Documents: thinking with eyes and hands

Total amount of reading for this week: 9,200 words

For today we’ll read an article by Bruno Latour, a French philosopher, anthropologist and sociologist. Latour wrote this article to persuade his colleagues in the social sciences that they need to pay more attention to documents and processes of documentation.

To read before this class:

  1. Latour, Bruno. “Visualisation and Cognition: Thinking with Eyes and Hands.” Knowledge and Society: Studies in the Sociology of Culture Past and Present 6 (1986): 1–40. PDF.
    9,200 words
    Reading tips

    Latour uses some unusual terminology in this article. He refers to documents as inscriptions and practices of documentation as inscription procedures. He also refers to documents as immutable mobiles, highlighting what he considers to be two of their most important qualities: immutability and mobility.

    Latour is interested in the relationship between practices of documentation and thinking (cognition). His basic argument is that what may seem like great advances in thought are actually better understood as the emergence of new practices of documentation. Latour focuses primarily on documents as aids to visualization rather than as carriers of information. Thus he begins by discussing the emergence of new visualization techniques, such as linear perspective.

January 24
Documents and evidence

Our lives and our societies are structured by, and constituted through, documents—and this has been true for a long time.

Today’s reading is the second chapter of Michael Buckland’s book on Information and Society. Buckland is a professor at the Berkeley School of Information, and he was my doctoral advisor.

To read before this class:

  1. Buckland, Michael. “Document and Evidence.” In Information and Society, 21–49. MIT Press, 2017. PDF.

January 29
The recorded information universe

When Google asserts that its mission is to organize the world’s information, to what is it referring? What does “the world’s information” consist of?

Philosopher of librarianship Patrick Wilson wrestled with the same question in 1968, when he attempted to define the limits of “the bibliographical universe.”

To read before this class:

  1. Wilson, Patrick. “The Bibliographical Universe.” In Two Kinds of Power, 6–19. Berkeley: University of California Press, 1968. PDF.

January 31
Information science?

Total amount of reading for this week: 6,900 words

Can there be a science of information? It depends on what you mean by “science,” and also on what you mean by “information.”

Information scientist Marcia Bates often, and influentially, reflected on the nature of both information and information science. In 1999 she argued that information science should be understood as a “meta-science.”

To read before this class:

  1. Bates, Marcia J. “The Invisible Substrate of Information Science.” Journal of the American Society for Information Science; New York 50, no. 12 (October 1999): 1043–50. http://search.proquest.com/docview/231394612/abstract/DBA8FEBAEA134FE5PQ/1.
    6,900 words

February 5
The ideology of information

Total amount of reading for this week: 4,600 words

Assignment #1 handed out

Today we return to the question of “the information professions.” Does it make sense to create what Abbott called “the combined jurisdiction”? Philip Agre, a computer-scientist-turned-information-scholar, was skeptical. In the article we’ll read for today, he argues that treating different genres of document all as “information” is a way for information professionals to attempt to extend their influence, and not necessarily the best way to communicate about the content or meaning of documents.

To read before this class:

  1. Agre, Philip E. “Institutional Circuitry: Thinking about the Forms and Uses of Information.” Information Technology and Libraries 14, no. 4 (December 1995): 225. https://search.proquest.com/docview/215834010/abstract/D4ABCDE862CC4B56PQ/2.
    4,600 words

February 7
Information science is neither

Total amount of reading for this week: 7,500 words

Wrapping up our initial unit on the information professions and information science, we will read an article by Chair of the UCLA Department of Information Studies, Jonathan Furner. Furner argues that information science is not about information, nor is it a science.

To read before this class:

  1. Furner, Jonathan. “Information Science Is Neither.” Library Trends 63, no. 3 (2015): 362–77. https://doi.org/10.1353/lib.2015.0009.
    7,500 words

February 12
Peer review of assignment #1

Assignment 1 due

Bring three printed copies of your submission for assignment #1 to class today.

You will spend today’s class peer-reviewing two of your classmate’s submissions, before turning them all in to me.

February 14
Semiotics: a theory of meaning-making

The way we often talk about information can be misleading: we talk about information being “inside” books, or videos, or databases, or else we talk about information as a kind of invisible substance that travels around… but these metaphors do not reflect how communication actually works. It’s good to keep this in mind to avoid some common traps of thinking about information.

An alternative way of talking about information, that avoids some of these problems, is to focus on “signs”: how signs are organized into codes or languages, and how signs and codes operate in our broader culture. The study of signs and processes involving signs is known as “semiotics.”

To read before this class:

  1. Liebenau, Jonathan, and James Backhouse. “Introduction to Semiotics / Pragmatics.” In Understanding Information, 10–34. Macmillan Information Systems Series. London: Macmillan, 1990. PDF.

February 19
Making meaning by drawing distinctions

Total amount of reading for this week: 16,900 words

Making things meaningful involves drawing distinctions—categorizing and classifying the world around us. Eviatar Zerubavel is a cognitive sociologist, meaning that he studies how social processes shape our thinking, and he’s written a number of fascinating and accessible books on the topic. For today we’ll read some selections from his book about making distinctions in everyday life.

To read before this class:

  1. Zerubavel, Eviatar. “Introduction / Islands of Meaning / The Great Divide / The Social Lens.” In The Fine Line, 1–17, 21–24, 61–80. New York: Free Press, 1991. PDF.
    16,900 words

February 21
Scheduling conflict

No class.

February 26
Analyzing meaning: semantics

An important part of what “information professionals” do is try to analyze, understand, and disambiguate the meanings of signs that others use. Today we’ll try our hand at some semantic analysis.

To read before this class:

  1. Liebenau, Jonathan, and James Backhouse. “Semantics.” In Understanding Information, 37–51. Macmillan Information Systems Series. London: Macmillan, 1990. PDF.

February 28
Analyzing the meaning of texts

Total amount of reading for this week: 11,900 words

Assignment #2 handed out

While some forms of communication are fairly straightforward to analyze semantically, written texts pose a challenge. Even a short passage of text can have virtually endless meanings, as Patrick Wilson explains.

To read before this class:

  1. Wilson, Patrick. “Subjects and the Sense of Position.” In Two Kinds of Power, 69–92. Berkeley: University of California Press, 1968. PDF.
    11,900 words
    Reading tips

    Wilson can be a bit long-winded, but his insights are worth it. (You can skip the very long footnotes, so this reading is actually shorter than it looks.) What Wilson calls a “writing” is more typically referred to as a text. In this chapter he is criticizing the assumptions librarians make when cataloging texts by subject. The “sense of position” in the title of the chapter refers to the librarian’s sense of where in a classification scheme a text should be placed. Although he is talking about library classification, everything Wilson says is also applicable to state-of-the-art machine classification of texts today.

March 5
Classification: analyzing meaning systematically

Total amount of reading for this week: 5,600 words

We all categorize and classify all the time, but we don’t always do it intentionally and systematically. Today we’ll try out a form of systematic classification known as faceted classification.

To read before this class:

  1. Hunter, Eric. “What Is Classification? / Classification in an Information System / Faceted Classification.” In Classification Made Simple, 3rd ed. Farnham: Ashgate, 2009. PDF.
    5,600 words

March 7
Classifying clouds

Assignment 2 due

Total amount of reading for this week: 10,100 words

Most of us would readily agree that our everyday “folk” classifications are historically contingent and somewhat arbitrary. Yet scientific classification presumably is different: science is the study of reality, and so scientific classifications are “real” in a way that other classifications are not.

Today we’ll discuss historian of science Lorraine Daston‘s history of scientists’ attempts to classify clouds.

To read before this class:

  1. Daston, Lorraine. “Cloud Physiognomy.” Representations 135, no. 1 (August 1, 2016): 45–71. https://doi.org/10.1525/rep.2016.135.1.45.
    10,100 words
    Reading tips

    Things to focus on in this reading:

    • What’s the difference between variety and variability, and why are both problems for classification?

    • What are some of the possible different approaches that might be taken to classify clouds?

    • What motivated the creation of cloud atlases?

    • What role do images play in cloud atlases?

March 12
Spring break

No class.

March 14
Spring break

No class.

March 19
Formalization

A formal language differs from natural languages (like English or Japanese) by having strict rules governing its use. Programming languages are formal languages, as are logics like propositional logic. Translating communication into a formal language, or formalization, is a necessary step toward making communication computable.

To read before this class:

  1. Liebenau, Jonathan, and James Backhouse. “Syntactics.” In Understanding Information, 55–64. Macmillan Information Systems Series. London: Macmillan, 1990. PDF.

March 21
Computation

People were building systems to automate information organization and retrieval long before the invention of the computer, but the digital computer made possible many techniques that were previously unfeasible. The invention of computing also gave birth to a theory of computation, which gives us a mathematical framework for characterizing and measuring syntactic labor. Today we’ll look at one of the earliest computational techniques to be applied to information organization: Boolean logic.

To read before this class:

  1. Hillis, W. “Nuts and Bolts / Universal Building Blocks.” In The Pattern on the Stone, 1–38. New York: Basic Books, 1998. PDF.

March 26
Catch-up day

Today I’ll return and we’ll discuss assignment #2, and we’ll finish the Boolean algebra activity from last week.

March 28
Automating semiotic labor

We’ve looked at how people categorize, classify, and name things of interest. As we’ve seen, this can be hard work, and like other kinds of hard work, people have sought to escape it through automation.

To what extent can the organization of information be automated? Information scholar Julian Warner looks at this question by drawing a distinction between different kinds of semiotic labor.

To read before this class:

  1. Warner, Julian. “Forms of Labour in Information Systems.” Information Research 7, no. 4 (2002). http://www.informationr.net/ir/7-4/paper135.html.

April 2
Boolean information retrieval

When using Boolean retrieval, we treat texts as simple sets of words. This allows us to obtain lists of texts in response to queries consisting of words combined with the operators AND, OR, and NOT.

To read before this class:

  1. Manning, Christopher D, Prabhakar Raghavan, and Hinrich Schütze. “Boolean Retrieval.” In Introduction to Information Retrieval. Cambridge, UK: Cambridge University Press, 2008. http://nlp.stanford.edu/IR-book/pdf/01bool.pdf.
    Reading tips

    Introduces inverted indexes and shows how simple Boolean queries can be processed using such indexes.

April 4
A case for Boolean retrieval

Total amount of reading for this week: 2,800 words

Assignment #3 handed out

Boolean retrieval is sometimes characterized as hopelessly outdated. But there is something to be said for the division of labor—and power—that Boolean retrieval organizes.

To read before this class:

  1. Hjørland, Birger. “Classical Databases and Knowledge Organization: A Case for Boolean Retrieval and Human Decision-Making during Searches.” Journal of the Association for Information Science and Technology 66, no. 8 (August 1, 2015): 1559–75. PDF.
    2,800 words

April 9
Evaluating information retrieval

Information retrieval may involve varying amounts of automated labor. Deciding whether and how to automate requires some way to evaluate the effects of automation on the quality of the information retrieval system.

To read before this class:

  1. Buckland, Michael. “Evaluation of Selection Methods.” In Information and Society, 153–164. MIT Press, 2017. PDF.

April 11
Probability and inductive logic

Assignment 3 due

Information science took a major turn when the designers of information retrieval systems began to explore the statistical modeling of language.

Statistics is hard. Most people don’t intuitively understand probability, including me, and including the vast majority of scientists who rely on statistical methods. So today we’ll review some of the basics, so we know just enough to be dangerous.

To read before this class:

  1. Hacking, Ian. An Introduction to Probability and Inductive Logic. Cambridge: Cambridge University Press, 2001. PDF.

April 16
Automatic classification

Total amount of reading for this week: 6,500 words

The shift to statistical modeling in information science can be traced to the work of Bill Maron. Maron was an engineer at missile manufacturer Ramo-Wooldridge when he began investigating statistical methods for classifying and retrieving documents. For today we’ll read a classic paper of Maron’s in which he develops the basic ideas behind the Bayesian classifier, a technique that is still widely used today for a variety of automatic classification tasks from spam filtering to face recognition.

To read before this class:

  1. Maron, M. E.“Automatic Indexing: An Experimental Inquiry.” Journal of the ACM 8, no. 3 (July 1961): 404–417. https://doi.org/10.1145/321075.321084.
    6,500 words
    Reading tips

    Trigger warning: math. The math is relatively basic and if you’ve studied any probability, you should be able to follow it. But if not, just skip it: Maron explains everything important about his experiment in plain English. Pay extra attention to what he says about “clue words.”

April 18
Scrutinizing automatic classification

Total amount of reading for this week: 10,300 words

For the remainder of the semester we’ll be “scrutinizing” some of the selection systems currently organizing us. Bernhard Rieder gets us started by scrutinizing Maron’s Bayes classifier.

To read before this class:

  1. Rieder, Bernhard. “Scrutinizing an Algorithmic Technique: The Bayes Classifier as Interested Reading of Reality.” Information, Communication & Society 20, no. 1 (January 2, 2017): 100–117. https://doi.org/10.1080/1369118X.2016.1181195.
    10,300 words

April 23
Scrutinizing recommendation systems

What are the consequences of the shift from 1) information systems that allow us to precisely specify the properties of the things we seek, to 2) information systems that attempt to anticipate our needs or desires and recommend things to us? If a YouTube video, a search result, a fashion brand, a scientific paper, or a restaurant that people discover via a recommendation service becomes popular and successful, is it because that video, result, brand, paper, or restaurant is of high quality, or is it perhaps due in part to the way the recommendation service works? Sociologists Matthew Salganik and Duncan Watts sought to investigate this question by building their own streaming music service.

To read before this class:

  1. Matthew J. Salganik, and Duncan J. Watts. “Leading the Herd Astray: An Experimental Study of Self-Fulfilling Prophecies in an Artificial Cultural Market.” Social Psychology Quarterly 71, no. 4 (December 1, 2008): 338–55. https://doi.org/10.1177/019027250807100404.

April 25
Scrutinizing large-scale recommendation systems

Final exam handed out

All selection systems, including recommendation systems, organize how we think about the things selected or recommended. But only a few selection systems become so large-scale that they begin to organize the production of those things. One of these few is YouTube.

To read before this class:

  1. Bridle, James. “Something Is Wrong on the Internet.” James Bridle, November 6, 2017. https://medium.com/@jamesbridle/something-is-wrong-on-the-internet-c39c471271d2.

May 6
Final exam due

Final exam due