Foundations of Information Science

UNC SILS, INLS 201, Fall 2020

August 10

Synchronous recitations begin

August 10
Recitation sections begin meeting

All recitation sections will begin meeting the week starting August 10.

See the recitation schedule and Zoom links.

August 11

Synchronous all hands meeting

August 11
All hands meeting

View slides Updated Tuesday 8/11 10:34 AM

On Tuesday, August 11, starting at 11:30AM Eastern time, we will have our only synchronous meeting of all the INLS 201 sections.

The instructors will introduce ourselves, and we’ll go over the syllabus, class policies, and how to use the course website.

Zoom link: https://unc.zoom.us/j/91053749590

The password for the meeting will be posted to Sakai.

The meeting will be recorded.

After the meeting is over, I’ll post any slides I showed to this website, and (if you are logged in) you will see a link to a PDF of them below.

What are data and information ( professions | technologies ) ?

During the first part of this course, we'll try to understand three interrelated phenomena:

Data and information
Data and information professions
Data and information technologies

We'll start the first week with looking at the recent history and present of the “data and information professions” in the United States, and how “data and information” emerge from the interaction of these professions with processes of technological change.

Each of the next three weeks we will examine a different paradigm for understanding data and information: 1) documentation, 2) semiotics, and 3) information theory. Each paradigm is more abstract than the former; more general, but also further distant from actual practices of using data and information.

August 17
The data and information professions

View slides Updated Friday 8/14 7:13 PM

Total amount of required reading for this meeting: 13,100 words

SILS is a professional school, so we’ll begin by examining the “data and information professions.“ What are they? How do they relate to “information schools,” or “data science”? The story is complicated. In 1988, sociologist Andrew Abbott, who was interested in how professions emerge and change, tried to sort it all out.

For this week, please read Abbott’s “The Information Professions,” a chapter in his book The System of Professions.

📖 To read before this meeting:

Abbott, Andrew. “The Information Professions.” In The System of Professions, 215–246. University of Chicago Press, 1988. PDF.

13,100 words

Reading tips

This is an excerpt from a book that advances a theory about how professions change over time, so there is some discussion of that theory here. Don’t worry too much about that—focus on how Abbott tries to define what “the information professions” are, and especially his discussion of the attempt to create a combined jurisdiction that would unify quantitative and qualitative information.

Abbott’s story ends in 1988, so there is obviously more to say about what has happened to the “information professions” since then.
Optional

Reid, Edna O. “Transitioning from an M.L.S. to M.I.S. Career. A Case Study from a Black Information Scientist.” In The Black Librarian in America Revisited, edited by E. J. Josey, 216–223. Scarecrow Press, 1994. PDF.

2,400 words

Reading tips

In this essay Edna Reid describes her career as an information scientist, providing a view from the inside of the “combined jurisdiction” of information science, as well as showing the wide variety of contexts within which data and information professionals work.
Optional

The iSchool Inclusion Institute. “What Are the Information Sciences?,” January 23, 2020. https://web.archive.org/web/20200123161101/http://www.i3-inclusion.org/about/what-are-the-information-sciences/.

Reading tips

This page is useful for its extensive list of graduate Courses, specializations, and research areas in the data and information sciences, organized by the (other) undergraduate majors that they relate to.
Optional

Boykis, Vicki. “Data Science Is Different Now,” August 7, 2020. https://web.archive.org/web/20200807191324/https://veekaybee.github.io/2019/02/13/data-science-is-different/.

Reading tips

A practicing data scientist deflates some of the hype around this “hot job.”

August 24
The social lives of data and documents

View slides Updated Friday 8/21 11:28 AM

Total amount of required reading for this meeting: 9,200 words

In everyday English, to document usually means to provide material evidence that serves as a record, as in: “If you want to apply for a passport, you will have to document your citizenship status.”

In an information technology context, documentation typically means the written instructions that accompany software or hardware, as in: “Please consult the user documentation before using this software.”

But we can also use the word documentation to refer more broadly to all kinds of human practices of “documenting”: creating them, annotating them, classifying them, aggregating them, etc. This what we will mean by documentation in this course.

Documents do not have to be printed on paper, of course. Documents can be files on a hard drive or records in a database. As documents get more numerous and thus more difficult to inspect individually, we are more likely to call them “data.”

📖 To read before this meeting:

Buckland, Michael. “Introduction.” In Information and Society, 1–19. MIT Press, 2017. PDF.

3,800 words
Buckland, Michael. “Document and Evidence.” In Information and Society, 21–49. MIT Press, 2017. PDF.

5,400 words
Optional

Turner, Deborah. “Orally‐based Information.” Journal of Documentation 66, no. 3 (April 27, 2010): 370–83. https://doi.org/10.1108/00220411011038458.

7,000 words

Reading tips

Historically the data and information professions have focused almost entirely on written material. But as Deborah Turner argues in this article, we ought to recognize how much communication and learning happens through speech and gestures and other aspects of “face-to-face” communication. Though orality (communicating through speech and gesture) is often contrasted with literacy (communicating through the written word), the distinction becomes blurred when one takes a broad view of documentary practices (especially now that sharing oral communication via audio and video can be even easier than writing something).
Optional

Brown, John Seely, and Paul Duguid. “Reading the Background.” In The Social Life of Information, 173–205. Boston: Harvard Business School Press, 2000. PDF.

10,600 words

Reading tips

In this chapter from their book The Social Life of Information, John Seely Brown and Paul Duguid explain why despite 50+ years of digital computers and networks, we still use a lot of paper documents.
Optional

Whitworth, Andrew. “Basic Concepts and Terminology.” In Radical Information Literacy, 11–26. Amsterdam: Chandos, 2014. PDF.

6,500 words

Reading tips

This introduction to a book about information literacy (which the author abbreviates as IL) introduces some key concepts, including:

the intersubjective creation of meaning

information landscapes

cognitive authority
Optional

Brown, Marion. “Improvisation and the Aural Tradition in Afro-American Music.” Black World 23, no. 1 (November 1973): 14–19. https://books.google.com/books?id=TjoDAAAAMBAJ&lpg=PA1&pg=PA14.

Reading tips

In this short article jazz musician and musicologist Marion Brown traces the connection and blurs the (European) line between oral communication and musical improvisation.

August 31
Meaning, significance, and codes

View slides Updated Sunday 8/30 2:41 PM

Total amount of required reading for this meeting: 16,000 words

As we discussed last week, practices of “documenting” can take on a wide variety of forms. Given this variety, can we make any general observations? What is it that all these different ways communicating with the help of material artifacts (documents and data) have in common?

Semiotics, or the study of signs, is rooted in such questions. Charles Sanders Peirce was a scientist who was consumed by the question of how scientists, or any community of people, could best communicate clearly with one another and settle any doubts they might have. Through his attempts to answer this question developed a large and complex “theory of signs.”

Ferdinand de Saussure was linguist who helped shift the study of language away from historical study of how language changed over time, to focus instead on the discovery of general principles of how language works. He too—independently of Peirce—developed a theory of signs.

Peirce and Saussure’s ideas were widely influential, inspiring a wide range of semiotic theories. Characteristic of these theories is a focus on structural relations in systems of communication. Semioticians are interested in how signs take on meaning as a function of the structural role they play in a larger system. They look at how signs are organized into languages or codes, and how signs and codes operate within our broader cultures.

Ideas from semiotics are useful for thinking about data and information systems. Data scientists, ontologists, UX designers, and other information professionals employ semiotic concepts all the time, though many of them are unaware of it.

📖 To read before this meeting:

Fiske, John. “Communication, Meaning, and Signs.” In Introduction to Communication Studies, 3nd ed., 37–60. London ; New York: Routledge, 2010. https://ebookcentral.proquest.com/lib/unc/reader.action?docID=958077&ppg=90.

8,400 words
Fiske, John. “Codes.” In Introduction to Communication Studies, 3nd ed., 61–79. London ; New York: Routledge, 2010. https://ebookcentral.proquest.com/lib/unc/reader.action?docID=958077&ppg=114.

7,600 words
Optional

Rolling, James Haywood Jr. “Text, Image, and Bodily Semiotics: Repositioning African American Identity.” In Semiotics and Visual Culture: Sights, Signs, and Significance, edited by Deborah L. Smith-Shank, 72–79. Reston, Va.: National Art Education Association, 2004. PDF.

6,100 words

Reading tips

In this chapter James Haywood Rolling, Jr. uses semiotic analysis to demonstrate how human bodies can be “documentary.” Like Deborah Turner’s work on orality, his argument calls attention to broader patterns of meaning-making beyond the exchange of written documents. Rolling Jr. emphasizes the role of visual culture in meaning-making, looking at how white supremacist culture in the US employs visual signs to “define and delimit” Black American identity, and how Black Americans have countered that denigration by using photography and other technologies of image-making to construct a collective identity independent of white America.
Optional

Daylight, Russell. “The Semiotic Abstraction.” Semiotica 2017, no. 218 (January 26, 2017). PDF.

3,700 words

Reading tips

This article compares how the concept of abstraction is understood by computer scientists and semioticians. The author argues that semiotic systems should be understood as “machines for creating differences,” of which computers are one kind.
Optional

Martynenko, Grigory. “Semiotics of Statistics.” Journal of Quantitative Linguistics 10, no. 2 (2003): 105–15. PDF.

3,800 words

Reading tips

This author of this article analyzes statistics as a semiotic system. This view is useful for thinking about statistics as a means of and tool for communication, rather than simply an abstract body of mathematical knowledge. The author argues for a theoretical distinction between “virtual statistics” (the abstract body of mathematical knowledge) and “actual statistics” (how people actually employ statistics in specific circumstances). This is analogous to the theoretical distinction between actual speech and the abstract, systematic rules and conventions of language.
Optional

Akrich, Madeline, and Bruno Latour. “A Summary of a Convenient Vocabulary for the Semiotics of Human and Nonhuman Assemblies.” In Shaping Technology / Building Society: Studies in Sociotechnical Change, edited by Wiebe E. Bijker and John Law, 259–64. Cambridge, MA: MIT Press, 1992. PDF.

1,400 words

Reading tips

This short manifesto by Madeline Akrich and Bruno Latour argues that “semiotics is the study of order building … and may be applied to settings, machines, bodies, and programming languages as well as texts … semiotics is not limited to signs; the key aspect of the semiotics of machines is its ability to move from signs to things and back.”
Optional

Fry, Paul. Semiotics and Structuralism. ENGL 300: Introduction to Theory of Literature. Open Yale Courses, n.d. https://oyc.yale.edu/english/engl-300/lecture-8.

Reading tips

This lecture video does a good job of introducing the basic ideas of semiotics, though from the perspective of someone studying literature rather than from the perspective of someone studying data and information systems.

September 7

Exam 1 handed out

September 7
Quantifying data and information

View slides Updated Tuesday 9/8 1:43 PM

Total amount of required reading for this meeting: 6,600 words

As people began to communicate through wires and over radio waves, engineers sought to understand and describe how it happens, in order to design better communication systems. Claude Shannon, an engineer who worked at Bell Labs, developed an influential theory that came to be known as “information theory.”

📖 To read before this meeting:

Weaver, Warren. “Recent Contributions to The Mathematical Theory of Communication,” September 1949. PDF.

6,000 words

Reading tips

Claude Shannon, an engineer who worked at Bell Labs, developed a mathematical theory of communication that came to be known as “information theory.” The papers in which Shannon developed his theory were originally published in 1948 in two parts in the Bell System Technical Journal. A year later, Warren Weaver published this summary of Shannon’s work.

There is some math in this report. If you’re not mathematically inclined, just skip over it—it isn’t necessary to understand the math in order to understand the basic ideas.
Shannon, Claude. “The Bandwagon.” IRE Transactions on Information Theory 2, no. 3 (1956): 3. PDF.

600 words

Reading tips

About six years after information theory made its debut, Shannon wrote this one-page editorial.
Optional

Fiske, John. “Communication Theory.” In Introduction to Communication Studies, 3nd ed., 5–21. London ; New York: Routledge, 2010. https://ebookcentral.proquest.com/lib/unc/reader.action?docID=958077&ppg=58.

6,600 words

Reading tips

This chapter from a standard introductory textbook in communications explains information theory (the author calls it communication theory) in more modern and accessible terms.
Optional

Gleick, James. “Information Theory.” In The Information, 1st ed., 204–232. New York: Pantheon Books, 2011. PDF.

9,100 words

Reading tips

This chapter from science writer James Gleick’s book The Information is an engaging mini-biography of Claude Shannon, but it is also an accessible introduction to information theory.
Optional

Eckersley, Peter. “A Primer on Information Theory and Privacy.” Electronic Frontier Foundation, August 10, 2020. https://www.eff.org/deeplinks/2010/01/primer-information-theory-and-privacy.

700 words

Reading tips

This short article use the information theoretic concept of entropy to explain why it is so easy to identify individual people based on their web browsing activity.

Worldviews and data models

A worldview is a way of looking at the world that shapes one's perception of the world. Worldviews make it easy to see some things and difficult or impossible to see other things. Indeed, one's understanding of what “things” are is part of one's worldview.

It is not possible to communicate, or even to think, without a worldview. For anything to have meaning, to be thinkable and communicable, it must somehow be represented, and representation inevitably means drawing boundaries and making distinctions to establish what “things” there are and how they relate to one another. Collectively, these boundaries and distinctions constitute a worldview.

When we represent things using computers, we encode worldviews into data models. Different approaches to modeling data are different ways of turning the world into computable “information.” The choice of one way of data modeling over another can be consequential.

During the second part of this course, we'll first look at how we draw boundaries and make distinctions, both in everyday life and as part of the construction of scientific knowledge.

We'll then examine and contrast two different ways of formally modeling these boundaries and distinctions so that they can be represented and manipulated mathematically: Boolean algebra (putting things into groups based on their attributes) and Bayesian inference (hypothesizing about the groups to which things might belong, based on past evidence).

September 14
Drawing boundaries, making distinctions

View slides Updated Friday 9/11 4:28 PM

Total amount of required reading for this meeting: 16,900 words

Making things meaningful involves drawing boundaries and making distinctions—categorizing and classifying the world around us. Eviatar Zerubavel is a cognitive sociologist, meaning that he studies how social processes shape our thinking, and he’s written a number of fascinating and accessible books on the topic. For this week read some selections from his book The Fine Line about making distinctions in everyday life.

📖 To read before this meeting:

Zerubavel, Eviatar. “Introduction / Islands of Meaning / The Great Divide / The Social Lens.” In The Fine Line, 1–17, 21–24, 61–80. New York: Free Press, 1991. PDF.

16,900 words

Reading tips

Eviatar Zerubavel is a cognitive sociologist, meaning that he studies how social processes shape our thinking, and he’s written a number of fascinating and accessible books on the topic. These are selections from his book The Fine Line about making distinctions in everyday life.
Optional

Glushko, Robert J, Paul P Maglio, Teenie Matlock, and Lawrence W Barsalou. “Categorization in the Wild.” Trends in Cognitive Sciences 12, no. 4 (April 2008): 129–35. http://dx.doi.org/10.1016/j.tics.2008.01.007.

5,000 words
Optional

Lakoff, George. “The Importance of Categorization / From Wittgenstein to Rosch.” In Women, Fire, and Dangerous Things: What Categories Reveal about the Mind. Chicago: University of Chicago Press, 1987. PDF.

21,600 words

September 17

Exam 1 due

September 21
Making things classifiable

View slides Updated Friday 9/18 4:58 PM

Total amount of required reading for this meeting: 10,100 words

Last week we looked at categorization and classification in everyday life. This week we’ll look at efforts to make things categorizable and classifiable at scale beyond everyday life.

For example, scientists seek to develop universal classifications rather than relying on locally-specific ones. Establishing and maintaining universal classifications is difficult, as the history of the scientific classification of clouds demonstrates. It is not only a matter of classifying clouds, but also a matter of making clouds classifiable.

Science is not the only institution that seeks to stabilize the meanings of things in order to coordinate across great distances and over long periods of time. Law, trade and finance, engineering—every variety of large-scale coordination has its own techniques of making things classifiable (though we can identify some common features).

📖 To read before this meeting:

Daston, Lorraine. “Cloud Physiognomy.” Representations 135, no. 1 (August 1, 2016): 45–71. https://doi.org/10.1525/rep.2016.135.1.45.

10,100 words
Optional

Dupré, John. “Scientific Classification.” Theory, Culture & Society 23, no. 2–3 (May 1, 2006): 30–32. PDF.

1,200 words
Optional

Latour, Bruno. “Visualisation and Cognition: Thinking with Eyes and Hands.” Knowledge and Society: Studies in the Sociology of Culture Past and Present 6 (1986): 1–40. PDF.

9,200 words

Reading tips

Latour uses some unusual terminology in this article. He refers to documents as inscriptions and practices of documentation as inscription procedures. He also refers to documents as immutable mobiles, highlighting what he considers to be two of their most important qualities: immutability and mobility.

Latour is interested in the relationship between practices of documentation and thinking (cognition). His basic argument is that what may seem like great advances in thought are actually better understood as the emergence of new practices of documentation. Latour focuses primarily on documents as aids to visualization rather than as carriers of information. Thus he begins by discussing the emergence of new visualization techniques, such as linear perspective.
Optional

Bennett, Claudette. “Racial Categories Used in the Decennial Censuses, 1790 to the Present.” Government Information Quarterly 17, no. 2 (2000): 161–80. https://doi.org/10.1016/S0740-624X(00)00024-1.

7,000 words

September 28
Modeling data as classes of objects with attributes

View slides Updated Friday 9/25 7:57 PM

Total amount of required reading for this meeting: 14,400 words

This week we will consider a common way of modeling the world so as to turn it into “data.” It is so common that most of us take it for granted.

This way of modeling the world relies on the following “common-sense” assumptions:

The world consists of individual objects or entities.
These entities have attributes that can be counted and described.
Entities can be sorted into types based on the presence or absence or values of their attributes.

If we make these assumptions, we can translate our models of the world into mathematical expressions using Boolean algebra.

Three short readings each explore this way of modeling, from slightly different perspectives.

The first required reading is an excerpt from a very useful and easy-to-read (and very British) textbook on how to classify things.

The second required reading is by Edmund Berkeley, a pioneer of computer science and co-founder of the Association for Computing Machinery, which is still the primary scholarly association for computer scientists. But he wrote this article in 1937, before he became a computer scientist—because computers had yet to exist. At the time he was a mathematician working at the Prudential life insurance company, where he recognized the usefulness of Boolean algebra for modeling insurance data. He published this article in a professional journal for actuaries (people who compile and analyze statistics and use them to calculate insurance risks and premiums).

The third required reading is an excerpt from one of my favorite books, Data and Reality by Bill Kent. Kent was a computer programmer and database designer at IBM and Hewlett-Packard, during the era when the database technologies we use today were first being developed. He thought deeply and carefully about the challenges of data modeling and management, which he recognized were not primarily technical challenges.

📖 To read before this meeting:

Hunter, Eric. “What Is Classification? / Classification in an Information System / Faceted Classification.” In Classification Made Simple, 3rd ed. Farnham: Ashgate, 2009. PDF.

5,600 words
Berkeley, Edmund C. “Boolean Algebra (the Technique for Manipulating AND, OR, NOT and Conditions).” The Record 26 part II, no. 54 (1937): 373–414. PDF.

3,400 words

Reading tips

This article is by Edmund Berkeley, a pioneer of computer science and co-founder of the Association for Computing Machinery, which is still the primary scholarly association for computer scientists. But he wrote this article in 1937, before he became a computer scientist—because computers had yet to exist. At the time he was a mathematician working at the Prudential life insurance company, where he recognized the usefulness of Boolean algebra for modeling insurance data. He published this article in a professional journal for actuaries (people who compile and analyze statistics and use them to calculate insurance risks and premiums).

Berkeley uses some frightening-looking mathematical notation in parts of this article, but everything he discusses is actually quite simple. The most important parts are:

pages 373–374, where he gives a simple explanation of Boolean algebra,

pages 380–381, where he considers practical applications of Boolean algebra, and

pages 383 on, where he pays close attention to translation back and forth between Boolean algebra and English.
Kent, William. “Attributes / Types and Categories and Sets / Models.” In Data and Reality, 77–94. Amsterdam: North-Holland, 1978. PDF.

5,400 words

Reading tips

This is an excerpt from one of my favorite books, Data and Reality by Bill Kent. Kent was a computer programmer and database designer at IBM and Hewlett-Packard, during the era when the database technologies we use today were first being developed. He thought deeply and carefully about the challenges of data modeling and management, which he recognized were not primarily technical challenges.

The fixed-width typewriter font makes this reading look old-fashioned, but nothing in it is out-of-date. These are precisely the same issues data modelers and “data scientists” struggle with today.
Optional

Evans, Eric. “Crunching Knowledge.” In Domain-Driven Design. Boston: Addison-Wesley, 2004. PDF.

3,000 words

October 5
Modeling data as distributions

View slides Updated Friday 10/9 3:57 PM

Total amount of required reading for this meeting: 18,400 words

The limitations of the kind of modeling we looked at last week become clear if we try to apply it to classify the subject matter of texts. Texts include things like books and news articles, but could also include things like movies and video games—anything for which it makes sense to ask, “What it is about?”

In the first reading, Patrick Wilson considers the problems that arise if one tries to treat the subject of a text as an attribute of that text.

The second reading introduces a way of modeling the world that is radically different from the one we looked at last week. Bill Maron was an engineer at missile manufacturer Ramo-Wooldridge when he began investigating statistical methods for classifying and retrieving documents. In this paper he describes a method for statistically modeling the subject matter of texts. He introduces the basic ideas behind what is now known as a Bayesian classifier, a technique that is still widely used today for a variety of automatic classification tasks from spam filtering to face recognition.

📖 To read before this meeting:

Wilson, Patrick. “Subjects and the Sense of Position.” In Two Kinds of Power, 69–92. Berkeley: University of California Press, 1968. PDF.

11,900 words

Reading tips

In this chapter Patrick Wilson considers the problems that arise when one tries to come up with systematic rules for classifying texts by subject.

Wilson can be a bit long-winded, but his insights are worth it. (You can skip the very long footnotes, so this reading is actually shorter than it looks.) What Wilson calls a “writing” is more typically referred to as a text. In this chapter he is criticizing the assumptions librarians make when cataloging texts by subject. The “sense of position” in the title of the chapter refers to the librarian’s sense of where in a classification scheme a text should be placed. Although he is talking about library classification, everything Wilson says is also applicable to state-of-the-art machine classification of texts today.
Maron, M. E.“Automatic Indexing: An Experimental Inquiry.” Journal of the ACM 8, no. 3 (July 1961): 404–17. https://doi.org/10.1145/321075.321084.

6,500 words

Reading tips

Bill Maron was an engineer at missile manufacturer Ramo-Wooldridge when he began investigating statistical methods for classifying and retrieving documents. In this paper he describes a method for statistically modeling the subject matter of texts. He introduces the basic ideas behind what is now known as a Bayesian classifier, a technique that is still widely used today for a variety of automatic classification tasks from spam filtering to face recognition.

Trigger warning: math. The math is relatively basic, and if you’ve studied any probability, you should be able to follow it. But if not, just skip it: Maron explains everything important about his experiment in plain English. Pay extra attention to what he says about “clue words.”
Optional

Hacking, Ian. An Introduction to Probability and Inductive Logic. Cambridge: Cambridge University Press, 2001. PDF.
Optional

Smucker, Mark D. “Information Representation.” In Interactive Information Seeking, Behaviour and Retrieval, edited by Ian Ruthven and Diane Kelly, 77–93. London: Facet Pub., 2011. PDF.

October 12

Exam 2 made available

October 12
Exam review

October 16

Exam 2 due

Selection systems in society

Data and information professionals, along with the technological systems that they build, can be understood as constituting selection systems that extract usable information from masses of documents and data. Boolean algebra and Bayesian inference are two different logics according to which selections systems can be constructed (and of course it is possible to construct systems that combine these two logics and possibly other logics as well).

During the last part of this course, we'll look at selection systems in the context of our broader society.

First, we'll reflect on the relationship between technology and society. Does technological change cause social, political, and cultural change? Or do technologies simply reflect social, political, and cultural practices?

Then, we'll consider the trade-offs between using human and machine labor in selection systems.

Finally, we'll look again at Boolean algebra and Bayesian inference, and consider the broader consequences of their differing logics.

October 19
Selection systems / Technology and society

View slides Updated Saturday 10/17 6:33 PM

Total amount of required reading for this meeting: 17,100 words

There are various positions one might take regarding the relationship between technology and society. Sometimes people talk about technology as an external force that exerts influence on society, pushing us in certain directions. Sometimes people insist that technologies are “just tools” that can be used in different ways, for better or for worse. Some people see technologies as “politics by other means,” and certain technologies as inextricably linked to certain political ideas.

The parkway bridges of Long Island provide an illustrative example of how people talk about technology and society.

📖 To read before this meeting:

Buckland, Michael, and Christian Plaunt. “On the Construction of Selection Systems.” Library Hi Tech 12, no. 4 (1994): 15–28. PDF.

8,100 words

Reading tips

An examination of the structure and components of information storage and retrieval systems and information filtering systems. Argues that all selection systems can be represented in terms of combinations of a set of basic components. The components are of only two types: representations of data objects and functions that operate on them.
Winner, Langdon. “Do Artifacts Have Politics?” Daedalus 109, no. 1 (1980): 121–136. https://www.jstor.org/stable/20024652.

9,000 words
Optional

Buckland, Michael. “Discovery and Selection.” In Information and Society, 135–52. MIT Press, 2017. PDF.

2,900 words
Optional

Sawyer, P. H.“Technical Determinism: The Stirrup and the Plough.” Past & Present, no. 24 (1963): 90–95. PDF.

2,600 words
Optional

Pinch, Trevor J., and Wiebe E. Bijker. “The Social Construction of Facts and Artefacts: Or How the Sociology of Science and the Sociology of Technology Might Benefit Each Other.” Social Studies of Science 14, no. 3 (1984): 411–428. PDF.

3,400 words
Reading tips
The authors are attacking what they describe as “linear” models of technological development, which focus on a series of “technological breakthroughs” leading inevitably to where we are today. They argue that looking at the actual historical development of a technology like the bicycle shows that what seem in retrospect to be obvious “technological breakthroughs” were not at all obvious at the time.

It may help to consult these pages to get a sense of the different bicycle models discussed in the reading:
Optional

Joerges, Bernward. “Do Politics Have Artefacts?” Social Studies of Science 29, no. 3 (1999): 411–31. https://doi.org/10.1177/030631299029003004.

11,100 words
Optional

Campanella, Thomas J. “Robert Moses and His Racist Parkway, Explained.” Bloomberg CityLab, July 9, 2017. https://web.archive.org/web/20200719205411/https://www.bloomberg.com/news/articles/2017-07-09/robert-moses-and-his-racist-parkway-explained.

1,400 words

Reading tips

The parkway bridges of Long Island, built by city planner Robert Moses, provide an illustrative example of how people talk about technology and society.

October 26

Investigation proposals due this week

October 26
Selection labor by people and machines

View slides Updated Saturday 10/24 4:22 PM

Total amount of required reading for this meeting: 6,800 words

Regardless of the type of data modeling employed, turning the world into data and information involves labor. This week we’ll consider the question of automation: what kinds of labor are done by people, and what kinds are done by machines?

📖 To read before this meeting:

Warner, Julian. “Description and Search Labor for Information Retrieval.” Journal of the American Society for Information Science and Technology 58, no. 12 (2007): 1783–1790. https://doi.org/10.1002/asi.20664.

6,800 words

Reading tips

Warner’s writing can be hard to follow at times. If you’re getting bogged down, focus on trying to understand the various categories of labor that Warner identifies, and how they relate to one another. What does he mean by “the dynamic compelling the transfer of human syntactic labor to technology stemming from the costs of direct human labor” (page 1789)?
Optional

Seligman, Ben B. “The Social Cost of Cybernation.” In The Evolving Society: The Proceedings of the First Annual Conference on the Cybercultural Revolution—Cybernetics and Automation, edited by Alice Mary Hilton, 159–66. New York: Institute for Cybercultural Research, 1966. PDF.

2,600 words
Optional

Boggs, James. “The Negro and Cybernation.” In The Evolving Society: The Proceedings of the First Annual Conference on the Cybercultural Revolution—Cybernetics and Automation, edited by Alice Mary Hilton, 167–72. New York: Institute for Cybercultural Research, 1966. PDF.

1,900 words

November 2
Comparing selection techniques

View slides Updated Friday 10/30 10:28 PM

Total amount of required reading for this meeting: 13,100 words

Boolean algebra and Bayesian inference are two different—possibly complementary—techniques for building selection systems. We’ve looked at how these techniques work in the abstract, but what consequences do they have?

The first reading for this week is an excerpt from an article arguing that, though they are perceived as outdated, selection systems based on Boolean algebra (more commonly referred to as Boolean retrieval systems) are preferable for some purposes because they offer more opportunities for human decision-making during searches.

The second reading “scrutinizes” Bill Maron’s Bayesian classifier, identifying it as an example of an algorithmic technique that is now applied for many different purposes that differ quite a bit in their particulars from Maron’s “library problem.”

📖 To read before this meeting:

Hjørland, Birger. “Classical Databases and Knowledge Organization: A Case for Boolean Retrieval and Human Decision-Making during Searches.” Journal of the Association for Information Science and Technology 66, no. 8 (August 1, 2015): 1559–75. PDF.

2,800 words

Reading tips

This is an excerpt from an article arguing that, though they are perceived as outdated, selection systems based on Boolean algebra (more commonly referred to as Boolean retrieval systems) are preferable for some purposes because they offer more opportunities for human decision-making during searches.
Rieder, Bernhard. “Scrutinizing an Algorithmic Technique: The Bayes Classifier as Interested Reading of Reality.” Information, Communication & Society 20, no. 1 (January 2, 2017): 100–117. https://doi.org/10.1080/1369118X.2016.1181195.

10,300 words

November 9

Discuss investigative findings in recitation

November 9
Presentations of investigative findings

November 20

Selection system investigation due

August 10 Synchronous recitations begin

August 10 Recitation sections begin meeting

August 11 Synchronous all hands meeting

August 11 All hands meeting

What are data and information ( professions | technologies ) ?

August 17 The data and information professions

August 24 The social lives of data and documents

August 31 Meaning, significance, and codes

September 7 Exam 1 handed out

September 7 Quantifying data and information

Worldviews and data models

September 14 Drawing boundaries, making distinctions

September 17 Exam 1 due

September 21 Making things classifiable

September 28 Modeling data as classes of objects with attributes

October 5 Modeling data as distributions

October 12 Exam 2 made available

October 12 Exam review

October 16 Exam 2 due

Selection systems in society

October 19 Selection systems / Technology and society

October 26 Investigation proposals due this week

October 26 Selection labor by people and machines

November 2 Comparing selection techniques

November 9 Discuss investigative findings in recitation

November 9 Presentations of investigative findings

November 20 Selection system investigation due

August 10
Synchronous recitations begin

August 10
Recitation sections begin meeting

August 11
Synchronous all hands meeting

August 11
All hands meeting

August 17
The data and information professions

August 24
The social lives of data and documents

August 31
Meaning, significance, and codes

September 7
Exam 1 handed out

September 7
Quantifying data and information

September 14
Drawing boundaries, making distinctions

September 17
Exam 1 due

September 21
Making things classifiable

September 28
Modeling data as classes of objects with attributes

October 5
Modeling data as distributions

October 12
Exam 2 made available

October 12
Exam review

October 16
Exam 2 due

October 19
Selection systems / Technology and society

October 26
Investigation proposals due this week

October 26
Selection labor by people and machines

November 2
Comparing selection techniques

November 9
Discuss investigative findings in recitation

November 9
Presentations of investigative findings

November 20
Selection system investigation due