This schedule is still in flux. Specifically, the parts in gray are subject to change.
Total amount of required reading for this meeting: 1,800 words
Today we’ll introduce ourselves and talk about:
In addition to the topics above, please come prepared to talk about:
To read before this meeting:
Total amount of required reading for this meeting: 4,700 words
Today we’ll look at Loomio, the collaboration software that we’ll be using this semester, and we’ll start examining the dataset we compiled last spring. We’ll also start talking about some of the various tools for working with data that we’ll be using: text editors, regular expressions, command line utilities, scripting languages, makefiles, and OpenRefine.
To read before this meeting:
Total amount of required reading for this meeting: 6,300 words
Today we’ll meet with Kristen Merryman, Digital Projects Librarian at the North Carolina Digital Heritage Center. We’ll try to get a sense of what it means to have a collection based on a place like “North Carolina.” And we’ll talk about how geographic metadata is integrated into catalog records and authority records, the process of georeferencing, and how places can be a challenge for curators of collections.
To read before this meeting:
Today we’ll introduce OpenRefine, a tool for cleaning and enriching data.
Before coming to class, please install OpenRefine and complete one of the following OpenRefine tutorials:
If you run into trouble installing or running OpenRefine, ask for help on the Loomio site! Don’t wait until class to tell us you weren’t able to get it running.
Total amount of required reading for this meeting: 10,400 words
Today we’ll introduce the concept of a digital gazetteer as a specific kind of networked knowledge organization system.
To read before this meeting:
You might also explore the Who’s On First gazetteer: https://www.whosonfirst.org
Today we will have a hands-on activity that will involve digging into the digital version of Powell’s gazetteer using regular expressions.
Before coming to class please complete this regular expression tutorial.
To read before this meeting:
Total amount of required reading for this meeting: 6,600 words
Historical gazetteers aim to record and describe not just place names currently in use, but also place names used in the past.
To read before this meeting:
Today we’ll learn how to match (reconcile) records in OpenRefine with records in another database such as Wikidata.
Before coming to class, read the OpenRefine documentation page on reconciliation.
You may also want to check out these two videos:
Class will not meet.
Total amount of required reading for this meeting: 4,500 words
There are several standard formats for recording the spatial “footprints” of places. The one that is easiest to work with is called GeoJSON. As its name implies, GeoJSON uses JSON (JavaScript Object Notation) to record geographic coordinates and polygons.
To read before this meeting:
Total amount of required reading for this meeting: 3,800 words
Some digital gazetteers specifically aim to link together disparate datasets that relate to the same places. An approach to publishing known as linked data is well suited to this purpose.
To read before this meeting:
Total amount of required reading for this meeting: 3,600 words
Linked data is less a specific technology than a set of best practices for publishing data on the web. Today we’ll introduce the basic concepts of linked data, including the Resource Description Framework (RDF) data model, and we’ll learn to write RDF by hand using the Turtle syntax.
To read before this meeting:
Total amount of required reading for this meeting: 5,300 words
One of the things that distinguishes gazetteers from more standard Geographic Information System (GIS) tools is that they can be also be used to record information about places with ill-defined or unknown locations: ancient places, mythical places, or—as we will discuss today—vague places.
Content warning “Perceptual Regions in Texas” contains discussion of ethnic slurs.
To read before this meeting:
GeoJSON gives us a convenient way to express geospatial information (“footprints”). RDF gives us a convenient way to express other kinds of statements about places, such as statements about the things (people, organizations, events, other places) to which they are related, or the categories (feature types) to which they belong. JSON-LD is a data format that allows us to combine the strengths of GeoJSON and RDF.
To read before this meeting:
Total amount of required reading for this meeting: 7,300 words
SPARQL is to RDF triplestores what SQL is to relational databases. A good way to get started with SPARQL is to try the Wikidata Query Service.
To read before this meeting:
Class will not meet.
Total amount of required reading for this meeting: 11,500 words
The motivating idea behind this class is that we can make new things possible by taking the contents of a book (Powell’s gazetteer) and putting it into a different form, using networked computers. This is an old idea. Creating new “knowledge infrastructure” is hard, despite a century of dreaming about it.
To read before this meeting:
H.G. Wells was an English writer now best known for his science fiction including The Time Machine and The War of the Worlds. During his own lifetime, however, he was prominent as a “futurist” who devoted his literary talents to the development of a progressive vision on a global scale.
This reading is an excerpt from a speech Wells gave in 1936 at The Royal Institution of Great Britain, an organization for scientific education and research founded in 1799.
J.C.R. Licklider was an American psychologist and computer scientist who was one of the first to foresee modern-style interactive computing and its application to all manner of activities. As director of ARPA’s Information Processing Techniques Office he funded research which led to the canonical graphical user interface, and the ARPANET, the direct predecessor to the Internet.
In the early 1960s The Council on Library Resources recruited Licklider to address the question how could technology help libraries gather, index, organize, store and make accessible the growing body of recorded information. Licklider gathered a small team of engineers and psychologists to explore “concepts and problems of libraries of the future”. Licklider wrote a summary report of the project which appeared as the book Libraries of the Future in 1965.
This reading is an excerpt from a 1982 speech in which Licklider looks back at the changes he anticipated in 1965.
Danny Hillis is an American computer scientist who pioneered parallel computers and their use in artificial intelligence. In 2005, Hillis founded Metaweb Technologies to develop a semantic data storage infrastructure for the Internet, and Freebase, an open, structured database of the world’s knowledge. That company was acquired by Google, and its technology became the basis of the Google Knowledge Graph.
This reading is a post giving an overview of Hillis’ latest project, the Underlay.
Total amount of required reading for this meeting: 2,500 words
Gazetteers can be combined with annotation tools and/or natural language processing (NLP) tools to “geotag” text. Geotagging involves identifying place names mentioned in a text in order to get get a sense of the spatial coverage of the content, or to link the the text to other relevant texts, images, or data.
To read before this meeting:
You can find the raw data created for this project at https://doi.org/10.18738/T8/BB70Z2.