Organization of Information

UNC SILS, INLS 520, Fall 2012

Scoping & Identifying Resources

Due September 18.

Assignment Overview

In this assignment, you will:

  1. Begin to design an original organizing system.
  2. Decide upon the domain and scope of your system.
  3. Think about how to clearly identify your resources.
  4. Post a design proposal on Piazza succinctly describing your domain, scope, and resources.
  5. Read your classmates’ notes and offer constructive feedback.
  6. Revise your proposal in response to the feedback you receive.

Deadlines

You must post your initial design proposal on Piazza by 11AM on Thursday, September 13, 2012. You should post feedback on your classmates’ proposals by noon on Sunday, September 16, 2012, and revise your proposal in response to the feedback you receive by 11AM on Tuesday, September 18.

So that I have a stable snapshot of your proposal, please paste a copy of your final revised proposal into a plain text file named proposal.txt, zip that file, and submit it by uploading your zip archive before 11AM on Tuesday, September 18, 2012. Late assignments will not be accepted unless you have an exceptionally good excuse. If you submit a file in a format other than plain text, you will be be asked to resubmit and your resubmission will be considered late.

Submission Requirements

A complete submission consists of

  1. An initial proposal note on Piazza
  2. Comments posted on classmates’ proposals
  3. Revisions to your initial proposal
  4. A zipped stable snapshot of your final proposal

Your proposal should have the following sections:

  1. Domain
  2. Detailed Scope
  3. Identification

Detailed Instructions

Your task is to begin designing an original system for organizing some kind of resource. You will begin by deciding on the domain of your system and making specific decisions about how to limit its scope. You must clearly delineate and define your individual resources, including how to you will identify them.

Part 1: Domain

The first thing you need to decide is: what kind of resource are you organizing? Remember, a resource can be virtually anything. It could be something tangible, like action figures. It could be something intangible, like restaurant dining experiences. It’s totally up to you.

But choose carefully! Most of the rest of your assignments for the semester will involve different approaches to organizing the kind of resource you choose for this assignment. So choose something you’re interested in, and won’t get bored of. Ideally, it will be something you could imagine yourself organizing professionally in the future. But don’t feel that you have to choose something down-to-earth; it’s OK to be creative.

Remember that the domain of an organizing system isn’t just about what is being organized; it’s also about the larger context in which it is being organized. Use the five questions to help you flesh out your (possibly imaginary) domain.

Forbidden domains: Certain domains are just too boring or too easy to support the series of assignments for this class. Organizing your personal collection of books, films, music, or video games is out. (That doesn’t mean that you can’t have books be your resources, just that you need to develop a more interesting idea of your domain than “the books on my shelves.”) Likewise, well-known, established organizing systems are out. This is supposed to be an original organizing system, so don’t define your domain as “circulating books in a large academic library” or “games in the Apple App Store.”

Part 2: Detailed Scope

Once you decided upon your domain and are able to clearly yet succinctly describe it to yourself in writing, it’s time to start making detailed decisions about your scope.

Suppose you’re creating a catalog recording people’s descriptions of their experiences dining out at restaurants. Are you looking at five-star restaurants or fast food? Or do you want to try to model any dining experience?

At this point, you need to be thinking about why you would be organizing this resource. In the “real world” there would be someone paying you to organize it, and you could talk to them to help you answer the questions that would help you design the organizing system. But for the purposes of this assignment, just make up your own answers to help you define the scope of your system.

Be sure to carefully consider not only what falls within your scope but also what is outside of it. Thinking through the six interrelated aspects of scope should help here.

Part 3: Identification

Now that you’ve clearly defined your scope, you should be able to more precisely define your resources. Defining what your resources involves deciding on a level of abstraction: for example, are you organizing individual physical books, or abstract literary expressions that may be manifested in various forms and editions? You also need to think about parts and granularity: do your resources have parts that need to be kept track of? Are your resources themselves collections? You also need to think about the persistence of your resources: do they change over time? How much can a resource change before it is no longer the same resource?

As a practical test of whether you’ve answered these questions satisfactorily, explain how you will uniquely identify your resources. How can you distinguish two different resources? Do they have some intrinsic property that you can rely on for identification? Will you need to assign identifiers? How would you manage that assignment? Will you need to verify that identifiers are authentic? How would you do that? Be specific.

Submit this assignment.

Creating a Vocabulary & Descriptions

Due September 27.

Assignment Overview

In this assignment, you will:

  1. Create an XML vocabulary for describing resources as you defined them in your previous assignment.
  2. Describe a specific instance of that resource using your vocabulary.
  3. Swap vocabularies with a partner.
  4. Describe a resource using your partner’s vocabulary.
  5. Reflect on your experience creating and using vocabularies.

Deadline

You must submit your work by uploading your zip archive of assignment files before 11AM on Thursday, September 27, 2012. Late assignments will not be accepted unless you have an exceptionally good excuse. Even though you have nine days for this assignment, you’ll want to start early, as you’ll need to swap vocabularies with a partner in order to successfully complete all parts of the assignment.

Submission Requirements

You will need to use a text editor for this assignment. You may choose to use an XML editor as well, but it isn’t really necessary.

You will submit a total of 3 text files (zipped).

The first file should be named report.txt. It should have the following sections:

  1. Defining the Terms
  2. Reflection

The second file will be an example resource described in XML by you using your vocabulary. Name this file mine.xml.

The third file will be an example resource described in XML by your partner using your vocabulary. Name this file theirs.xml.

Detailed Instructions

Your task is to develop a vocabulary for describing the kind of resource you defined in the previous assignment. Creating a vocabulary involves choosing meaningful names and descriptions and being aware of any biases you might be bringing to the vocabulary. This assignment will take you through that process.

Part 1: Defining the Terms

Identify and define the terms required to describe resources within your scope. For each term, write a short definition. These definitions are your instructions for people using your vocabulary to describe an instance of your resource, so strive for precision.

In addition to these short definitions of the meanings of your terms, you need to explain how descriptions using your terms should be represented as XML. The assumption is that each term will correspond to an XML element. This means that your terms need to be legal XML element names.

You may decide that some terms are required elements of any description, while others are optional. You should indicate which are which.

Also, be sure to specify if an element is meant to be nested within another element: for example, an <address> element might nest <street_address>, <city>, <state>, and <zip> elements within it. If you decide to use XML attributes, be sure to specific which elements have attributes, and whether or not these attributes are required. Finally, you may need to specify what kind of content an element or attribute can take: a number? a date? anything?

All of the above should go in the “Defining the Terms” section of your report.txt document.

Remember the tradeoffs we’ve discussed so far in the semester. They’ll help you think through some key issues, such as:

  • How many “levels” of elements do you want or need? Everything can be at the same hierarchical level, or you may have some elements that are “containers” for others. There’s no right answer; just think about the implications.
  • How many elements do you need to cover everything in your scope?
  • What are the benefits/drawbacks of a simpler vocabulary? A more complex one?
  • Are you being consistent with the levels of abstraction and granularity of your terms?
  • There’s no required upper or lower boundary on the number of elements that will be in your vocabulary. That said, in the past, we’ve found that a useful vocabulary can be developed with somewhere between 10 and 20 terms.
Part 2: Using Your Vocabulary

You might have developed your vocabulary by thinking of a specific instance of your resource. This step will make that connection explicit, testing your vocabulary by having you create a description of a specific resource.

Create an XML description of your resource, using your vocabulary, in the file mine.xml. Make sure the resource falls within the scope you defined in the previous assignment, and that you followed all the definitions you created in part 1.

As you do this step, you may find you want to change some of your elements, make things less (or more) granular, remove (or add) levels of hierarchy, or alter your vocabulary in other ways. That’s OK! You should revisit the previous step and revise your elements and definitions as needed. In some cases, you may realize that you need to clarify or refine your scope from the previous assignment as well. That’s fine, as long as you don’t change it so radically that you can’t defend the claim that it’s still the “same” scope. Make notes on any changes you make – you’ll want them for your reflection later.

When you’re finished, make sure your XML is well-formed (i.e., that it is actually XML) by uploading your file to the W3C Markup Validation Service. You should get the message “This document was successfully checked as well-formed XML!” If you get the message “Errors found while checking this document as XML!” look at the list of errors and try to determine what is wrong. (Don’t worry about the warnings.) Use Piazza if you get stuck! Don’t spend hours trying to solve XML problems on your own.

Part 3: Swapping Vocabularies

Now comes the fun part: You’ll be swapping vocabularies with a partner.

For this part of the assignment, you’ll just be sharing part 1 of your report.txt. DO NOT share the description you created using your vocabulary—just the definition of terms.

Once you have your partner’s vocabulary, look at the definition of scope that they posted to Piazza. Then attempt to create an XML description using only the scope and term definitions they provided. Save your description as a file named theirs.xml. Again, make sure your instance is well-formed. Send this file back to your partner.

Part 4: Reflecting on the vocabulary modeling experience

Write a few paragraphs reflecting on your vocabulary modeling experience in your report.txt document under the section “Reflection.” This doesn’t need to be longer than a few hundred words, but it does need to include, at minimum, the following elements:

  1. Your thoughts on the challenges of creating your own vocabulary, what it was like to define your terms, and what changes—if any—you needed to make after you tried to encode your own instance.

  2. What it was like to try to create a description using your partner’s vocabulary. Did any terms confuse you? Was anything particularly clear? What did looking at someone else’s terms and definitions teach you about creating a vocabulary?

  3. What was it like to see the description your partner created using your vocabulary? Did it match the expectations you had for your vocabulary? And what does that tell you about the success of your scope, terms, and definitions?

Note: The “No Busy Work” Principle

Assignments are meant to challenge you intellectually in some way, not to see how much “busy work” you can put up with. So I’ll never intentionally ask you to do something that takes time but that doesn’t give you more insights. Put another way, if you find yourself doing busy work in an assignment, re-read the instructions to see if they advise against doing what you’re doing. If that doesn’t end the busy work, ask me if you’re doing what is expected.

In this assignment there are ample opportunities to do “busy work,” so be careful not to do it. For example, a “Syllabus Vocabulary” would probably have some notion of “Topic,” and an description of a syllabus might have many of them. In part 2 of this assignment your task is to demonstrate how your vocabulary works by creating a description. If you were creating a syllabus description it would be sufficient to include just two or three topics, not 29.

Submit this assignment.

Building a Taxonomy

Due October 11.

Assignment Overview

In this assignment you will:

  1. Define categories usable for organizing your resources.
  2. Sort those categories into a taxonomy.
  3. Create a diagram of your taxonomy.
  4. Write definitions for each part of your taxonomy using hypernyms and hyponyms.
  5. Reflect on your experience.

Deadline

You must submit your work by uploading your zip archive of assignment files to the course website before 11:00AM on Thursday, October 11th, 2012. Late assignments will not be accepted unless you have an exceptionally good excuse.

Submission Requirements

You will submit a zip archive containing two text files, one named reflection.txt and one named urls.txt. reflection.txt should contain your reflection on the assignment (see part 6). urls.txt should contain the published URLs for your Google Docs spreadsheet and drawing (see parts 1–5). To get your published URL for each document, select Publish to the web from the File menu in the upper left-hand corner of the Google Docs interface. For your spreadsheet, make sure you choose to publish All sheets. You will see the URL for your published document under heading Get a link to the published data. Copy and paste your the URLs for your two published documents into the urls.txt file.

Detailed Instructions

For this assignment, you’ll be developing a hierarchical set of categories for your organizing system. The goal of this assignment is to give you some practice thinking about categories and category membership, abstraction, classification, and taxonomy. You’ll also learn a technique for naming and describing a system of categories so that you can clearly convey their meaning to others.

Part 1. Select a property to focus on

In assignment #2, you identified a number of properties of resources to be described in your organizing system. For this assignment, select one of those properties that is suitable for categorizing your resources. As you’ll see in the next step, you’ll want to choose a property that can take on several (at least eight) possible values. If you can’t identify a suitable property, feel free to define a new one.

For example, suppose your resources were superheroes. One of the properties of superheroes you might have chosen to describe is superpower:

<superhero>
<name>Spiderman</name>
<alterego>Peter Parker</alterego>
<superpower>spiderweb casting</superpower>
</superhero>

Superheroes have a lot of different superpowers. You decide that it might make sense to categorize your superheroes by superpower, so you decide to focus on that property.

Part 2. List your descriptors

Now, create a Google Docs spreadsheet by making a copy of this template. (You can do this by selecting Make a copy ... under the File menu.) This spreadsheet contains a full example (using superpower) for you to follow. You’ll need to replace the content with you own, of course. (You don’t need to keep exactly the same number of rows.)

In the first sheet (“Descriptors”) of your spreadsheet, make a list of possible values for the property you’ve selected. These are your descriptors. You’ll need at least eight, but the more the merrier. (Maybe you already have a list of values, if you specified a controlled set of values for that property in assignment #2.)

Now start generalizing away from these specific values. For each value, instance, identify a more general category to which that instance belongs. For example, you might decide that spiderweb casting is just a specific instance of a more general category of animal power superpowers. Remember, as always, you’re making a choice about the level of abstraction you use.

One thing you don’t want to do here is make your categories so specific that they can’t contain anything but the specific descriptor you’re trying to generalize. (Continuing from the example above: a category called spider power might not be useful, unless you have listed other spider-derived powers beyond spiderweb casting. On the other hand, if your resources were all hybrid human-arthropod superheroes, then a spider power category might be appropriate.)

As you’re making your first pass trying to generalize your descriptors into categories, do not stress out too much about naming these categories. You’re likely to go back to them and revise them as you progress through the assignment. If it’s starting to make you feel crazy, my advice is to come up with something temporary and move on; new ideas might pop up once you’ve started to arrange your hierarchy.

Part 3. Organize your classes into a hierarchy

Now that you’ve taken a crack at identifying categories for each of your instances, begin arranging them into a hierarchy. The goal is to create a hierarchy with four levels. Level 1, the top or “root” category of your hierarchy, should come directly the definition of the property statement you wrote for assignment #2, something like:

superpower: an extraordinary ability possessed by the superhero

Record this definition in the fourth sheet (“Root category”) of your spreadsheet.

Level 4, the lowest level of your hierarchy, will be your descriptors. When you created your categories in part 2, you added level 3, more abstract than your descriptors, but less abstract than the “root” category. What you’re doing now is adding one more level of abstraction, level 2, between the categories you identified in part 2 and your root category. To keep these levels distinct, henceforth we’ll refer to them using the following names:

  1. Root category
  2. Top-level categories
  3. Subcategories
  4. Descriptors

Think of this as a sorting task. (Sometimes it’s even helpful to write your category names down on pieces of paper or sticky notes and physically sort them.) As you sort, you may discover that some of your original categories are too narrow. You may also realize that they’re too broad and don’t leave you enough room to insert another level before getting to your root category. That’s OK! Revise your categories as many times as you need to and record them in your spreadsheet.

At this phase of the assignment, it’s important that you strive for a consistent level of abstraction among your top-level categories. For example, if we had musical instruments as our root category and our next level down included both clarinets and stringed instruments, that might be a sign that we weren’t maintaining a consistent level of abstraction, because stringed instruments is more abstract than clarinets. A more consistent taxonomy would have wind instruments and stringed instruments on the same level.

When you’re satisfied with your hierarchy of categories, record them in the second and third sheets (“Subcategories” and “Top-level categories”) of your spreadsheet.

Part 4. Create a diagram of your hierarchy

Create a Google Docs drawing showing the structure of your category hierarchy. This does not have to be fancy. Start with your root category at the top of your diagram, then your top-level categories, then your subcategories. You don’t need to include your descriptors.

Part 5. Define your categories

You’ve already defined your root category. Now, you’re going to write definitions for your other categories such that an ordinary person would be able to categorize new instances. You’ll be following this formula for definitions:

Hyponym = { adjective } hypernym { distinguishing clause }

You do not need to force every definition to include both an adjective and a distinguishing clause, but each definition should include at least one or the other. Record each definition in your spreadsheet in the appropriate place. Remember that your definitions should reflect things that are true for all members of the category.

Part 6. Reflect on your experience

In a text file called reflection.txt, write a paragraph or two about the approaches you used to identify categories and organize them into a hierarchy.

Some questions to guide your reflection:

  • What was your thought process like?
  • Were there any “outliers” that you had to work especially hard to fit in?
  • Were you able to keep your top-level categories at a consistent level of abstraction, and how did you do so?

Submit this assignment.

Classifying with Facets

Due November 1.

Assignment Overview

In this assignment you will:

  1. Familiarize yourself with an online tool for creating faceted classification schemes.
  2. Design a faceted classification for a set of resources.
  3. Adjust your classification given additional instances.
  4. Build a web page for using your faceted classification.
  5. Reflect on your experience designing and iterating your classification.

Deadline

You must submit your work by uploading your zip archive of assignment files to the course website before 11AM on Thursday, October 25th, 2012. Late assignments will not be accepted unless you have an exceptionally good excuse.

Submission Requirements

For this assignment you will design a faceted classification, implement a small HTML application for browsing your faceted classification, and write a short report reflecting on your experience designing and implementing.

Detailed Instructions

You will submit a zip archive containing the files for your HTML application and a report text file, as detailed in the assignment instructions below.

Part 1. Create More Instances

In the second assignment you created a vocabulary for describing some resource, and you and your partner created two instances (XML files) that used your vocabulary. For this assignment you will need to create five more instances. Note: you are free to make minor revisions to your vocabulary from what you turned in for the first assignment, but do not radically change the kind of resource you are describing. When creating your five new instances, try to use some of the different descriptors you created for your taxonomy assignment.

Your partner from the last assignment will also create an additional five instances (using your vocabulary), but will not give them to you yet. If you’ve made changes to your vocabulary, be sure to communicate them to your partner.

So, at this point, you should have a total of seven instances (the original two plus the new five you created), each describing a different resource using your vocabulary.

Don’t spend too much time creating the instances: the point is just to give you a reasonable set of resources to classify, not to richly describe each resource. But make sure that your resources are sufficiently different from one another, or else they will be more difficult to divide into classes. For example, if your resources are restaurants, don’t choose to make all your instances descriptions of barbecue joints (unless “barbecue joints” was the scope of your vocabulary).

Part 2. Get Acquainted with the Faceted Browsing Example Application

Download the zip file containing the faceted browsing example application. Unzip the zip file, and open the index.html file in Google Chrome. Note: the application may not work in other browsers. Use Chrome. See how the three enumerative facets of Animal, Continent, and Programming Language and the single numeric facet Age combine to filter and sort dozens of (nonsensical) resource descriptions. Then look at the resources.xml file to see the “raw data” being classified, and settings.js to see how the example application is configured.

Part 3. Design A Faceted Classification

Once you have familiarized yourself with the example application, it’s time to start working on your classification. Review the sections on facets in the Lambe chapter and pages 18–24 in the TDO chapter on classification, noting Ranganathan’s dimensions, the set of general criteria for facet design, and the principles guiding facet ordering. You will use these to design your own faceted classification to organize your set of seven resources.

You can have as many facets as you want, but you must have at least three. Your facets can be enumerative or numeric (spectrum) facets. Hierarchical or geographic facets are out of scope for this assignment. You will need to be creative, but make sure that anybody else could understand your facets without your help. They can be abstract or practical or a mix of both, so long as they classify the resources in a way that other people could understand. Also make sure that your facets are flexible enough to handle additions to the list of instances.

Part 4. Test Your Facets with New Instances

A good set of facets should be able to accommodate new instances without adjustment. Ask your partner to give you the five new instances they created. Revise your facets if necessary so that your system can classify all 12 (7 original plus 5 new) instances. Note: there have been reports of the UNC mail servers eating XML attachments, so zip your XML files before mailing them to one another.

Part 5. Encode Your Work for Faceted Browsing

To make your faceted classification browsable, first you need to convert your resource descriptions into a format the faceted browsing application can understand, and then you need to configure the application so it knows about your facets. This will involve editing the files resources.xml, resources.js, and settings.js. You should make backup copies of these files so you can refer back to the original examples in case you get stuck (but make sure your edited versions keep these original filenames).

  1. Combine your XML descriptions into a single file

First, take your and your partners’ resource descriptions, and combine them into a single XML file called resources.xml. Do this by making all the descriptions children of a single <collection> element. For example, in the example resources.xml file, there are dozens of <item> elements under the <collection> element. If the root element of your descriptions is, say, <actionfigure>, then you will have twelve <actionfigure> elements under the <collection> element. Make sure your combined resources.xml file validates before moving on to the next step.

  1. Convert your XML into JSON

    Next, you need to convert your resources.xml file into JSON format. Go to jsontoxml.utilities-online.info, copy and paste the contents of your XML file into the XML box, press the --> button, and then copy and paste the contents of the JSON box into a file called resources.js. (You may also want to paste the JSON into this JSON validator to make sure everything converted OK.) Finally, edit resources.js and add

    var resources =
    

    to the beginning of it (see the original resources.js for an example).

  2. Configure the faceted browsing application

    Finally, you need to edit settings.js to inform the faceted browsing application about your facets and resource description vocabulary. Read the comments in settings.js carefully, and edit the file accordingly.

  3. Try it out!

    When you’re done editing settings.js, open index.html in Chrome. You may need to refresh the page (Ctrl+R on Windows or ⌘-R on Macs). Try browsing and sorting your resources. If you don’t see anything but a blank page, there’s a problem with your resources.js or settings.js file. Carefully compare them to the example files to track down your problems, and post to Piazza if you get stuck.

    Once you have things working, adjust your settings.js file until you’re happy with what you have.

Troubleshooting

You may run into quirks, bugs and limitations while using the faceted browsing application. Have patience, make note of any compromises or design changes you had to make because of limitations of the application or facets in general and send email to the class list if you are having difficulties that other students might encounter. If you encounter a show-stopping bug, post it to Piazza and I will try to get it fixed ASAP.

Part 6. Submit Your Work

Once you have your faceted browsing application working as you like, create a new text file report.txt in the same folder. The report should have two sections, and each should be a paragraph or two:

  1. The first section should describe your use of the faceted browsing application, especially any compromises you made in your design because of perceived limitations.

  2. The second section should answer the following questions:

    • Were there sequence effects? That is, did you find that you had to work through the facets in a particular order to get to a classification you felt comfortable with?
    • Was your vocabulary useful for or a barrier to developing facets? How?
    • Did you use Ranganathan’s PMEST in conceiving your facets? Was it useful?
    • What were your biggest challenges in designing the classification?
    • How well did your initial facet design handle the additional instances? What changes did you need to make to accommodate the new instances?
    • Did you choose to enforce exclusivity in your facets? Why or why not?

Finally, zip up the folder containing all the files for the faceted browsing application, as well as your report.txt, and upload it to the course website.

Submit this assignment.

Midterm Exam

Due November 8.

Download the midterm as a MS Word .docx or .doc (Word 97–2004 compatible) document.

Submit this assignment.

Branch Deliverables

Due December 13.

See your branch materials for details about deliverables. Your deliverables must be uploaded by 3PM on Thursday, December 13th.

Submit this assignment.