Web Information Organization

UNC SILS, INLS 620, Fall 2016

Designing a State Machine

Due September 27.

For this assignment you will conceptualize interactions with an information service in terms of “state machines,” and think about how these state machines could be mapped to the uniform interface of HTTP.

You’ll do the following:

Decide what interactions the service needs to support, and the kinds of resources involved
Draw diagrams of the “state machines” for these interactions
Show how your state machines could be implemented using HTTP

You should work on this assignment on your own. You are welcome to ask your classmates general questions about concepts relevant to the assignment, but don’t design your state machines collaboratively.

Part 1: Thinking about service interactions

In class I will give you some brief, high level descriptions of possible services for which you will be designing state machines. You can choose to work with one of these, or make up your own, but you must either let me know which of the provided descriptions you are using, or send me your own description, by midnight on Sunday 9/18.

Then, try to answer the following questions:

What kinds of interactions does the service need to support? In the Starbucks example, the customer needed to be able to: order a drink, change her order, pay for her order, and receive her drink. The barista needed to be able to: see what drinks he needs to make, check to see if a drink has been paid for, and cross off his list drinks received by customers.
What are the different kinds of resources involved in the interactions? In the Starbucks example there were the following kinds of resources: Order, Payment, and Drink. There was also a resource that was a queue of Orders.
What kinds of dependencies are there among the steps of the various interactions? In the Starbucks example, the customer could not change her order once the barista started making it; and the customer could not receive her drink until she had paid.

IMPORTANT NOTE: It is likely that you will find yourself thinking about authentication and authorization, e.g. “only the barista should be able to mark a drink as ‘made.’” Do not worry about that for now; assume that all resources are fully public, and that anyone can create, update, or delete anything. In other words, if you find yourself worrying about logins and whatnot, stop.

Deliverable #1: write a few paragraphs addressing the points above and whatever else you think is relevant. Do not get into specifics such as URLs or data formats. Keep things as simple as you can.

Part 2: Draw your state machine diagrams

Now you will take what you decided upon in Part 1 and draw state diagrams. You can use drawing software, or draw your diagrams by hand. I don’t really care as long as they are readable and understandable.

Each diagram should have a “start” node, an “end” node, and a set of state nodes. The state nodes should be given appropriate names; in the Starbucks example these were names like Order placed, Drink made, etc. Arrows between nodes indicate how the user moves from one state to another. The arrows should be labeled to indicate the action taken to move from one state for another. For example, in the Starbucks Customer state diagram, the pay action took the customer from the Order placed state to the Paid state. There may be multiple arrows between the same two states; for example either accept update or reject update took a customer from Order change requested to Order placed in the Starbucks example.

Deliverable #2: At least one state machine diagram.

Part 3: Show how your interactions could be implemented using HTTP

Now you will show how your state machines could be implemented using HTTP. Create new versions of your state diagrams in which your nodes are resources, and the arrows are HTTP requests or responses. If an arrow represents an HTTP request, it should be labeled with the request’s HTTP method. If it represents a response, it should be labeled with an HTTP status code. So, for example, in the Starbucks case the Order placed state became the Orders queue resource, and moving from the start node to Order placed by the pay action became sending a POST request to the Orders queue resource.

You do not need to give your resources URLs. Just give them meaningful names. Often you will have a resource that is a specific instance of a kind of resource. In this case, ensure that your resource name makes this clear. For example, we might have named the order resource in the Starbucks example Order #123 to indicate that it is one of many Order resources.

You don’t need to indicate the to response for every HTTP request unless it is particularly critical to the interaction. For example, in the Starbucks example the response to a PUT request to an Order resource indicated whether the update succeeded or not, which is a pretty critical part of the interaction of changing an order.

Deliverable #3: New versions of your state machine diagrams, showing how they could be implemented using HTTP.

Now look over all your deliverables, and correct any inconsistencies that might have arisen. If, in the process of making your state diagrams, you changed your mind about how to model your resources or interactions, update your answers to part 1 to indicate this.

All three of your deliverables should be printed and turned in at the beginning of class on the date the assignment is due.

Resources and Representations

Due October 6.

Choose a Web site or Web application you use frequently. Identify a potential resource that the site or application does not make individually addressable via a URL. Explain why you think it would be useful if the site or application did.

Note that this is not a question about what additional information or functionality the site or application might provide. Rather, it is a question about how the existing information or functionality might be made better addressable.

For example: The schedule for this course is at the URL https://aeshin.org/teaching/inls-620/2016/fa/schedule/. This resource lists each meeting of the course, what to read before that meeting, and when assignments are due. I also make each individual meeting addressable: the URL for our November 10 meeting is https://aeshin.org/teaching/inls-620/2016/fa/schedule/#on-11-10. Finally, I have defined a resource for the next meeting. The URL for this resource is https://aeshin.org/teaching/inls-620/2016/fa/schedule/#next, and it always points to the upcoming meeting, whenever that may be.

Provide:
- the “main” URL for the site or application, i.e. the URL of the “starting” resource in a typical interaction with the site or application
- an explanation of the new resource you think should exist, and why
- the URLs of 2 existing (kinds of) resources, from which you think your new resource should be directly reachable (e.g. via a link), with short explanations why
Open this page (the one you are reading now) in the Chrome Web browser, using incognito mode. Then open the Developer Tools. Open the Network Panel by clicking on the Network tab at the top of the Developer Tools window. Now click this link to an article at the New York Times (you may want to copy the questions below someplace else first so you can refer to them).

Now answer the following questions:
- How many HTTP requests did following this link result in? How many resources were requested?
- Were all the requests successful? How do you know?
- How many different types of representations were returned? List the different types you saw.
Now click the Back button, returning to this page you are reading now, and follow the link above again.
- Do you see any differences in the Network panel this time? What are they?
Who owns the URL http://ils.unc.edu/ilssa/? How might you try to find out?
How can you determine whether two different URLs refer to the same resource?
DBpedia is a project that publishes on the Web structured data extracted from Wikipedia. For this question you will use Postman to explore a DBpedia resource, its related resources, and their representations. Mainly, you’ll be using Postman to request URLs and to look at the headers of HTTP responses. Note that if you install the Postman Chrome extension, you’ll also need to install the Interceptor extension. Finally, no matter which version of Postman you install, you’ll need to configure it to not follow redirects.

To do content negotiation (i.e., inform the server what kind of representation you want), you need to add an HTTP header to your request. For example, if you wanted to request a representation in plain text format, you could add the header:
```
Accept: text/plain
```
Of course, just because you request a certain type of representation doesn’t mean that that type of representation is actually available.

Now, use Postman to request the following resource: http://dbpedia.org/resource/University_of_North_Carolina_at_Chapel_Hill
- Does this resource have any representations? Why or why not?
Examine the headers returned when you request this resource. Find another resource related to this one.
- What is the URL of this second resource?
- What is the relationship between these two resources?
- Investigate this second resource. Does it have a representation?
Look at the headers returned by a request for this second resource. You should see information about a number of related resources, with associated media types. Choose one of these alternate resources, and note the media type. Now make a request for the original resource (http://dbpedia.org/resource/University_of_North_Carolina_at_Chapel_Hill), specifying that you want that media type.
- What was the media type you requested?
- How does specifying a media type change the response you get?

Designing Representations

Due November 8.

For this assignment, you will continue designing the information service you began developing in assignment 1. Specifically, you will design representations of the resources you identified in the previous assignment. While your resources may lend themselves to any number of different representations, I want you to focus on designing hypermedia representations that not only represent the data and metadata about the data, but also uses links to represent metadata about your service and the ways it can be interacted with.

It is possible to design hypermedia types using many different data formats, but for the purposes of this assignment you are asked to use HTML. By using HTML as your base format, you will not need to design your own hypermedia controls (i.e. syntax for creating links) since HTML has already defined these for you. So your design effort will focus on expressing the semantics of your information service using the existing elements and attributes of HTML.

Think about your representations

Think about what kind of data needs to be included in the representations of your resources. Consider both data included in requests to your service (i.e. in PUT or POST requests) and representations included in responses from your service. Don’t worry about specific media types for now, just think about what data is needed. Another important thing to consider is what status codes your service will possibly return. This means you need to think not only about successful requests for your resources, but unsuccessful ones as well.

For example, if you were designing a web service for a farmer’s market service you might document the following (note that this is incomplete):

GET to the Farmers Market resource returns either a list of links to individual Farm resources, or the message No farms yet. In either case the status code is 200 OK.

POST to Farmers Market returns the message Created farm {farm-name} with the URI of the new Farm in the Location HTTP header, and a 201 Created status code. If the POSTed data is missing some required information (e.g. the farm’s name), it returns the message Farm's name is required with a 400 Bad Request status code.

PUT to a Farm resource requires a representation including (at least) the farm’s name and URL. If either of these is missing, the response will be the message Farm's {property} is required (repeated once for each missing property) with a 400 Bad Request status code. If the Farm doesn’t exist yet, the response will be the message No such farm exists with a 404 Not Found error code. Otherwise the response will be the message Updated farm {farm-name} with a 200 OK status code.

Note that I didn’t bother including 500 Internal Server Error responses, since we assume that any resource can potentially return these.

Deliverable #1: For each method supported by each kind of resource in your service, specify what kinds of representation (if any) it requires, and what kinds of representations, including status codes, it might return. Be sure to consider possible error conditions.

Designing your hypermedia type

To design the hypermedia representations of your resources, you’ll need to think about:

How to represent the data provided by your various resources in HTML
Common patterns (“blocks”) that appear in your representations, and how these will be identified
How you will use outbound links to link representations to related resources and indicate possible process flows
How you will use templated query links for search actions
How you will use update links to create and update resources

To show your answers to these questions, you will create a set of HTML files that are example representations of your resources. For each kind of resource you have, you will create one example HTML file. This file should be very simple, with just the minimum HTML markup necessary to present the data. Don’t spend any time worrying about the styling of the file; focus only the structure. So, for example, think about whether you need a list of things, or a table, a paragraph of text, etc.

Now, give your HTML elements class attributes where necessary to describe the specific kind content they hold. For example, in an HTML list representing a list of Farms, each list item element might be given a class attribute with the value farm.

Once you’re satisfied with your example representations, link them to one another.

Your outbound links (i.e. anchor elements) in each HTML file should have href values that link them to the other HTML files, so that you can open one HTML file in a web browser and click on links to get to the other files. So, for example, in a “real” farmer’s market service each farm in the HTML representation of my “all farms” list would link to the specific URL for that Farm resource. But in these example HTML files, I would just have each element in the list link to farm.html (the example HTML representation of a single Farm resource).

Likewise, a templated query link (i.e. an HTML GET forms) should have an action attribute values that refers to the example HTML representation of the search results for that query.

HTML does not support idempotent update links. But for the purposes of this assignment, you can pretend that it does. Just indicate, as the value of the form’s method attribute, the method you intend for the form to use to make the request. Later in the semester we will look at techniques for adding true support for idempotent updates to HTML.

Your anchor elements should have rel attributes that describe why a user agent might want to activate them. Check the registry of link relations and try to find an appropriate one, or make up your own.

You can describe your HTML forms using class attributes, as explained in RESTful Web APIs 119—122.

Deliverable #2: An interlinked set of HTML files, one to represent each kind of resource in your service.

Documentation

Deliverable #3: A file documenting the class and rel attribute values you used to describe your service and the data it provides. Use pages 114—115 and 121—122 of RESTful Web APIs as an example of how to document your attribute values.

Each group should email me their three deliverables before Tuesday, November 8.

Final Project

Due December 13.

For your final project, you will take the design work you did for the last two assignments, and turn it into a working Web information service.

Implementing your service

Your service must provide access to at least two kinds of resources that have some kind of relationship to one another. Clients should be able to access the two kinds of resources directly, and they should also be able to access “collection” resources that list all the resources of a particular kind. It should be possible to create and update at least one of the kinds of resources through your service.

For example, the help desk service provides access to one kind of resource: help requests. The service provides a resource that lists existing help requests, and this resource is filterable. Help requests can be created and updated through the service.

Your service should provide (at least) HTML representations for all resources. These representations must include metadata that describe the application (how to transition from one state to another) and the data being provided.

Describing your application flow

Your HTML representations must include the proper hypermedia controls for linking representations to one another, creating query URIs from templates, and updating resources both idempotently and non-idempotently. Your HTML controls must have appropriate class attribute values and rel attribute values that describe their meaning and purpose (this was the work you did for the Designing Representations assignment).

Providing machine-readable access to your data

You have three options for providing machine-readable access to your data:

Option #1: HTML+Microdata

If you choose to use microdata, you should describe your data and relationships using appropriate types and properties from schema.org. If there are no appropriate types or properties for your data at schema.org, you might consider using RDFa instead.

You can use Google’s Structured Data Testing Tool or the omnipotentdatatranslator to check your microdata
Option #2: HTML+RDFa

If you choose to use RDFa, you should describe your data and relationships using types and properties from some RDF-compatible vocabulary. You can search for appropriate vocabularies at Linked Open Vocabularies.

You can use Google’s Structured Data Testing Tool, or the W3C RDFa Distiller and Parser to check your RDFa.
Option #3: JSON-LD

Instead of using microdata or RDFa to describe the data in your HTML representations, you can provide JSON-LD representations. Your types and properties should come from some RDF-compatible vocabulary. You can search for appropriate vocabularies at Linked Open Vocabularies.

Deliverables

Your final deliverables for the project are:

The URL of a single GitHub repository containing the complete source code for your service.
A Readme.md file in the GitHub repository documenting:
- the attribute values used to describe your application flow, and
- the types and properties used to describe your data.
(The .md suffix indicates a plain text file that uses Markdown syntax. This allows you to produce something more nicely formatted than plain text alone, and GitHub will automatically display it as HTML.)

You may simply email me your URL by 12pm on Tuesday, December 13th.