ProjectsAugmented Media
Passing the Bubble Perception in Virtual Environments KDI (NSF, Annotations)


KDI (NSF, Annotations)

  Keywords: Annotations, Metadata, Activity Histories  

Annotation: Form, Function and Promise
If you look closely at knowledge oriented work, annotation is likely to figure prominently. If at first this does not seem true it is because annotation is so ubiquitous and used for so many functions that we rarely realize just how often we annotate. Annotations can be spoken comments, written text, graphical elements, and even gestures. And they may be applied to representations and sometimes to non-representational activities. Thus, the thing being annotated may be video, or audio tapes, textual documents obviously, pictures, graphics, animations, and certain ongoing activities, such as sports, which may have running commentaries recorded.

To give a flavor of the diversity of functions which annotation can serve we have informally observed written annotations alone being used to:

frame an interpretation (marginalia are often used to hold reader's comments, to frame a perspective)
call attention or highlight ideas (as when text is underlined, ideas individuated and numbered, or graphical call outs are posted )
provide retrieval indices (e.g. notes on photocopies marking them as containing valuable information on some specific topic of interest)
give directives for action (e.g. act on it, mandatory read, background reading,)
classify an item (give subject headings, note relevance to other work, state how hard it is,)
summarize large text (e.g. annotated bibliography, abstract or executive summary)
give location of items of interest (informal index, `see page 12 for Jones’ remarks’,)
give instructions for use (annotate a structural diagram of a bookcase explaining how it is be assembled)
convert speech to spatially surveyable text (annotate a recording)
correct grammar, spelling, ideas (often by convention as in galley proofs)
layer information (a soil engineer annotates the working blueprint and passes it on)
record the opinions of different members of a group (mark who said what, or who defends a given position)
record reminders (beside the phone, on the refrigerator)
serve as thumbnails to help in navigation (little images or the first few words of a larger document)
identify oneself as an author or agent (nurses and doctors initial a patient's chart both for billing purposes and to be tracked down later for follow up information)

In studying the natural history of annotation we will pay particular attention to the diversity of forms they take, the variety of functions they serve, the degree to which there are conventions governing their creation, their capacity to be exploited as new meta-data, and their role as key elements in groupware.

Annotations come in many forms, from the most highly structured or conventional (e.g. author’s marks on galley proofs) to the most free and unconstrained (e.g. doodles, images, wise cracks). If the workflow of documents requires signatures, directives, commentaries and the like, then there are often conventions governing their addition. But equally because annotations are a collection of highly general techniques for posting contextualized or situated content, the meaning being contributing to a document or film, etc., may not be evident to everyone. Thus, although annotations may be in a language, such as English, with well understood syntactic and semantic structure, their meaning may be elusive because not everyone knows the context – the annotational frame of reference – from which the annotational contribution was originally made.

In looking back at our earlier work on history–enriched digital objects [19] it is now clear that the encapsulated history of interaction can be viewed as yet another form of annotation. In a digital medium these annotations can be automatically created as by-products of normal interactions. We will design work materials that capture such interaction histories and explore mechanisms to exploit their use. We think they have the potential to fundamentally change the digital workplace.

For instance, by learning more about the form, function, and conventionality of annotations and the role they play in distributed cognition we expect to be able to design digital media which can help solve the problem of situatedness by automatically storing aspects of the annotator’s context. This has always been an elusive problem in settling on the proper meaning of someone else’s annotation or in determining why someone interacted with a particular object (whose history we are tracking). Here theory can nicely guide design. As we push forward the horizon of digital media we expect annotations to explode in profusion and importance. We also expect that annotations will transform the way we search, filter and collaborate using large knowledge repositories. The reason is that annotations serve as a valuable new source of meta-data, and can be used as the basis for more powerful forms of collaborative filtering and of social interaction.

Here is a simple example. One of the digital ways annotations can be used is to comment on the nature of a link between two documents. Currently in the WWW, a hyperlink between two information elements indicates that there is some unspecified relation between them. Each author who specifies a link, however, usually has some implicit notion of the relevance of the link. In the author’s mind another node often has a rhetorical significance: it may be an elaboration, an exegesis, a different viewpoint, a counterargument, etc. But, at present, there is no official mechanism to share that personal notion, since links are untyped. If we had tools that made it easy for authors to make the personal meaning of their links explicit, then browsers and other tools could provide more suitable interactions to users. The easier this job of annotating links becomes –for readers too – the sooner we can begin to leverage the obvious power of typing links. Searching, browsing and filtering will improve, and we will be one step closer to that elusive goal of finding quality on the web.

The groupware aspects of annotation are of equal importance. Members of a group currently annotate paper documents to indicate what they believe is right or wrong about a document, what is important, what can be ignored, or what is relevant to specific colleagues. As more documents and representations are used in digital form we can exploit more aspects of the context of authorship, more of the historical facts of the annotations, to filter, navigate and visualize contents we are interested in. For instance, at present, teachers use annotations to comment on students’ exercises, and students often annotate readings to identify concepts or aspects they find problematic. If students were to have online access to the annotations of others – the frequently asked questions, the teacher’s replies, the replies of students – then they might study in a different manner. They might appreciate that the areas of the course where they are having problems are ones others also find difficult. Moreover, students will be able to help their fellow students directly. These are merely a few of the virtues digital annotations promise to provide. In the next section, we discuss how we propose to study the general topic of annotation, and how our theoretical results may lead to better mechanisms for coordinating cognition.

General Theory of Annotation
In their most common form annotations resemble moves in a conversation. You begin a story … I interrupt with my objection, my request for clarification, my two cents worth of reinforcement etc. Such conversational interactions become annotations when they are stored in a form that others can view. This description is helpful and marks the way for deeper analysis, but true to a first approximation only. We do not normally call the dialogue in books, plays, correspondence, documentaries and narratives to be annotations. Although these works store conversational moves, they are self contained, whereas annotations are always parasitic on pre-existing documents or representations (or in certain cases activities). This means that annotations, at least the textual or graphical kind, are highly situated by their spatial location in a document or representation, and not just by the author who adds them and the conversational context they come from. Nonetheless, the insights of conversational analysis offer us a good starting point for analyzing annotations, particularly as conversational theory identifies several of the key components constituting the speaker-hearer or author-audience context. Without this sort of deep analysis we cannot expect to design new forms of filters, information visualizations, search engines, and browsers that can take advantage of the valuable information stored in annotations. Following conversational analysis then, to understand an annotation we need to know:

1. Who is the intended audience of the annotation? Oneself, the original author of the document, a different group who will be taking your recommendation, whoever happens to read the annotation? Depending on the intended audience, readers of annotated documents will evaluate the meaning of the annotation in different ways. Is the annotation for a novice? An employee? For those who have already read several other documents?
2. What relation holds between author and audience? Is the author writing for him or herself at a later date (self-to-self), much as we underline or highlight phrases to remind ourselves of key points or to call attention to top level ideas? Is the author a doctor who is writing a comment on a patient’s chart for another doctor (doctor-to-doctor), a nurse (doctor-to-nurse), or the patient herself? Is the author a manager annotating a document for his subordinates, marking it as essential reading, or good background, or as something to be converted to a timeline and acted on? Is the author reviewing a colleague’s paper or a collaborator’s contribution to a proposal? Or is the author writing comments on a paper that is supposed to now be read by a third party? In each case, there are power or organizational relationships between author and audience, which again influence how the reader of an annotation will or ought to interpret the import of the annotation. A doctor leaving a note to a patient does not expect contradiction or request for justification. The same note to another doctor though may be an invitation to reconsider the evidence.
3. What is the intended function or illocutionary force of the annotation? Is it intended to mark approval or disapproval, or to state one’s own beliefs on the topic? Is the author intending to prescribe or prohibit action by the annotation, or to warn, promise, punish, stimulate, or provoke? When the author is marking material for his own later perusal is the function to help him to remember, to situate comments so that note-taking is briefer, to help himself to understand the document by formulating his own views and thereby increase the depth of processing occurring during reading? A note to oneself intended to jog memory may require reading the surrounding passage to get the deeper point being marked down, whereas the function of a note meant to summarize the surrounding passage is to make it unnecessary to re-read the local material.
4. What is the larger pattern of activity – the larger workflow – in which it is a part? Does the annotation appear as a component of reviewing a document for publication, of monitoring a patient’s progress, preparing a manager’s assessment of a project, and so on? Since annotations are part of an ongoing activity of reading, interpreting, evaluating, commenting, cooperating, and so forth, it is important to know where in the process the annotator is, for that helps to determine its appropriate interpretation.

These elements of annotational context, redundant at points, distinct at others, remind us of the contextually salient elements of conversations. Indeed a theory of annotation may open new lines of thought for theories of conversation. But annotations also have other aspects that require an even richer theory. First, as mentioned, annotations have an essentially spatial or temporal position in a document or representation. In simple text an annotation always has a specific 2-D position. It can span contiguous regions of lines of words, or even serve to combine disjoint regions. Moreover, as annotations accumulate, particularly annotations on annotations, their position becomes meaningful relative to each other. In the Talmud, for example, pages are structured so that surrounding a central quote from the Mishnah are comments by the most important early rabbis, and in subsequent pages there are comments on these comments. Where a comment is located is an indication of the historical importance of that comment and its subsequent influence on talmudic thought.

Second, annotations are often multimodal. Words may be used to comment on maps, diagrams or pictures; sketches may be used to comment on prose or timelines; voice-overs may be used to add commentary to films, graphics or text; and almost any modality of annotation can be used to help interpret instructions. For instance, to help interpret instructions explaining how to leave voice annotation in MS Word, we might use animations, graphics, textual call-outs, voice-overs, or even videos. In conversations we do not have to worry about the nature of multimodal interaction, since all conversational interactions are linguistic or gestural. Accordingly, a general theory of annotation must include some discussion of the nature of multimodal interactions – how different modalities can constructively or destructively interfere with each other.

Third, the range of documents or representations which annotations may be attached to is far more diverse than the types of real time conversations that exist. Because annotations can be attached to virtually any representation – conversation being just one type, or family of types, of representations – the range of entities they may be attached to is as broad as the space of possible representations. For instance, textual documents constitute a type of representation distinct from conversation and they range over student exercises, academic readings, syllabi, contact sheets, press releases, letters, hardcopies of PowerPoint presentations (called decks by high-level bureaucrats), travel forms, budget reports, and requisition slips to name only a few. In each case, the type of document and the workflow it is part of constrains the range of plausible annotations one might find, and similarly, the range of plausible interpretations a reader might make upon receipt of the annotated document. This takes us well out of the scope of conversational analysis.

Lastly, annotations are often asynchronous with the representations they are about. In a conversation, there is a clear notion of turn taking. You speak then I reply. You send or read me a document, and I comment on it. A commentator may address several people or positions at once, but the presupposition still holds that these comments are to utterances or to text that temporally precede the speaker’s. Yet in widely distributed electronic systems, particularly those based on hypertext, a speaker or author may leave annotations conditionally lurking, so that when a certain comment is made, a particular annotation or reply is automatically triggered.

Literary Text

A Literary Text Annotation made by an actor in a script Color coding denotes lines (green), stage notes (yellow) and other, such as lines of other characters (no color).






Rehearsals for the studio occur in this space. Dancers claim to use physical structures of the space, such as vertical lines of the mirror, as guides in learning a routine.






The orthopedist annotated this mold of a polio patient’s right leg to represent areas from which to refrain from filing.



  Project Team  
David Kirsh
(202) 623-3624
Office: CSB173