Enriching personal information management with document interaction histories Public Deposited

Downloadable Content

Download PDF
Last Modified
  • March 20, 2019
  • Gyllstrom, Karl
    • Affiliation: College of Arts and Sciences, Department of Computer Science
  • Personal information management is increasingly challenging, as more and more of our personal and professional activity migrates to personal computers. Manual organization and search remain the only two options available to users, and both have significant limitations; the former requires too much effort on the part of the user, while the latter is dependent on users' ability to recall discriminating information. I pursue an alternative approach, where users' computer interactions with their workspaces are recorded, algorithms draw inferences from this interaction, and these inferences are applied to improve information management and retrieval for users. This approach requires no effort from users and enables retrieval to be more personalized, natural, and intuitive. The Passages system enhances information management by maintaining a detailed chronicle of all the text the user ever reads or edits, and making this chronicle available for rich temporal queries about the user's information workspace. Passages enables queries like, which papers and web pages did I read when writing the related work section of this paper?, and which of the emails in this folder have I skimmed, but not yet read in detail? As time and interaction history are important attributes in users' recall of their personal information, effectively supporting them creates useful possibilities for information retrieval. I present methods to collect information about the large volume of text with which the user interacts, and use this information to improve retrieval. I show through user evaluation the accuracy of Passages in building interaction history, and illustrate its capacity to both improve existing retrieval systems and enable novel ways to characterize document activity across time. Before the Passages system, I developed two other systems with similar goals. Confluence extends an existing system that identifies task-based links among users' data through their being used at proximal points in time. For example, if a user frequently interacts with a report and a graph at the same time, those documents likely share a common task even though they may have no semantic relationship. Once such links are identified, they are applied when users issue search queries, expanding traditional, text-based results with other documents that share task-based links to those results. This creates a form of task-based retrieval which is independent of document semantics, and enhances users' ability to retrieve information. The SeeTrieve system extends this concept to trace the visible text in the GUI with which the user interacts and associate this with files whose accesses occur at proximal points in time. In addition to improving retrieval for users, it creates a form of automated, task-oriented tagging of files.
Date of publication
Resource type
Rights statement
  • In Copyright
  • Stotts, P. David
  • Open access

This work has no parents.