This site contains information and resources related to Andrew Lampert's email classification research.
As the complexity and sophistication of text processing tools increases, we can expect to see techniques that go beyond the syntactic and semantic features of documents to consider the more nuanced, context-sensitive aspects of language use that generally fall within the realm of pragmatics. This requires data that has been carefully considered by human annotators in ecologically-valid real-world contexts. To facilitate such in-context annotation, we have developed a plug-in for Microsoft Outlook that allows annotators to judge relevant aspects of collections of email messages. Integration into a widely-used application provides a natural environment that closely approximates the real-world environment in which this data is conventionally created and consumed. We describe how this tool is used to collect data for the development of a system that automatically classifies requests and commitments in email.
If you make use of any of these resources, please cite the following paper:
Lampert A, Breese D, Paris C and Dale R - A Tool for Capturing Context-Sensitive Judgements in Email Data, CSIRO ICT Centre Technical Report (EP092057), September 2009.