This site contains information and resources related to Andrew Lampert's email text segmentation and classification research.
The Zebra system is implemented in Java, using the following libraries and components:
The Zebra Email Segmentation Tool will be available for download. Right now, I'm working through some licensing issues before I can make it available. If you're keen to get a status update, please contact me.
Our annotated data is licensed under a Creative Commons Attribution-Noncommercial 2.0 Generic License.
If you make use of any of these resources, please cite the following paper:
Andrew Lampert, Robert Dale and Cécile Paris (2009) - Segmenting Email Message Text into Zones, In Proceedings of Empirical Methods in Natural Language Processing (EMNLP 2009), pp. 919-928, Singapore, August 6-7.