A Self-Learning Context-Aware Lemmatizer for German

Abstract
Accurate lemmatization of German nouns mandates the use of a lexicon. Comprehensive lexicons, however, are expensive to build and maintain. We present a self-learning lemmatizer capable of automatically creating a full-form lexicon by processing German documents.
Reference
Praharshana Perera and René Witte, A Self-Learning Context-Aware Lemmatizer for German. Human Language Technology Conference/Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP 2005), pp. 636643, October 68, 2005, Vancouver, B.C., Canada.
Bibtex entry (also for download):
@InProceedings{perera-witte:2005:HLTEMNLP,
author = {Praharshana Perera and Ren\'{e} Witte},
title = {{A Self-Learning Context-Aware Lemmatizer for German}},
booktitle = {Proceedings of Human Language Technology Conference and
Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP 2005)},
month = {October 6--8},
year = {2005},
address = {Vancouver, British Columbia, Canada},
publisher = {Association for Computational Linguistics},
pages = {636--643},
url = {http://www.aclweb.org/anthology/H/H05/H05-1080}
}
You can also visit the conference website.
Software
The Durm German Lemmatization System is available as free/open source software.
Download
URL: http://acl.ldc.upenn.edu//H/H05/H05-1080.pdf. Also available: local copy.
MD5 checksum: 967bc9caf77b5ab09dbfecfa6b74b973
Copyright © 2005 ACL.
