UKP Lab develops natural language processing techniques for automatically understanding written text and applies them to information management like information retrieval, question answering, and structuring information in Wikis. The Ubiquitous Knowledge Processing Lab is among the leading research institutes in the field of utilizing Web 2.0 content as the source of lexical semantic information for natural language processing. Wikipedia and Wiktionary are employed as collaboratively constructed lexical semantic resources and used to improve expert-built resources like WordNet. These resources are used to develop semantically enhanced algorithms for information retrieval and question answering. An example is semantic search: If a user enters the query "pie-fruit" into a search engine, a standard search engine will retrieve pages containing the words "pie" but not the word "fruit", providing plenty of pages on "apple pie". An intelligent search engine will "understand" that the user is interested in pie recipes that do not use any type of fruit and retrieve appropriate documents. Further research activities at UKP lab are automatic quality assessment of text, sentiment analysis and opinion mining. Research activities are organized into the following research areas:
Part of the research efforts at UKP Lab is the development of natural language processing software. The following software packages are freely available for research purposes:
DKPro
The Darmstadt Knowledge Processing Software Repository is an open source community of software projects aimed at Natural Language Processing. It offers robust, ready to use NLP components which are built on top of IBM’s Unstructured Information Management Architecture as a common and open framework. DKPro contains basic natural language processing components like part-of-speech tagging and lemmatization. Additionally, the package offers components that support the processing of user generated discourse. User generated content contains spelling errors, abbreviations and emoticons which prohibit direct application of standard NLP components. DKPro provides the required preprocessing tools.
Parallel to JWPL, the Java Wiktionary Library offers programmatic access to information contained in the English and the German versions of Wiktionary.