AllRight - Know It All and Know It Right: High Quality Knowledge Mining in the Web
From a user point of view, our world has created an impressively huge, rapidly
changing offer of information, goods, and services. However, the average user
is overwhelmed and confused by this unmanageable number of offers which leads
to wrong decisions, waste of time, and frustration. Therefore, knowledge-based
systems are deployed to assist the users and to simplify decisions processes
in a complex and fast moving world. However, because of the rapid changes in
our environment the efficient acquisition and maintenance of the knowledge bases
is a key problem to be solved in order to apply semantic systems which ease
the life of users.
The goal of the proposed project is to exploit the information available on
the Web such that knowledge acquisition could be highly automated. Typically
this information on the Web is stored in semi-structured documents containing
tables, lists, and natural language descriptions. Therefore, the task of automatic
knowledge acquisition is to transform this semi-structured information into
structured descriptions of concepts which can be processed by a reasoning system.
Based on a general description of a concept (e.g. the definition of a digital
camera) the proposed system will automatically discover instances of this concept
(e.g. digital cameras of a particular brand), their specifications, and relations
to other instances (e.g. accessories). Since the quality of a knowledge-based
system is tightly linked to the quality of the knowledge base, we have to aim
at the highest possible recall as well as precision (both at least more than
90%). In order to achieve this goal we will explore a knowledge mining framework
on the basis of information extraction, natural language processing, machine
learning, and model-based reasoning (i.e. deep domain models). This knowledge
mining framework shall be highly adaptable to different domains such that the
effort for acquiring a new knowledge base is kept low (less than 3 days of work
for a trained person). The central research task is to show that such a knowledge
mining framework can be developed with reasonable effort. We will show this
by developing new methods and algorithms as well as performing an empirical
evaluation.
Our group is involved in this project together with University of Klagenfurt, ConfigWorks, and Lixto Software GmbH. Project Duration 1.1.2005-31.12.2006.
Funded by FFG Fit-IT.