Number and Type: |
181130 VU WS 2005/06
|
Lecturer: |
Robert Baumgartner |
Short Description: |
Approaches to web data extraction and integration |
Preliminary Meeting: |
Thursday 6th of October, 10:00 (s.t.), seminar room
184/2 |
Registration: |
until 2nd of October via e-mail
(limited participant number) |
Language: |
Slides in English, lecture language depending whether
non-german speaking students from the computational logic study join |
Timetable: |
about every other Thursday 10:00-13:00 (starting with
20th of October), Seminarraum 184/2 (tw. geblockt) |
Procedure: |
Lecture coupled with exercises and group work; Exercise
evaluation at 10:00, lecture at 11:00 (on the first
lecture day at 10:00) |
Keywords: |
XML Family, XML Schema, XPath, XSLT, XQuery, (HTML)
data extraction and wrapper generation, definition and areas in IE,
differentation to IR, Lixto project: Visual Wrapper and Transformation
Server, application generation with Lixto, other wrapper generation
languages and -tools, wrapper learning und automatic data extraction,
data aggregation and syndication, portal integration, e-biz Frameworks,
pdf data extraction |
Fields of Study: |
This VU is a compulsory course or compulsory elective
in some bachelor and master studies and can furthermore be selected
as part of KfK
Semantic Web Advanced Topics and is part of the European
Master Programs Computational Logic. |
Related Lectures: |
Proseminar Web
Information Extraction (Herzog, Gatterbauer) |
Attention - Please note: Due
to organisatorial issues the session planned for October 20 has been moved
to November 3 and the second session moved to November 17 in consequence,
and November 24 has been added for the third session.
Structure of the lecture and slides
|
Prelim.
|
Preliminary Meeting
|
|
6.10.
|
1st
|
Motivation IE, XML and
XML Schema (till 12:30)
|
|
3.11.
|
2nd
|
XML Navigation, Query
and Transformation Languages (+Ex.)
|
|
17.11.
|
3nd
|
XML Query Languages,
techniques in IE, approaches to wrapper generation (+Ex.)
|
|
24.11.
|
4th
|
Lixto Visual Wrapper
and Elog (+ Ex.)
|
|
1.12.
|
5th
|
Lixto Transformation
Server (+Ex.)
|
|
15.12.
|
6th
|
Three sample projects:
on inductive wrapper generation, automatic data extraction, and
PDF data extraction (+Ex.)
|
|
12.1.
|
7th
|
Talks of Student Groups
(10:00-12:30, 13:30-16:00; five groups each)
|
|
26.1.
|
Group Distribution Group
Talk Topics&Timetable
|