GSLT: Statistical Methods (Level 2)



The aim of this course is to give a research-oriented introduction to probabilistic modeling, statistical methods and their use within the field of language technology. The course is aimed at students with a basic knowledge of natural language processing and/or speech technology (at least the equivalent of a GSLT level 1 course in one of these areas). Basic programming skills are useful as well as a rudimentary knowledge of basic statistics and probability theory.

NB: The official language within GSLT is English but we can decide to have lectures, seminars and discussions in Swedish instead, provided of course that all participants are comfortable with this. In any case, participants are free to formulate their contributions to discussions, whether oral or written, in any language that can be understood by the other participants (which in most circumstances means Swedish or English).


Part 1: Basic Course

The main part of the course is based on a permanent web course where all course material can be found, including lecture notes, slides, recommended reading, projects, tools and resources. The main text for the basic course is Manning & Schütze (1999) Foundations of Statistical Natural Language Processing. Lectures during the two intensive weeks will cover most of the material in the basic course according to the following schedule.

Date
Time
Room
Contents
1/2
8-10
D326
Introduction
Basic probability theory
1/2
10-12
D326
Stochastic variables
Statistical inference
2/2
8-10
D311
Language modeling
14/3
15-17
D326
Part-of-speech tagging
Syntactic parsing
15/3
8-10
D326
Word sense disambiguation
Machine translation
15/3
10-12
D326
Evaluation

Teachers for the course are Joakim Nivre (lecturer) and Leif Grönqvist (course assistant). Messages that are relevant for all participants can be sent to the course mailing list statmet@gslt.hum.gu.se.

During the first part of the course there will be two practical assignments, one after each intensive week:

The practical assignments are group assignments (two persons). All assignments should be sent to both Joakim (nivre@msi.vxu.se) and Leif (leifg@ling.gu.se).


Part 2: Course Project

The second part of the course is a project, which can be done individually or in groups of two, and which will be reported in the form of a term paper. Topics have to be submitted (and approved) by 8 April. The deadline for the submission of the term paper is 18 May. The closing seminar will take place at Växjö University 25-26 May 2005.