GSLT: Statistical Methods (Level 2)



The aim of this course is to give a research-oriented introduction to probabilistic modeling, statistical methods and their use within the field of language technology. The course is aimed at students with a basic knowledge of natural language processing and/or speech technology (at least the equivalent of a GSLT level 1 course in one of these areas). Basic programming skills are useful as well as a rudimentary knowledge of basic statistics and probability theory.

NB: The official language within GSLT is English but we can decide to have lectures, seminars and discussions in Swedish instead, provided of course that all participants are comfortable with this. In any case, participants are free to formulate their contributions to discussions, whether oral or written, in any language that can be understood by the other participants (which in most circumstances means Swedish or English).


Part 1: Basic Course

The main part of the course is based on a permanent web course where all course material can be found, including lecture notes, slides, recommended reading, projects, tools and resources. The main text for the basic course is Manning & Schütze (1999) Foundations of Statistical Natural Language Processing. Lectures will cover most of the material in the basic course according to the following schedule.

Date
Time
Room
Contents Slides
12/3
10-12
C430
Introduction
Basic probability theory
Lectures 1-3
12/3
13-15
C430
Stochastic variables
Statistical inference
12/3
15-17
C430
Language modeling
13/3
8-10
C430
Part-of-speech tagging
Syntactic parsing
Lectures 4-6
13/3
10-12
C430
Word sense disambiguation
Machine translation
13/3
13-15
C430
Evaluation

The teacher for the course is Joakim Nivre. Messages that are relevant for all participants can be sent to the course mailing list statmet AT gslt DOT hum DOT gu DOT se.

During the first part of the course there will be two practical assignments:

The practical assignments are group assignments (two persons). All assignments should be sent to nivre AT msi DOT vxu DOT se.


Part 2: Course Project

The second part of the course is a project, which can be done individually or in groups of two, and which will be reported in the form of a term paper. Topics have to be submitted by 5 April. The deadline for the submission of the term paper is 17 May. The closing seminar will take place at Uppsala University 26-27 May 2009.