Nordic Treebank Network Meeting

Tartu, 8-10 September 2004

Day 1: Working Group Sessions and Discussion

Tools and Resources (TIGER-XML)

Coordinator: Matthias Trautner Kromann

In the TIGER-XML session at the Tartu meeting, the network members voted to accept proposals 1.1 on character encoding, 1.2 on intersegmental links, and 1.3 on glosses (cf. http://www.id.cbs.dk/~mtk/ntn/tiger-xml.html) as recommendations in their current form. For the other proposals on the ballot, the network members voted to set up the following working groups for each proposal:

The members of the working groups are responsible for producing a proposal for a recommendation before November 1 (December 1 for 1.6, where the work starts from scratch). The network decided that the working groups should remain open to everybody, and that all discussion within the working groups should be carried out on the nordic-treebank list so that all members of the network can participate in the discussion.

Parallel Treebank

Coordinator: Martin Volk

Martin Volk presented two projects about parallel treebanks that were done by PhD students as part of the Treebank Course. Both were on the topic of transfering information from a treebank in one language (e.g. EN) to a parallel language (e.g. Amharic) .

Atelach Alemu has done a project on "Projecting Dependency Parses - English to Amharic". She has parsed English Sofie sentences and wrote a program to transfer the information to Amharic. Her conclusions were rather negative. However, Svetoslav Marinov (Skövde) has done a project on "(Semi-)Automatic transfer of syntactic information" from Swedish to Bulgarian. He transfered the dependency information computed for Swedish by the Växjö group to Bulgarian. And his recall and precision values were encouraging.

In this section we also presented three other projects that were done by PhD students but were not related to parallel treebanks:

Then Martin Volk and Yvonne Samuelsson presented their work on a Swedish-German parallel treebank: Matthias Trautner Kromann presented his pseudo-automatic word alignment program within his DTAG treebank tool.

Janne Bondi Johannessen compiled a status list of the annotated Sofie sentences in the various languages. She will follow up on this and see that the various groups submit their annotations to the online database in Oslo. We also asked that the problem with the display of crossing branches need be solved. One option would be to use the SVG trees from TIGER-Search (instead of a local tree display).

Various other research topics with respect to Parallel Treebanks were discussed, but no actions were taken:

Notational Harmonization (VISL)

Coordinator: Eckhard Bick

In the session on notational harmonization, Eckhard presented the VISL category system providing definitions for the individual form and function categories, with a special focus on co-ordination and the stacking notation. Joakim presented a VISL-transformation of Swedish dependency treebank edge labels.

The following actions were agreed upon:

Spoken Language and Discourse

Coordinator: Jens Allwood (absent)

The network decided to set down a working group consisting of Janne Bondi Johanneson and Matthias Kromann for planning NTN's work on spoken language treebanks. The primary task for this working group is:

Day 2: Planning

Six main topics were discussed in the final planning session:

Participants

SiteParticipants
Copenhagen Business SchoolMatthias Trautner Kromann
CSC Scientific ComputingManne Miettinen
Stockholm UniversityMartin Volk
Yvonne Samuelsson
University of BergenKoenraad de Smedt
University of HelsinkiKimmo Koskenniemi
University of OsloJanne Bondi Johannessen
Gunnar Hrafn Hrafnbjargarson
University of Southern DenmarkEckhard Bick
Søren Harder
University of TartuHeli Uibo
Kadri Muischnek
Kaili Müürisep
Växjö UniversityJoakim Nivre
Nordic Language Technology ProgramHenrik Holmboe

Pictures