Large Software systems are distributed systems. They possibly contain
data parallel components. My research activities aim at methods, tools, and
frameworks to design and implement such systems. More information on the whole
research team can be found on the Research
In Computer Science (RICS) home page of the Växjö University.
Current (last update 2003-12-09) activities include the projects:
1.
Specification
of Services and Components (SoSaC)
Systems of components and services are usually built in two steps. First,
pre-existing components and services are selected and assessed, missing
components are designed and implemented. In the second stage, components and
services are assembled to a system. Systems integration is usually physically
and temporally separated from component design and mismatches between
components are the rule, not the exception. This makes the selection and the
assessment of existing components and services as well as their adaptation an
integral part of component-based systems development process. This project
contributes to the specification of components and services to support their
assessment for selection and adaptation for assembly.
2.
Component Analysis and Transformation
Engine (Cate)
Cate aims at the promotion of reuse of software. We improve adaptations and optimisations
of components in their system environments. Therefore, we enhance
representations, analyses and transformations of component systems by combining
approaches from software engineering with techniques and data structures from
compiler construction.
3.
Analysing and Visualizing Programs
(VizzAnalyzer)
Successful software comprehension methods (and tools) need the synergy of
low-level code analyses known from the field of compiler construction,
high-level analyses from the field of re-engineering and software visualization
techniques. We argue that each individual technique would be either not goal
directed or too shallow (or both). The VizzAnalyzer is a framework for an easy
integration of low- and high-level analysis as well as visualization
components.
4.
An Adaptive GRID-Service Architecture for a
Distributed, Time-coherent Radio/IT-Network
LOFAR/LOIS is a new-generation
wide-area, time-coherent network of digital sensors and emitters for space,
environmental, telecom and IT research. From the IT perspective, LOFAR/LOIS is
an open network infrastructure with full public access to primary data. The enormously large amount of data
transmitted over a combination of wireless/radio channels, fibre and copper networks,
will put challenging demands on the computation facilities. GRID architectures
connecting computational resources of mainframes and PC from different Web
locations potentially provided the required computational power but open up new
research questions for software architecture and program optimisations. The
proposed project aims at a GRID-service infrastructure comprising an optimisation
engine, which will be able to decide dynamically about adapting the application
software architecture. It will monitor the system state and automatically
decide on adaptations in a robust and efficient way. This infrastructure is
applicable in complex multi-media telecom and IT networks such as LOFAR/LOIS.
My diploma thesis and the subsequent work at the University of Dresden
deals with a distributed, object-oriented language called OPAL. This language
maps objects to processes, method calls to process communications. The naive
mapping leads to poor performance unless a tuning phase clusters the processes.
I support this process by a monitoring technique detecting the performance
critical parts of the program. Additionally, I improved the Opal compiler.
I moved to Karlsruhe where I was granted a fellowship and focused my
activities on data parallel programs. Again, the naive mapping of such programs
to real parallel architectures with distributed memory and message passing
leads to poor performance results. I solve this problem with transformation and
optimization techniques. Therefore, I describe parallel machines with the LogP
model that proved adequate for real parallel architectures. Most data parallel
programs can be compiled to LogP. These programs are optimized with scheduling
techniques. Although the general problem is NP-hard, I found approximation
guaranteeing a performance of a small constant factor of the theoretical
optimum in the worst case. In practice, the results are even closer to the
optimum. This work lead to my dissertation on “Optimization
of Parallel Programs”.
I continued my work on parallel and distributed system at the ICSI where
I was granted a post doc fellowship. With the team at the ICSI, I worked on
language design problems. The parallel, object-oriented language Sather allowed
the implementation of save parallel programs. To guarantee safety, programs
contained numerous runtime synchronizations of threads. With the
differentiation of synchronized and unsynchronized classes at design level,
many synchronizations are skipped while correctness is still preserved. This
work is summarized in the language report for Sather-2.
Back to Karlsruhe, I concentrated on connecting single, predefined
components to distributed systems. Therefore, I had to solve adaptation
problems at different levels: the output of a component must be transformed
accordingly before it can be processed by another component. With the
specification of data and their transformation using XML technology, I found a
correct solution. In the aXMLelerate project, I designed generators producing mini
compilers from these XML specifications which leading to fast solution.
Adaptation is also required for the communication layer of the components.
Explicit adapters guarantee correctness. The adapter code is then woven into
the components in order to make the solution fast. The next adaptation is
required at protocol level. The solution here is also explicit adaptation and
code weaving. However, the ordering constraints due to the local protocols of
components could lead to global consistency problems. Locally correct
adaptation could bring in deadlocks. Unfortunately, solutions to this problem
are known to be not decidable or not complete (or both). I propose theoretical
tools based on state machines (in general not complete), context free languages
(in general not complete and not decidable) and Petri nets (in general not
decidable) which work fine for special cases. For a practical solution, I
created a method for the visualization of the dynamic system behavior. The
resulting tool VizzEditor connects program points with views of Petri net states
and can serve as a visual debugger.
Both projects, aXMLelerate and VizzEditor, showed to be
applicable in other contexts. The aXMLelerate project, where I supervise four researchers and
a few more students, aims at document exchange in general. The toolbox that we
designed and implemented is applied successfully in projects with the Daimler
Chrysler AG and the MOST Cooperation.
Originally, I created the VizzEditor project to improve the teaching of algorithms. It
allows the rapid design of visualizations of programs using a debugger
interface to read the program state and mapping it to predefined views. Such
visualizations have been used to explain sorting and graph algorithms to
students in the beginners’ classes at our university. Another instantiation of
the VizzEditor lead to a framework for the visualization of LogP
scheduling algorithms. We also apply the VizzEditor in the software
reengineering context: we merge program structure and dynamic program
information to detect design flaws. The VizzEditor project is implemented mainly with students. Currently more then 20
students contribute to the project.
Today, there is a great variety on parallel
algorithms for shared-memory architectures, mainly the PRAM. However, the PRAM
model does not take into account properties of realistic architectures. Culler
et al. defined a more realistic machine model which reflects better the
practical behaviour of massively parallel computers. Their LogP model differs
from the PRAM in the following points. First, the synchronous execution is
dropped. Instead, all processors perform their computation asynchronously.
Second, there is no shared memory assumption. Instead, they consider a
communication latency, communication overhead and network bandwidth. Finally,
the number of processors is fixed and cannot increase with the problem size.
Various parallel machines are modelled with the parameters latency, overhead,
inverse of the bandwidth, and number of processors. It turned out that the LogP
model is adequate to predict the runtime of programs on these architectures.
We focus on transformation of parallel PRAM
programs to equivalent LogP programs and on the optimization of the resulting
programs w.r.t. the LogP parameters of the target architecture. With LogP cost
model, we can not only perform optimization automatically but also predict the
quality of the optimized programs in terms of their runtime on a specific
target machine. Our optimization techniques guarantee a small factor of the
theoretically optimal performance. We proved the time bound and confirmed it by
performance measurements of different programs on different parallel computers
and a clusters of workstations.