Research

Large Software systems are distributed systems. They possibly contain data parallel components. My research activities aim at methods, tools, and frameworks to design and implement such systems. More information on the whole research team can be found on the Research In Computer Science (RICS) home page of the Växjö University.

Current (last update 2003-12-09) activities include the projects:

1.      Specification of Services and Components (SoSaC)
Systems of components and services are usually built in two steps. First, pre-existing components and services are selected and assessed, missing components are designed and implemented. In the second stage, components and services are assembled to a system. Systems integration is usually physically and temporally separated from component design and mismatches between components are the rule, not the exception. This makes the selection and the assessment of existing components and services as well as their adaptation an integral part of component-based systems development process. This project contributes to the specification of components and services to support their assessment for selection and adaptation for assembly.

2.      Component Analysis and Transformation Engine (Cate)
Cate aims at the promotion of reuse of software. We improve adaptations and optimisations of components in their system environments. Therefore, we enhance representations, analyses and transformations of component systems by combining approaches from software engineering with techniques and data structures from compiler construction.

3.      Analysing and Visualizing Programs (VizzAnalyzer)
Successful software comprehension methods (and tools) need the synergy of low-level code analyses known from the field of compiler construction, high-level analyses from the field of re-engineering and software visualization techniques. We argue that each individual technique would be either not goal directed or too shallow (or both). The VizzAnalyzer is a framework for an easy integration of low- and high-level analysis as well as visualization components.

4.      An Adaptive GRID-Service Architecture for a Distributed, Time-coherent Radio/IT-Network
LOFAR/LOIS is a new-generation wide-area, time-coherent network of digital sensors and emitters for space, environmental, telecom and IT research. From the IT perspective, LOFAR/LOIS is an open network infrastructure with full public access to primary data.  The enormously large amount of data transmitted over a combination of wireless/radio channels, fibre and copper networks, will put challenging demands on the computation facilities. GRID architectures connecting computational resources of mainframes and PC from different Web locations potentially provided the required computational power but open up new research questions for software architecture and program optimisations. The proposed project aims at a GRID-service infrastructure comprising an optimisation engine, which will be able to decide dynamically about adapting the application software architecture. It will monitor the system state and automatically decide on adaptations in a robust and efficient way. This infrastructure is applicable in complex multi-media telecom and IT networks such as LOFAR/LOIS.

 

[ HOME ]  [ TOP ]


Scientific Career

My diploma thesis and the subsequent work at the University of Dresden deals with a distributed, object-oriented language called OPAL. This language maps objects to processes, method calls to process communications. The naive mapping leads to poor performance unless a tuning phase clusters the processes. I support this process by a monitoring technique detecting the performance critical parts of the program. Additionally, I improved the Opal compiler.

I moved to Karlsruhe where I was granted a fellowship and focused my activities on data parallel programs. Again, the naive mapping of such programs to real parallel architectures with distributed memory and message passing leads to poor performance results. I solve this problem with transformation and optimization techniques. Therefore, I describe parallel machines with the LogP model that proved adequate for real parallel architectures. Most data parallel programs can be compiled to LogP. These programs are optimized with scheduling techniques. Although the general problem is NP-hard, I found approximation guaranteeing a performance of a small constant factor of the theoretical optimum in the worst case. In practice, the results are even closer to the optimum. This work lead to my dissertation on “Optimization of Parallel Programs”.

I continued my work on parallel and distributed system at the ICSI where I was granted a post doc fellowship. With the team at the ICSI, I worked on language design problems. The parallel, object-oriented language Sather allowed the implementation of save parallel programs. To guarantee safety, programs contained numerous runtime synchronizations of threads. With the differentiation of synchronized and unsynchronized classes at design level, many synchronizations are skipped while correctness is still preserved. This work is summarized in the language report for Sather-2.

Back to Karlsruhe, I concentrated on connecting single, predefined components to distributed systems. Therefore, I had to solve adaptation problems at different levels: the output of a component must be transformed accordingly before it can be processed by another component. With the specification of data and their transformation using XML technology, I found a correct solution. In the aXMLelerate project, I designed generators producing mini compilers from these XML specifications which leading to fast solution. Adaptation is also required for the communication layer of the components. Explicit adapters guarantee correctness. The adapter code is then woven into the components in order to make the solution fast. The next adaptation is required at protocol level. The solution here is also explicit adaptation and code weaving. However, the ordering constraints due to the local protocols of components could lead to global consistency problems. Locally correct adaptation could bring in deadlocks. Unfortunately, solutions to this problem are known to be not decidable or not complete (or both). I propose theoretical tools based on state machines (in general not complete), context free languages (in general not complete and not decidable) and Petri nets (in general not decidable) which work fine for special cases. For a practical solution, I created a method for the visualization of the dynamic system behavior. The resulting tool VizzEditor connects program points with views of Petri net states and can serve as a visual debugger.

Both projects, aXMLelerate and VizzEditor, showed to be applicable in other contexts. The aXMLelerate project, where I supervise four researchers and a few more students, aims at document exchange in general. The toolbox that we designed and implemented is applied successfully in projects with the Daimler Chrysler AG and the MOST Cooperation.

Originally, I created the VizzEditor project to improve the teaching of algorithms. It allows the rapid design of visualizations of programs using a debugger interface to read the program state and mapping it to predefined views. Such visualizations have been used to explain sorting and graph algorithms to students in the beginners’ classes at our university. Another instantiation of the VizzEditor lead to a framework for the visualization of LogP scheduling algorithms. We also apply the VizzEditor in the software reengineering context: we merge program structure and dynamic program information to detect design flaws. The VizzEditor project is implemented mainly with students. Currently more then 20 students contribute to the project.

[ HOME ]  [ TOP ]


Dissertation on Optimization of Parallel Programs

Today, there is a great variety on parallel algorithms for shared-memory architectures, mainly the PRAM. However, the PRAM model does not take into account properties of realistic architectures. Culler et al. defined a more realistic machine model which reflects better the practical behaviour of massively parallel computers. Their LogP model differs from the PRAM in the following points. First, the synchronous execution is dropped. Instead, all processors perform their computation asynchronously. Second, there is no shared memory assumption. Instead, they consider a communication latency, communication overhead and network bandwidth. Finally, the number of processors is fixed and cannot increase with the problem size. Various parallel machines are modelled with the parameters latency, overhead, inverse of the bandwidth, and number of processors. It turned out that the LogP model is adequate to predict the runtime of programs on these architectures.

We focus on transformation of parallel PRAM programs to equivalent LogP programs and on the optimization of the resulting programs w.r.t. the LogP parameters of the target architecture. With LogP cost model, we can not only perform optimization automatically but also predict the quality of the optimized programs in terms of their runtime on a specific target machine. Our optimization techniques guarantee a small factor of the theoretically optimal performance. We proved the time bound and confirmed it by performance measurements of different programs on different parallel computers and a clusters of workstations.

 

[ HOME ]  [ TOP ]