Research Overview

The goal of my research is to make engineering scalable and efficient distributed systems easier

My research thus intersects distributed systems, software engineering, cloud computing and programming languages.

Why is research in distributed systems and software engineering important?

Modern distributed applications present a unique set of challenges, arising from the need to support customer-facing, highly-available services and applications, while scaling to thousands of servers and network components possibly running on multiple geographically distributed data centers. Engineering effective distributed systems is thus a hard problem. Research in this area has the potential to impact several emerging applications ranging from electronic commerce, financial trading, advertising and marketing to social networks and sensor networks. Moreover, with the increasing use of mobile devices, a large percentage of software is becoming distributed or so-called "cloud-backed software"/"apps". Such applications need to seamlessly execute on  multiple devices while maintaining durability and consistency of data -- well-known examples are text editing, spreadsheet and presentation software. 

My approach

One way to identify inefficiencies and scalability limitations/bottlenecks in existing distributed systems is through empirical analysis. Once I identify such bottlenecks, I start by developing fundamental building blocks and corresponding programming abstractions to alleviate them. I then align these abstractions with other programming language techniques, middleware systems and distributed algorithms to increase scalability and efficiency.

[Jan 2018 : Well, I guess this is now slightly out of date. For most recent work, see my publications page and CV]

The information below dates from my job hunt in December 2013. It is NOT terribly out-of-date and gives you an idea of my research philosophy, but in 2015, @ IBM Research, I am primarily working on elastic scalability and security of distributed data analytics platforms.

This page gives a short overview of two of my research projects. For a detailed description of my research, along with some future plans please see my [Research Statement].

Programmable Elasticity

To optimize the performance of new or existing distributed applications while deploying or moving them to the cloud, engineering robust elasticity management components is essential. I am interested in the design and implementation of  high-level programming language extensions for elastic distributed programs, which are easy to use, familiar to a large community of programmers and supports a variety of programming idioms. 

As a first step in this journey, I have developed ElasticRMI [MIDDLEWARE 2013], a distributed programming framework and middleware system that (1) enables application developers to dynamically change the number of (server) objects available to handle remote method invocations with respect to the application's workload, without requiring changes to clients (invokers) of remote methods, (2) enables flexible elastic scaling, and (3) provides a generic, high-level programming framework that handles elasticity at the level of classes and objects, masking low-level platform specific tasks (like creating and provisioning virtual machine (VM) images) from the developer. Our experiments with four real-world applications implemented in ElasticRMI demonstrate that using fine-grained application-specific metrics actually helps, and increases elasticity significantly.

Engineering Event-based Distributed Systems

Event-based distributed applications like algorithmic trading, highway traffic management and business process monitoring typically use middleware systems for event dissemination and event correlation (i.e., detection of complex events). Event-based distributed systems are inefficient because middleware systems are viewed as a separate layer by the programming language used to develop event-based applications, and the interaction only happens through APIs. The central claim of my doctoral thesis is that the efficiency of event-based distributed systems can be increased by aligning middleware systems with specialized programming abstractions.

I explore these ideas through the EventJava language framework, consisting of:

1) Specialized Programming Abstractions: EventJava is an extension to the Java programming language with advanced abstractions for event correlation (including aggregation, temporal and causal correlation), unicasting and multicasting of events. It supports reactions to combinations of events, and predicates guarding those reactions. EventJava is implemented as a framework to allow customizable propagation and matching of events. [ECOOP'09DEBS'11]

2) Efficient Event Correlation: The EventJava compiler uses definitions of complex events to generate an optimized complex event detection component, called GenTrie, at each event consumer. GenTrie consists of an event flow graph, that is designed to discard unwanted events early and to effectively store events that partially match correlation patterns. [COORDINATION'10]

3) Parametric Subscriptions: To address “dynamic” event-based applications that have to update their subscriptions for various reasons, we introduce parametric subscriptions to support subscription adaptations, and discusses their desirable and feasible guarantees. We propose novel algorithms for updating routing mechanisms effectively and efficiently in classic content-based publish/subscribe broker overlay networks. Compared to re-subscriptions, our algorithms significantly improve the reaction time to subscription updates without hampering throughput or latency under high update rates. In fact, our algorithms significantly improve the latter two metrics under high update rates. [MIDDLEWARE'10ICDCS'11ACM TOCS'13]

4) Subscription Normalization: The EventJava framework also contains a distributed event transmission middleware (called Beretta), which leverages a simple model of typed events, enabling a succinct and uniform normalized representation of subscriptions. This, in turn supports highly effective subsumption and attribute-wise split filtering with worst-case complexity logarithmic in the number of subscriptions, and enables the systematic introduction of parameters into subscriptions. [ICDCS'11]