My areas of interest span Operating Systems and Distributed Systems. Within these disciplines, I work mainly in the areas of resource management, scheduling, and performance analysis. I am interested in performance issues in a variety of distributed systems: Clouds, Volunteer Grids, Data centers, and Mobile platforms. The key focus of my work has been to provide system support for data- and compute-intensive applications, and to make these systems scalable, self-managing, and reliable.
A complete list of my publications is available here. Please note that this list is generally more up-to-date than the list of projects below.
Currently, I am working on a number of projects involving data-intensive computing and user-cloud interactions. Some of my current projects include:
Geo-Distributed Data-intensive Computing: Across a large number of application domains that include web analytics, social analytics, scientific computing, and energy analytics, large quantities of data are generated from disparate sources such as users, devices, and sensors located around the globe. Much of this data needs to processed and analyzed quickly to extract timely information, leading to tradeoffs in cost, performance, and accuracy. In this project, we are developing new scheduling algorithms and resource management techniques for optimizing data-intensive analytics: both batch and stream computing in geo-distributed (wide-area) environments.
With the rapid growth of large online social networks,
the ability to analyze large-scale social structure and behavior
has become critically important, and this has led
to the development of several scalable graph processing
systems. In reality, social interaction takes place not just
between pairs of individuals, but rather in the context of multi-user groups.
Research has shown that such group dynamics can be better modeled through hypergraphs: a generalization of graphs.
In this project, we are building MESH (Minnesota Engine for Scalable evolving Hypergraph analysis): a framework
of algorithms and system components to support scalable analysis of evolving hypergraphs.
June 2017: Source code for MESH v1.0 released. Please check it out.
Wiera: Cloud providers offer an array of storage services that represent different points along the performance, cost, and durability spectrum. Further, such storage services are provided across multiple data center locations that are dispersed geographically. In order to avail the composite benefits of multiple storage tiers and locations, the burden is traditionally on applications to identify desired locations, as well as to manage the complexity of different interfaces to these storage services and their diverse policies. In this project, we are building a middleware that enables the provision of multi-tiered, multi-data center cloud storage instances that are easy to specify, flexible, and enable a rich array of storage policies and desired metrics to be realized.
Nebula: Centralized cloud infrastructures have traditionally been used as the computational platform for data-intensive computing. However, they suffer from inefficient data mobility due to the centralization of cloud resources as well as high cost of execution, and hence, are highly unsuited for dispersed-data-intensive applications, where the data may be spread at multiple geographical locations. In this project, we are building Nebula: a dispersed, low cost cloud infrastructure that uses voluntary edge resources for both computation and data storage. The lightweight Nebula architecture enables distributed data-intensive computing through a number of optimizations including location-aware data and computation placement, replication, and recovery.
Mobilizing the Cloud: Users increasingly rely on mobile devices for much of their computational and information needs, leading to the advent of a growing number of rich mobile applications. However, mobile devices are inherently limited in their compute and storage capacities and battery power. The abundance of compute and storage resources available in the cloud makes it well-suited to addressing the limitations of mobile devices. In this project, we are building a mobile/cloud computing framework that explores the use of cloud infrastructure to accelerate mobile applications using data-driven, context-dependent optimization techniques.
Some earlier projects that I worked on: