Photo of Tian Guo

 Tian Guo

Assistant Professor
Computer Science Department
Worcester Polytechnic Institute
100 Institute Road
Worcester, MA 01609
Office: Fuller Labs B23

tian@cs.wpi.edu

(508)831-6860

github.com/belindanju


Projects

Being a proud member of LASS, I was fortunate to work on many interesting research projects on building a green future cloud architecture. Here are a summary of my most recent efforts. I hope you find those projects useful, or at least interesting! Drop me an email if you have comments or would like to collaborate in any of these areas.

  • Cloud Cost Reduction: It is always about money, isn't it? Read More

    One key benefit of cloud platforms is the ability to acquire resources on demand to handle peak workload. For enterprises with existing IT infrastructures, it is not clear how to make transitions to cloud platforms cost-effectively. I built a cloud bursting system that answers questions, such as when to and how much workload to move from private data center to public clouds, and automates such processes. This system serves as a building block for investigating problems such as pooling cloud resources.

    Of course, as cloud customers(people that host cloud services of some sorts), we are constantly facing with cost and performance trade-offs. Can we get away with 10 servers, 100 servers or even 1000 servers? But What about our monthly bill? Yes, budgets is a real-world problem. Well, to help out, I did some works to exploit a type of very cheap but volatile cloud resources, e.g., spot servers. The high level idea is to provide system/application-level mechanisms that allow customers run interactive, batch and batch-interactive applications on spot servers as long as possible. Such mechanisms are guided by our risk-aware cost-effecitve policies.

  • Mobility Support in the Cloud: Providing better services for mobile users. Read More

    Traffics from mobile devices have surpassed other types and start to dominate cloud traffics. Mobile devices' capabilities are still limited by their battery life despite growing computation and storage. Also, today's mobile devices are equipped with at least two network interfaces, one for WiFi and the other for cellular. Well, not just about devices. Mobile users also tend to move around and introduce mobility problem. The bottom line is mobile traffics possess some unique characteristics that render current cloud platforms' support powerless. I think it is very interesting to enhance cloud platforms' supports for mobile applications by taking into account their unique characteristics. As part of my effort, I worked on VMShadow that moves cloud applications closer to users thus improving performance. The techniques proposed in VMShadow is general and could potentially be used for mobility problems. And I would love to work more on this area.

  • Geo-elasticity for Global Workload: I say automation is always preferred. Read More

    As applications serve more users from geographically distributed locations, their workloads exhibit spatial dynamics in addition to temporal one. What exactly is spatial workload dynamic? Well, just imagining a global application that needs to handle different amount of requests from various locations on a particular day. For example, application traffics from United States might spike on Black Friday while traffics from China might increase on Single's day, i.e. 11.11. Handling workload dynamics effectively require provisioning enough server resources on data centers that are closer to workload spikes. Manually selecting cloud sites and setting up servers is tedious and time-consuming. So, I worked on two systems, DBScale and GeoScale, that automate the scaling processes across geographically distributed data centers. Such ability is referred as geo-elasticity and is an important step towards a mobile-aware cloud platform.

  • Big Data Analytic Framework: Towards faster and cheaper data processing . Read More

    Today data are literally everywhere, being collected, transferred and analyzed in cloud platforms. Tasks ranging from simple aggregation to interactive machine learning benefit from having access to a large amount of server resources and running in parallel. However, the trade-off between how many servers to use and how long to wait are often constrained by budgets. Luckily, cloud providers rolled out cheaper resources that potentially allow us to rent 10x more servers with same cost. But such low costs are associated with risks of losing servers only within minutes warning. I worked on SpotOn and Flint that manage such risks for batch and interactive applications. I believe mechanisms underlying these two systems are useful for providing cost-effective big data analytic in the cloud environment.

  • Green Data Center: It is not just about cutting electricity bills. Read More

    Data centers are consuming humongous amount of electricity every year. As more data centers are built, electricity consumptions will continue to increase if no precautions are taken. Companies like Google, Facebook and Apple are constantly working on improving data center energy efficiency and publishing their power usage effectiveness (PUE) values. Because such PUEs often correspond to yearly average, they contain very limited information about how energy is consumed inside each data center. To gain insights of energy efficiency, my co-authors and I analyzed MGHPCC, a state-of-the-art 15MW green data that incorporates many of the technological advances used in commercial data centers. I believe such insights are useful in facilitating research community to design and evaluate new energy-efficient optimizations.


Publications [Google Scholar Profile] [Bibtex Entries]

On the Feasibility of Cloud-Based SDN Controllers for Residential Networks
Curtis R. Taylor, Tian Guo, Craig A. Shue, and Mohamed E. Najd
To appear in 2017 IEEE Conference on Network Function Virtualization and Software Defined Networks (NFV-SDN'17)
abstract

Towards Efficient Deep Inference for Mobile Applications
Tian Guo
arXiv:1707.04610
abstract

Mobile applications are benefiting significantly from the advancement in deep learning, e.g. providing new features. Given a trained deep learning model, applications usually need to perform a series of matrix operations based on the input data, in order to infer possible output values. Because of model computation complexity and increased model sizes, those trained models are usually hosted in the cloud. When mobile apps need to utilize those models, they will have to send input data over the network. While cloud-based deep learning can provide reasonable response time for mobile apps, it also restricts the use case scenarios, e.g. mobile apps need to have access to network. With mobile specific deep learning optimizations, it is now possible to employ device-based inference. However, because mobile hardware, e.g. GPU and memory size, can be very different and limited when compared to desktop counterpart, it is important to understand the feasibility of this new device-based deep learning inference architecture. In this paper, we empirically evaluate the inference efficiency of three Convolutional Neural Networks using a benchmark Android application we developed. Based on our application-driven analysis, we have identified several performance bottlenecks for mobile applications powered by on-device deep learning inference.

Paper
Performance and Cost Considerations for Providing Geo-Elasticity in Database Clouds
Tian Guo, Prashant Shenoy
To appear in Transactions on Autonomous and Adaptive Systems (TAAS'17)
abstract

Online applications that serve global workload have become a norm and those applications are experiencing not only temporal but also spatial workload variations. In addition, more applications are hosting their backend tiers separately for benefits such as ease of management. To provision for such applications, traditional elasticity approaches that only consider temporal workload dynamics and assume well-provisioned backends are insufficient. Instead, in this paper, we propose a new type of provisioning mechanisms---geo-elasticity, by utilizing distributed clouds with different locations. Centered this idea, we build a system called DBScale that tracks geographic variations in the workload to dynamically provision database replicas at different cloud locations across the globe. Our geo-elastic provisioning approach comprises a regression-based model that infers database query workload from spatially distributed front-end workload, a two-node open queueing network model that estimates the capacity of databases serving both CPU and I/O-intensive query workloads, and greedy algorithms for selecting best cloud locations based on latency and cost. We implement a prototype of our DBScale system on Amazon EC2’s distributed cloud. Our experiments with our prototype show up to a 66% improvement in response time when compared to local elasticity approaches.

Latency-aware Virtual Desktops Optimization in Distributed Clouds
Tian Guo, Prashant Shenoy, K. K. Ramakrishnan, Vijay Gopalakrishnan
To appear in Multimedia Systems (MMSJ'17)
abstract

Distributed clouds offer a choice of data center locations for providers to host their applications. In this paper we consider distributed clouds that host virtual desktops which are then accessed by users through remote desktop protocols. Virtual desktops have different levels of latency-sensitivity, primarily determined by the actual applications running and affected by the end users’ locations. In the scenario of mobile users, even switching between 3G and WiFi networks affects the latency sensitivity. We design VMShadow, a system to automatically optimize the location and performance of latency-sensitive VMs in the cloud. VMShadow performs black-box fingerprinting of a VM’s network traffic to infer the latency-sensitivity and employs both ILP and greedy heuristic based algorithms to move highly latency-sensitive VMs to cloud sites that are closer to their end users. VMShadow employs a WAN-based live migration and a new network connection migration protocol to ensure that the VM migration and subsequent changes to the VM’s network address are transparent to end-users. We implement a prototype of VMShadow in a nested hypervisor and demonstrate its effectiveness for optimizing the performance of VM-based desktops in the cloud. Our experiments on a private as well as the public EC2 cloud show that VMShadow is able to discriminate between latency-sensitive and insensitive desktop VMs and judiciously moves only those that will benefit the most from the migration. For desktop VMs with video activity, VMShadow improves VNC’s refresh rate by 90% by migrating virtual desktop to the closer location. Transcontinental remote desktop migrations only take about 4 minutes and our connection migration proxy imposes 13µs overhead per packet.

Paper
Managing Risk in a Derivative IaaS Cloud
Prateek Sharma, Stephen Lee, Tian Guo, David Irwin, and Prashant Shenoy
To appear in Transactions on Parallel and Distributed Systems (TPDS'17)
abstract

Infrastructure-as-a-Service (IaaS) cloud platforms rent computing resources with different cost and availability tradeoffs. For example, users may acquire virtual machines (VMs) in the spot market that are cheap, but can be unilaterally terminated by the cloud operator. Because of this revocation risk, spot servers have been conventionally used for delay and risk tolerant batch jobs. In this paper, we develop risk mitigation policies which allow even interactive applications to run on spot servers.

Our System, SpotCheck is a derivative cloud platform, and provides the illusion of an IaaS platform that offers always-available VMs on demand for a cost near that of spot servers, and supports unmodified applications. SpotCheck’s design combines virtualization-based mechanisms for fault-tolerance, and bidding and server selection policies for managing the risk and cost. We implement SpotCheck on EC2 and show that it i) provides nested VMs with 99.9989% availability, ii) achieves nearly 5× cost savings compared to using on-demand VMs, and iii) eliminates any risk of losing VM state.

Paper
Placement Strategies for Virtualized Network Functions in a NFaaS Cloud
Xin He, Tian Guo, Erich Nahum and Prashant Shenoy
Fourth IEEE Workshop on Hot Topics in Web Systems and Technologies (HotWeb'16)
abstract

Enterprises that host services in the cloud need to protect their cloud resources using network services such as firewalls and deep packet inspection systems. While middleboxes have typically been used to implement such network functions in traditional enterprise networks, their use in cloud environments by cloud tenants is problematic due to the boundary between cloud providers and cloud tenants. Instead we argue that network function virtualization is a natural fit in cloud environments, where the cloud provider can implement Network Functions as a Service using virtualized network functions running on cloud servers, and enterprise cloud tenants can employ these services to implement security and performance optimizations for their cloud resources. In this paper, we focus on placement issues in the design of a NFaaS cloud and present two placement strategies---tenant-centric and service-centric---for deploying virtualized network services in multi-tenant settings. We discuss several trade-offs of these two strategies. We implement a prototype NFaaS testbed and conduct a series of experiments to show to quantify the benefits and drawbacks of our two strategies. Our results suggest that the tenant-centric placement provides lower latencies while service-centric approach is more flexible for reconfiguration and capacity scaling.

Paper
Elastic Resource Management in Distributed Clouds
Tian Guo
Ph.D. thesis, University of Massachusetts Amherst.
abstract

The ubiquitous nature of computing devices and their increasing reliance on remote resources have driven and shaped public cloud platforms into unprecedented large-scale, distributed data centers. Concurrently, a plethora of cloud-based applications are experiencing multi-dimensional workload dynamics—workload volumes that vary along both time and space axes and with higher frequency. The interplay of diverse workload characteristics and distributed clouds raises several key challenges for efficiently and dynamically managing server resources. First, current cloud platforms impose certain restrictions that might hinder some resource management tasks. Second, an application-agnostic approach might not entail appropriate performance goals, therefore, requires numerous specific methods. Third, provisioning resources outside LAN boundary might incur huge delay which would impact the desired agility. In this dissertation, I investigate the above challenges and present the design of automated systems that manage resources for various applications in distributed clouds. The intermediate goal of these automated systems is to fully exploit potential benefits such as reduced network latency offered by increasingly distributed server resources. The ultimate goal is to improve end-to-end user response time with novel resource management approaches, within a certain cost budget. Centered around these two goals, I first investigate how to optimize the location and performance of virtual machines in distributed clouds. I use virtual desktops, mostly serving a single user, as an example use case for developing a black-box approach that ranks virtual machines based on their dynamic latency requirements. Those with high latency sensitivities have a higher priority of being placed or migrated to a cloud location closest to their users. Next, I relax the assumption of well-provisioned virtual machines and look at how to provision enough resources for applications that exhibit both temporal and spatial workload fluctuations. I propose an application-agnostic queueing model that captures the resource utilization and server response time. Building upon this model, I present a geo-elastic provisioning approach—referred as geo-elasticity—for replicable multi-tier applications that can spin up an appropriate amount of server resources in any cloud locations. Last, I explore the benefits of providing geo-elasticity for database clouds, a popular platform for hosting application backends. Performing geo-elastic provisioning for backend database servers entails several challenges that are specific to database workload, and therefore requires tailored solutions. In addition, cloud platforms offer resources at various prices for different locations. Towards this end, I propose a cost-aware geo-elasticity that combines a regression-based workload model and a queueing network capacity model for database clouds. In summary, hosting a diverse set of applications in an increasingly distributed cloud makes it interesting and necessary to develop new, efficient and dynamic resource management approaches.

Paper
Flint: Batch-Interactive Data-Intensive Processing on Transient Servers
Prateek Sharma, Tian Guo, Xin He, David Irwin, Prashant Shenoy
Procceedings of the Eleventh European Conference on Computer Systems (EuroSys'16)
abstract

Cloud providers now offer transient servers, which they may revoke at anytime, for significantly lower prices than on-demand servers, which they cannot revoke. Transient servers’ low price is particularly attractive for executing an emerging class of workload, which we call Batch-Interactive Data-Intensive (BIDI), that is becoming increasingly impor- tant for data analytics. BIDI workloads require large sets of servers to cache massive datasets in memory to enable low latency operation. In this paper, we illustrate the challenges of executing BIDI workloads on transient servers, where re- vocations (akin to failures) are the common case. To address these challenges, we design Flint, which is based on Spark and includes automated checkpointing and server selection policies that i) support batch and interactive applications and ii) dynamically adapt to application characteristics. We evaluate a prototype of Flint using EC2 spot instances, and show that it yields cost savings of up to 90% compared to using on-demand servers, while increasing running time by < 2%.

Paper
GeoScale: Providing Geo-Elasticity in Distributed Clouds
Tian Guo, Prashant Shenoy, Hakan Hacigumus
Proceedings of 2016 IEEE International Conference on Cloud Engineering (IC2E'16)
abstract

Distributed cloud platforms are well suited for serving a geographically diverse user base. However traditional cloud provisioning mechanisms that make local scaling decisions are not well suited for temporal and spatial workload fluctuations seen by modern web applications. In this paper, we argue the need of geo-elasticity and present GeoScale, a system to provide geo-elasticity in distributed clouds. We describe GeoScale’s model-driven proactive provisioning ap- proach and conduct an initial evaluation of GeoScale on Amazon’s distributed EC2 cloud. Our results show up to 31% improvement in the 95th percentile response time when compared to traditional elasticity techniques.

Paper
Analyzing the Efficiency of a Green University Data Center
Patrick Pegus II, Benoy Varghese, Tian Guo, David Irwin, Prashant Shenoy, Anirban Mahanti, James Culbert, John Goodhue, Chris Hill
Proceedings of 2016 ACM International Conference on Performance Engineering (ICPE'16)
abstract

Data centers are an indispensable part of today’s IT infrastructure. To keep pace with modern computing needs, data centers continue to grow in scale and consume increasing amounts of power. While prior work on data centers has led to significant improvements in their energy-efficiency, detailed measurements from these facilities’ operations are not widely available, as data center design is often considered part of a company’s competitive advantage. However, such detailed measurements are critical to the research community in motivating and evaluating new energy-efficiency optimizations. In this paper, we present a detailed analysis of a state-of-the-art 15MW green multi-tenant data center that incorporates many of the technological advances used in commercial data centers. We analyze the data center’s computing load and its impact on power, water, and carbon usage using standard effectiveness metrics, including PUE, WUE, and CUE. Our results reveal the benefits of optimizations, such as free cooling, and provide insights into how the various effectiveness metrics change with the seasons and increasing capacity usage. More broadly, our PUE, WUE, and CUE analysis validate the green design of this LEED Platinum data center.

Paper
SpotOn: A Batch Computing Service for the Spot Market
Supreeth Subramanya, Tian Guo, Prateek Sharma, David Irwin, and Prashant Shenoy
Proceedings of the 6th Annual Symposium on Cloud Computing (SoCC'15)
abstract

Cloud spot markets enable users to bid for compute resources, such that the cloud platform may revoke them if the market price rises too high. Due to their increased risk, revocable resources in the spot market are often significantly cheaper (by as much as 10X) than the equivalent non-revocable on-demand resources. One way to mitigate spot market risk is to use various fault-tolerance mechanisms, such as checkpointing or replication, to limit the work lost on revocation. However, the additional performance overhead and cost for a particular fault-tolerance mechanism is a complex function of both an application’s resource usage and the magnitude and volatility of spot market prices.

We present the design of a batch computing service for the spot market, called SpotOn, that automatically selects a spot market and fault-tolerance mechanism to mitigate the impact of spot revocations without requiring application modification. SpotOn’s goal is to execute jobs with the performance of on-demand resources, but at a cost near that of the spot market. We implement and evaluate SpotOn in simulation and using a prototype on Amazon’s EC2 that packages jobs in Linux Containers. Our simulation results using a job trace from a Google cluster indicate that SpotOn lowers costs by 91.9% compared to using on-demand resources with little impact on performance.

Paper Slides Poster
Model-driven Geo-Elasticity In Database Clouds
Tian Guo and Prashant Shenoy
International Conference on Autonomic Computing and Communications (ICAC'15)
abstract

Motivated by the emergence of distributed clouds, we argue for the need for geo-elastic provisioning of application replicas to effectively handle temporal and spatial workload fluctuations seen by such applications. We present DBScale, a system that tracks geographic variations in the workload to dynamically provision database replicas at different cloud locations across the globe. Our geo-elastic provisioning approach comprises a regression-based model to infer the database query workload from observations of the spatially distributed frontend workload and a two-node open queueing network model to provision databases with both CPU and I/O-intensive query workloads. We implement a prototype of our DBScale system on Amazon EC2’s distributed cloud. Our experiments with our prototype show up to a 66% improvement in response time when compared to local elasticity approaches.

Paper Slides
SpotCheck: Designing a Derivative IaaS Cloud on the Spot Market
Prateek Sharma, Stephen Lee, Tian Guo, David Irwin, and Prashant Shenoy
Procceedings of the Tenth European Conference on Computer Systems (EuroSys'15)
abstract

Infrastructure-as-a-Service (IaaS) cloud platforms rent resources, in the form of virtual machines (VMs), under a variety of contract terms that offer different levels of risk and cost. For example, users may acquire VMs in the spot market that are often cheap but entail significant risk, since their price varies over time based on market supply and demand and they may terminate at any time if the price rises too high. Currently, users must manage all the risks associated with using spot servers. As a result, conventional wisdom holds that spot servers are only appropriate for delay-tolerant batch applications. In this paper, we propose a derivative cloud platform, called SpotCheck, that transparently manages the risks associated with using spot servers for users.

SpotCheck provides the illusion of an IaaS platform that offers always-available VMs on demand for a cost near that of spot servers, and supports all types of applications, including interactive ones. SpotCheck’s design combines the use of nested VMs with live bounded-time migration and novel server pool management policies to maximize availability, while balancing risk and cost. We implement SpotCheck on Amazon’s EC2 and show that it i) provides nested VMs to users that are 99.9989% available, ii) achieves nearly 5X cost savings compared to using equivalent types of on-demand VMs, and iii) eliminates any risk of losing VM state.

Paper Slides Poster
Cost-Aware Cloud Bursting for Enterprise Applications
Tian Guo, Upendra Sharma, Prashant Shenoy, Timothy Wood, and Sambit Sahu
ACM Transactions on Internet Technology, Volume 13 Issue 3, May 2014, Article No. 10 (TOIT'14)
abstract

The high cost of provisioning resources to meet peak application demands has led to the widespread adoption of pay-as-you-go cloud computing services to handle workload fluctuations. Some enterprises with existing IT infrastructure employ a hybrid cloud model where the enterprise uses its own private resources for the majority of its computing, but then “bursts” into the cloud when local resources are insufficient. However, current commercial tools rely heavily on the system administrator’s knowledge to answer key questions such as when a cloud burst is needed and which applications must be moved to the cloud. In this paper we describe Seagull, a system designed to facilitate cloud bursting by determining which applications should be transitioned into the cloud and automating the movement process at the proper time. Seagull optimizes the bursting of applications using an optimization algorithm as well as a more efficient but approximate greedy heuristic. Seagull also optimizes the overhead of deploying applications into the cloud using an intelligent precopying mechanism that proactively replicates virtualized applications, lowering the bursting time from hours to minutes. Our evaluation shows over 100% improvement compared to naive solutions but produces more expensive solutions compared to ILP. However, the scalability of our greedy algorithm is dramatically better as the number of VMs increase. Our evaluation illustrates scenarios where our prototype can reduce cloud costs by more than 45% when bursting to the cloud, and that the incremental cost added by precopying applications is offset by a burst time reduction of nearly 95%.

Paper
VMShadow: Optimizing the Performance of Latency-sensitive Virtual Desktops in Distributed Clouds
Tian Guo, Vijay Gopalakrishnan, K. K. Ramakrishnan, Prashant Shenoy, Arun Venkataramani, and Seungjoon Lee
Proceedings of the 5th ACM Multimedia Systems Conference (MMSys'14)
abstract

Distributed clouds offer a choice of data center locations to application providers to host their applications. In this paper we consider distributed clouds that host virtual desktops which are then accessed by their users through remote desktop protocols. We argue that virtual desktops that run latency-sensitive applications such as games or video players are particularly sensitive to the choice of the cloud data center location. We design VMShadow, a system to automatically optimize the location and performance of location-sensitive virtual desktops in the cloud. VMShadow performs black-box fingerprinting of a VM’s network traffic to infer its location-sensitivity and employs a greedy heuristic based algorithm to move highly location-sensitive VMs to cloud sites that are closer to their end-users. VMShadow employs WAN-based live migration and a new network connection migration protocol to ensure that the VM migration and subsequent changes to the VM’s network address are transparent to end-users. We implement a prototype of VMShadow in a nested hypervisor and demonstrate its effectiveness for optimizing the performance of VM-based desktops in the cloud. Our experiments on a private and the public EC2 cloud show that VMShadow is able to discriminate between location-sensitive and insensitive desktop applications and judiciously move only those VMs that will benefit the most. For desktop VMs with video activity, VMShadow improves VNC’s refresh rate by 90%. Further our connection migration proxy, which utilizes dynamic rewriting of packet headers, imposes a rewriting overhead of only 13µs per packet. Trans-continental VM migrations take about 4 minutes.

Paper
VMShadow: Optimizing The Performance of Virtual Desktops in Distributed Clouds
Tian Guo, Vijay Gopalakrishnan, K. K. Ramakrishnan, Prashant Shenoy, Arun Venkataramani, and Seungjoon Lee
Proceedings of the 4th Annual Symposium on Cloud Computing (SOCC '13)
abstract

We present VMShadow, a system that automatically optimizes the location and performance of applications based on their dynamic workloads. We prototype VMShadow and demonstrate its efficacy using VM-based desktops in the cloud as an example application. Our experiments on a private cloud as well as the EC2 cloud, using a nested hypervisor, show that VMShadow is able to discriminate between location-sensitive and location-insensitive desktop VMs and judiciously moves only those that will benefit the most from the migration. For example, VMShadow performs transcontinental VM migrations in ∼4 mins and can improve VNC’s video refresh rate by up to 90%.

Paper Poster
Seagull: Intelligent Cloud Bursting for Enterprise Applications
Tian Guo, Upendra Sharma, Timothy Wood, Sambit Sahu, and Prashant Shenoy
Proceedings of the 2012 USENIX conference on Annual Technical Conference (ATC'12)
abstract

Enterprises with existing IT infrastructure are beginning to employ a hybrid cloud model where the enterprise uses its own private resources for the majority of its computing, but then “bursts” into the cloud when local resources are insufficient. However, current approaches to cloud bursting cannot be effectively automated because they heavily rely on system administrator knowledge to make decisions. In this paper we describe Seagull, a system designed to facilitate cloud bursting by determining which applications can be transitioned into the cloud most economically, and automating the movement process at the proper time. We further optimize the deployment of applications into the cloud using an intelligent precopying mechanism that proactively replicates virtualized applications, lowering the bursting time from hours to minutes. Our evaluation illustrates how our prototype can reduce cloud costs by more than 45% when bursting to the cloud, and the incremental cost added by precopying applications is offset by a burst time reduction of nearly 95%.

Paper Slides Videos