Measuring the Impact of Gradient Accumulation on Cloud-based Distributed Training
THE 23rd IEEE/ACM International Symposium On Cluster, Cloud and Internet Computing (CCGrid'23)
Though GA is a commonly adopted technique for addressing the GPU memory shortage problem in model training, its benefits to model training have not been systematically studied. This paper evaluates and summarizes the benefits of GA, especially in terms of cloud-based distributed training scenarios, where training cost is determined by both execution time and resource consumption.
LayerCake: Efficient Inference Serving with Cloud and Mobile Resources
THE 23rd IEEE/ACM International Symposium On Cluster, Cloud and Internet Computing (CCGrid'23)
The landscape of DL inference has changed drastically since our first paper on mobile deep inference! Many mobile-oriented models have arised and more apps are leveraging DL models. This paper considers the dynamic inference execution environment and schedules the request to the best-available resource.
Multi-Camera Lighting Estimation for Photorealistic Front-Facing Mobile Augmented Reality
The Twenty-fourth International Workshop on Mobile Computing Systems and Applications (Hotmobile'23)
We demonstrate the promise of dual-camera lighting estimation in improving rendering effects for virtual try-on AR applications. Furthermore, we also show that an existing SToA lighting estimation model can't fully utilize the enlarged camera view.
FuncPipe: A Pipelined Serverless Framework for Fast and Cost-efficient Training of Deep Learning Models
Proceedings of ACM SIGMETRICS, 2023 (SIGMETRICS'23)
FuncPipe co-optimzes model partition and serverless resource allocation to reduce memory consumption and also relieve communication burden in distributed training. Further, we designed a pipelined scatter-reduce to simultaneously utilize downlink/uplink bandwidth.
LitAR: Visually Coherent Lighting for Mobile Augmented Reality
Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT'22)
LitAR reconstructs high-quality environment map using mobile LiDAR sensor and RGB camera with a two-field reconstruction technique. LitAR thus supports features like reflection rendering and correct color tone. Further, our multi-resolution
Privacy-preserving Reflection Rendering for Augmented Reality
30th ACM International Conference on Multimedia (MM) (MM'22)
It is desirable to support visually-coherent rendering for end-user facing applications, such as AR content streaming. However, rendered reflections might reveal sensitive information of the physical space. This paper demonstrates the ease of such attacks proposes two simple defense mechanisms with different visual impacts.
FusedAR: Adaptive Environment Lighting Reconstruction for Visually Coherent Mobile AR Rendering
IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW) (IEEEVRW'22)
A followup work to Xihe, FusedAR now provides lighting information that can be used for high-quality reflection rendering for Mobile AR.
Multi-objective Optimization by Learning Space Partitions
International Conference on Learning Representations (ICLR'22)
A novel multi-objective optimizer that learns a model from observed samples to partition the search space and then focus on promising regions that are likely to contain a subset of the Pareto frontier.
FiShNet: Fine-Grained Filter Sharing for Resource-Efficient Multi-Task Learning
ACM International Conference on Information and Knowledge Management (CIKM'21)
Having multiple deep learning tasks that you want to run on mobile devices? Our FiShNet circumvents the need to manage multiple DL models and provides flexible and fine-grained sharing among different tasks.
Many Models at the Edge:Scaling Deep Inference via Model-Level Caching
2nd IEEE International Conference on Autonomic Computing and Self-Organizing Systems (ACSOS'21)
Want to know how to effectively manage a large number of deep learning models, some are popular and some are less requested, at a resource-constrained edge? Check out our model-specific caching work CremeBrulee!
Enabling Sustainable Clouds: The Case for Virtualizing the Energy System
ACM Symposium on Cloud Computing 2021 (SoCC'21)
It is time to treat carbon as the first-class citizen when designining and managing data centers and clouds! Our vision paper outlines a roadmap with energy virtualization that leads us toward near zero carbon future.
On the Future of Cloud Engineering
9th IEEE International Conference on Cloud Engineering (IC2E'21)
Quantifying and Improving Performance of Distributed Deep Learning with Cloud Storage
9th IEEE International Conference on Cloud Engineering (IC2E'21)
DELI 🥪 is a PyTorch-based prototype for enabling efficient distributed deep learning using cloud storage buckets. DELI can reduce the time that the training loop is waiting for data by 85.6% - 93.5% compared to loading from a storage bucket.
Memory-Efficient Deep Learning Inference in Trusted Execution Environments
9th IEEE International Conference on Cloud Engineering (IC2E'21)
Do you want to securely run unmodified DL models? TEEs help but can lead to more than 20X overhead! Check out our work that reduces the execution overhead to 1.09X!
Xihe: A 3D Vision-based Lighting Estimation Framework for Mobile Augmented Reality
The 19th ACM International Conference on Mobile Systems, Applications, and Services (MobiSys'21)
No more physical probes or undesirable visual effects! With our system Xihe, mobile AR developers can access accurate spatially-variant lighting estimation in ~20ms. All you need is a Lidar-enabled device!
Few-shot Neural Architecture Search
Thirty-eighth International Conference on Machine Learning (ICML'21) long oral
Are you intrigued by one-shot NAS but worried about the inccurate performance estimation? Try out our few-shot NAS! Few-shot NAS establishes new SoTAs, e.g., on ImageNet, it finds models that reach 80.5 top-1 accuracy at 600 MB FLOPS.
Sync-Switch: Hybrid Parameter Synchronization for Distributed Deep Learning
41th IEEE International Conference on Distributed Computing Systems (ICDCS'21)
This paper presents a hybrid synchronization approach that exploits the benefits of both BSP and ASP, i.e., reducing training time while simultaneously maintaining the converged accuracy.
PieSlicer: Dynamically Improving Response Time for Cloud-based CNN Inference
12th ACM/SPEC International Conference on Performance Engineering (ICPE'21)
The bottleneck for using cloud-based inference can come down to poor mobile or network performance. PieSlicer improves this performance by dynamically deciding where to preprocess the inference input based on empirical-driven performance models.
CINET: Redesigning Deep Neural Networks for Efficient Mobile-Cloud Collaborative Inference
SIAM International Conference on Data Mining (SDM'21)
We design a collaboration-aware neural network called CiNet by considering the low on-device computation and network transmission cost from the outset. CiNet allows easy and efficient inference computation partition across mobile device and remote server.
GRAD: Learning for Overhead-aware Adaptive Video Streaming with Scalable Video Coding
28th ACM International Conference on Multimedia (ACM MM'20)
We provide a new mechanism for bitrate adaptation algorithms, enabling finer-grained bitrate adjustments to both buffered and incoming video chunks. Our deep reinforcement learning based approach outperforms state-of-the-art, especially under highly-variable network.
Virtual reality streaming at the edge: a power perspective: poster
ACM/IEEE Symposium on Edge Computing (SEC'19)
EdgeServe: efficient deep learning model caching at the edge
ACM/IEEE Symposium on Edge Computing (SEC'19)
Characterizing the deep neural networks inference performance of mobile applications
arXiv (arXiv'19)
An experimental evaluation of garbage collectors on big data applications
The 45th International Conference on Very Large Data Bases (VLDB'19)
Cloud-based or On-device: An Empirical Study of Mobile Deep Inference
IEEE International Conference on Cloud Engineering (IC2E'18)
Providing geo-elasticity in geographically distributed clouds
Transactions on Internet Technology (TOIT'18)
Performance and cost considerations for providing geo-elasticity in database clouds
ACM Transactions on Autonomous and Adaptive Systems (TAAS'17)
On the feasibility of cloud-based SDN controllers for residential networks
IEEE Conference on Network Function Virtualization and Software Defined Networks (NFV-SDN'17)
Managing risk in a derivative IaaS cloud
IEEE Transactions on Parallel and Distributed Systems (TPDS'17)
Placement Strategies for Virtualized Network Functions in a NFaaS Cloud
Fourth IEEE Workshop on Hot Topics in Web Systems and Technologies (HotWeb'16)
Flint: Batch-Interactive Data-Intensive Processing on Transient Servers
the Eleventh European Conference on Computer Systems (EuroSys'16)
GeoScale: Providing Geo-Elasticity in Distributed Clouds
IEEE International Conference on Cloud Engineering (IC2E'16)
Analyzing the Efficiency of a Green University Data Center
ACM International Conference on Performance Engineering (ICPE'16)
SpotOn: A Batch Computing Service for the Spot Market
The sixth ACM Symposium on Cloud Computing (SoCC'15)
Model-driven Geo-Elasticity In Database Clouds
International Conference on Autonomic Computing (ICAC'15)
SpotCheck: Designing a Derivative IaaS Cloud on the Spot Market
the Tenth European Conference on Computer Systems (EuroSys'15)
VMShadow: Optimizing The Performance of Latency-sensitive Virtual Desktops in Distributed Clouds
Proceedings of the 5th ACM Multimedia Systems Conference (MMSys'14)
Vmshadow: Optimizing the performance of virtual desktops in distributed clouds
Proceedings of the 4th annual Symposium on Cloud Computing (SoCC'13)
Cost-aware Cloud Bursting for Enterprise Applications
Transactions on Internet Technology (TOIT'13)
Seagull: intelligent cloud bursting for enterprise applications
2012 USENIX Annual Technical Conference (USENIX ATC'12)