Efficient Mobile Deep Inference

An ever-increasing number of mobile applications are leveraging deep learning models to provide novel and useful features, such as real-time language translation and object recognition. However, current mobile inference paradigm requires application developers to statically trade-off between inference accuracy and inference speed during development time. As a result, mobile user experience is negatively impact given dynamic inference scenarios and heterogeneous device capacity. The MODI project proposes new research in designing and implementing a mobile-aware deep inference platform that combines innovations in both algorithm and system optimizations.

Acknowledgements: This project is generously supported by NSF Grant #1755659, #1815619, and Google Cloud Research Credits.

Project Personnel

Papers

MDInference: Balancing Inference Accuracy and Latency for Mobile Applications.

Samuel S. Ogden and Tian Guo.

2020 IEEE International Conference on Cloud Engineering (IC2E'20), Invited paper

@article{mdinference_ic2e2020,
  title={MDInference: Balancing Inference Accuracy and Latency for Mobile
Applications},
  author={Samuel S. Ogden and Tian Guo},
  journal={2020 IEEE International Conference on Cloud Engineering},
  year={2020},
}

Perseus: Characterizing Performance and Cost of Multi-Tenant Serving for CNN Models.

Matthew LeMay, Shijian Li, Tian Guo.

2020 IEEE International Conference on Cloud Engineering (IC2E'20)

@article{perseus_ic2e2020,
  title={Perseus: Characterizing Performance and Cost of Multi-Tenant Serving for CNN Models},
  author={Matthew LeMay, Shijian Li, Tian Guo},
  journal={2020 IEEE International Conference on Cloud Engineering},
  year={2020},
}

MODI: Mobile Deep Inference Made Efficient by Edge Computing

Samuel S. Ogden, Tian Guo

The USENIX Workshop on Hot Topics in Edge Computing (HotEdge '18)

@inproceedings {modi_hotedge2018,
author = {Samuel S. Ogden and Tian Guo},
title = {{MODI}: Mobile Deep Inference Made Efficient by Edge Computing},
booktitle = {{USENIX} Workshop on Hot Topics in Edge Computing (HotEdge 18)},
year = {2018},
address = {Boston, MA},
url = {https://www.usenix.org/conference/hotedge18/presentation/ogden},
publisher = {{USENIX} Association},
month = jul,
}

Cloud-based or On-device: An Empirical Study of Mobile Deep Inference

Tian Guo

2018 IEEE International Conference on Cloud Engineering (IC2E'18)

@article{mdmeasurement_ic2e2018,
  title={Cloud-Based or On-Device: An Empirical Study of Mobile Deep Inference},
  author={Tian Guo},
  journal={2018 IEEE International Conference on Cloud Engineering (IC2E)},
  year={2017},
  pages={184-190}
}
© Tian Guo 2020. Last Updated: May 30, 2020