Big Data

So much data but so few computation resources! How should we crunch those data and extract insights in a cost-efficient and resource efficient manner? New hardware, new software stack, or just don't care about those data? Let us find out!

Papers

DistStream: An Order-Aware Distributed Framework for Online-Offline Stream Clustering Algorithms

Lijie Xu, Xingtong Ye, Kai Kang, Tian Guo, Wensheng Dou, Wei Wang and Jun Wei

40th IEEE International Conference on Distributed Computing Systems (ICDCS'20)

@inproceedings{distream_icdcs2020,
    author = {Xu, Lijie and Ye, Xingtong and Kang, Kai and Guo, Tian and Dou, Wensheng and Wang, Wei and Wei, Jun},
    title     = {Characterizing and Modeling Distributed Training with Transient Cloud GPU Servers},
    booktitle = {40th {IEEE} International Conference on Distributed Computing Systems, {ICDCS} 2020},
    year      = {2020},
}