Dystri: A Dynamic Inference based Distributed DNN Service Framework on Edge
Document Type
Conference Proceeding
Publication Date
8-7-2023
Abstract
Deep neural network (DNN) inference poses unique challenges in serving computational requests due to high request intensity, concurrent multi-user scenarios, and diverse heterogeneous service types. Simultaneously, mobile and edge devices provide users with enhanced computational capabilities, enabling them to utilize local resources for deep inference processing. Moreover, dynamic inference techniques allow content-based computational cost selection per request. This paper presents Dystri, an innovative framework devised to facilitate dynamic inference on distributed edge infrastructure, thereby accommodating multiple heterogeneous users. Dystri offers a broad applicability in practical environments, encompassing heterogeneous device types, DNN-based applications, and dynamic inference techniques, surpassing the state-of-the-art (SOTA) approaches. With distributed controllers and a global coordinator, Dystri allows per-request, per-user adjustments of quality-of-service, ensuring instantaneous, flexible, and discrete control. The decoupled workflows in Dystri naturally support user heterogeneity and scalability, addressing crucial aspects overlooked by existing SOTA works. Our evaluation involves three multi-user, heterogeneous DNN inference service platforms deployed on distributed edge infrastructure, encompassing seven DNN applications. Results show Dystri achieves near-zero deadline misses and excels in adapting to varying user numbers and request intensities. Dystri outperforms baselines with accuracy improvement up to 95×.
Identifier
85179889219 (Scopus)
ISBN
[9798400708435]
Publication Title
ACM International Conference Proceeding Series
External Full Text Location
https://doi.org/10.1145/3605573.3605598
First Page
625
Last Page
634
Grant
2147623
Fund Ref
National Science Foundation
Recommended Citation
Hou, Xueyu; Guan, Yongjie; and Han, Tao, "Dystri: A Dynamic Inference based Distributed DNN Service Framework on Edge" (2023). Faculty Publications. 1525.
https://digitalcommons.njit.edu/fac_pubs/1525