Towards real-time embodied AI agent: a bionic visual encoding framework for mobile robotics: Towards real-time embodied AI agent: a bionic visual..: X. Hou et al.
Document Type
Article
Publication Date
12-1-2024
Abstract
Embodied artificial intelligence (AI) agents, which navigate and interact with their environment using sensors and actuators, are being applied for mobile robotic platforms with limited computing power, such as autonomous vehicles, drones, and humanoid robots. These systems make decisions through environmental perception from deep neural network (DNN)-based visual encoders. However, the constrained computational resources and the large amounts of visual data to be processed can create bottlenecks, such as taking almost 300 milliseconds per decision on an embedded GPU board (Jetson Xavier). Existing DNN acceleration methods need model retraining and can still reduce accuracy. To address these challenges, our paper introduces a bionic visual encoder framework, }Robye, to support real-time requirements of embodied AI agents. The proposed framework complements existing DNN acceleration techniques. Specifically, we integrate motion data to identify overlapping areas between consecutive frames, which reduces DNN workload by propagating encoding results. We bifurcate processing into high-resolution for task-critical areas and low-resolution for less-significant regions. This dual-resolution approach allows us to maintain task performance while lowering the overall computational demands. We evaluate }Robye across three robotic scenarios: autonomous driving, vision-and-language navigation, and drone navigation, using various DNN models and mobile platforms. }Robye outperforms baselines in speed (1.2–3.3 ×), performance (+4% to +29%), and power consumption (-36% to -47%).
Identifier
85201279190 (Scopus)
Publication Title
International Journal of Intelligent Robotics and Applications
External Full Text Location
https://doi.org/10.1007/s41315-024-00363-w
e-ISSN
2366598X
ISSN
23665971
First Page
1038
Last Page
1056
Issue
4
Volume
8
Recommended Citation
Hou, Xueyu; Guan, Yongjie; Han, Tao; and Wang, Cong, "Towards real-time embodied AI agent: a bionic visual encoding framework for mobile robotics: Towards real-time embodied AI agent: a bionic visual..: X. Hou et al." (2024). Faculty Publications. 57.
https://digitalcommons.njit.edu/fac_pubs/57