Faculty Publications

MutualNet: Adaptive ConvNet via Mutual Learning From Different Model Configurations

Taojiannan Yang, College of Engineering and Computer Science
Sijie Zhu, College of Engineering and Computer Science
Matias Mendieta, College of Engineering and Computer Science
Pu Wang, College of Computing and Informatics
Ravikumar Balakrishnan, Intel Corporation
Minwoo Lee, College of Computing and Informatics
Tao Han, Newark College of Engineering
Mubarak Shah, College of Engineering and Computer Science
Chen Chen, College of Engineering and Computer Science

Document Type

Article

Publication Date

1-1-2023

Abstract

Most existing deep neural networks are static, which means they can only perform inference at a fixed complexity. But the resource budget can vary substantially across different devices. Even on a single device, the affordable budget can change with different scenarios, and repeatedly training networks for each required budget would be incredibly expensive. Therefore, in this work, we propose a general method called MutualNet to train a single network that can run at a diverse set of resource constraints. Our method trains a cohort of model configurations with various network widths and input resolutions. This mutual learning scheme not only allows the model to run at different width-resolution configurations but also transfers the unique knowledge among these configurations, helping the model to learn stronger representations overall. MutualNet is a general training methodology that can be applied to various network structures (e.g., 2D networks: MobileNets, ResNet, 3D networks: SlowFast, X3D) and various tasks (e.g., image classification, object detection, segmentation, and action recognition), and is demonstrated to achieve consistent improvements on a variety of datasets. Since we only train the model once, it also greatly reduces the training cost compared to independently training several models. Surprisingly, MutualNet can also be used to significantly boost the performance of a single network, if dynamic resource constraints are not a concern. In summary, MutualNet is a unified method for both static and adaptive, 2D and 3D networks. Code and pre-trained models are available at https://github.com/taoyang1122/MutualNet.

Identifier

85122290885 (Scopus)

Publication Title

IEEE Transactions on Pattern Analysis and Machine Intelligence

External Full Text Location

https://doi.org/10.1109/TPAMI.2021.3138389

e-ISSN

19393539

ISSN

01628828

PubMed ID

34962861

First Page

811

Last Page

827

Issue

Volume

Grant

2003198

Fund Ref

Intel Corporation

Recommended Citation

Yang, Taojiannan; Zhu, Sijie; Mendieta, Matias; Wang, Pu; Balakrishnan, Ravikumar; Lee, Minwoo; Han, Tao; Shah, Mubarak; and Chen, Chen, "MutualNet: Adaptive ConvNet via Mutual Learning From Different Model Configurations" (2023). Faculty Publications. 2370.
https://digitalcommons.njit.edu/fac_pubs/2370

This document is currently not available here.

COinS

DOI

10.1109/TPAMI.2021.3138389

Faculty Publications

MutualNet: Adaptive ConvNet via Mutual Learning From Different Model Configurations

Document Type

Publication Date

Abstract

Identifier

Publication Title

External Full Text Location

e-ISSN

ISSN

PubMed ID

First Page

Last Page

Issue

Volume

Grant

Fund Ref

Recommended Citation

DOI

Search

Browse

Author Corner

Links

Faculty Publications

MutualNet: Adaptive ConvNet via Mutual Learning From Different Model Configurations

Authors

Document Type

Publication Date

Abstract

Identifier

Publication Title

External Full Text Location

e-ISSN

ISSN

PubMed ID

First Page

Last Page

Issue

Volume

Grant

Fund Ref

Recommended Citation

Share

DOI

Search

Browse

Author Corner

Links