Faculty Publications

Unifying Fourteen Post-Hoc Attribution Methods With Taylor Interactions

Huiqi Deng, Shanghai Jiao Tong University
Na Zou, University of Houston
Mengnan Du, Ying Wu College of Computing
Weifu Chen, Guangzhou Maritime University
Guocan Feng, Sun Yat-Sen University
Ziwei Yang, Hangzhou Hikvision Digital Technology Co.,Ltd.
Zheyang Li, Hangzhou Hikvision Digital Technology Co.,Ltd.
Quanshi Zhang, Shanghai Jiao Tong University

Document Type

Article

Publication Date

7-1-2024

Abstract

Various attribution methods have been developed to explain deep neural networks (DNNs) by inferring the attribution/importance/contribution score of each input variable to the final output. However, existing attribution methods are often built upon different heuristics. There remains a lack of a unified theoretical understanding of why these methods are effective and how they are related. Furthermore, there is still no universally accepted criterion to compare whether one attribution method is preferable over another. In this paper, we resort to Taylor interactions and for the first time, we discover that fourteen existing attribution methods, which define attributions based on fully different heuristics, actually share the same core mechanism. Specifically, we prove that attribution scores of input variables estimated by the fourteen attribution methods can all be mathematically reformulated as a weighted allocation of two typical types of effects, i.e., independent effects of each input variable and interaction effects between input variables. The essential difference among these attribution methods lies in the weights of allocating different effects. Inspired by these insights, we propose three principles for fairly allocating the effects, which serve as new criteria to evaluate the faithfulness of attribution methods. In summary, this study can be considered as a new unified perspective to revisit fourteen attribution methods, which theoretically clarifies essential similarities and differences among these methods. Besides, the proposed new principles enable people to make a direct and fair comparison among different methods under the unified perspective.

Identifier

85183983693 (Scopus)

Publication Title

IEEE Transactions on Pattern Analysis and Machine Intelligence

External Full Text Location

https://doi.org/10.1109/TPAMI.2024.3358410

e-ISSN

19393539

ISSN

01628828

PubMed ID

38271170

First Page

4625

Last Page

4640

Issue

Volume

Grant

2021ZD0111602

Fund Ref

National Key Research and Development Program of China

Recommended Citation

Deng, Huiqi; Zou, Na; Du, Mengnan; Chen, Weifu; Feng, Guocan; Yang, Ziwei; Li, Zheyang; and Zhang, Quanshi, "Unifying Fourteen Post-Hoc Attribution Methods With Taylor Interactions" (2024). Faculty Publications. 317.
https://digitalcommons.njit.edu/fac_pubs/317

This document is currently not available here.

COinS

DOI

10.1109/TPAMI.2024.3358410

Faculty Publications

Unifying Fourteen Post-Hoc Attribution Methods With Taylor Interactions

Document Type

Publication Date

Abstract

Identifier

Publication Title

External Full Text Location

e-ISSN

ISSN

PubMed ID

First Page

Last Page

Issue

Volume

Grant

Fund Ref

Recommended Citation

DOI

Search

Browse

Author Corner

Links

Faculty Publications

Unifying Fourteen Post-Hoc Attribution Methods With Taylor Interactions

Authors

Document Type

Publication Date

Abstract

Identifier

Publication Title

External Full Text Location

e-ISSN

ISSN

PubMed ID

First Page

Last Page

Issue

Volume

Grant

Fund Ref

Recommended Citation

Share

DOI

Search

Browse

Author Corner

Links