Faculty Publications

SEDN: A Spatiotemporal Encoder-Decoder Network for End-to-End Object Removal Forgery Detection in High-Resolution Videos

Lizhi Xiong, Nanjing University of Information Science & Technology
Linsen Ding, Nanjing University of Information Science & Technology
Mengqi Cao, Nanjing University of Information Science & Technology
Zhihua Xia, Jinan University
Yun Qing Shi, Newark College of Engineering

Document Type

Article

Publication Date

1-1-2024

Abstract

With the growing popularity of high-resolution (HR) video and the continuous growth of network bandwidth, the challenge of object removal detection in HR videos has attracted significant attention. Expert forgers leverage the rich detail in HR videos for meticulous pixel manipulation and apply sophisticated postprocessing techniques to hide high-frequency artifacts, thereby making forgery detection and localization more difficult when existing schemes are used. Additionally, the end-toend framework simplifies the detection and localization process, which has not been considered in previous work. To solve the above issues, a spatiotemporal encoder-decoder network (SEDN) is proposed for end-to-end object removal forgery detection in HR videos. In the SEDN, a new model composed of a 3D asymmetric dual-stream network (3D-ADSN) and Transformer is proposed. The 3D-ADSN is utilized as the encoder, which fully integrates the high-frequency and low-frequency spatiotemporal information of videos. Transformer is utilized as the decoder to capture the global structure spatiotemporal information of the long-range feature sequence obtained by the encoder. This network combination successfully achieves simultaneous detection in the temporal and spatial domains without any additional postprocessing calculations. The experimental results demonstrate the better performance of the SEDN at different resolutions.

Identifier

85213549272 (Scopus)

Publication Title

IEEE Transactions on Multimedia

External Full Text Location

https://doi.org/10.1109/TMM.2024.3521804

e-ISSN

19410077

ISSN

15209210

Recommended Citation

Xiong, Lizhi; Ding, Linsen; Cao, Mengqi; Xia, Zhihua; and Shi, Yun Qing, "SEDN: A Spatiotemporal Encoder-Decoder Network for End-to-End Object Removal Forgery Detection in High-Resolution Videos" (2024). Faculty Publications. 754.
https://digitalcommons.njit.edu/fac_pubs/754

This document is currently not available here.

COinS

DOI

10.1109/TMM.2024.3521804

Faculty Publications

SEDN: A Spatiotemporal Encoder-Decoder Network for End-to-End Object Removal Forgery Detection in High-Resolution Videos

Document Type

Publication Date

Abstract

Identifier

Publication Title

External Full Text Location

e-ISSN

ISSN

Recommended Citation

DOI

Search

Browse

Author Corner

Links

Faculty Publications

SEDN: A Spatiotemporal Encoder-Decoder Network for End-to-End Object Removal Forgery Detection in High-Resolution Videos

Authors

Document Type

Publication Date

Abstract

Identifier

Publication Title

External Full Text Location

e-ISSN

ISSN

Recommended Citation

Share

DOI

Search

Browse

Author Corner

Links