(Partial) Program Dependence Learning

Document Type

Conference Proceeding

Publication Date

1-1-2023

Abstract

Code fragments from developer forums often migrate to applications due to the code reuse practice. Owing to the incomplete nature of such programs, analyzing them to early determine the presence of potential vulnerabilities is challenging. In this work, we introduce NeuralPDA, a neural network-based program dependence analysis tool for both complete and partial programs. Our tool efficiently incorporates intra-statement and inter-statement contextual features into statement representations, thereby modeling program dependence analysis as a statement-pair dependence decoding task. In the empirical evaluation, we report that NeuralPDA predicts the CFG and PDG edges in complete Java and C/C++ code with combined F-scores of 94.29% and 92.46%, respectively. The F-score values for partial Java and C/C++ code range from 94.29%-97.17% and 92.46%-96.01%, respectively. We also test the usefulness of the PDGs predicted by NeuralPDA (i.e., PDG*) on the downstream task of method-level vulnerability detection. We discover that the performance of the vulnerability detection tool utilizing PDG* is only 1.1% less than that utilizing the PDGs generated by a program analysis tool. We also report the detection of 14 real-world vulnerable code snippets from StackOverflow by a machine learning-based vulnerability detection tool that employs the PDGs predicted by NeuralPDA for these code snippets.

Identifier

85171736652 (Scopus)

ISBN

[9781665457019]

Publication Title

Proceedings International Conference on Software Engineering

External Full Text Location

https://doi.org/10.1109/ICSE48619.2023.00209

ISSN

02705257

First Page

2501

Last Page

2513

Grant

CNS-2120386

Fund Ref

National Science Foundation

This document is currently not available here.

Share

COinS