Faculty Publications

Combining program analysis and statistical language model for code statement completion

Son Nguyen, The University of Texas at Dallas
Tien Nguyen, The University of Texas at Dallas
Yi Li, New Jersey Institute of Technology
Shaohua Wang, New Jersey Institute of Technology

Document Type

Conference Proceeding

Publication Date

11-1-2019

Abstract

Automatic code completion helps improve developers' productivity in their programming tasks. A program contains instructions expressed via code statements, which are considered as the basic units of program execution. In this paper, we introduce AutoSC, which combines program analysis and the principle of software naturalness to fill in a partially completed statement. AutoSC benefits from the strengths of both directions, in which the completed code statement is both frequent and valid. AutoSC is first trained on a large code corpus to derive the templates of candidate statements. Then, it uses program analysis to validate and concretize the templates into syntactically and type-valid candidate statements. Finally, these candidates are ranked by using a language model trained on the lexical form of the source code in the code corpus. Our empirical evaluation on the large datasets of real-world projects shows that AutoSC achieves 38.9-41.3% top-1 accuracy and 48.2-50.1% top-5 accuracy in statement completion. It also outperforms a state-of-the-art approach from 9X-69X in top-1 accuracy.

Identifier

85078944075 (Scopus)

ISBN

[9781728125084]

Publication Title

Proceedings 2019 34th IEEE ACM International Conference on Automated Software Engineering Ase 2019

External Full Text Location

https://doi.org/10.1109/ASE.2019.00072

First Page

710

Last Page

721

Grant

CCF-1518897

Fund Ref

National Science Foundation

Recommended Citation

Nguyen, Son; Nguyen, Tien; Li, Yi; and Wang, Shaohua, "Combining program analysis and statistical language model for code statement completion" (2019). Faculty Publications. 7230.
https://digitalcommons.njit.edu/fac_pubs/7230

This document is currently not available here.

COinS

DOI

10.1109/ASE.2019.00072

Faculty Publications

Combining program analysis and statistical language model for code statement completion

Document Type

Publication Date

Abstract

Identifier

ISBN

Publication Title

External Full Text Location

First Page

Last Page

Grant

Fund Ref

Recommended Citation

DOI

Search

Browse

Author Corner

Links

Faculty Publications

Combining program analysis and statistical language model for code statement completion

Authors

Document Type

Publication Date

Abstract

Identifier

ISBN

Publication Title

External Full Text Location

First Page

Last Page

Grant

Fund Ref

Recommended Citation

Share

DOI

Search

Browse

Author Corner

Links