A Hybrid Approach for Inference between Behavioral Exception API Documentation and Implementations, and Its Applications

Document Type

Conference Proceeding

Publication Date

9-19-2022

Abstract

Automatically producing behavioral exception (BE) API documentation helps developers correctly use the libraries. The state-of-the-art approaches are either rule-based, which is too restrictive in its applicability, or deep learning (DL)-based, which requires large training dataset. To address that, we propose StatGen, a novel hybrid approach between statistical machine translation (SMT) and tree-structured translation to generate the BE documentation for any code and vice versa. We consider the documentation and source code of an API method as the two abstraction levels of the same intent. StatGen is specifically designed for this two-way inference, and takes advantage of their structures for higher accuracy. We conducted several experiments to evaluate StatGen. We show that it achieves high precision (75% and 75%), and recall (81% and 84%), in inferring BE documentation from source code and vice versa. StatGen achieves higher precision, recall, and BLEU score than the state-of-the-art, DL-based baseline models. We show StatGen's usefulness in two applications. First, we use it to generate the BE documentation for Apache APIs that lack of documentation by learning from the documentation of the equivalent APIs in JDK. 44% of the generated documentation were rated as useful and 42% as somewhat useful. In the second application, we use StatGen to detect the inconsistency between the BE documentation and corresponding implementations of several JDK8 packages.

Identifier

85146930883 (Scopus)

ISBN

[9781450396240]

Publication Title

ACM International Conference Proceeding Series

External Full Text Location

https://doi.org/10.1145/3551349.3560434

Grant

CCF-1723432

Fund Ref

National Science Foundation

This document is currently not available here.

Share

COinS