Evaluating Type Prediction with Textual Hints
Alipour, Mohammad Amin
MetadataShow full item record
One of the main obstacles to program comprehension and software maintenance is the lack of information about the types of variables in a source code. In this paper, we explore the effectiveness of type inference using textual hints. We formulate the type inference problem as a classification task where we train classification models to use textual features in the source code to predict the type of new variables in a program. To evaluate the effectiveness of this approach, we train and test several classifiers on types of variables in five open-source Java projects from two open-source organizations under different scenarios. The choice of Java provides us with a large number of instances with accurate labeling to assess our classification approach. Our experiments show that textual hints can correctly predict the type of new variables with high accuracy (F-measure more than 80%) where the model is trained and tested on disjoined parts of the same project. However, the accuracy of this approach drops sharply in the settings that the project in the test sample is different from the project used for training, as there are types in the test set that models have not seen in the train set. Despite the negative results in some scenarios, our experiments show the potential of using textual features to enhance the performance of classifiers in predicting types of variables. These results imply that although textual features have limitations in predicting accurate types, they can supplement other type prediction techniques.