DICE: Data Imputation for Cost Estimates from Multiple Sources to Model User Decision-Making

Document Type

Conference Proceeding

Publication Date

1-1-2023

Abstract

Understanding key factors that affect users' commute mode choice is essential to design policies that promote sustainable transportation. However, the reliance on survey data for these studies often faces incomplete data challenges. One of the regional transportation surveys obtained for the study on commute mode decision-making misses 97% of the parking cost data, an important factor in people's decision-making. To tackle the problem, we propose the data imputation for cost estimates (DICE) scheme to synthesize data from multiple sources to infer the missing data. DICE linearly maps imputed values to missing entries based on the assumption that higher-income users can spend more on their commute. In the absence of ground truth data, we propose to use the accuracy of the regression model trained with the imputed data as a metric to evaluate DICE. We train the regression model with 75% of the imputed data, test it with the remainder, and evaluate it with the complete cases. The prediction accuracy of the test data and the evaluation data are 0.89 and 0.77, respectively. The results indicate that the imputed data and complete cases share similar distributions and the model trained with the imputed data can perform classification. We tested DICE using a 1995 transportation survey and a 2021 housing survey data sets where cost is considered a key feature in decision-making. In both cases, the regression model achieves higher than 0.7 prediction accuracy, which proves the applicability of DICE on different data sets.

Identifier

85182390379 (Scopus)

ISBN

[9798350342734]

Publication Title

Proceedings International Conference on Tools with Artificial Intelligence Ictai

External Full Text Location

https://doi.org/10.1109/ICTAI59109.2023.00029

ISSN

10823409

First Page

149

Last Page

154

This document is currently not available here.

Share

COinS