Population synthesis involves generating synthetic yet realistic representations of a target population of micro-agents for behavioral modeling and simulation. Traditional methods, often reliant on target population samples, such as census data or travel surveys, face limitations due to high costs and small sample sizes, particularly at smaller geographical scales. We propose a novel framework based on copulas to generate synthetic data for target populations where only empirical marginal distributions are known. This method utilizes samples from different populations with similar marginal dependencies, introduces a spatial component into population synthesis, and considers various information sources for more realistic generators. Concretely, the process involves normalizing the data and treating it as realizations of a given copula, and then training a generative model before incorporating the information on the marginals of the target population. Utilizing American Community Survey data, we assess our framework's performance through standardized root mean squared error (SRMSE) and so-called sampled zeros. We focus on its capacity to transfer a model learned from one population to another. Our experiments include transfer tests between regions at the same geographical level as well as to lower geographical levels, hence evaluating the framework's adaptability in varied spatial contexts. We compare Bayesian Networks, Variational Autoencoders, and Generative Adversarial Networks, both individually and combined with our copula framework. Results show that the copula enhances machine learning methods in matching the marginals of the reference data. Furthermore, it consistently surpasses Iterative Proportional Fitting in terms of SRMSE in the transferability experiments, while introducing unique observations not found in the original training sample.

Copula-based transferable models for synthetic population generation / Jutras-Dube, P.; Al-Khasawneh, M. B.; Yang, Z.; Bas, J.; Bastin, F.; Cirillo, C.. - In: TRANSPORTATION RESEARCH. PART C, EMERGING TECHNOLOGIES. - ISSN 0968-090X. - 169:(2024). [10.1016/j.trc.2024.104830]

Copula-based transferable models for synthetic population generation

Cirillo C.
2024

Abstract

Population synthesis involves generating synthetic yet realistic representations of a target population of micro-agents for behavioral modeling and simulation. Traditional methods, often reliant on target population samples, such as census data or travel surveys, face limitations due to high costs and small sample sizes, particularly at smaller geographical scales. We propose a novel framework based on copulas to generate synthetic data for target populations where only empirical marginal distributions are known. This method utilizes samples from different populations with similar marginal dependencies, introduces a spatial component into population synthesis, and considers various information sources for more realistic generators. Concretely, the process involves normalizing the data and treating it as realizations of a given copula, and then training a generative model before incorporating the information on the marginals of the target population. Utilizing American Community Survey data, we assess our framework's performance through standardized root mean squared error (SRMSE) and so-called sampled zeros. We focus on its capacity to transfer a model learned from one population to another. Our experiments include transfer tests between regions at the same geographical level as well as to lower geographical levels, hence evaluating the framework's adaptability in varied spatial contexts. We compare Bayesian Networks, Variational Autoencoders, and Generative Adversarial Networks, both individually and combined with our copula framework. Results show that the copula enhances machine learning methods in matching the marginals of the reference data. Furthermore, it consistently surpasses Iterative Proportional Fitting in terms of SRMSE in the transferability experiments, while introducing unique observations not found in the original training sample.
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2994692
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo