In this thesis, we discuss the use of longitudinal data in biostatistics and their analysis, focusing on three specific real cases of study. Longitudinal data refer to collections of repeated measurements of specific variables of interest at multiple time points. Their analysis offers many advantages: among others, it enables the evaluation of temporal evolutions of quantities of interest (biomarkers, tumor size, daily counts,..), also from an individual perspective, and it provides stronger evidence for causal relationships. Various statistical meth- ods can be used to analyze longitudinal data. They range from generalized mixed- effect models, growth and evolution modeling (often combined with the mixed- effects structures), to time-to-events analyses. However, such statistical methodologies might sometimes involve complicated issues to deal with, especially those related to censoring and missing data problems. In this work, we present three longitudinal studies. (I) The first one focuses on modeling and forecasting the COVID-19 pandemic in Italy using a newly developed compartmental model called SIPRO. Its analysis shows the necessity of extending the well-known SIR model to account for the asymptomatic part of the population, in order to realistically describe the COVID-19 pandemic. Moreover, it warns about identifiability issues that arise when the extended model is too complicated with re- spect to the collected information. (II) The second one focuses on longitudinal data from prostate cancer patients and it aims at estimating the optimal time to recommend an expensive examination for prostate cancer patients who presented a resurgence after surgery. In particular, this study highlights that better estimates can be obtained, with respect to logistic models applied so far, using a more complex joint model that incorporates all the patients clinical history. (III) Finally, the third one addresses the practical implementation of pre-existing methodologies discussed in the literature. Specifically, it focuses on adapting one of these methods to account for informative withdrawal in recurrent event problems, with the aim of estimating vaccine efficacy. Based on a real case study, provided by GSK, this work shows how to obtain more reliable estimates in case of missing data due to informative censoring, and warns about numerical issues that can arise during the analyses.
Statistical methods for longitudinal medical data with applications / Amongero, Martina; Gasparini, Mauro. - (2024).
Statistical methods for longitudinal medical data with applications
Amongero,Martina;Gasparini,Mauro
2024
Abstract
In this thesis, we discuss the use of longitudinal data in biostatistics and their analysis, focusing on three specific real cases of study. Longitudinal data refer to collections of repeated measurements of specific variables of interest at multiple time points. Their analysis offers many advantages: among others, it enables the evaluation of temporal evolutions of quantities of interest (biomarkers, tumor size, daily counts,..), also from an individual perspective, and it provides stronger evidence for causal relationships. Various statistical meth- ods can be used to analyze longitudinal data. They range from generalized mixed- effect models, growth and evolution modeling (often combined with the mixed- effects structures), to time-to-events analyses. However, such statistical methodologies might sometimes involve complicated issues to deal with, especially those related to censoring and missing data problems. In this work, we present three longitudinal studies. (I) The first one focuses on modeling and forecasting the COVID-19 pandemic in Italy using a newly developed compartmental model called SIPRO. Its analysis shows the necessity of extending the well-known SIR model to account for the asymptomatic part of the population, in order to realistically describe the COVID-19 pandemic. Moreover, it warns about identifiability issues that arise when the extended model is too complicated with re- spect to the collected information. (II) The second one focuses on longitudinal data from prostate cancer patients and it aims at estimating the optimal time to recommend an expensive examination for prostate cancer patients who presented a resurgence after surgery. In particular, this study highlights that better estimates can be obtained, with respect to logistic models applied so far, using a more complex joint model that incorporates all the patients clinical history. (III) Finally, the third one addresses the practical implementation of pre-existing methodologies discussed in the literature. Specifically, it focuses on adapting one of these methods to account for informative withdrawal in recurrent event problems, with the aim of estimating vaccine efficacy. Based on a real case study, provided by GSK, this work shows how to obtain more reliable estimates in case of missing data due to informative censoring, and warns about numerical issues that can arise during the analyses.File | Dimensione | Formato | |
---|---|---|---|
Tesi_AmongeroMartina.pdf
accesso aperto
Tipologia:
Tesi di dottorato
Licenza:
Creative commons
Dimensione
4.81 MB
Formato
Adobe PDF
|
4.81 MB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2995738