Thèse de doctorat en Biostatistiques
Soutenue le 24-06-2013
à Paris 5 , dans le cadre de École doctorale Frontières de l'innovation en recherche et éducation (Paris) , en partenariat avec Mathématiques appliquées Paris 5 (laboratoire) .
Le président du jury était René Ecochard.
Propriétés statistiques des estimateurs de la densité parasitaire dans les études portant sur le paludisme et applications opérationnelles
Pas de résumé en français
Malaria is a devastating global health problem that affected 219 million people and caused 660,000 deaths in 2010. Inaccurate estimation of the level of infection may have adverse clinical and therapeutic implications for patients, and for epidemiological endpoint measurements. The level of infection, expressed as the parasite density (PD), is classically defined as the number of asexual parasites relative to a microliter of blood. Microscopy of Giemsa-stained thick blood smears (TBSs) is the gold standard for parasite enumeration. Parasites are counted in a predetermined number of high-power fields (HPFs) or against a fixed number of leukocytes. PD estimation methods usually involve threshold values; either the number of leukocytes counted or the number of HPFs read. Most of these methods assume that (1) the distribution of the thickness of the TBS, and hence the distribution of parasites and leukocytes within the TBS, is homogeneous; and that (2) parasites and leukocytes are evenly distributed in TBSs, and thus can be modeled through a Poisson-distribution. The violation of these assumptions commonly results in overdispersion. Firstly, we studied the statistical properties (mean error, coefficient of variation, false negative rates) of PD estimators of commonly used threshold-based counting techniques and assessed the influence of the thresholds on the cost-effectiveness of these methods. Secondly, we constituted and published the first dataset on parasite and leukocyte counts per HPF. Two sources of overdispersion in data were investigated: latent heterogeneity and spatial dependence. We accounted for unobserved heterogeneity in data by considering more flexible models that allow for overdispersion. Of particular interest were the negative binomial model (NB) and mixture models. The dependent structure in data was modeled with hidden Markov models (HMMs). We found evidence that assumptions (1) and (2) are inconsistent with parasite and leukocyte distributions. The NB-HMM is the closest model to the unknown distribution that generates the data. Finally, we devised a reduced reading procedure of the PD that aims to a better operational optimization and a practical assessing of the heterogeneity in the distribution of parasites and leukocytes in TBSs. A patent application process has been launched and a prototype development of the counter is in process.