An empirical approach to machine learning: algorithm selection, hyperparameter optimization and automatic principal design

par Sourava prasad Mishra

Projet de thèse en Informatique

Sous la direction de Balazs Kegl et de Michèle Sebag.

Thèses en préparation à Paris Saclay , dans le cadre de Sciences et Technologies de l'Information et de la Communication , en partenariat avec LAL - Laboratoire de l'Accélérateur Linéaire (laboratoire) , Applied Statistics and machine learning (AppStat) (equipe de recherche) et de Université Paris-Sud (établissement de préparation de la thèse) depuis le 01-11-2014 .


  • Résumé

    In this thesis project we propose to apply the scientific method to machine learning. We will explore two lines of research. In the first we will build on recent work applying modern experimental design for algorithm selection and hyperparameter tuning. The main thrust of this sub-project is the multi-problem approach: we will explore the interaction between methods (and hyperparameters) and data sets to find out whether and to what extent experience can be generalized across data sets. The output of this project is a toolbox for practitioners and a stockpile of knowledge on what algorithm works on what (kind of) data sets. This second output will feed into the second line of research: we will ask the question of emph{why} certain methods work on certain data sets. We will study algorithms as natural phenomena, form hypotheses, design and evaluate experiments, and carry out measurements that could validate or refute our hypotheses.

  • Titre traduit

    An empirical approach to machine learning: algorithm selection, hyperparameter optimization and automatic principal design


  • Résumé

    In this thesis project we propose to apply the scientific method to machine learning. We will explore two lines of research. In the first we will build on recent work applying modern experimental design for algorithm selection and hyperparameter tuning. The main thrust of this sub-project is the multi-problem approach: we will explore the interaction between methods (and hyperparameters) and data sets to find out whether and to what extent experience can be generalized across data sets. The output of this project is a toolbox for practitioners and a stockpile of knowledge on what algorithm works on what (kind of) data sets. This second output will feed into the second line of research: we will ask the question of why certain methods work on certain data sets. We will study algorithms as natural phenomena, form hypotheses, design and evaluate experiments, and carry out measurements that could validate or refute our hypotheses.