Sebastian Engelke
Mon 14 Feb 2022, 15:00 - 16:00
online (Zoom)

If you have a question about this talk, please contact: Zohreh Kaheh (zkaheh)

Image for Methods for extreme quantile regression in high dimensions

Quantile regression relies on minimizing the conditional quantile loss, which is based on the quantile check function. This has been extended to flexible regression functions such as the gradient forest (Athey et al., 2019). These methods break down if the quantile of interest lies outside of the range of the data. Extreme value theory provides the mathematical foundation for estimation of such extreme quantiles. A common approach is to approximate the exceedances over a high threshold by the generalized Pareto distribution. For conditional extreme quantiles, one may model the parameters of this distribution as functions of the predictors. Up to now, the existing methods are either not flexible enough or do not generalize well in higher dimensions. We develop two new approaches for extreme quantile regression that estimate the parameters of the generalized Pareto distribution in a flexible way even in higher dimensions. The first approach is based on gradient boosting and the second one on random forests. These estimators outperform classical quantile regression methods and methods from extreme value theory in simulations studies. We illustrate the methodology at the example of U.S. wage data.