Date of Award

May 2020

Degree Type


Degree Name

Master of Science



First Advisor

Daniel Gervini


This thesis addresses the problem of prediction of taxi trip duration for any given

day, time, pickup point and dropo point. Data on taxi trips from the Chicago Data

Portal is used. The main idea of the model is to cluster similar trips together and use

the mean duration of all those clustered taxi trips to predict the duration of a new taxi

trip in that cluster. Furthermore, for a possible additional reduction of prediction error,

estimators from dierent days which are not signicantly dierent from each other are

pooled together. It is shown that this procedure improves prediction error.

Included in

Mathematics Commons