Date of Award

May 2019

Degree Type


Degree Name

Doctor of Philosophy



First Advisor

Xiao Qin

Committee Members

Robert J. Schneider, Jie Yu, Yin Wang, Chao Zhu


Crash causation mechanism, Driver-behavioral factors, Driver Error, Multiple Risk generating process, Negative Binomial-Lindley, Unobserved heterogeneity


All travelers are exposed to the risk for crashes on the road, as none of the roadways are entirely safe. Under Vision Zero, improving traffic safety on our nation’s highways is and will continue to be one of the most pivotal tasks on the national transportation agenda. For decades, researchers and transportation professionals have strived to identify causal relationships between crash occurrence and roadway geometry, and traffic-related variables on the mission of creating a safe environment for the traveling public. Although great achievements have been witnessed such as the publication of the Highway Safety Manual (HSM), research is rather limited in the area of incorporating driver behavior variables into safety modeling. As driver errors are responsible for more than 90 percent of crashes occurred, excluding such important information could cause ineffective, inaccurate, and incorrect prediction results and parameter inferences.

The primary reasons for this research void are the lack of driver information and methods for integrating driver data with roadway and traffic characteristics. Standard procedures for collecting and archiving driver behavior data do not exist, as highway agencies are not obligated to collect them. The most relevant source for driver behavior information is perhaps the crash report where police officers may record driver conditions and the possible driver factors contributing to the crash. However, such information is not available to near misses, traffic conflicts and non-crash traffic events where good behaviors prevail. As a result, unobserved data heterogeneity will induce data overdispersion issues which are a significant limiting factor to safety modeling. Furthermore, the conventional approach to treating crashes as originated from a single risk source also induces heterogeneity in crash data and yields biased parameter estimates. Thus, a statistically rigorous methodology is in urgent need to consider the consequence of missing critical driver information in a crash model as well as to distinguish between distinct risk generating sources of a crash event when the driver information is available.

This dissertation contributes to the prediction of crash frequency and severity by explicitly considering human factors and driver behaviors in the modeling process. This endeavor began with a comprehensive literature review that identified and addressed data needs, technical issues, and latest development on the incorporation of human factors in safety analysis; and concluded with analytical framework and modeling alternatives to quantify driver behavior being proposed, developed and evaluated.

Given myriads of data elements to be explored, availability of contributing factors and crash data issues, a three-pronged modeling approach was adopted to accommodate a broad spectrum of data aggregated over areas, sites and crash events. This approach was informed by the complex nature of crashes involving highway geometry, traffic exposure, contextual factors, driver characteristics, vehicle factors, as well as the interactions among them. The availability of direct or surrogate measures of crash contributing factors varies by spatial unit. To give an example, socioeconomic and demographic features of the driving population are available at census tract; roadway geometry and traffic variables are available for segments and intersections; while specific driver conditions are only collected when a crash took place. With the flexibility in spatial context and risk generating sources, the three-pronged approach provides direct benefits to guide different safety applications such as planning, design, and operations; and informs different programs such as engineering and enforcement.

The area-based crash models were developed to incorporate human factors and driver behavior in the form of socioeconomic and demographic data. In particular, behavior-based crash prediction models for speed and alcohol-related crashes were developed, respectively. Results showed that driver behavior-related crashes were more correlated with socioeconomic and demographic variables than traffic and trip-related explanatory variables.

The site-specific crash models were exploited to address the effect of human factors and driver behaviors in two fronts: 1) developing rigorous statistical models to account for unobserved heterogeneity induced overdispersion when driver behavior information is not available, 2) treating behavior variables as a separate risk source in a prediction model. The first pursuit leads to the development of a mixed distribution random parameter model to explicitly account for unobserved heterogeneity. The second pursuit results in the development of a multivariate multiple risk source regression model to simultaneously predict crash count and severity. Modeling results show better model performance and valid model inferences for quantifying the effect of driver factors on crash occurrence can be achieved with proposed multiple risk source models.

The event-oriented models were utilized to evaluate the interaction between human factors and engineering variables in a crash event. Driver errors were categorized by the driver’s action during a crash on a roadway segment. The modeling results identified many highway geometric features, traffic conditions, and driver characteristics as statistically correlated to different types of driver mistakes. An exploratory analysis was followed to evaluate the effect of driver mistakes on the crash injury outcomes.

The dissertation demonstrates the strength of using diverse methods and models under various circumstances to incorporate human factors and driver behavior in crash prediction. The safety professionals can choose appropriate models based on their own data availability, unit of analysis, and design effective treatments or training programs. This research shares new insights to reinforce informed decision support for cumulative safety improvement of roadway network, recognizes the opportunities to address high priority safety issue areas, and determines the appropriate countermeasures.