Date of Award

August 2014

Degree Type


Degree Name

Doctor of Philosophy


Management Science

First Advisor

Ghose Sanjoy

Committee Members

kanti Prasad, Amit Bhatnagar, Xiaojing yang, Tingting He


Advertising, Internet, Purchase, Reviews, Search, Text Ming


Nowadays, with the explosive growth in the usage of the Internet, consumers are performing all kinds of activities over the Internet like searching or buying. We want to study the different activities of consumers in the online domain.

In our daily lives, people are often making various kinds of product purchases. When making such purchases, a lot of factors can affect consumers' decisions. This includes the nature of the product category, and especially in the online domain, the nature of their search activities. In the first essay/chapter, we develop an econometric model to understand the relationships between different dimensions of on-line search and purchase behavior. Our approach uses endogeneity corrections to develop a model that is more correct than the typical non-endogeneity corrected model. Thus we believe our results to be truly reflective of what is happening in the search-buying domain. We use extensive empirical data to test several hypotheses that we developed. Parameters from our model estimations reveal that there are interesting variations in the search-purchase behavior relationships across types of product categories. This difference is especially evident between utilitarian and hedonic goods. Our findings have important theoretical and managerial implications.

The amount of information in text reviews is tremendously greater than that in typical numerical data. A major challenge for marketers is how to extract the most relevant information from this big data source. In our second essay/chapter, we do this by using a text mining methodology that draws on machine learning algorithms. We collect data using a Java WebCrawler type programming approach. We use a word-based model to predict consumers' recommendations. Model prediction accuracy was high. In the marketing literature there has been almost no work where such a methodology has been used to make predictions of recommendations based on big data stemming from textual information. An interesting finding from our research is that as the number of textual features increases, the predictive accuracy of the model increases only up to a point. Beyond that, inclusion of more words in the model leads to a decrease in predictive accuracy. We also use a diagnostic approach to identify key words that are determinants of user recommendations. Since our model deals with big data, we address in details the issue of scalability; our computations show that our approach is very scalable. Potential for marketing implications seems considerable.

Marketers are always interested in predicting market sales so that they can arrange the firm activities accordingly. In the meantime, this market sales information can also help the consumers to make right buying decisions. However the high cost and long period of collecting the available data with a lag makes it very inconvenient and out of date. With the rise of multi-social media sharing websites such as YouTube, Flickr, and various blogs, consumers can search and learn various types of information from these websites. The availability of large amounts of data on the Internet enables us to use large scale data mining algorithms for solving complex problems. The users' online searching activities can be captured for predicting the market sales. In the third essay/chapter, we focus on the impacts of different search behavior and marketing outcomes like product sales. We examined the three major online search areas including text, image, and video from search engines like Google to help us accurately and easily predict the sales of automobiles. We believe that our work here opens a brand new arena for using multimedia search activities and will have a big impact on marketing sciences.

Included in

Marketing Commons