Date of Award
Doctor of Philosophy
Susan W McRoy, Tian Zhao, Yi Hu, Seyed H Hosseini
Bayesian Method, Image Classification, Image Restoration, Image Segmentation, Machine Learning, Natural Language Processing
Classification/clustering is an important class of unstructured data processing problems. The classification (supervised, semi-supervised and unsupervised) aims to discover the clusters and group the similar data into categories for information organization and knowledge discovery. My work focuses on using the Bayesian methods and machine learning techniques to classify the free-text and image data, and address how to overcome the limitations of the traditional methods. The Bayesian approach provides a way to allow using more variations(numerical or categorical), and estimate the probabilities instead of explicit rules, which will benefit in the ambiguous cases. The MAP(maximum a posterior) estimation is used to deal with the local maximum problems which the ML(maximum likelihood) method gives inaccurate estimates. The EM(expectation-maximization) algorithm can be applied with MAP estimation for the incomplete/missing data problems. Our proposed framework can be used in both supervised and unsupervised classification. For natural language processing(NLP), we applied the machine learning techniques for sentence/text classification. For 3D CT image segmentation, MAP EM clustering approach is proposed to auto-detect the number of objects in the 3D CT luggage image, and the prior knowledge and constraints in MAP estimation are used to avoid/improve the local maximum problems. The algorithm can automatically determine the number of classes and find the optimal parameters for each class. As a result, it can automatically detect the number of objects and produce better segmentation for each object in the image. For segmented object recognition, we applied machine learning techniques to classify each object into targets or non-targets. We have achieved the good results with 90% PD(probability of detection) and 6% PFA(probability of false alarm). For image restoration, in X-ray imaging, scatter can produce noise, artifacts, and decreased contrast. In practice, hardware such as anti-scatter grid is often used to reduce scatter. However, the remaining scatter can still be significant and additional software-based correction is desirable. Furthermore, good software solutions can potentially reduce the amount of needed anti-scatter hardware, thereby reducing cost. In this work, the scatter correction is formulated as a Bayesian MAP (maximum a posteriori) problem with a non-local prior, which leads to better textural detail preservation in scatter reduction. The efficacy of our algorithm is demonstrated through experimental and simulation results.
Gu, Yingying, "Bayesian Methods and Machine Learning for Processing Text and Image Data" (2017). Theses and Dissertations. 1633.