Date of Award

August 2024

Degree Type

Dissertation

Degree Name

Doctor of Philosophy

Department

Engineering

First Advisor

Tian Zhao

Committee Members

Rohit Kate, Jun Zhang, Min Wu, Amir Kordijazi

Keywords

Computer Vision, CS-UNet, GCtx-UNet, Microscopy image segmentation

Abstract

In the fields of medical and materials sciences, accurate and robust segmentation of microscopy images is of great importance. The manual analysis methods used in the past face limitations, from subjectivity introduced by user judgments to inability to scale for processing large image datasets. Automated segmentation with traditional computer vision methods, such as image thresholding and morphology operations, offers speed and repeatability but is challenging to implement and not robust to changes in imaging or sample conditions. Integrated computer vision techniques have emerged as promising solutions to address these challenges where Convolutional Neural Network (CNN) and Vision Transformer (ViT) are two widely used techniques. Deep learning algorithms have demonstrated significant potential in improving microscopy image analysis in the areas of medical and materials sciences. They have been employed in medical applications for disease diagnosis, medical imaging analysis, and drug discovery by analyzing images such as X-rays, CT scans, and MRI scans. Materials science has used these algorithms for material characterization, defect detection, and quality control, analyzing materials such as metals, ceramics, and polymers.Segmentation is a critical task in microscopy image analysis, as it involves identifying and separating different objects or regions of interest within an image. Segmentation can be used for tasks such as cell counting, measuring cell size and shape, and identifying specific structures or features within a tissue or organ. Deep learning models have shown great potential for improving the accuracy and efficiency of segmentation tasks in microscopy image analysis. However, training deep learning models for segmentation can be challenging, as it requires a large amount of labeled data and can be computationally intensive. Additionally, microscopy images differ significantly from natural images due to variable signal-to-noise ratios, rich information content, and complex structures. Acquiring microscopy images can be a difficult and expensive process, and they often require expert knowledge to label and annotate. To overcome these challenges, transfer learning can be used, where models trained on large-scale datasets can be fine-tuned for specific microscopy analysis tasks when confronted with limited training data. Nevertheless, transfer learning based on natural images tend to yield sub-optimal results when applied to microscopy image segmentation due to the mismatch of domain information in microscopy images. Also, popular segmentation networks use CNN, which often struggles to capture long-range dependencies. This deficiency was solved in the Transformer-based segmentation models though they are limited in capturing low-level features. To address these challenges, this thesis developed several deep learning models for microscopy image analysis, which use transfer learning based on large sets of microscopy images to provide accurate, robust, and generalizable approaches for microscopy image segmentation. Such advancements have the potential to improve the analysis of medical and material science images, impacting disease diagnosis, material characterization, and various critical research areas. One of the key steps is to pre-train CNN and Transformer models on large in-domain datasets to learn high-level microscopy features that are not typically present in natural images. Experiments show that such models outperform those pre-trained on natural image data like ImageNet.

Practical and effective strategies emerge when hybrid approaches are employed, combining the strengths of CNNs and Transformers. These approaches allow for the exploitation of the unique advantages inherent in each architectural design. Through the fusion of domain expertise, the power of pre-training, and the combining of diverse architectural approaches offer a path toward enhanced precision and efficiency in microscopy image segmentation.

In this thesis, we make the following contributions:1- We surveyed the areas of Computer Vision (CV) research in microscopy image analysis covering topics such as image classification and image segmentation. 2- We designed a robust CV pipeline for the automation of area fraction measurement in ductile iron microstructures with significant improvement in accuracy and efficiency. 3- We developed a new U-shaped hybrid segmentation network called CS-UNet that combines the strengths of CNNs and Transformers for image segmentation. Experimental results show that CS-UNet outperforms the state-of-the-art CNN-based, Transfoerm-based, and hybrid models. We also collected a set of microscopy images, MicroLite, for the purpose of pre-training segmentation encoders. Models pre-trained on MicroLite have superior performance than models pre-trained on natural images. 4-While CS-UNet has superior segmentation accuracy than the state-of-the-art models, it is computational intensive due to the combination of both CNN and Transformer encoders. To improve computation efficiency, we developed a lightweight segmentation network, GCtx-UNet. This network can capture both global and local image features with accuracy comparable to the state-of-the-art approaches. GCtx-UNet utilizes vision Transformers that leverage global context self-attention modules combined with local self-attention to model long and short-range spatial dependencies. GCtx-UNet is more efficient than the state-of-the art models in that it has less number of model parameters, lower computation workload, higher inference speed, and smaller model size. The efficiency and accuracy of GCtx-UNet make it a practical choice for image segmentation applications.

Available for download on Wednesday, July 01, 2026

Share

COinS