Date of Award

May 2023

Degree Type


Degree Name

Master of Science


Computer Science

First Advisor

Mohammad H. Rahman

Committee Members

Mohammad H. Rahman, Rohit Kate, Inga Wang


Activities of Daily Living, Assistive Robot, Computer Vision, Localization, Pre-trained Model, Robot Manipulation


Upper and lower extremity (ULE) functional deficiencies, which limit a person's ability to perform everyday tasks, have increased at an alarming rate over the past few decades. It is essential for individuals with impairments to take care of themselves without requiring a significant amount of support from other individuals. Few assistive devices are available in the market to make their life comfortable, yet controlling them sometimes becomes challenging for this group of people. Robotic devices are emerging as assistive devices to assist individuals with limited ULE functionalities in activities of daily living (ADL). As most of these devices only allow manual control via a joystick, it becomes hard for them to follow and precisely complete a task using them, especially for individuals with severe limitations of hand functions and distance vision impairment. Therefore, autonomous/semi-autonomous control of a robotic assistive device to perform any ADL task is open to research which will eventually offer them the independence to perform everyday tasks without the help of others. This thesis proposes a vision-based control system of a 6 Degrees of Freedom (DoF) robotic manipulator to perform the "pick-and-place" task semi-autonomously, which is by far the most common activity among ADLs, using a vision-based approach. The first part of the thesis describes the design of a deep learning-based detection model. A dataset consisting of 47 ADL objects is compiled in order to develop an ADL detection model for the aforementioned application. Then, a YOLO (You Only Look Once) model is retrained for ADL object detection in real-time using this dataset. After that, the model is fine-tuned and validated to ensure it works properly in the intended setting. The following part contains designing and developing a localization algorithm of ADL objects in 3D space using the trained YOLO model. The system incorporates a RealSense depth camera (D435), RealSense Python SDK, and the trained deep ADL object detection model to localize objects in the 3D environment and interact with them using an assistive robot. The thesis concludes with the design of the vision-based control system's underlying architecture to carry out the "pick-and-place" ADL task. The system adopts the designed localization technique to localize ADL objects such as apples, oranges, capsicums, or cups in real-time and pick them semi-autonomously to bring them back to a specific location to complete the task successfully. A xArm6 (6 DoF) robot from Ufactory has been utilized here to evaluate the system and make necessary adjustments to ensure the system is functioning appropriately. Experimental validation of the proposed vision-based control system is carried out in various settings with a wide range of ADL objects, and system performance is analyzed at each level. The experimental results demonstrate that the proposed system achieves a promising performance with an overall success rate of 72.9% in detecting, localizing, and performing the ADL task, proving the vision-guided ADL assistance system's feasibility in the real world.

Available for download on Friday, June 14, 2024