Date of Award
12-2024
Degree Type
Thesis
Degree Name
M.S.
Degree Program
Engineering and Applied Science - Computer Science
Department
Computer Science
Major Professor
Dr. Tamjidul Hoque
Second Advisor
Dr. Abdul Rahman Alsamman
Third Advisor
Dr. Abdullah Nur Yasin
Abstract
This research investigates the development of a robust AI-powered detection and tracking engine aimed at revolutionizing the retail checkout experience. The foundation of this work is a comprehensive exploration of state-of-the-art Computer Vision methodologies, particularly focusing on object detection, segmentation, and tracking. The study employs a modular pipeline that integrates advanced visual recognition algorithms with a robust data processing framework.
Key to this work is the construction of a synthetic dataset using Unity3D, enabling the generation of high-quality annotated data that mirrors real-world retail scenarios. This approach addresses the challenge of insufficient labeled datasets by simulating diverse and cluttered shopping environments, ensuring models are trained on realistic, high-variance data. The pipeline leverages cutting-edge architectures, including Transformer-based models, which are evaluated extensively against benchmarks such as the COCO dataset. Detailed experimentation was conducted to optimize model performance, encompassing preprocessing, model fine-tuning, and hyperparameter adjustment. Additionally, the research explores strategies for edge deployment, incorporating quantization techniques to ensure computational efficiency without compromising accuracy. The system's performance was validated using metrics such as mAP and latency, showcasing its ability to operate effectively in real-time conditions. Further downstream tasks, such as the integration of recommendation engines powered by LLMs, are outlined as potential future extensions. This methodology-centric study provides a roadmap for developing scalable, accurate, and efficient AI solutions for automated retail systems, offering a significant contribution to the field of Computer Vision and its application in commercial environments
Recommended Citation
Naeem, Abdullah BIn, "A Talking Cart" (2024). University of New Orleans Theses and Dissertations. 3200.
https://scholarworks.uno.edu/td/3200
Included in
Operations Research, Systems Engineering and Industrial Engineering Commons, Robotics Commons, Systems Science Commons
Rights
The University of New Orleans and its agents retain the non-exclusive license to archive and make accessible this dissertation or thesis in whole or in part in all forms of media, now or hereafter known. The author retains all other ownership rights to the copyright of the thesis or dissertation.