Date of Award


Degree Type


Degree Name


Degree Program

Computer Science


Computer Science

Major Professor

Roussev, Vassil

Second Advisor

Richard, Golden

Third Advisor

Deng, Jing


Digital forensics investigations become more time consuming as the amount of data to be investigated grows. Secular growth trends between hard drive and memory capacity just exacerbate the problem. Bloom filters are space-efficient, probabilistic data structures that can represent data sets with quantifiable false positive rates that have the potential to alleviate the problem by reducing space requirements. We provide a framework using Bloom filters to allow fine-grained content identification to detect similarity, instead of equality. We also provide a method to compare filters directly and a statistical means of interpreting the results. We developed a tool--md5bloom--that uses Bloom filters for standard queries and direct comparisons. We provide a performance comparison with a commonly used tool, md5deep, and achieved a 50% performance gain that only increases with larger hash sets. We compared filters generated from different versions of KNOPPIX and detected similarities and relationships between the versions.


The University of New Orleans and its agents retain the non-exclusive license to archive and make accessible this dissertation or thesis in whole or in part in all forms of media, now or hereafter known. The author retains all other ownership rights to the copyright of the thesis or dissertation.