Graduation Year
2024
Graduation Month
May
Document Type
Thesis
Degree Name
Bachelor of Science
School or Department
Wildlife Biology
Major
Wildlife Biology – Terrestrial
Faculty Mentor Department
Biological Sciences, Division of
Faculty Mentor
Jeffrey Good
Faculty Reader(s)
Jeffrey Good, Lucia Williams, Zac Cheviron
Keywords
Genome, Database, Sequencing, Genetic, Computational, Biology
Subject Categories
Computational Biology | Databases and Information Systems | Genetics and Genomics | Genomics
Abstract
As the field of computational genomics continues to expand in both potential and application, it is now more imperative than ever to ensure that massive genetic sequencing datasets are properly stored in an accessible manner. This project sought to establish a practical, user-friendly, secure system for a genomics research lab (the Good Lab; thegoodlab.org) at the University of Montana. A MySQL database and connected web application was ruled the best configuration to maximize utility and accessibility for the lab’s researchers. Building the logical framework for the database, creating the server, and sourcing data occurred over several months. The dataset ranged from experimental details of sequencing (such as experiment dates, sequencing platform, and provider) to metadata of the samples (specific biological specimen information, molecular protocols). A combination of lab notebooks and a master Excel spreadsheet yielded over 3,500 individual biological sequencing samples that spanned terabytes of archived data. These data represent 10 years of lab sequencing efforts, with numerous examples of incomplete or non-standardized documentation. Once the database was seeded with these data, efforts transitioned to user functionality and the front end. One goal became the creation of a web application that allows efficient execution of basic functions (insertions, selective deletions, updates, and queries) for individuals without a MySQL background. However, due to such an interfaces’ complexity, a temporary substitute in the form of a thorough backend users’ guide was designed to allow for maximum usability of the system in the immediate future. Ultimately, the fundamental goal was accomplished: a clear, organized system for sequencing data was created with a structure and function that will permit many years of continued data collection and recall in a manner befitting the importance of the data being collected. Areas for future improvement and development for the stack were also identified.
Honors College Research Project
1
GLI Capstone Project
no
Recommended Citation
Olexa, Jacquelin W., "Creation of a Digital Storage System for Genome Sequencing Metadata" (2024). Undergraduate Theses, Professional Papers, and Capstone Artifacts. 483.
https://scholarworks.umt.edu/utpp/483
Included in
Computational Biology Commons, Databases and Information Systems Commons, Genomics Commons
© Copyright 2024 Jacquelin W. Olexa