Graduation Year

2024

Graduation Month

May

Document Type

Thesis

Degree Name

Bachelor of Science

School or Department

Wildlife Biology

Major

Wildlife Biology – Terrestrial

Faculty Mentor Department

Biological Sciences, Division of

Faculty Mentor

Jeffrey Good

Faculty Reader(s)

Jeffrey Good, Lucia Williams, Zac Cheviron

Keywords

Genome, Database, Sequencing, Genetic, Computational, Biology

Subject Categories

Computational Biology | Databases and Information Systems | Genetics and Genomics | Genomics

Abstract

As the field of computational genomics continues to expand in both potential and application, it is now more imperative than ever to ensure that massive genetic sequencing datasets are properly stored in an accessible manner. This project sought to establish a practical, user-friendly, secure system for a genomics research lab (the Good Lab; thegoodlab.org) at the University of Montana. A MySQL database and connected web application was ruled the best configuration to maximize utility and accessibility for the lab’s researchers. Building the logical framework for the database, creating the server, and sourcing data occurred over several months. The dataset ranged from experimental details of sequencing (such as experiment dates, sequencing platform, and provider) to metadata of the samples (specific biological specimen information, molecular protocols). A combination of lab notebooks and a master Excel spreadsheet yielded over 3,500 individual biological sequencing samples that spanned terabytes of archived data. These data represent 10 years of lab sequencing efforts, with numerous examples of incomplete or non-standardized documentation. Once the database was seeded with these data, efforts transitioned to user functionality and the front end. One goal became the creation of a web application that allows efficient execution of basic functions (insertions, selective deletions, updates, and queries) for individuals without a MySQL background. However, due to such an interfaces’ complexity, a temporary substitute in the form of a thorough backend users’ guide was designed to allow for maximum usability of the system in the immediate future. Ultimately, the fundamental goal was accomplished: a clear, organized system for sequencing data was created with a structure and function that will permit many years of continued data collection and recall in a manner befitting the importance of the data being collected. Areas for future improvement and development for the stack were also identified.

Honors College Research Project

1

GLI Capstone Project

no

Share

COinS
 

© Copyright 2024 Jacquelin W. Olexa