Oral Presentations and Performances: Session I
Project Type
Presentation
Faculty Mentor’s Full Name
Andrew Whiteley
Faculty Mentor’s Department
Wildlife Biology
Additional Mentor
Sam Rosenbaum sam.rosenbaum@umconnect.umt.edu Chad Bishop chad.bishop@mso.umt.edu Winsor Lowe winsor.lowe@mso.umt.edu
Abstract / Artist's Statement
Advances in genomics now allow us to obtain multiple completed genomes for a given species. A major outstanding knowledge gap in population genomics currently is whether a ‘local’ reference genome is important to use. The use of a reference genome from a distant population may lead to bias in estimated population genomic parameters, something referred to as ‘reference bias’.
Brook trout (Salvelinus fontinalis) exist in a native range that extends from the upper reaches of eastern Canada to northern Georgia. Due to this range extent, it has been found that some genetic differentiation exists between the northern and southern extents of the range (Kazyak et al. 2021). The current reference genome used when mapping brook trout is built using individuals from the northern extent of their range. As a result, when mapping individuals from the southern extent, we hypothesized that there may be bias in the mapping of the reads, resulting in differences in metrics such as heterozygosity, and nucleotide diversity.
Thirty-two individuals from Virginia were mapped to the reference genome built based on Canadian brook trout. They were then subsequently mapped to a reference genome built from the same population in Virginia. Using VCFtools and downstream analyses in R, we tested for detectable reference bias between the two reference assemblies. This work will serve to help researchers better understand the effects of the reference genomes they choose and how they may affect conclusions that are drawn using genomic data.
Category
Life Sciences
From Canada to Virginia: Identifying Reference Bias in Whole Genome Sequencing of Brook Trout
UC 327
Advances in genomics now allow us to obtain multiple completed genomes for a given species. A major outstanding knowledge gap in population genomics currently is whether a ‘local’ reference genome is important to use. The use of a reference genome from a distant population may lead to bias in estimated population genomic parameters, something referred to as ‘reference bias’.
Brook trout (Salvelinus fontinalis) exist in a native range that extends from the upper reaches of eastern Canada to northern Georgia. Due to this range extent, it has been found that some genetic differentiation exists between the northern and southern extents of the range (Kazyak et al. 2021). The current reference genome used when mapping brook trout is built using individuals from the northern extent of their range. As a result, when mapping individuals from the southern extent, we hypothesized that there may be bias in the mapping of the reads, resulting in differences in metrics such as heterozygosity, and nucleotide diversity.
Thirty-two individuals from Virginia were mapped to the reference genome built based on Canadian brook trout. They were then subsequently mapped to a reference genome built from the same population in Virginia. Using VCFtools and downstream analyses in R, we tested for detectable reference bias between the two reference assemblies. This work will serve to help researchers better understand the effects of the reference genomes they choose and how they may affect conclusions that are drawn using genomic data.