Oral Presentations and Performances: Session I

Author Information

Project Type

Presentation

Faculty Mentor’s Full Name

Andrew Whiteley

Faculty Mentor’s Department

Wildlife Biology

Additional Mentor

Sam Rosenbaum sam.rosenbaum@umconnect.umt.edu Chad Bishop chad.bishop@mso.umt.edu Winsor Lowe winsor.lowe@mso.umt.edu

Abstract / Artist's Statement

Advances in genomics now allow us to obtain multiple completed genomes for a given species. A major outstanding knowledge gap in population genomics currently is whether a ‘local’ reference genome is important to use. The use of a reference genome from a distant population may lead to bias in estimated population genomic parameters, something referred to as ‘reference bias’.

Brook trout (Salvelinus fontinalis) exist in a native range that extends from the upper reaches of eastern Canada to northern Georgia. Due to this range extent, it has been found that some genetic differentiation exists between the northern and southern extents of the range (Kazyak et al. 2021). The current reference genome used when mapping brook trout is built using individuals from the northern extent of their range. As a result, when mapping individuals from the southern extent, we hypothesized that there may be bias in the mapping of the reads, resulting in differences in metrics such as heterozygosity, and nucleotide diversity.

Thirty-two individuals from Virginia were mapped to the reference genome built based on Canadian brook trout. They were then subsequently mapped to a reference genome built from the same population in Virginia. Using VCFtools and downstream analyses in R, we tested for detectable reference bias between the two reference assemblies. This work will serve to help researchers better understand the effects of the reference genomes they choose and how they may affect conclusions that are drawn using genomic data.

Category

Life Sciences

Share

COinS
 
Apr 17th, 9:00 AM Apr 17th, 9:15 AM

From Canada to Virginia: Identifying Reference Bias in Whole Genome Sequencing of Brook Trout

UC 327

Advances in genomics now allow us to obtain multiple completed genomes for a given species. A major outstanding knowledge gap in population genomics currently is whether a ‘local’ reference genome is important to use. The use of a reference genome from a distant population may lead to bias in estimated population genomic parameters, something referred to as ‘reference bias’.

Brook trout (Salvelinus fontinalis) exist in a native range that extends from the upper reaches of eastern Canada to northern Georgia. Due to this range extent, it has been found that some genetic differentiation exists between the northern and southern extents of the range (Kazyak et al. 2021). The current reference genome used when mapping brook trout is built using individuals from the northern extent of their range. As a result, when mapping individuals from the southern extent, we hypothesized that there may be bias in the mapping of the reads, resulting in differences in metrics such as heterozygosity, and nucleotide diversity.

Thirty-two individuals from Virginia were mapped to the reference genome built based on Canadian brook trout. They were then subsequently mapped to a reference genome built from the same population in Virginia. Using VCFtools and downstream analyses in R, we tested for detectable reference bias between the two reference assemblies. This work will serve to help researchers better understand the effects of the reference genomes they choose and how they may affect conclusions that are drawn using genomic data.