Detecting Gene Conversion in Modern Genomic Studies
The detection of gene conversion events in genomic data is technically challenging because conversion leaves signatures that can resemble those of other evolutionary processes — recombination, point mutation, or complex population dynamics. As sequencing technology has advanced from single-gene Sanger sequencing to whole-genome short-read sequencing to long-read technologies, the methods for detecting gene conversion have evolved substantially, enabling genome-wide analyses of conversion rates, hotspots, and biases that were impossible even a decade ago.
Population Genetic Signatures of Gene Conversion
Gene conversion has distinctive population genetic signatures that distinguish it from point mutation and reciprocal recombination. Unlike point mutation, which affects a single site, gene conversion simultaneously changes multiple linked sites within the conversion tract. Unlike reciprocal recombination, which involves flanking marker exchange, non-crossover gene conversion homogenizes the internal conversion tract without exchanging flanking markers. This produces a pattern in haplotype analysis that has been called "non-crossover recombination" — two haplotypes that differ internally within a region despite sharing flanking haplotype backgrounds.
Statistical tests designed to detect gene conversion include analysis of four-gamete violations (combinations of alleles at linked sites that cannot be explained by mutation alone, requiring recombination or conversion), comparisons of internal vs. flanking recombination rates, and hidden Markov model-based approaches that scan for chromosomal segments where a haplotype switches to match a different individual over a short internal segment before reverting — the classic pattern of a received conversion tract.
Detecting Conversion in Pedigree and Sperm Studies
The most direct evidence for gene conversion comes from pedigree studies and sperm genotyping. By densely genotyping parents and offspring with SNP arrays or whole-genome sequencing, apparent de novo mutations that involve multiple closely spaced variants rather than single base changes — and that match one grandparental haplotype in the converted region — can be identified as probable gene conversion events. Sperm genotyping allows direct analysis of individual recombination products: a single sperm carries the result of one meiosis, permitting analysis of both crossovers and non-crossovers (gene conversions) at molecular resolution. High-throughput sperm sequencing has enabled genome-wide catalogs of human meiotic recombination events, providing direct evidence for conversion tract lengths, hotspot usage, and the GC-biasing of mismatch repair at conversion sites.
Bioinformatic Tools
Several bioinformatic tools specifically address gene conversion detection. GENECONV is a widely used software package that scans aligned DNA sequences for statistical evidence of gene conversion by identifying segments with lower-than-expected divergence between sequences — evidence of homogenization. It is particularly useful for within-species comparisons of paralogous gene family members. BEAGLE and SHAPEIT are haplotype phasing tools that, when applied to dense population genomic data, permit the inference of non-crossover events by identifying the transfer of allelic blocks. LDhot uses maximum likelihood approaches to detect elevated recombination (including conversion) in population genetic data by analyzing linkage disequilibrium patterns.
For structural variants and paralogs, long-read sequencing (PacBio HiFi, Oxford Nanopore) has transformed our ability to phase and compare highly similar genomic sequences that were previously collapsed or misaligned by short-read assemblers. Many gene conversion events in medically relevant gene families — SMN1/SMN2 (spinal muscular atrophy), STRC/STRCP1 (hearing loss), CYP21A2/CYP21A2P (congenital adrenal hyperplasia) — are in genomic regions that were previously refractory to accurate analysis but are now well-characterized by long-read approaches.
Distinguishing Conversion from Heteroduplex Repair
A technical challenge in conversion detection is distinguishing genuine gene conversion (donor-instructed overwriting of recipient sequence) from heteroduplex repair occurring at mismatches in a Holliday junction intermediate. Both processes produce similar outcomes — transfer of sequence from one homolog to another — but through slightly different molecular pathways. Modern single-molecule sequencing of recombination intermediates, combined with precise mapping of repair enzyme activities, has begun to address this distinction experimentally. For the evolutionary consequences of biased conversion, see our article on biased gene conversion and genome evolution.
For more information, visit our homepage or our resources section.