Invited review: Bioinformatic methods to discover the likely causal variant of a new autosomal recessive genetic condition using genome-wide data

Pollott, G E (2018) Invited review: Bioinformatic methods to discover the likely causal variant of a new autosomal recessive genetic condition using genome-wide data. Animal.

[img]
Preview
Text
11661.pdf - Published Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.

Download (1MB) | Preview

Abstract

In animals, new autosomal recessive genetic diseases (ARGD) arise all the time due to the regular, random mutations that occur during meiosis. In order to reduce the effect of any damaging new variant, it is necessary to find its cause. To evaluate the best way of doing this, 34 papers which found the exact location of a new genetic disease in livestock were reviewed and found to require at least two stages. In the initial stage the commonly used χ2 method, applied in a case-control association analysis with single nucleotide polymorphism (SNP)-chip data, was found to have limitations and was almost always used in conjunction with a second method to locate the target region on the genome containing the variant. The commonly used methods had their drawbacks; so a new method was devised based on long runs of homozygosity, a common feature of new ARGD. This ‘autozygosity by difference’ method was found to be as good as, or better than, all the reviewed methods tested based on its ability to unambiguously find the shortest known target region in an already analysed data set. Mean target region length was found to be 4.6 megabases in the published reports. Success did not depend on the size of commercial SNP-chip used, and studies with as few as three cases and four controls were large enough to find the target region. The final stage relied on either sequencing the candidate genes found in the target region or using whole genome sequencing (WGS) on a small number of cases. Sometimes this latter method was used in conjunction with WGS on a number of control animals or resources such as the 1000 bull genomes data. Calculations showed that, in cattle, less than 15 animals would be needed in order to locate the new variant when using WGS data. This could be any combination of cases plus parents or other unrelated animals in the breed. Using WGS data, it would be necessary to search the three billion bases of the cattle genome for base positions which were homozygous for the same allele in all cases and heterozygous for that allele in parents, or not containing that homozygote in unrelated controls. This site could be confirmed on other healthy animals using much cheaper methods, and then a genetic test could be devised for that variant in order to screen the whole population and to devise a breeding programme to eliminate the disorder from the population.

Actions (Repository Editors)

View Item View Item