Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Statistical and Population Genetics - Statistical Science - Exam, Exams of Statistics

This is the Exam of Statistical Science which includes Stochastic Differential Equation, Brownian Motion, Solution, Measurable Function, Markov Process, Starting, Bounded Functions, Local Martingale, First Time etc. Key important points are: Statistical and Population Genetics, Problem, Coalescent Process, Approximates, Population, Generations,, Segment Per Generation, Number of Mutations, Coalescent Tree, Expected Value

Typology: Exams

2012/2013

Uploaded on 02/26/2013

dharmanand
dharmanand 🇮🇳

3.3

(3)

61 documents

1 / 5

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
M. PHIL. IN STATISTICAL SCIENCE
Friday 4 June, 2004 13:30 to 15:30
Statistical and Population Genetics
Attempt THREE questions.
There are four questions in total.
The questions carry equal weight.
You may not start to read the questions
printed on the subsequent pages until
instructed to do so by the Invigilator.
pf3
pf4
pf5

Partial preview of the text

Download Statistical and Population Genetics - Statistical Science - Exam and more Exams Statistics in PDF only on Docsity!

M. PHIL. IN STATISTICAL SCIENCE

Friday 4 June, 2004 13:30 to 15:

Statistical and Population Genetics

Attempt THREE questions. There are four questions in total.

The questions carry equal weight.

You may not start to read the questions

printed on the subsequent pages until

instructed to do so by the Invigilator.

1 This problem concerns the coalescent process that approximates the evolution of the ancestry of a sample of n chromosomal segments from a population of large but constant size N. Time is measured in units of N generations, and θ = 2N u is the mutation parameter, u being the mutation rate per segment per generation. The effects of recombination in the segment may be ignored. Let S be the number of mutations that occur on the coalescent tree of the sample.

(i) Show that the expected value of S is given by

IES = θ

n∑− 1

i=

i

(ii) Find an expression for the variance of S.

(iii) Using the result of (i), write down an unbiassed estimator θˆ (say) of θ, and show that it is asymptotically consistent as n → ∞.

(iv) For j = 2, 3 ,... , n, let Yj be the number of mutations that arise on the coalescent tree while there are j distinct ancestors of the sample. Show that the distribution of Yj is geometric.

Note: if X has a Poisson distribution with parameter λ, then the probability generating function of X is

IEsX^ = e−λ(1−s), 0 6 s 6 1.

(v) Using (iv) or otherwise, establish that the quantity

log n(θˆ − θ) is asymptotically Normally distributed as n → ∞ , and identify the variance.

(vi) What are the practical implications of the result in (v)?

2 One of the major problems in statistical and population genetics is to understand linkage disequilibrium (LD). Write an essay on this topic. You should include brief descriptions of the patterns of LD across chromosome 21, the ancestral recombination graph, its role in understanding LD, and its role in fine-scale mapping.

Statistical and Population Genetics

gf gm

gc

The above diagram shows a pedigree drawing for a trio consisting of a father, mother and affected child, with genotypes at a single genetic locus denoted gf , gm, gc respectively.

In a genetic association study, twelve such families are collected with genotypes as tabulated below (where ‘?’ denotes unknown genotypes).

F amily gf gm gc

1 2 / 2 1 / 1 1 / 2 2 1 / 2 1 / 2 1 / 1 3 1 / 2 2 / 2 1 / 2 4 1 / 2 1 / 2 1 / 2 5 1 / 1 2 / 2 1 / 2 6 1 / 2 1 / 2 1 / 1 7 1 / 2 ?/? 1 / 2 8 1 / 2 1 / 2 2 / 2 9 1 / 2 1 / 2 1 / 1 10 1 / 2 ?/? 1 / 1 11 1 / 2 1 / 2 1 / 2 12 2 / 2 1 / 2 1 / 2

i) For each family, calculate the contribution that it would make to the cells of the following transmission table and thus calculate the values of the counts a, b, c, d in the table.

Transmitted allele Untransmitted allele

1 2 1 a b 2 c d

ii) Calculate the value of transmission disequilibrium test (TDT) from this table. Is there any evidence for genetic association? (You may need to know that the percentage points for the upper 5% level are 1.64 for the standard normal distribution and 3.84 for a χ^2 distribution on 1df).

Statistical and Population Genetics

iii) Convert the data in the transmission table to the following table based on unmatched transmissions:

Marker allele Transmitted Untransmitted

1 w y 2 x z

and use the data in the cells of this table to calculate the haplotype relative risk (GHRR) odds ratio, and test of association.

iv) Prove that for such a trio, the probability of the child’s genotypes, given the parents’ genotypes and the event (D) that the child is affected with disease P (gc|gf , gm, D) may be written as

P (gc|gm, gf , D) =

Rgc ∑ g∗∈G Rg∗

where Rg is the relative risk for genotype g relative to some arbitrary baseline genotype, and the sum in the denominator is over the set G of the four possible offspring genotypes that the parents can produce.

v) Thus prove that the likelihood contribution from the 10 families with both parents genotyped is R^31 / 1 R^41 / 2 R 2 / 2 (R 1 / 1 + 2R 1 / 2 + R 2 / 2 )^6 (R 1 / 2 + R 2 / 2 )^2

Statistical and Population Genetics