Trial Enrichment Design

From Statgen Internal Wiki
Jump to navigationJump to search

Overview[edit]

Parameters[edit]

Type 2 diabetes (T2D)[edit]

Follow up time: A median of 3 years

Incidence:

Rosiglitazon group: 11.6% 
Placebo group: 26.0% 

Cumulative incidence in the first 2 years:

15% in the placebo group 
5% in the rosiglitazone group

Ref: Effect of rosiglitazone on the frequency of diabetes in patients with impaired glucose tolerance or impaired fasting glucose: a randomised controlled trial. The DREAM study: http://www.ncbi.nlm.nih.gov/pubmed/16997664

Sibling recurrence risk : 2.5 - 4.2

This number is obtained by the ratio of prevalence (6%) and that "15-15% of first-degree relatives of patients with T2D develop impaired glucose tolerance or diabetes" indicated in http://www.ncbi.nlm.nih.gov/pubmed/15823385

Whitehall SNPs: http://www.bmj.com/cgi/content/full/bmj.b4838/DC1 Table A

Age-related Macular Degeneration (AMD)[edit]

Follow up time: 6.3 years.

Sample size: 1446.

From Table 2, we find the incidence rates

Placebo: 22%
Antioxidants: 21%
Zinc: 19%
Antioxidants and Zinc: 16%
Incidence rate: 279 progresses to advanced AMD, so the incidence rate is 6.4%.

Ref: Seddon et al. Prediction Model for Prevalence and Incidence of Advanced Age-Related Macular Degeneration Based on Genetic, Demographic, and Environmental Variables http://www.iovs.org/cgi/content/abstract/50/5/2044

Sibling recurrence risk : 2.95 http://www.citeulike.org/user/ykaminoh/article/5382860

Type 1 diabetes (T1D)[edit]

From a subgroup analysis of the Diabetes Prevention Trial of Type 1 Diabetes (DPT-1), we find the following information

Follow up time: 5 years Incidence rate:

6.2% per year in the oral insulin group
10.4% per year in teh placebo group

Hazard ratio: 0.566, 95% CI (0.361, 0.888) p = 0.015

Ref: Update on Worldwide Efforts to Prevent Type 1 Diabetes http://www3.interscience.wiley.com/journal/121570859/abstract?CRETRY=1&SRETRY=0

Sibling recurrence risk : 15 http://www.plosgenetics.org/article/info:doi%2F10.1371%2Fjournal.pgen.1000540

Myocardial Infarction (MI)[edit]

Recent results from the Framingham Heart Study (FHS) reports that 1174 out of 8491 (13.8%) participants (456 out of 4522 women - 10.0%) developed the first cardiovascular disease during the 12 years of follow-up. http://circ.ahajournals.org/cgi/content/full/117/6/743?ijkey=66b179ef0f2e12e73d6be67108550728c864ce9b

Collaborative Atorvasatin Diabetes Study (CARDS) reported a randomized clinical trial with 2838 type 2 diabetic patients for 3.9 years, and observed 2.47%/yr (placebo), 1.54%/yr (atorvastatin) incidence rates. http://linkinghub.elsevier.com/retrieve/pii/S0140673604168955

We used the CARDs results to model the trial population. Note the caveats of using type 2 diabetic patient during the screening procedure.

Follow up time: 3.9 years Incidence rate:

 2.47%/yr placebo
 1.54%/yr statin

Sibling recurrence risk : ??? (provide references)

Table 1 from this paper: http://genomemedicine.com/content/2/2/10 gives sib risk for various diseases.

AUC value for ROC curve[edit]

Disease Preval <math>\lambda_s</math> AUC_max AUC_half
T1D 0.0054 13.7 0.99 0.93
T2D 0.03 3.5 0.92 0.82
AMD 0.015 2.9 0.89 0.80
MI 0.04 3.2 0.93 0.83


From this table, we find the AUC value estimated by logit and probit models are the same.

  • From Evans DM et al, "Harnessing the information contained within genome-wide association studies to improve individual prediction of complex disease risk.", Hum Mol Genet - Table 2, at p-value threshold 0.1
    • T2D - 0.696
    • T1D - 0.788
    • CHD - 0.595
    • AMD - ??

http://mgetit.lib.umich.edu/sfx_local?sid=google&auinit=DM&aulast=Evans&atitle=Harnessing+the+information+contained+within+genome-wide+association+studies+to+improve+individual+prediction+of+complex+disease+risk&id=doi:10.1093/hmg/ddp295&title=Human+Molecular+Genetics&volume=18&issue=18&date=2009&spage=3525&issn=0964-6906

Current list of working scripts and R codes[edit]

  • All files are in the directory fantasia:/home/hmkang/prj/CT
  • Scripts and R codes are in order of desirable execution.
  • Currently, the directory R/ and scripts/ needs to be copied into your local directory in order to make the codes work properly due to the permission issue.
  • results/ and tmp/ directories need to be created before running the following codes.
  1. R CMD BATCH --slave --vanilla R/runCTE.hist.v3.R # sample 1e5 genotypes with 100 samplings of odds ratios and write histograms and ROC curves, and AUC values using the known genetic parameters, for each of the four disease traits - T2D, AMD, T1D, and MI
This takes 30 minutes for 100 samples and only 3 minutes for 10 samples.
  1. python scripts/draw_CTE.v3.hist.py # draw risk factor histograms and ROC curves for each of the four diseases
  1. R CMD BATCH --slave --vanilla R/runCTE.trials.v3.R # translate the histogram/ROC curves into the trial enrichment
  2. python scripts/draw_CTE.v3.trials.py # draw trial enrichment curves for each disease
  3. R CMD BATCH --slave --vanilla R/runCTE.gen.v3.R # trial enrichment using general risk scores based on various AUCs, assuming normality of the risk scores, given the prevalence and treatment effect sizes of each diseases
  4. python scripts/draw_CTE.v3.gen.py # draw plots for trial enrichment using general risk scores based on various AUCs.
  5. R/CTE.v3.R # core R code
  6. python scripts/draw-CTE.v3.con.py # draw contour plots of mean risk scores versus s.d. or risk scores with respect to AUCs. This should be followed by running ContourSigmaFromAUC function in R/CTE.v3.R
  • To do list
    • Draw ROC curves, and CT enrichment plots (_numcost.pdf) from AUC of 0.792 and compare the plots using the five loci for the AMD case.
I've done so. The percentage of absolutely difference is 2.9% for the cost estimated using two ways. 
One visual difference is that using AUC value gives more smooth curves, unlikely the 
zigzagging result from using the five loci.
    • If they consistent, Incorporate C-statistics from Seddon et al paper with only genetic factors versus with both genetic and clinical factors
I've also done the comparison for the T1D case. The result is very similar (the difference of cost 
is less than 2%), indicating consistency. 
    • Needs to incorporate the variance of the betas in R/runCTE.trial.v3.R and R/runCTE.gen.v3.R
    • Wait until new data arrives

T1D trial design involving an antibody screening[edit]

Introduction (for communicating with GSK)[edit]

  • A little bit of background
  • Original request from Dawn and Li Li
Standard trial Genetically enriched trial
Targeted proportion via genetic screening N/A top 15%
Fraction of individual passed antibody screening Top 0.5%, 1%, 3% or 5% 2x, 5x, or 10x of the rate in the standard trial
T1D risk in placebo arm 50% 50%
T1D risk in treatment arm 35% 35%
  • What we want or need to change
    • Add flexibility to vary "targeted proportion via genetic screening"
    • Need to impose a certain probability model between "genetic risk", "T1D incidence" and "Antibody screening" variables.
    • Under some model, T1D risk may not be the same between standard and enriched trial even after applying the antibody screening

General Framework[edit]

Definitions[edit]

  • Y - binary variable - progression of disease at the end of a trial period (if they were followed-up)
  • Z - binary variable - positive/negative results from antibody screening
  • x - vector of genetic markers (or possibly including other clinical variables)
  • b(x) - risk of disease progression learned from GWAS

Input Parameters (numbers may vary)[edit]

  • Pr(Y=1|Z=1,Ctrl) = 0.5
  • Pr(Z=1|Y=1) = 0.9 or 0.8 (1 or 2 antibodies)
  • Pr(Z=1|Y=0) = 0.01 or 0.005
  • Pr(Y,Z) joint table can be calculated from the values above

Models[edit]

  • logit Pr(Y=1|b(x)) = \mu + b(x) : GWAS risk predictions are transferrable to clinical trials

Whitehall II prospective cohort T2D study[edit]

Data[edit]

ROC[edit]

File:ROC.pdf

Reproducing ROC and AUC[edit]

Fig 1 from Talmud et al. shows

  • AUC = 0.53 (0.50, 0.58) for gene count score only
  • AUC = 0.78 (0.75, 0.82) for Framingham offspring risk score
  • AUC = 0.78 (0.75, 0.81) when gene count score incorporated into Framingham offspring score

(the paper treat these two scores as independent, so just add them to form a combined risk)

Our analysis gives AUC = 0.51, 0.75 and 0.77 for the above corresponding cases.

AUC for subjects with impaired glucose tolerance (IGT)[edit]

Using Li Li's table for standard clinical trials, we define IGT = 2 hour 75g oral glucose 7.8 - 11.0 mmol/dL

  • AUC = 0.47 when only consider genetic risk
  • AUC = 0.68 when only consider framingham risk
  • AUC = 0.70 when consider the added framingham and genetic risk

AUC for subjects with impaired fasting glucose tolerance (IFT)[edit]

Using WHO criteria: fasting plasma glucose level from 6.1 mmol/l (110 mg/dL) to 6.9 mmol/l (125 mg/dL).

  • AUC = 0.51 when only consider genetic risk
  • AUC = 0.71 when only consider framingham risk
  • AUC = 0.72 when consider the addded framingham and genetic risk

AUC for subject with either IGT or IFT[edit]

IGT and IFT defined the same as above.

  • AUC = 0.49 when only consider genetic risk
  • AUC = 0.70 when only consider framingham risk
  • AUC = 0.72 when consider the added framingham and genetic risk


Simulation[edit]

When we do not have data, we use GWAS result to simulate genetic risk and obtain a series of AUC values. This cohort study, the percentage of Diabetical subject is 0.037 and 16 SNPs were used in the simulation.

Result:

  • "AUC-Expected-OR" 0.589006157844499
  • "AUC-very-pessimistic" 0.546650465402407
  • "AUC-very-optimistic" 0.646407762728796
  • "AUC-Sample-Mean" 0.597908111804416
  • "AUC-SE" 0.00710488677620161
  • "preval" 0.037