Trial Enrichment Design
Overview[edit]
Parameters[edit]
Type 2 diabetes (T2D)[edit]
Follow up time: A median of 3 years
Incidence:
Rosiglitazon group: 11.6% Placebo group: 26.0%
Cumulative incidence in the first 2 years:
15% in the placebo group 5% in the rosiglitazone group
Ref: Effect of rosiglitazone on the frequency of diabetes in patients with impaired glucose tolerance or impaired fasting glucose: a randomised controlled trial. The DREAM study: http://www.ncbi.nlm.nih.gov/pubmed/16997664
Sibling recurrence risk : 2.5 - 4.2
This number is obtained by the ratio of prevalence (6%) and that "15-15% of first-degree relatives of patients with T2D develop impaired glucose tolerance or diabetes" indicated in http://www.ncbi.nlm.nih.gov/pubmed/15823385
Whitehall SNPs: http://www.bmj.com/cgi/content/full/bmj.b4838/DC1 Table A
[edit]
Follow up time: 6.3 years.
Sample size: 1446.
From Table 2, we find the incidence rates
Placebo: 22% Antioxidants: 21% Zinc: 19% Antioxidants and Zinc: 16% Incidence rate: 279 progresses to advanced AMD, so the incidence rate is 6.4%.
Ref: Seddon et al. Prediction Model for Prevalence and Incidence of Advanced Age-Related Macular Degeneration Based on Genetic, Demographic, and Environmental Variables http://www.iovs.org/cgi/content/abstract/50/5/2044
Sibling recurrence risk : 2.95 http://www.citeulike.org/user/ykaminoh/article/5382860
Type 1 diabetes (T1D)[edit]
From a subgroup analysis of the Diabetes Prevention Trial of Type 1 Diabetes (DPT-1), we find the following information
Follow up time: 5 years Incidence rate:
6.2% per year in the oral insulin group 10.4% per year in teh placebo group
Hazard ratio: 0.566, 95% CI (0.361, 0.888) p = 0.015
Ref: Update on Worldwide Efforts to Prevent Type 1 Diabetes http://www3.interscience.wiley.com/journal/121570859/abstract?CRETRY=1&SRETRY=0
Sibling recurrence risk : 15 http://www.plosgenetics.org/article/info:doi%2F10.1371%2Fjournal.pgen.1000540
Myocardial Infarction (MI)[edit]
Recent results from the Framingham Heart Study (FHS) reports that 1174 out of 8491 (13.8%) participants (456 out of 4522 women - 10.0%) developed the first cardiovascular disease during the 12 years of follow-up. http://circ.ahajournals.org/cgi/content/full/117/6/743?ijkey=66b179ef0f2e12e73d6be67108550728c864ce9b
Collaborative Atorvasatin Diabetes Study (CARDS) reported a randomized clinical trial with 2838 type 2 diabetic patients for 3.9 years, and observed 2.47%/yr (placebo), 1.54%/yr (atorvastatin) incidence rates. http://linkinghub.elsevier.com/retrieve/pii/S0140673604168955
We used the CARDs results to model the trial population. Note the caveats of using type 2 diabetic patient during the screening procedure.
Follow up time: 3.9 years Incidence rate:
2.47%/yr placebo 1.54%/yr statin
Sibling recurrence risk : ??? (provide references)
Table 1 from this paper: http://genomemedicine.com/content/2/2/10 gives sib risk for various diseases.
AUC value for ROC curve[edit]
- Table (Compare different AUC estimate)
- The probit AUC is extracted from this paper: http://www.plosgenetics.org/article/info%3Adoi%2F10.1371%2Fjournal.pgen.1000864
- Use this web calculator http://gump.qimr.edu.au/genroc/
Disease | Preval | <math>\lambda_s</math> | AUC_max | AUC_half | |
---|---|---|---|---|---|
T1D | 0.0054 | 13.7 | 0.99 | 0.93 | |
T2D | 0.03 | 3.5 | 0.92 | 0.82 | |
AMD | 0.015 | 2.9 | 0.89 | 0.80 | |
MI | 0.04 | 3.2 | 0.93 | 0.83 |
From this table, we find the AUC value estimated by logit and probit models are the same.
- From Evans DM et al, "Harnessing the information contained within genome-wide association studies to improve individual prediction of complex disease risk.", Hum Mol Genet - Table 2, at p-value threshold 0.1
- T2D - 0.696
- T1D - 0.788
- CHD - 0.595
- AMD - ??
Current list of working scripts and R codes[edit]
- All files are in the directory fantasia:/home/hmkang/prj/CT
- Scripts and R codes are in order of desirable execution.
- Currently, the directory R/ and scripts/ needs to be copied into your local directory in order to make the codes work properly due to the permission issue.
- results/ and tmp/ directories need to be created before running the following codes.
- R CMD BATCH --slave --vanilla R/runCTE.hist.v3.R # sample 1e5 genotypes with 100 samplings of odds ratios and write histograms and ROC curves, and AUC values using the known genetic parameters, for each of the four disease traits - T2D, AMD, T1D, and MI
This takes 30 minutes for 100 samples and only 3 minutes for 10 samples.
- python scripts/draw_CTE.v3.hist.py # draw risk factor histograms and ROC curves for each of the four diseases
- R CMD BATCH --slave --vanilla R/runCTE.trials.v3.R # translate the histogram/ROC curves into the trial enrichment
- python scripts/draw_CTE.v3.trials.py # draw trial enrichment curves for each disease
- R CMD BATCH --slave --vanilla R/runCTE.gen.v3.R # trial enrichment using general risk scores based on various AUCs, assuming normality of the risk scores, given the prevalence and treatment effect sizes of each diseases
- python scripts/draw_CTE.v3.gen.py # draw plots for trial enrichment using general risk scores based on various AUCs.
- R/CTE.v3.R # core R code
- python scripts/draw-CTE.v3.con.py # draw contour plots of mean risk scores versus s.d. or risk scores with respect to AUCs. This should be followed by running ContourSigmaFromAUC function in R/CTE.v3.R
- To do list
- Draw ROC curves, and CT enrichment plots (_numcost.pdf) from AUC of 0.792 and compare the plots using the five loci for the AMD case.
I've done so. The percentage of absolutely difference is 2.9% for the cost estimated using two ways. One visual difference is that using AUC value gives more smooth curves, unlikely the zigzagging result from using the five loci.
- If they consistent, Incorporate C-statistics from Seddon et al paper with only genetic factors versus with both genetic and clinical factors
I've also done the comparison for the T1D case. The result is very similar (the difference of cost is less than 2%), indicating consistency.
- Needs to incorporate the variance of the betas in R/runCTE.trial.v3.R and R/runCTE.gen.v3.R
- Wait until new data arrives
T1D trial design involving an antibody screening[edit]
Introduction (for communicating with GSK)[edit]
- A little bit of background
- Original request from Dawn and Li Li
Standard trial | Genetically enriched trial | |
Targeted proportion via genetic screening | N/A | top 15% |
Fraction of individual passed antibody screening | Top 0.5%, 1%, 3% or 5% | 2x, 5x, or 10x of the rate in the standard trial |
T1D risk in placebo arm | 50% | 50% |
T1D risk in treatment arm | 35% | 35% |
- What we want or need to change
- Add flexibility to vary "targeted proportion via genetic screening"
- Need to impose a certain probability model between "genetic risk", "T1D incidence" and "Antibody screening" variables.
- Under some model, T1D risk may not be the same between standard and enriched trial even after applying the antibody screening
General Framework[edit]
Definitions[edit]
- Y - binary variable - progression of disease at the end of a trial period (if they were followed-up)
- Z - binary variable - positive/negative results from antibody screening
- x - vector of genetic markers (or possibly including other clinical variables)
- b(x) - risk of disease progression learned from GWAS
Input Parameters (numbers may vary)[edit]
- Pr(Y=1|Z=1,Ctrl) = 0.5
- Pr(Z=1|Y=1) = 0.9 or 0.8 (1 or 2 antibodies)
- Pr(Z=1|Y=0) = 0.01 or 0.005
- Pr(Y,Z) joint table can be calculated from the values above
Models[edit]
- logit Pr(Y=1|b(x)) = \mu + b(x) : GWAS risk predictions are transferrable to clinical trials
Whitehall II prospective cohort T2D study[edit]
Data[edit]
ROC[edit]
Reproducing ROC and AUC[edit]
Fig 1 from Talmud et al. shows
- AUC = 0.53 (0.50, 0.58) for gene count score only
- AUC = 0.78 (0.75, 0.82) for Framingham offspring risk score
- AUC = 0.78 (0.75, 0.81) when gene count score incorporated into Framingham offspring score
(the paper treat these two scores as independent, so just add them to form a combined risk)
Our analysis gives AUC = 0.51, 0.75 and 0.77 for the above corresponding cases.
AUC for subjects with impaired glucose tolerance (IGT)[edit]
Using Li Li's table for standard clinical trials, we define IGT = 2 hour 75g oral glucose 7.8 - 11.0 mmol/dL
- AUC = 0.47 when only consider genetic risk
- AUC = 0.68 when only consider framingham risk
- AUC = 0.70 when consider the added framingham and genetic risk
AUC for subjects with impaired fasting glucose tolerance (IFT)[edit]
Using WHO criteria: fasting plasma glucose level from 6.1 mmol/l (110 mg/dL) to 6.9 mmol/l (125 mg/dL).
- AUC = 0.51 when only consider genetic risk
- AUC = 0.71 when only consider framingham risk
- AUC = 0.72 when consider the addded framingham and genetic risk
AUC for subject with either IGT or IFT[edit]
IGT and IFT defined the same as above.
- AUC = 0.49 when only consider genetic risk
- AUC = 0.70 when only consider framingham risk
- AUC = 0.72 when consider the added framingham and genetic risk
Simulation[edit]
When we do not have data, we use GWAS result to simulate genetic risk and obtain a series of AUC values. This cohort study, the percentage of Diabetical subject is 0.037 and 16 SNPs were used in the simulation.
Result:
- "AUC-Expected-OR" 0.589006157844499
- "AUC-very-pessimistic" 0.546650465402407
- "AUC-very-optimistic" 0.646407762728796
- "AUC-Sample-Mean" 0.597908111804416
- "AUC-SE" 0.00710488677620161
- "preval" 0.037