********************************************************************************* ***** ***** ***** REPLICATION CODE FOR ***** ***** ***** ***** Treatment eligibility and retention in clinical HIV care: ***** ***** regression-discontinuity evidence from South Africa ***** ***** ***** ***** Jacob Bor, Matthew P. Fox, Sydney Rosen, Atheendar Venkataramani, ***** ***** Frank Tanser, Deenan Pillay, Till Barnighausen ***** ***** ***** ***** PLOS Medicine 2017. Correspondence to: jbor@bu.edu ***** ***** ***** ********************************************************************************* * This file contains the Stata code (Stata/SE 14.2) used to implement our analysis. * Part 1 of this code constructs the analytical dataset used in this paper from * the raw datasets available from the Africa Health Research Institute (www.ahri.org). * Part 2 of this code implements the analysis, begining with the analytical dataset. * We have made the de-identified analytical dataset available as Supplementary * Information "S1 Data". Part 1 is included for purposes of transparency, but is * commented out, as most users will begin with the analytic dataset. * Information on the Hlabisa HIV Treatment and Care Programme (Hlabisa Cohort) can * be found in the Cohort Profile published as Houlihan et al. (2011) Int J Epidemiology. * https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3195268/ *** PART 1 - Construction of Analytical Dataset * Move these three raw data files to $filepath: * (1) "AH03-01 All patients in HIV Care/AH03-01 All patients in HIV Care.dta" * (2) "AH03-02 All Lab Tests/AH03-02 All Lab Tests.dta" * (3) "AH03-03 ART Clinic Visits/AH03-03 ART Clinic Visits.dta" * REMOVE "/*" ON LINE 41 AND "*/" ON LINE 235 TO EXECUTE PART 1 /* global filepath = "/Users/jacobbor/Dropbox/RD350_retention" clear clear matrix clear mata set mem 150m set more off cd $filepath *** Extract patient-level data use "AH03-01 All patients in HIV Care/AH03-01 All patients in HIV Care.dta", clear foreach v of varlist DateOfBirth DateOfDeath DateOfInitiation DateOfEarliestCD4 DateLastFollowedUp LastClinicVisitDate DateLatestPCcollected DateLastPickedUpMedication DateOfLastWeightKgBeforeInit DateOfLastHeightCmBeforeInit DateOfLastHIVStagingChildrenBefo DateOfLastHIVStagingAdultsBefore DateRegdAtEarliestClinic ARTDateEligible ARTDateTrainingStarted DoctorsPrescriptionDate { local newvar = substr("`v'",1,25) rename `v' _`newvar' gen `v' = date(_`newvar',"YMD") format `v' %td drop _`newvar' } local variables "PersonNo ACDIS_IIntId HIS_PatientId DateOfBirth DateOfDeath Sex EarliestRegistrationClinic* CauseOfDeath Isigodi DateOfEarliestCD4Count EarliestCD4Count EarliestCD4Pct DateOfInitiation LastCD4DateBeforeInitiation LastCD4CountBeforeInitiation LastCD4PctBeforeInitiation IsTransferIn IsTransferinName CurrentFollowUpStatus CurrentFollowUpStatusName DateLastFollowedUp LastClinicVisitDate DateLatestPCcollected DateLastPickedUpMedication DateOfLastWeightKgBeforeInit LastWeightKgBeforeInit DateOfLastHeightCmBeforeInit LastHeightCmBeforeInit DateOfLastHIVStagingChildrenBefo LastHIVStagingChildrenBeforeInit DateOfLastHIVStagingAdultsBefore LastHIVStagingAdultsBeforeInit DateRegdAtEarliestClinic DateOfEarliestCD4 ARTDateEligible ARTDateTrainingStarted DateOfInitiation DoctorsPrescriptionDate ARTDaysEligibleToInitiation DisclosedHIVStatusToAnyone" local variables "PersonNo DateOfBirth Sex EarliestRegistrationClinicName DateOfEarliestCD4Count EarliestCD4Count DateOfInitiation" keep `variables' order `variables' *** Recode variables gen female = Sex == "F" replace female = . if Sex == "U" gen age = (DateOfEarliestCD4 - DateOfBirth) / 365.24 egen agecat = cut(age), at(0,18,25,30,35,40,45,55,999) /* for public use data */ gen QtrOfEarliestCD4Count = qofd(DateOfEarliestCD4Count) /* for public use data */ format QtrOfEarliestCD4Count %tq gen Clinic_A = EarliestRegistrationClinicName == "SUPPRESSED" gen Clinic_B = EarliestRegistrationClinicName == "SUPPRESSED" gen Clinic_C = EarliestRegistrationClinicName == "SUPPRESSED" gen Clinic_Other = (1-Clinic_A)*(1-Clinic_B)*(1-Clinic_C) *** Limit data to study population * Exclusions count drop if DateOfEarliestCD4 == . drop if EarliestCD4Count == . drop if DateOfEarliestCD4 < td(12aug2011) drop if DateOfEarliestCD4 >= td(1jan2013) /* ensure at least one year of potential follow-up for all patients (follow up ends Jan2014)*/ count if DateOfEarliestCD4 > DateOfInitiation local N_drop = r(N) drop if DateOfEarliestCD4 > DateOfInitiation count dis `N_drop'/(`N_drop' + r(N)) save "temp Hlabisa350 patients.dta", replace *** Bring in labs and clinic visits data * Extract routine clinic visits (labeled as "TestDate") use "AH03-03 ART Clinic Visits/AH03-03 ART Clinic Visits.dta", clear keep if VisitType ==8 keep PersonNo VisitDate rename VisitDate _VisitDate gen VisitDate = date(_VisitDate,"YMD") format VisitDate %td drop _VisitDate * Now, bring in routine labs data append using "AH03-02 All Lab Tests/AH03-02 All Lab Tests.dta", keep(PersonNo TestDate ViralLoad ViralLoadBelowLDL CD4Count) * drop lab tests that were not CD4/VL tests keep if ViralLoad < . | CD4Count < . | VisitDate < . keep PersonNo VisitDate TestDate * merge back into patient-level data merge m:1 PersonNo using "temp Hlabisa350 patients.dta" keep if _merge ==3 drop _merge rename TestDate _TestDate gen TestDate = date(_TestDate ,"YMD") format TestDate %td drop _TestDate * include DatesOfInitiation as a "testdate" gen TestOrVisitDate = TestDate replace TestOrVisitDate = VisitDate if TestDate ==. sort PersonNo TestOrVisitDate by PersonNo: gen rank = _n expand 2 if rank ==1, gen(new) replace TestDate = . if new ==1 replace VisitDate = . if new ==1 replace TestDate = DateOfInitiation if new == 1 replace TestOrVisitDate = DateOfInitiation if new == 1 sort PersonNo TestOrVisitDate drop new *** Define ART Initiation Outcomes cap drop ART_* gen ART_2w = 0 replace ART_2w = 1 if DateOfInitiation < DateOfEarliestCD4Count + 14 gen ART_6m = 0 replace ART_6m = 1 if DateOfInitiation < DateOfEarliestCD4Count + 182 gen ART_12m = 0 replace ART_12m = 1 if DateOfInitiation < DateOfEarliestCD4Count + 365 *** Define Retention in Care Outcomes * Note: for followup out to 18 and 24 mo, restrict sample to people presenting with at least 18 (or 24) mo follow-up sort PersonNo TestOrVisitDate * define main outcome - retention at 12mo (6-18), including routine clinic visits gen _visit_test_6_18 = TestOrVisitDate > DateOfEarliestCD4Count + 182 & TestOrVisitDate < DateOfEarliestCD4Count + 365 + 182 by PersonNo: egen visit_test_6_18 = max(_visit_test_6_18) by PersonNo: replace visit_test_6_18 = . if _n>1 replace visit_test_6_18 = . if DateOfEarliestCD4Count + 365 +182 > td(1jan2014) * alternate definition, without clinic visits gen _test_6_18 = TestDate > DateOfEarliestCD4Count + 182 & TestDate < DateOfEarliestCD4Count + 365 + 182 by PersonNo: egen test_6_18 = max(_test_6_18) by PersonNo: replace test_6_18 = . if _n>1 replace test_6_18 = . if DateOfEarliestCD4Count + 365 +182 > td(1jan2014) * time-path of retention, w/o clinic visits gen _test_0_6 = TestDate > DateOfEarliestCD4Count + 0 & TestDate < DateOfEarliestCD4Count + 182 by PersonNo: egen test_0_6 = max(_test_0_6) by PersonNo: replace test_0_6 = . if _n>1 gen _test_6_12 = TestDate > DateOfEarliestCD4Count + 182 & TestDate < DateOfEarliestCD4Count + 365 by PersonNo: egen test_6_12 = max(_test_6_12) by PersonNo: replace test_6_12 = . if _n>1 gen _test_12_18 = TestDate > DateOfEarliestCD4Count + 365 & TestDate < DateOfEarliestCD4Count + 365 + 182 by PersonNo: egen test_12_18 = max(_test_12_18) by PersonNo: replace test_12_18 = . if _n>1 replace test_12_18 = . if DateOfEarliestCD4Count + 365 +182 > td(1jan2014) gen _test_18_24 = TestDate > DateOfEarliestCD4Count + 365 + 182 & TestDate < DateOfEarliestCD4Count + 365 + 365 by PersonNo: egen test_18_24 = max(_test_18_24) by PersonNo: replace test_18_24 = . if _n>1 replace test_18_24 = . if DateOfEarliestCD4Count + 365 +365 > td(1jan2014) * keep only the first obs per patient, so that the dataset is at the level of the patient, not the lab test by PersonNo: keep if _n==1 count if test_6_18 < . count if visit_test_6_18 <. tab test_6_18 visit_test_6_18 *** Drop identifying information for public use data (and extraneous variables) drop _* rank DateOfEarliestCD4Count DateOfInitiation PersonNo VisitDate TestDate TestOrVisitDate DateOfBirth age Sex EarliestRegistrationClinicName order EarliestCD4Count QtrOfEarliestCD4Count female agecat Clinic_A Clinic_B Clinic_C Clinic_Other ART_2w ART_6m ART_12m visit_test_6_18 test_6_18 test_* *** Label variables label var EarliestCD4Count "Value of Earliest CD4 Count, cells per mm3" label var QtrOfEarliestCD4Count "Quarter of Earliest CD4 Count (exact date suppressed)" label var female "Female" label var agecat "Age category (exact age suppressed)" label var Clinic_A "Entered care at Clinic A" label var Clinic_B "Entered care at Clinic B" label var Clinic_C "Entered care at Clinic C" label var Clinic_Other "Entered care at a clinic other than A, B, or C" label var ART_2w "Started ART within 2 weeks of earliest CD4 count" label var ART_6m "Started ART within 6 months of earliest CD4 count" label var ART_12m "Started ART within 12 months of earliest CD4 count" label var visit_test_6_18 "Retained at 12 months; any labs or clinic visits 6-18 mo after earliest CD4" label var test_6_18 "Retained at 12 months; any lab tests 6-18 months after earliest CD4" label var test_0_6 "Retained 0-6 months; any lab tests 0-6 months after earliest CD4" label var test_6_12 "Retained 6-12 months; any lab tests 6-12 months after earliest CD4" label var test_12_18 "Retained 12-18 months; any lab tests 12-18 months after earliest CD4" label var test_18_24 "Retained 18-24 months; any lab tests 18-24 months after earliest CD4" label define agecat 0 "0 to <18 yrs" label define agecat 18 "18 to <25 yrs", add label define agecat 25 "25 to <30 yrs", add label define agecat 30 "30 to <35 yrs", add label define agecat 35 "35 to <40 yrs", add label define agecat 40 "40 to <45 yrs", add label define agecat 45 "45 to <55 yrs", add label define agecat 55 "55+ yrs", add label values agecat agecat save "Hlabisa350 patients, w labs and visits.dta", replace save "S1_Data.dta", replace */ ************************* *** PART 2 - ANALYSIS *** ************************* cd $filepath * install rdrobust package net install st0366.pkg global filepath = "/Users/jbor/Dropbox/RD350_retention/RR2" clear clear matrix clear mata set mem 150m set more off use "S1_Data.dta", clear *** Generate additional variables based on earliest CD4 count gen d350_FirstCD4 = EarliestCD4Count - 350 gen CD4elig = EarliestCD4Count <350 egen CD4cat = cut(EarliestCD4C), at(0(25)800) gen CD4cat_5 = CD4cat + 12.5 *** Descriptive statistics and support for study design count ** Fig 1, Density of first CD4 counts cap drop Yj Xj r0 fhat se_fhat f_uci f_lci DCdensity EarliestCD4Count, breakpoint(350) h(75) generate(Xj Yj r0 fhat se_fhat) nograph gen f_uci = fhat + 1.96*se_fhat gen f_lci = fhat - 1.96*se_fhat local ldiff: display %3.2f round(r(theta), .01) local ldiff_lci: display %3.2f round(r(theta)-1.96*r(se), .01) local ldiff_uci: display %3.2f round(r(theta)+1.96*r(se), .01) dis `ldiff' dis `ldiff_lci' dis `ldiff_uci' twoway (scatter Yj Xj if Xj >=200 & Xj < 500, ms(oh)) /// (line fhat r0 if r0 >=200 & r0 < 350, sort lc(maroon)) /// (line fhat r0 if r0 >=350 & r0 < 500, sort lc(maroon)) /// (line f_uci r0 if r0 >=200 & r0 < 350, sort lc(gs8) lw(thin)) /// (line f_uci r0 if r0 >=350 & r0 < 500, sort lc(gs8) lw(thin)) /// (line f_lci r0 if r0 >=200 & r0 < 350, sort lc(gs8) lw(thin)) /// (line f_lci r0 if r0 >=350 & r0 < 500, sort lc(gs8) lw(thin)) /// , xline(350) xlabel(200(50)500, labsize(med)) legend(off) /// xtitle("Earliest CD4 Count, cells/uL", size(medlarge)) /// ytitle(Density, size(medlarge)) ylabel(0(.001).003, labsize(med) nogrid) /// saving(DCdensity_350_HLABISA_AC, replace) graphregion(col(white)) graph export DCdensity_350_HLABISA_AC.eps, replace dis "Log-difference at discontinuity: is `ldiff', 95% CI: (`ldiff_lci', `ldiff_uci')" ** Table 1, continuity in baseline covariates (NOTE: Age and DateOfEarliestCD4 are made categorical in de-identified data) * Sex, Age, Date rdbwselect female EarliestCD4C, c(350) kernel(uni) all local h_fem = e(h_IK) dis `h_fem' rdbwselect age EarliestCD4C, c(350) kernel(uni) all local h_age = e(h_IK) dis `h_age' rdbwselect DateOfEarliestCD4 EarliestCD4C, c(350) kernel(uni) all local h_date = e(h_IK) dis `h_date' *local h_fem = 143.35493 *local h_age = 124.58904 *local h_date = 153.6015 reg female i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) <`h_fem', r est sto fem reg age i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) <`h_age', r est sto age reg DateOfEarliestCD4 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) <`h_date', r est sto date * Clinic at presentation rdbwselect Clinic_A EarliestCD4C, c(350) kernel(uni) all local h_A = e(h_IK) dis `h_A' rdbwselect Clinic_B EarliestCD4C, c(350) kernel(uni) all local h_B = e(h_IK) dis `h_B' rdbwselect Clinic_C EarliestCD4C, c(350) kernel(uni) all local h_C = e(h_IK) dis `h_C' rdbwselect Clinic_Other EarliestCD4C, c(350) kernel(uni) all local h_Other = e(h_IK) dis `h_Other' *local h_A = 83.177498 *local h_B = 128.16771 *local h_C = 142.9457 reg Clinic_A i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) <`h_A', r est sto A reg Clinic_B i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) <`h_B', r est sto B reg Clinic_C i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) <`h_C', r est sto C reg Clinic_Other i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) <`h_Other', r est sto Other outreg2 [fem age date A B C Other] using RD350observables, replace dec(2) pdec(3) stats(coef ci pval) noaster *** FIRST STAGE - Effect of eligibility on ART uptake ** Fig S1, Fig 3, and Table 2 (column 1) replace ART_2w = 100*ART_2w replace ART_6m = 100*ART_6m replace ART_12m = 100*ART_12m * estimate binned averages reg ART_2w i.CD4cat, r predict ART_2w_hat reg ART_6m i.CD4cat, r predict ART_6m_hat reg ART_12m i.CD4cat, r predict ART_12m_hat ** Fig S1 bys CD4cat_5: gen firstobs = 1 if _n==1 twoway (scatter ART_2w_hat CD4cat_5 if firstobs==1, msymbol(+)) /// (scatter ART_6m_hat CD4cat_5 if firstobs==1, msymbol(Oh)) /// (scatter ART_12m_hat CD4cat_5 if firstobs==1, msymbol(t)) /// , legend(order(3 "12 months" 2 "6 months" 1 "2 weeks") pos(2) ring(0) region(lc(white)) col(1)) /// ylabel(0(20)60, labsize(med) nogrid) ytitle(Percent Started ART, size(medlarge)) xline(350) xlabel(0(200)800, labsize(med)) /// xtitle("Earliest CD4 Count, cells/{&mu}L", size(medlarge)) saving(FirstStageRD350_Appendix, replace) graphregion(col(white)) graph export FirstStageRD350_Appendix.eps, replace ** Fig 3 * choose bw for first stage RD model rdbwselect ART_6m EarliestCD4C, c(350) kernel(uni) all local h_firststage = e(h_IK) dis `h_firststage' * 96.42979 local h_firststage = 96.42979 local l_bw = 350 - `h_firststage' local r_bw = 350 + `h_firststage' twoway (scatter ART_6m_hat CD4cat_5 if firstobs==1, msymbol(oh)) /// (lpolyci ART_6m EarliestCD4C if EarliestCD4C < 350 & EarliestCD4C>=0, deg(1) bw(`h_firststage') fc(none) alc(gs8) alw(thin)) /// (lpolyci ART_6m EarliestCD4C if EarliestCD4C < 800 & EarliestCD4C>=350, deg(1) bw(`h_firststage') lc(maroon) fc(none) alc(gs8) alw(thin)), /// ylabel(0(20)60, labsize(med) nogrid) ytitle("Percent Starting ART Within 6 Months", size(medlarge)) /// xline(350) xline(`l_bw' `r_bw', lp(shortdash) lw(vthin) lc(gs4)) xlabel(0(200)800, labsize(med)) xtitle("Earliest CD4 Count, cells/{&mu}L", size(medlarge)) /// legend(off) saving(FirstStageRD350, replace) graphregion(col(white)) graph export FirstStageRD350.eps, replace *** INTENT-TO-TREAT: Effect of eligibility on retention ** Fig 2, ITT effect of eligibility on 12-month retention replace visit_test_6_18 = 100*visit_test_6_18 * binned averages reg visit_test_6_18 i.CD4cat, r predict visit_test_6_18_hat * choose bw for reduced form RD model rdbwselect visit_test_6_18 EarliestCD4C, c(350) kernel(uni) all local h_rf_6_18_alt = e(h_IK) dis `h_rf_6_18_alt' *142.1 local h_rf_6_18_alt = 142.1 local l_bw = 350 - `h_rf_6_18_alt' local r_bw = 350 + `h_rf_6_18_alt' twoway (scatter visit_test_6_18_hat CD4cat_5 if firstobs==1, msymbol(oh)) /// (lpolyci visit_test_6_18 EarliestCD4C if EarliestCD4C < 350 & EarliestCD4C>=0, deg(1) bw(`h_rf_6_18_alt') fc(none) alc(gs8) alw(thin)) /// (lpolyci visit_test_6_18 EarliestCD4C if EarliestCD4C < 800 & EarliestCD4C>=350, deg(1) bw(`h_rf_6_18_alt') lc(maroon) fc(none) alc(gs8) alw(thin)), /// ylabel(0(20)60, labsize(med) nogrid) ytitle(Percent Retained at 12 Months, size(medlarge)) /// xline(350) xline(`l_bw' `r_bw', lp(shortdash) lw(vthin) lc(gs4)) xlabel(0(200)800, labsize(med)) xtitle("Earliest CD4 Count, cells/uL", size(medlarge)) /// legend(off) saving(ITT_visit_test_6_18_RD350, replace) graphregion(color(white)) graph export ITT_visit_test_6_18_RD350.eps, replace ** Fig S2, alternate definition of retention replace test_6_18 = 100*test_6_18 reg test_6_18 i.CD4cat, r predict test_6_18_hat rdbwselect test_6_18 EarliestCD4C, c(350) kernel(uni) all local h_rf_6_18 = e(h_IK) dis `h_rf_6_18' local h_rf_6_18 = 116.8066 twoway (scatter test_6_18_hat CD4cat_5 if firstobs==1, msymbol(oh)) /// (lpolyci test_6_18 EarliestCD4C if EarliestCD4C < 350 & EarliestCD4C>=0, deg(1) bw(`h_rf_6_18') fc(none) alc(gs8) alw(thin)) /// (lpolyci test_6_18 EarliestCD4C if EarliestCD4C < 800 & EarliestCD4C>=350, deg(1) bw(`h_rf_6_18') lc(maroon) fc(none) alc(gs8) alw(thin)), /// ylabel(0(20)60, labsize(med) nogrid) ytitle(Percent Retained at 12 Months, size(medlarge)) /// xline(350) xlabel(0(200)800, labsize(med)) xtitle("Earliest CD4 Count, cells/uL", size(medlarge)) /// legend(off) saving(ITT_test_6_18_RD350, replace) graphregion(color(white)) graph export ITT_test_6_18_RD350.eps, replace ** Fig S3, Retention in 6-mo intervals (based just on lab results; combining visits and lab results gives similar results) replace test_0_6 = 100*test_0_6 replace test_6_12 = 100*test_6_12 replace test_12_18 = 100*test_12_18 replace test_18_24 = 100*test_18_24 * binned averages reg test_0_6 i.CD4cat, r predict test_0_6_hat reg test_6_12 i.CD4cat, r predict test_6_12_hat reg test_12_18 i.CD4cat, r predict test_12_18_hat reg test_18_24 i.CD4cat, r predict test_18_24_hat rdbwselect test_0_6 EarliestCD4C, c(350) kernel(uni) all local h_rf_0_6 = e(h_IK) rdbwselect test_6_12 EarliestCD4C, c(350) kernel(uni) all local h_rf_6_12 = e(h_IK) rdbwselect test_12_18 EarliestCD4C, c(350) kernel(uni) all local h_rf_12_18 = e(h_IK) rdbwselect test_18_24 EarliestCD4C, c(350) kernel(uni) all local h_rf_18_24 = e(h_IK) dis `h_rf_0_6' *114.24279 dis `h_rf_6_12' *164.74241 dis `h_rf_12_18' *125.3563 dis `h_rf_18_24' *164.15499 twoway (scatter test_0_6_hat CD4cat_5 if firstobs==1, msymbol(oh)) /// (lpolyci test_0_6 EarliestCD4C if EarliestCD4C < 350 & EarliestCD4C>=0, deg(1) bw(`h_rf_0_6') fc(none) alc(gs8) alw(thin)) /// (lpolyci test_0_6 EarliestCD4C if EarliestCD4C < 800 & EarliestCD4C>=350, deg(1) bw(`h_rf_0_6') lc(maroon) fc(none) alc(gs8) alw(thin)), /// ylabel(0(20)60, labsize(med) nogrid) ytitle(Retained 0-6 mo, size(medlarge)) /// xline(350) xlabel(0(200)800, labsize(med)) xtitle("Earliest CD4 Count, cells/uL", size(medlarge)) /// legend(off) saving(ITT_test_0_6_RD350, replace) graphregion(color(white)) graph export ITT_test_0_6_RD350.eps, replace twoway (scatter test_6_12_hat CD4cat_5 if firstobs==1, msymbol(oh)) /// (lpolyci test_6_12 EarliestCD4C if EarliestCD4C < 350 & EarliestCD4C>=0, deg(1) bw(`h_rf_6_12') fc(none) alc(gs8) alw(thin)) /// (lpolyci test_6_12 EarliestCD4C if EarliestCD4C < 800 & EarliestCD4C>=350, deg(1) bw(`h_rf_6_12') lc(maroon) fc(none) alc(gs8) alw(thin)), /// ylabel(0(20)60, labsize(med) nogrid) ytitle(Retained 6-12 mo, size(medlarge)) /// xline(350) xlabel(0(200)800, labsize(med)) xtitle("Earliest CD4 Count, cells/uL", size(medlarge)) /// legend(off) saving(ITT_test_6_12_RD350, replace) graphregion(color(white)) graph export ITT_test_6_12_RD350.eps, replace twoway (scatter test_12_18_hat CD4cat_5 if firstobs==1, msymbol(oh)) /// (lpolyci test_12_18 EarliestCD4C if EarliestCD4C < 350 & EarliestCD4C>=0, deg(1) bw(`h_rf_12_18') fc(none) alc(gs8) alw(thin)) /// (lpolyci test_12_18 EarliestCD4C if EarliestCD4C < 800 & EarliestCD4C>=350, deg(1) bw(`h_rf_12_18') lc(maroon) fc(none) alc(gs8) alw(thin)), /// ylabel(0(20)60, labsize(med) nogrid) ytitle(Retained 12-18 mo, size(medlarge)) /// xline(350) xlabel(0(200)800, labsize(med)) xtitle("Earliest CD4 Count, cells/uL", size(medlarge)) /// legend(off) saving(ITT_test_12_18_RD350, replace) graphregion(color(white)) graph export ITT_test_12_18_RD350.eps, replace twoway (scatter test_18_24_hat CD4cat_5 if firstobs==1, msymbol(oh)) /// (lpolyci test_18_24 EarliestCD4C if EarliestCD4C < 350 & EarliestCD4C>=0, deg(1) bw(`h_rf_18_24') fc(none) alc(gs8) alw(thin)) /// (lpolyci test_18_24 EarliestCD4C if EarliestCD4C < 800 & EarliestCD4C>=350, deg(1) bw(`h_rf_18_24') lc(maroon) fc(none) alc(gs8) alw(thin)), /// ylabel(0(20)60, labsize(med) nogrid) ytitle(Retained 18-24 mo, size(medlarge)) /// xline(350) xlabel(0(200)800, labsize(med)) xtitle("Earliest CD4 Count, cells/uL", size(medlarge)) /// legend(off) saving(ITT_test_18_24_RD350, replace) graphregion(color(white)) graph export ITT_test_18_24_RD350.eps, replace graph combine ITT_test_0_6_RD350.gph ITT_test_6_12_RD350.gph ITT_test_12_18_RD350.gph ITT_test_18_24_RD350.gph, /// col(2) saving(retention_combined_RD350, replace) graphregion(color(white)) graph export retention_combined_RD350.eps, replace *** Regression results ** Table 2 and Table S1, Intent-to-treat regression results * calculate IK bandwidth rdbwselect ART_6m EarliestCD4C, c(350) kernel(uni) all local h_ART_0_6 = e(h_IK) rdbwselect test_6_18 EarliestCD4C, c(350) kernel(uni) all local h_rf_6_18 = e(h_IK) rdbwselect visit_test_6_18 EarliestCD4C, c(350) kernel(uni) all local h_rf_6_18_alt = e(h_IK) rdbwselect test_0_6 EarliestCD4C, c(350) kernel(uni) all local h_rf_0_6 = e(h_IK) rdbwselect test_6_12 EarliestCD4C, c(350) kernel(uni) all local h_rf_6_12 = e(h_IK) rdbwselect test_12_18 EarliestCD4C, c(350) kernel(uni) all local h_rf_12_18 = e(h_IK) rdbwselect test_18_24 EarliestCD4C, c(350) kernel(uni) all local h_rf_18_24 = e(h_IK) dis ` h_ART_0_6' *96.42979 dis `h_rf_6_18' *116.8066 dis `h_rf_6_18_alt' *142.0977 dis `h_rf_0_6' *114.24279 dis `h_rf_6_12' *164.74241 dis `h_rf_12_18' *125.3563 dis `h_rf_18_24' *164.15499 reg ART_6m i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) <`h_ART_0_6', r est sto fs margins CD4elig, at(d350_FirstCD4==0) reg visit_test_6_18 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) <`h_rf_6_18_alt', r est sto r_main margins CD4elig, at(d350_FirstCD4==0) reg test_0_6 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) <`h_rf_0_6', r est sto r_1 margins CD4elig, at(d350_FirstCD4==0) reg test_6_12 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) <`h_rf_6_12', r est sto r_2 margins CD4elig, at(d350_FirstCD4==0) reg test_12_18 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) <`h_rf_12_18', r est sto r_3 margins CD4elig, at(d350_FirstCD4==0) reg test_18_24 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) <`h_rf_18_24', r est sto r_4 margins CD4elig, at(d350_FirstCD4==0) reg test_6_18 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) <`h_rf_6_18', r est sto r_alt margins CD4elig, at(d350_FirstCD4==0) outreg2 [fs r_main r_1 r_2 r_3 r_4 r_alt] using RD350regs, replace dec(2) pdec(3) stats(coef ci pval) noaster ** Table S2, Robustness, for Appendix: bw = 50 estimates clear reg ART_6m i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) < 50, r est sto fs reg visit_test_6_18 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) < 50, r est sto r_main reg test_0_6 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) < 50, r est sto r_1 reg test_6_12 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) < 50, r est sto r_2 reg test_12_18 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) < 50, r est sto r_3 reg test_18_24 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) < 50, r est sto r_4 reg test_6_18 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) < 50, r est sto r_alt outreg2 [fs r_main r_1 r_2 r_3 r_4 r_alt] using RD350regs_bw50, replace dec(2) pdec(3) stats(coef ci pval) noaster ** Table S3, Robustness, for Appendix: bw = 100 estimates clear reg ART_6m i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) < 100, r est sto fs reg visit_test_6_18 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) < 100, r est sto r_main reg test_0_6 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) < 100, r est sto r_1 reg test_6_12 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) < 100, r est sto r_2 reg test_12_18 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) < 100, r est sto r_3 reg test_18_24 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) < 100, r est sto r_4 reg test_6_18 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) < 100, r est sto r_alt outreg2 [fs r_main r_1 r_2 r_3 r_4 r_alt] using RD350regs_bw100, replace dec(2) pdec(3) stats(coef ci pval) noaster ** Table S4, Robustness, for Appendix: bw = 150 estimates clear reg ART_6m i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) < 150, r est sto fs reg visit_test_6_18 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) < 150, r est sto r_main reg test_0_6 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) < 150, r est sto r_1 reg test_6_12 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) < 150, r est sto r_2 reg test_12_18 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) < 150, r est sto r_3 reg test_18_24 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) < 150, r est sto r_4 reg test_6_18 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) < 150, r est sto r_alt outreg2 [fs r_main r_1 r_2 r_3 r_4 r_alt] using RD350regs_bw150, replace dec(2) pdec(3) stats(coef ci pval) noaster ** Table S5, Robustness, for Appendix: bw = 200 estimates clear reg ART_6m i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) < 200, r est sto fs reg visit_test_6_18 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) < 200, r est sto r_main reg test_0_6 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) < 200, r est sto r_1 reg test_6_12 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) < 200, r est sto r_2 reg test_12_18 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) < 200, r est sto r_3 reg test_18_24 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) < 200, r est sto r_4 reg test_6_18 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) < 200, r est sto r_alt outreg2 [fs r_main r_1 r_2 r_3 r_4 r_alt] using RD350regs_bw200, replace dec(2) pdec(3) stats(coef ci pval) noaster ** Table S6, Robustness, local LOGISTIC regression model *IK bandwidths for linear model local h_ART_0_6 = 96.42979 local h_rf_6_18_alt = 142.09767 local h_rf_0_6 = 114.24279 local h_rf_6_12 = 164.74241 local h_rf_12_18 = 125.3563 local h_rf_18_24 = 164.15499 local h_rf_6_18 = 116.8066 *using logistic regression instead, and obtaining predicted margins at the threshold logistic ART_6m i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) <`h_ART_0_6', r margins, at(d350_FirstCD4==0) by(CD4elig) margins, at(d350_FirstCD4==0) dydx(CD4elig) logistic visit_test_6_18 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) <`h_rf_6_18_alt', r margins, at(d350_FirstCD4==0) by(CD4elig) margins, at(d350_FirstCD4==0) dydx(CD4elig) logistic test_0_6 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) <`h_rf_0_6', r margins, at(d350_FirstCD4==0) by(CD4elig) margins, at(d350_FirstCD4==0) dydx(CD4elig) logistic test_6_12 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) <`h_rf_6_12', r margins, at(d350_FirstCD4==0) by(CD4elig) margins, at(d350_FirstCD4==0) dydx(CD4elig) logistic test_12_18 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) <`h_rf_12_18', r margins, at(d350_FirstCD4==0) by(CD4elig) margins, at(d350_FirstCD4==0) dydx(CD4elig) logistic test_18_24 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) <`h_rf_18_24', r margins, at(d350_FirstCD4==0) by(CD4elig) margins, at(d350_FirstCD4==0) dydx(CD4elig) logistic test_6_18 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) <`h_rf_6_18', r margins, at(d350_FirstCD4==0) by(CD4elig) margins, at(d350_FirstCD4==0) dydx(CD4elig) * note: table S6 constructed by hand *** Fuzzy RD / Instrumental Variables Analysis *** Note: to simplify, we use a bandwidth of 100 throughout. ** Complier average causal effect (CACE) / Local average treatment effect (LATE) estimate * first stage (effect of eligibility on ART uptake within first 6mo) reg ART_6m i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) <100 & visit_test_6_18 <., r scalar define fs = _b[1.CD4elig] * intent-to-treat / reduced form (effect of eligibility on retention at 12mo) reg visit_test_6_18 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) <100, r est sto alt_rf scalar define rf = _b[1.CD4elig] * wald estimator dis rf/fs * two-stage least squares ivregress 2sls visit_test_6_18 i.CD4elig#c.d350_FirstCD4 c.d350_FirstCD4 (ART_6m=CD4elig) if abs(d350_FirstCD4) <100 scalar define late_2sls = _b[ART_6m] ** Control complier mean * obtain p_d1z1 and p_d1z0 reg ART_6m i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) <100 & visit_test_6_18 <., r scalar define p_d1z1 = _b[1.CD4elig] + _b[_cons] scalar define p_d1z0 = _b[_cons] * obtain Ey_d0z1 and Ey_d0z0 from "conditional on not being treated" regression. reg visit_test_6_18 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) <100 & ART_6m==0, r est sto alt_rf_c scalar define Ey_d0z1 = _b[1.CD4elig] + _b[_cons] scalar define Ey_d0z0 = _b[_cons] * calculate control complier mean scalar define Ey0_c = (Ey_d0z1*(1-p_d1z1) - Ey_d0z0*(1-p_d1z0)) / ((1-p_d1z1) - (1-p_d1z0)) scalar list Ey0_c ** Treated complier mean * obtain Ey_d1z1 and Ey_d1z0 from "conditional on treated" regression. reg visit_test_6_18 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) <100 & ART_6m==100, r est sto alt_rf_t scalar define Ey_d1z1 = _b[1.CD4elig] + _b[_cons] scalar define Ey_d1z0 = _b[_cons] * calculate control complier mean scalar define Ey1_c = (Ey_d1z1*(p_d1z1) - Ey_d1z0*(p_d1z0)) / (p_d1z1 - p_d1z0) scalar list Ey1_c scalar list *LATE two ways: estimates are close dis "LATE 2sls = " late_2sls dis "Absolute risk difference | compliers = " Ey1_c - Ey0_c * Note, for consistency across estimates, we estimate the control complier mean as: Ey1_c - late * However, similar results are obtained using the above estimates of Ey1_c and Ey0_c, or replacing Ey1_c with Ey0_c + late * These alternate estimates can be computed using the regression output from Table S7. ** Table S7 - conditional on treated / untreated regressions used for the fuzzy RDD estimates outreg2 [fs rf rf_c rf_t alt_rf alt_rf_c alt_rf_t] using RD350_cond_on_trt, replace dec(2) pdec(3) stats(coef ci pval) noaster ** Table 3, Fuzzy RDD Results * TC dis Ey1_c * CC dis Ey1_c - 100*late_2sls * AT, estimated in conditional-on-treated regression (as 350 approached from above) dis Ey_d1z0 * NT, estimated in conditional-on-untreated regression (as 350 approached from below) dis Ey_d0z1 *CACE dis "Complier Average Causal Effect (CACE) = " late_2sls*100 *CRR dis "Causal Relative Risk | compliers = " Ey1_c/(Ey1_c - 100*late_2sls) dis "Causal Relative Risk of attrition | compliers = " (100-Ey1_c)/(100-(Ey1_c - 100*late_2sls)) * Bootstrap to get CI's for fuzzy RD estimates cap program drop fuzzyRDD program define fuzzyRDD, rclass preserve keep if abs(d350_FirstCD4) <100 & visit_test_6_18 <. * Using visit_test_6_18 * Fuzzy RD - using bw = 100 qui reg ART_6m i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) <100 & visit_test_6_18 <., r scalar define fs = _b[1.CD4elig] scalar define p_d1z1 = _b[1.CD4elig] + _b[_cons] scalar define p_d1z0 = _b[_cons] qui reg visit_test_6_18 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) <100, r scalar define rf = _b[1.CD4elig] scalar define late_wald = rf/fs return scalar late_wald = rf/fs *treated complier mean qui reg visit_test_6_18 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) <100 & ART_6m==100, r scalar define Ey_d1z1 = _b[1.CD4elig] + _b[_cons] * AT scalar define Ey_d1z0 = _b[_cons] return scalar Ey_d1z0 = _b[_cons] * TC scalar define Ey1_c = (Ey_d1z1*(p_d1z1) - Ey_d1z0*(p_d1z0)) / (p_d1z1 - p_d1z0) return scalar Ey1_c = (Ey_d1z1*(p_d1z1) - Ey_d1z0*(p_d1z0)) / (p_d1z1 - p_d1z0) * CC scalar define Ey0_c = Ey1_c - late_wald return scalar Ey0_c = Ey1_c - late_wald *CRR return scalar CRR = (1-Ey1_c)/(1-Ey0_c) *treated complier mean qui reg visit_test_6_18 i.CD4elig##c.d350_FirstCD4 if abs(d350_FirstCD4) <100 & ART_6m==0, r * NT, estimated in conditional-on-untreated regression (as 350 approached from below) return scalar Ey_d0z1 = _b[1.CD4elig] + _b[_cons] restore end fuzzyRDD return list bootstrap r(CRR) r(Ey_d1z0) r(Ey_d0z1) r(Ey1_c) r(Ey0_c) r(late_wald), r(501): fuzzyRDD estat bootstrap, p estat bootstrap, all