In this notebook, you will compile your following report. For each step, show one figure that justifies your inclusion or exclusion of variables.
Remember, you can save your figures to the images
folder and include them using the following syntax:
Each group should submit the HTML preview to laderast@ohsu.edu (it will be called final_report.nb.html
) with your group name (rename the file final_report_groupx.nb.html
. Make sure to fill out the author
field above with everyone’s name!
Don’t worry about removing variables from the model at each step. We’re only adding variables to the model. Also, you don’t have to show the performance of the model after each step, only show the performance of the final model.
Step 1: Initial Model
Choose your initial model from the following. You don’t need to show any figures here.
any_cvd
(your outcome)
age_s1
gender
bmi_s1
neck20
#show your code for the basic model here
Step 2: Do you add race
to your model?
Put a short definition of race
from your model. If you think it is important to add race and you are satisfied with the quality in the dataset, show a figure here. If you don’t think it’s important or you aren’t satisfied with the quality of the race
variable, show a figure here.
#put model code here
Step 3: Do you add hypertension
to your model?
Investigate adding one of these variables to your model. If you think the variable is important, show one figure for including it. Talk about your choice of variable, how it is measured/calculated, and its impact on your model.
#put model code here
Step 4: Apnea Hypopnea Index
Investigate adding one of these variables to your model. If you think the variable is important, show one figure for including it. Talk about your choice of variable, how it is measured/calculated, and its impact on your model.
#put model code here
Step 5: Evaluation of Final Model
Assess the impact of selecting complete cases for your covariate. If you like, you can show a before/after vis_dat
for your set of variables (before dropping NAs and after dropping NAs). At the very least, show the number of rows before and after.
For your test set, calculate your predicted probabilities and plot them as a histogram. Choose a threshold based on your priorities (do you want reduced false positives/false negatives?), and assess the accuracy/balanced accuracy of your thresholded model.
Were the sleep covariates (neck20
, ahi_a0h3
, and ahi_a0h4
) useful in predicting any_cvd
? Talk about why or why not.
Given your final results, how would you recommend the model be used?
LS0tDQp0aXRsZTogIllvdXIgRmluYWwgTW9kZWwiDQphdXRob3I6ICJQdXQgQXV0aG9yIE5hbWVzIEhlcmUiDQpvdXRwdXQ6IGh0bWxfbm90ZWJvb2sNCmVkaXRvcl9vcHRpb25zOiANCiAgY2h1bmtfb3V0cHV0X3R5cGU6IGlubGluZQ0KLS0tDQoNCmBgYHtyIHNldHVwLCBpbmNsdWRlPUZBTFNFfQ0KbGlicmFyeShicm9vbSkNCmxpYnJhcnkodGlkeXIpDQpsaWJyYXJ5KGRwbHlyKQ0KbGlicmFyeSh2aXNkYXQpDQpsaWJyYXJ5KGNhcmV0KQ0KDQpzaGhzX2RhdGEgPC0gcmVhZFJEUygiZGF0YS9jb21tb25fZGF0YV9zbWFsbC5yZHMiKQ0KYGBgDQoNCkluIHRoaXMgbm90ZWJvb2ssIHlvdSB3aWxsIGNvbXBpbGUgeW91ciBmb2xsb3dpbmcgcmVwb3J0LiBGb3IgZWFjaCBzdGVwLCBzaG93IG9uZSBmaWd1cmUgdGhhdCBqdXN0aWZpZXMgeW91ciBpbmNsdXNpb24gb3IgZXhjbHVzaW9uIG9mIHZhcmlhYmxlcy4gDQoNClJlbWVtYmVyLCB5b3UgY2FuIHNhdmUgeW91ciBmaWd1cmVzIHRvIHRoZSBgaW1hZ2VzYCBmb2xkZXIgYW5kIGluY2x1ZGUgdGhlbSB1c2luZyB0aGUgZm9sbG93aW5nIHN5bnRheDoNCg0KIVtdKGltYWdlcy9hY2UucG5nKQ0KDQpFYWNoIGdyb3VwIHNob3VsZCBzdWJtaXQgdGhlIEhUTUwgcHJldmlldyB0byBsYWRlcmFzdEBvaHN1LmVkdSAoaXQgd2lsbCBiZSBjYWxsZWQgYGZpbmFsX3JlcG9ydC5uYi5odG1sYCkgd2l0aCB5b3VyIGdyb3VwIG5hbWUgKHJlbmFtZSB0aGUgZmlsZSBgZmluYWxfcmVwb3J0X2dyb3VweC5uYi5odG1sYC4gTWFrZSBzdXJlIHRvIGZpbGwgb3V0IHRoZSBgYXV0aG9yYCBmaWVsZCBhYm92ZSB3aXRoIGV2ZXJ5b25lJ3MgbmFtZSENCg0KRG9uJ3Qgd29ycnkgYWJvdXQgcmVtb3ZpbmcgdmFyaWFibGVzIGZyb20gdGhlIG1vZGVsIGF0IGVhY2ggc3RlcC4gV2UncmUgb25seSBhZGRpbmcgdmFyaWFibGVzIHRvIHRoZSBtb2RlbC4gQWxzbywgeW91IGRvbid0IGhhdmUgdG8gc2hvdyB0aGUgcGVyZm9ybWFuY2Ugb2YgdGhlIG1vZGVsIGFmdGVyIGVhY2ggc3RlcCwgb25seSBzaG93IHRoZSBwZXJmb3JtYW5jZSBvZiB0aGUgZmluYWwgbW9kZWwuDQoNCiMjIFN0ZXAgMTogSW5pdGlhbCBNb2RlbA0KDQoqQ2hvb3NlIHlvdXIgaW5pdGlhbCBtb2RlbCBmcm9tIHRoZSBmb2xsb3dpbmcuIFlvdSBkb24ndCBuZWVkIHRvIHNob3cgYW55IGZpZ3VyZXMgaGVyZS4qDQoNCi0gYGFueV9jdmRgICh5b3VyIG91dGNvbWUpDQoNCi0gYGFnZV9zMWANCi0gYGdlbmRlcmANCi0gYGJtaV9zMWANCi0gYG5lY2syMGANCg0KYGBge3J9DQojc2hvdyB5b3VyIGNvZGUgZm9yIHRoZSBiYXNpYyBtb2RlbCBoZXJlDQpgYGANCg0KIyMgU3RlcCAyOiBEbyB5b3UgYWRkIGByYWNlYCB0byB5b3VyIG1vZGVsPw0KDQoqUHV0IGEgc2hvcnQgZGVmaW5pdGlvbiBvZiBgcmFjZWAgZnJvbSB5b3VyIG1vZGVsLiBJZiB5b3UgdGhpbmsgaXQgaXMgaW1wb3J0YW50IHRvIGFkZCByYWNlIGFuZCB5b3UgYXJlIHNhdGlzZmllZCB3aXRoIHRoZSBxdWFsaXR5IGluIHRoZSBkYXRhc2V0LCBzaG93IGEgZmlndXJlIGhlcmUuIElmIHlvdSBkb24ndCB0aGluayBpdCdzIGltcG9ydGFudCBvciB5b3UgYXJlbid0IHNhdGlzZmllZCB3aXRoIHRoZSBxdWFsaXR5IG9mIHRoZSBgcmFjZWAgdmFyaWFibGUsIHNob3cgYSBmaWd1cmUgaGVyZS4qDQoNCmBgYHtyfQ0KI3B1dCBtb2RlbCBjb2RlIGhlcmUNCmBgYA0KDQojIyBTdGVwIDM6IERvIHlvdSBhZGQgYGh5cGVydGVuc2lvbmAgdG8geW91ciBtb2RlbD8NCg0KKkludmVzdGlnYXRlIGFkZGluZyBvbmUgb2YgdGhlc2UgdmFyaWFibGVzIHRvIHlvdXIgbW9kZWwuIElmIHlvdSB0aGluayB0aGUgdmFyaWFibGUgaXMgaW1wb3J0YW50LCBzaG93IG9uZSBmaWd1cmUgZm9yIGluY2x1ZGluZyBpdC4gVGFsayBhYm91dCB5b3VyIGNob2ljZSBvZiB2YXJpYWJsZSwgaG93IGl0IGlzIG1lYXN1cmVkL2NhbGN1bGF0ZWQsIGFuZCBpdHMgaW1wYWN0IG9uIHlvdXIgbW9kZWwuKg0KDQotIGBodG5kZXJ2X3MxYA0KLSBgc3JoeXBlYA0KLSBgc3lzdGJwYA0KDQpgYGB7cn0NCiNwdXQgbW9kZWwgY29kZSBoZXJlDQpgYGANCg0KIyMgU3RlcCA0OiBBcG5lYSBIeXBvcG5lYSBJbmRleA0KDQoqSW52ZXN0aWdhdGUgYWRkaW5nIG9uZSBvZiB0aGVzZSB2YXJpYWJsZXMgdG8geW91ciBtb2RlbC4gSWYgeW91IHRoaW5rIHRoZSB2YXJpYWJsZSBpcyBpbXBvcnRhbnQsIHNob3cgb25lIGZpZ3VyZSBmb3IgaW5jbHVkaW5nIGl0LiBUYWxrIGFib3V0IHlvdXIgY2hvaWNlIG9mIHZhcmlhYmxlLCBob3cgaXQgaXMgbWVhc3VyZWQvY2FsY3VsYXRlZCwgYW5kIGl0cyBpbXBhY3Qgb24geW91ciBtb2RlbC4qDQoNCi0gYGFoaV9hMGgzYA0KLSBgYWhpX2EwaDRgDQoNCmBgYHtyfQ0KI3B1dCBtb2RlbCBjb2RlIGhlcmUNCmBgYA0KDQoNCiMjIFN0ZXAgNTogRXZhbHVhdGlvbiBvZiBGaW5hbCBNb2RlbA0KDQoqQXNzZXNzIHRoZSBpbXBhY3Qgb2Ygc2VsZWN0aW5nIGNvbXBsZXRlIGNhc2VzIGZvciB5b3VyIGNvdmFyaWF0ZS4gSWYgeW91IGxpa2UsIHlvdSBjYW4gc2hvdyBhIGJlZm9yZS9hZnRlciBgdmlzX2RhdGAgZm9yIHlvdXIgc2V0IG9mIHZhcmlhYmxlcyAoYmVmb3JlIGRyb3BwaW5nIE5BcyBhbmQgYWZ0ZXIgZHJvcHBpbmcgTkFzKS4gQXQgdGhlIHZlcnkgbGVhc3QsIHNob3cgdGhlIG51bWJlciBvZiByb3dzIGJlZm9yZSBhbmQgYWZ0ZXIuKg0KDQpgYGB7cn0NCg0KYGBgDQoNCipGb3IgeW91ciB0ZXN0IHNldCwgY2FsY3VsYXRlIHlvdXIgcHJlZGljdGVkIHByb2JhYmlsaXRpZXMgYW5kIHBsb3QgdGhlbSBhcyBhIGhpc3RvZ3JhbS4gIENob29zZSBhIHRocmVzaG9sZCBiYXNlZCBvbiB5b3VyIHByaW9yaXRpZXMgKGRvIHlvdSB3YW50IHJlZHVjZWQgZmFsc2UgcG9zaXRpdmVzL2ZhbHNlIG5lZ2F0aXZlcz8pLCBhbmQgYXNzZXNzIHRoZSBhY2N1cmFjeS9iYWxhbmNlZCBhY2N1cmFjeSBvZiB5b3VyIHRocmVzaG9sZGVkIG1vZGVsLioNCg0KYGBge3J9DQoNCmBgYA0KDQoqV2VyZSB0aGUgc2xlZXAgY292YXJpYXRlcyAoYG5lY2syMGAsIGBhaGlfYTBoM2AsIGFuZCBgYWhpX2EwaDRgKSB1c2VmdWwgaW4gcHJlZGljdGluZyBgYW55X2N2ZGA/IFRhbGsgYWJvdXQgd2h5IG9yIHdoeSBub3QuKg0KDQoqR2l2ZW4geW91ciBmaW5hbCByZXN1bHRzLCBob3cgd291bGQgeW91IHJlY29tbWVuZCB0aGUgbW9kZWwgYmUgdXNlZD8qIA0K