In this notebook, you will compile your following report. For each step, show one figure that justifies your inclusion or exclusion of variables.

Remember, you can save your figures to the images folder and include them using the following syntax:

Each group should submit the HTML preview to laderast@ohsu.edu (it will be called final_report.nb.html) with your group name (rename the file final_report_groupx.nb.html. Make sure to fill out the author field above with everyone’s name!

Don’t worry about removing variables from the model at each step. We’re only adding variables to the model. Also, you don’t have to show the performance of the model after each step, only show the performance of the final model.

Step 1: Initial Model

Choose your initial model from the following. You don’t need to show any figures here.

#show your code for the basic model here

Step 2: Do you add race to your model?

Put a short definition of race from your model. If you think it is important to add race and you are satisfied with the quality in the dataset, show a figure here. If you don’t think it’s important or you aren’t satisfied with the quality of the race variable, show a figure here.

#put model code here

Step 3: Do you add hypertension to your model?

Investigate adding one of these variables to your model. If you think the variable is important, show one figure for including it. Talk about your choice of variable, how it is measured/calculated, and its impact on your model.

#put model code here

Step 4: Apnea Hypopnea Index

Investigate adding one of these variables to your model. If you think the variable is important, show one figure for including it. Talk about your choice of variable, how it is measured/calculated, and its impact on your model.

#put model code here

Step 5: Evaluation of Final Model

Assess the impact of selecting complete cases for your covariate. If you like, you can show a before/after vis_dat for your set of variables (before dropping NAs and after dropping NAs). At the very least, show the number of rows before and after.

For your test set, calculate your predicted probabilities and plot them as a histogram. Choose a threshold based on your priorities (do you want reduced false positives/false negatives?), and assess the accuracy/balanced accuracy of your thresholded model.

Were the sleep covariates (neck20, ahi_a0h3, and ahi_a0h4) useful in predicting any_cvd? Talk about why or why not.

Given your final results, how would you recommend the model be used?

LS0tDQp0aXRsZTogIllvdXIgRmluYWwgTW9kZWwiDQphdXRob3I6ICJQdXQgQXV0aG9yIE5hbWVzIEhlcmUiDQpvdXRwdXQ6IGh0bWxfbm90ZWJvb2sNCmVkaXRvcl9vcHRpb25zOiANCiAgY2h1bmtfb3V0cHV0X3R5cGU6IGlubGluZQ0KLS0tDQoNCmBgYHtyIHNldHVwLCBpbmNsdWRlPUZBTFNFfQ0KbGlicmFyeShicm9vbSkNCmxpYnJhcnkodGlkeXIpDQpsaWJyYXJ5KGRwbHlyKQ0KbGlicmFyeSh2aXNkYXQpDQpsaWJyYXJ5KGNhcmV0KQ0KDQpzaGhzX2RhdGEgPC0gcmVhZFJEUygiZGF0YS9jb21tb25fZGF0YV9zbWFsbC5yZHMiKQ0KYGBgDQoNCkluIHRoaXMgbm90ZWJvb2ssIHlvdSB3aWxsIGNvbXBpbGUgeW91ciBmb2xsb3dpbmcgcmVwb3J0LiBGb3IgZWFjaCBzdGVwLCBzaG93IG9uZSBmaWd1cmUgdGhhdCBqdXN0aWZpZXMgeW91ciBpbmNsdXNpb24gb3IgZXhjbHVzaW9uIG9mIHZhcmlhYmxlcy4gDQoNClJlbWVtYmVyLCB5b3UgY2FuIHNhdmUgeW91ciBmaWd1cmVzIHRvIHRoZSBgaW1hZ2VzYCBmb2xkZXIgYW5kIGluY2x1ZGUgdGhlbSB1c2luZyB0aGUgZm9sbG93aW5nIHN5bnRheDoNCg0KIVtdKGltYWdlcy9hY2UucG5nKQ0KDQpFYWNoIGdyb3VwIHNob3VsZCBzdWJtaXQgdGhlIEhUTUwgcHJldmlldyB0byBsYWRlcmFzdEBvaHN1LmVkdSAoaXQgd2lsbCBiZSBjYWxsZWQgYGZpbmFsX3JlcG9ydC5uYi5odG1sYCkgd2l0aCB5b3VyIGdyb3VwIG5hbWUgKHJlbmFtZSB0aGUgZmlsZSBgZmluYWxfcmVwb3J0X2dyb3VweC5uYi5odG1sYC4gTWFrZSBzdXJlIHRvIGZpbGwgb3V0IHRoZSBgYXV0aG9yYCBmaWVsZCBhYm92ZSB3aXRoIGV2ZXJ5b25lJ3MgbmFtZSENCg0KRG9uJ3Qgd29ycnkgYWJvdXQgcmVtb3ZpbmcgdmFyaWFibGVzIGZyb20gdGhlIG1vZGVsIGF0IGVhY2ggc3RlcC4gV2UncmUgb25seSBhZGRpbmcgdmFyaWFibGVzIHRvIHRoZSBtb2RlbC4gQWxzbywgeW91IGRvbid0IGhhdmUgdG8gc2hvdyB0aGUgcGVyZm9ybWFuY2Ugb2YgdGhlIG1vZGVsIGFmdGVyIGVhY2ggc3RlcCwgb25seSBzaG93IHRoZSBwZXJmb3JtYW5jZSBvZiB0aGUgZmluYWwgbW9kZWwuDQoNCiMjIFN0ZXAgMTogSW5pdGlhbCBNb2RlbA0KDQoqQ2hvb3NlIHlvdXIgaW5pdGlhbCBtb2RlbCBmcm9tIHRoZSBmb2xsb3dpbmcuIFlvdSBkb24ndCBuZWVkIHRvIHNob3cgYW55IGZpZ3VyZXMgaGVyZS4qDQoNCi0gYGFueV9jdmRgICh5b3VyIG91dGNvbWUpDQoNCi0gYGFnZV9zMWANCi0gYGdlbmRlcmANCi0gYGJtaV9zMWANCi0gYG5lY2syMGANCg0KYGBge3J9DQojc2hvdyB5b3VyIGNvZGUgZm9yIHRoZSBiYXNpYyBtb2RlbCBoZXJlDQpgYGANCg0KIyMgU3RlcCAyOiBEbyB5b3UgYWRkIGByYWNlYCB0byB5b3VyIG1vZGVsPw0KDQoqUHV0IGEgc2hvcnQgZGVmaW5pdGlvbiBvZiBgcmFjZWAgZnJvbSB5b3VyIG1vZGVsLiBJZiB5b3UgdGhpbmsgaXQgaXMgaW1wb3J0YW50IHRvIGFkZCByYWNlIGFuZCB5b3UgYXJlIHNhdGlzZmllZCB3aXRoIHRoZSBxdWFsaXR5IGluIHRoZSBkYXRhc2V0LCBzaG93IGEgZmlndXJlIGhlcmUuIElmIHlvdSBkb24ndCB0aGluayBpdCdzIGltcG9ydGFudCBvciB5b3UgYXJlbid0IHNhdGlzZmllZCB3aXRoIHRoZSBxdWFsaXR5IG9mIHRoZSBgcmFjZWAgdmFyaWFibGUsIHNob3cgYSBmaWd1cmUgaGVyZS4qDQoNCmBgYHtyfQ0KI3B1dCBtb2RlbCBjb2RlIGhlcmUNCmBgYA0KDQojIyBTdGVwIDM6IERvIHlvdSBhZGQgYGh5cGVydGVuc2lvbmAgdG8geW91ciBtb2RlbD8NCg0KKkludmVzdGlnYXRlIGFkZGluZyBvbmUgb2YgdGhlc2UgdmFyaWFibGVzIHRvIHlvdXIgbW9kZWwuIElmIHlvdSB0aGluayB0aGUgdmFyaWFibGUgaXMgaW1wb3J0YW50LCBzaG93IG9uZSBmaWd1cmUgZm9yIGluY2x1ZGluZyBpdC4gVGFsayBhYm91dCB5b3VyIGNob2ljZSBvZiB2YXJpYWJsZSwgaG93IGl0IGlzIG1lYXN1cmVkL2NhbGN1bGF0ZWQsIGFuZCBpdHMgaW1wYWN0IG9uIHlvdXIgbW9kZWwuKg0KDQotIGBodG5kZXJ2X3MxYA0KLSBgc3JoeXBlYA0KLSBgc3lzdGJwYA0KDQpgYGB7cn0NCiNwdXQgbW9kZWwgY29kZSBoZXJlDQpgYGANCg0KIyMgU3RlcCA0OiBBcG5lYSBIeXBvcG5lYSBJbmRleA0KDQoqSW52ZXN0aWdhdGUgYWRkaW5nIG9uZSBvZiB0aGVzZSB2YXJpYWJsZXMgdG8geW91ciBtb2RlbC4gSWYgeW91IHRoaW5rIHRoZSB2YXJpYWJsZSBpcyBpbXBvcnRhbnQsIHNob3cgb25lIGZpZ3VyZSBmb3IgaW5jbHVkaW5nIGl0LiBUYWxrIGFib3V0IHlvdXIgY2hvaWNlIG9mIHZhcmlhYmxlLCBob3cgaXQgaXMgbWVhc3VyZWQvY2FsY3VsYXRlZCwgYW5kIGl0cyBpbXBhY3Qgb24geW91ciBtb2RlbC4qDQoNCi0gYGFoaV9hMGgzYA0KLSBgYWhpX2EwaDRgDQoNCmBgYHtyfQ0KI3B1dCBtb2RlbCBjb2RlIGhlcmUNCmBgYA0KDQoNCiMjIFN0ZXAgNTogRXZhbHVhdGlvbiBvZiBGaW5hbCBNb2RlbA0KDQoqQXNzZXNzIHRoZSBpbXBhY3Qgb2Ygc2VsZWN0aW5nIGNvbXBsZXRlIGNhc2VzIGZvciB5b3VyIGNvdmFyaWF0ZS4gSWYgeW91IGxpa2UsIHlvdSBjYW4gc2hvdyBhIGJlZm9yZS9hZnRlciBgdmlzX2RhdGAgZm9yIHlvdXIgc2V0IG9mIHZhcmlhYmxlcyAoYmVmb3JlIGRyb3BwaW5nIE5BcyBhbmQgYWZ0ZXIgZHJvcHBpbmcgTkFzKS4gQXQgdGhlIHZlcnkgbGVhc3QsIHNob3cgdGhlIG51bWJlciBvZiByb3dzIGJlZm9yZSBhbmQgYWZ0ZXIuKg0KDQpgYGB7cn0NCg0KYGBgDQoNCipGb3IgeW91ciB0ZXN0IHNldCwgY2FsY3VsYXRlIHlvdXIgcHJlZGljdGVkIHByb2JhYmlsaXRpZXMgYW5kIHBsb3QgdGhlbSBhcyBhIGhpc3RvZ3JhbS4gIENob29zZSBhIHRocmVzaG9sZCBiYXNlZCBvbiB5b3VyIHByaW9yaXRpZXMgKGRvIHlvdSB3YW50IHJlZHVjZWQgZmFsc2UgcG9zaXRpdmVzL2ZhbHNlIG5lZ2F0aXZlcz8pLCBhbmQgYXNzZXNzIHRoZSBhY2N1cmFjeS9iYWxhbmNlZCBhY2N1cmFjeSBvZiB5b3VyIHRocmVzaG9sZGVkIG1vZGVsLioNCg0KYGBge3J9DQoNCmBgYA0KDQoqV2VyZSB0aGUgc2xlZXAgY292YXJpYXRlcyAoYG5lY2syMGAsIGBhaGlfYTBoM2AsIGFuZCBgYWhpX2EwaDRgKSB1c2VmdWwgaW4gcHJlZGljdGluZyBgYW55X2N2ZGA/IFRhbGsgYWJvdXQgd2h5IG9yIHdoeSBub3QuKg0KDQoqR2l2ZW4geW91ciBmaW5hbCByZXN1bHRzLCBob3cgd291bGQgeW91IHJlY29tbWVuZCB0aGUgbW9kZWwgYmUgdXNlZD8qIA0K