Data Exploration of NHANES

Ted Laderas, Jessica Minnier, Thomas Frohwein

2019-01-25

Introductions

Please Note

This workshop adheres to the BioData Club Code of Conduct.

This is done to maintain a psychologically safe and inclusive environment for everyone. Please email me at laderast@ohsu.edu or text me (503-481-8470) if you see any potential violations.

If you violate the CoC, you may be asked to leave.

These Slides are Here

http://bit.ly/data_nhanes

Our Overall Goal

Why are we here?

What is NHANES and why are we looking at it?

Please Note

Take a Look at the Data as a Sheet

NHANES Extract in Google Sheet Form

NHANES is a valuable dataset in many ways

We can understand an outcome and look at its association with measured variables in the data.

Let’s look at three outcomes today:

Before we Start

Get into groups by your chosen outcome. Introduce yourselves, and pair off within your groups

Come up with one question about your outcome you’re curious about.

What do you expect is the case?

See if you can answer it!

What is Exploratory Data Analysis?

Remember

“Exploratory data analysis can never be the whole story, but nothing else can serve as the foundation stone.” - John Tukey, Exploratory Data Analysis

Why Look at your Data?

Systolic Blood Pressure

Systolic Blood Pressure

Why Visualization?

EDA is about Visualization First

Running the Explorer App

We’ll start exploring the data immediately!

Go to the apps below:

We’ll separate the scavenger hunt by outcome, and we’ll ask questions, and then come back to present.

Map your questions to a tab:

What is the Overview Tab for?

Data Explorer

Data Explorer

Overview Tab

Let’s try it!

  1. How many categorical variables are there? (in R, we call them factors)
  2. How many missing cases are there for your outcome?
  3. What is the mean age for the dataset?

What is the Category Tab for?

Categorical Tab

Categorical Example

Do people with the most days of LittleInterest also have the most days of Depression?

Categories: Let’s try it!

  1. How many categories are there for your outcome?
  2. What is the category with the largest counts for your outcome?
  3. Do the proportions of people with your outcome look the same for those who use marijuana versus those who don’t use it?

Continuous Tab

Continuous Scatter

If you get less hours of sleep per night, does that mean you have a higher BMI?

Continuous Boxplot

If you have a lot of depressed episodes, do you also get less sleep?

Continuous: Let’s Try it!

Let’s start the scavenger hunt

Let’s learn from each other

Each group should present the findings from 2-3 interesting questions:

  1. Where did you find it in the app?
  2. What variables did you look at?
  3. What did you expect?
  4. What did you find?

Physical Activity Notes

Jess’ excellent overview of physical activity in NHANES

Some Final Notes in NHANES

Congratulations

You are now a full fledged data explorer!

https://waynepelletier.com/work/tasty-icons

Please!

Fill out our post survey form - we need this info to do more workshops!

Post Survey Form

Let’s do more scavenger hunts

Are there other interesting public datasets that people want to look at?

Contact us! laderast@ohsu.edu

We can build apps for them!

http://laderast.github.io/burro