Posts

I want to thank everyone who has reached out to me after I wrote my post on struggling with my depression and self-care. I am incredibly grateful for everyone’s concern about me. I wrote that at a low point in my life because I had to. I was suffering too long in silence, and I needed to do something. Writing that post was incredibly scary. I am still worried that it may be used against me somehow down the line when I am reviewed for tenure.

CONTINUE READING

Motivation A few weeks ago, as part of the [rOpenSci Unconference](), a group of us (Sean Hughes, Malisa Smith, Angela Li, Ju Kim and me) decided to work on making the UMAP algorithm accessible within R. UMAP (Uniform Manifold Approximation and Projection) is a [dimensionality reduction technique]() that allows the user to reduce high dimensional data (multiple columns) into a smaller number of columns for visualization purposes (Usually two). It is related to both Priniciple Components Analysis (PCA) and t-SNE, which are techniques often used in the single-cell omics world to visualize high dimensional data.

CONTINUE READING

Well, Cascadia-R 2018 has come and gone. This year we tried our best to make it as inclusive, welcoming, and friendly as we could. Considering we had 224 participants this time around, I’d say it was a success. I just thought I would do a little write up of some of the things we did and why we did them in our conference. I’m hoping it will be useful for other conference planners to create a welcoming environment.

CONTINUE READING

Note: I am not writing the following to complain, or excuse any past behavior. I am writing this just to be honest and transparent about my current struggles in academia. I hope it helps someone, or encourages other to seek help. I have to confess that I haven’t really been feeling all that well the past few months. Right now I am plagued with feelings that I am doing my work as an Assistant Professor wrong.

CONTINUE READING

Even though I’ve been using the tidyverse for a couple of years, there’s always a couple new applications of tidyverse verbs. This one, in retrospect, is pretty simple. I had a one to many table that I wanted to collapse, tidy-style. Let’s look at the diamonds dataset: diamonds %>% select(color, cut) ## # A tibble: 53,940 x 2 ## color cut ## <ord> <ord> ## 1 E Ideal ## 2 E Premium ## 3 E Good ## 4 I Premium ## 5 J Good ## 6 J Very Good ## 7 I Very Good ## 8 H Very Good ## 9 E Fair ## 10 H Very Good ## # .

CONTINUE READING

In academia, it’s inevitable to have to travel and present at conferences and meetings. As an introvert, I’ve been trying to compile a few tips that have helped me navigate large conferences so I don’t feel overwhelmed. It is unfortunate that though the academic community has many introverts, conference and meeting structure is heavily biased towards extroverts. (No offense to extroverts, but some of you can sometimes seem like blowhards to us introverts.

CONTINUE READING

Sorry for the lack of posts. I have been busy with co-teaching our Health Informatics course (HSMP410) for the OHSU/PSU School of Public Health. I’m trying to make most of my lectures activity-driven for my students, who are Community Health Education and Nursing majors. I believe that you can teach mathematical concepts visually, so I am experimenting with using LearnR/Shiny to teach the basics of data literacy. I’m also using datacamp-light to show my students a simple intro to data science.

CONTINUE READING

This term, I’m co-teaching an undergraduate course for the PSU/OHSU School of Public Health called Health Informatics with a number of my collegues in my department, including Bill Hersh, Eilis Boudreau, Karen Eden, and Virginia Lankes. We’re trying to give students a feel for what informatics is about in an accessible way. I’m trying to make the lectures as understandable as I can. This week we tackled Genome Wide Association Studies and discussed the strength of evidence behind SNP variants associated with phenotypes.

CONTINUE READING

Last week I had the good fortune to attend the From Evidence to Scholarship Conference at my alma mater, Reed College. The focus of the conference was improving the research process for undergraduates using digital scholarship. I came away from it excited about the work other people are doing in this realm and thinking about ways we could adapt these approaches. Nicole Vasilevsky and I (both former Reedies) each gave talks, about our experience developing materials for Data Science and giving data science workshops to undergraduates.

CONTINUE READING

I gave a talk for the Portland State University Systems Science seminar called How are Data Science and Systems Science Connected?. In this talk, I was highlighting current blind spots in Data Science that I think Systems Science approaches can address, especially that of interactions between features. I talked a little about my dissertation research (surrogate oncogenes), and the problem of black-box interpretability of predictive models. If you’re interested in listening to the recording, the playback is available here: https://us.

CONTINUE READING