tladeras’s Twitter Archive—№ 6,450

I think bioinformaticians need to have a really strong shell scripting background. We underemphasize the importance of this (in terms of time spent learning) compared to R/Python. This is especially important when running cloud jobs and you need to glue together code.
On twitter.com ♻️ 55 Retweets ❤️ 429 Favorites 2022 Mar 2 Mood +6 🙂

…in reply to @tladeras
Running on a remote worker is an indirect process, and I find that shell scripting is so important to doing things on a worker where you need to bring your software and dependencies to it. That, and learning how to reproducibly manage software dependencies are real gaps.
Permalink On twitter.com ❤️ 16 Favorites 2022 Mar 2 Mood +2 🙂

…in reply to @tladeras
Yes, the technologies shift, but we should probably spend a couple sessions just on reproducibly managing software dependencies. This would include container-based tech, utilizing and extending.
Permalink On twitter.com ❤️ 18 Favorites 2022 Mar 2 Mood +1 🙂

…in reply to @tladeras
I often joke that bioinformatics is about transforming one file format to another. There's usually a lot of glue involved in doing that reproducibly, and I think we underestimate its importance in training bioinformaticians.
Permalink On twitter.com ❤️ 35 Favorites 2022 Mar 2 Mood +3 🙂

…in reply to @tladeras
I think part of this is the curse of knowledge - for a lot of us, our knowledge of Linux/shell scripting was self-taught and hard-learned. We underestimate how much time (and trial and error) it takes to learn these skills
Permalink On twitter.com ❤️ 30 Favorites 2022 Mar 2 Mood -4 🙁

…in reply to @tladeras
In terms of solid foundations and recommendations, I highly recommend both the Carpentries materials (swcarpentry.github.io/shell-novice/) and The Missing Semester of your CS Education: missing.csail.mit.edu/
Permalink On twitter.com ♻️ 9 Retweets ❤️ 90 Favorites 2022 Mar 2 Mood +2 🙂

…in reply to @tladeras
But we need applications courses for bioinformaticists as well that goes well beyond this material. I don't know what that looks like, but I'd be willing to work on it.
Permalink On twitter.com ❤️ 20 Favorites 2022 Mar 2 Mood +2 🙂

…in reply to @tladeras
Just a quick note: when I’m discussing the importance of shell scripting, I’m really mostly talking about secondary analysis: the running and batching of aligners/variant callers, etc.
Permalink On twitter.com ❤️ 10 Favorites 2022 Mar 2 Mood +2 🙂

…in reply to @tladeras
Running GATK on one file on your local computer is one thing, but automating a pipeline to run on 200K files on cloud/HPC is another thing entirely.
Permalink On twitter.com ❤️ 20 Favorites 2022 Mar 2 Mood 0