Best Practices on the DNAnexus Platform

Author

Ted Laderas

Published

January 23, 2023

Preface

This is a companion book to Bash for Bioinformatics.

In this book we attempt to highlight best practices on the DNAnexus platform given some previous knowledge.

Learning Objectives

After reading this book and doing the exercises, you should be able to:

  • Explain the object based filesystem on the DNAnexus platform and how it differs from Linux/Unix POSIX filesystems.
  • Explain key details of the UKB Research Analysis Platform and how these details impact your work.
  • Utilize Projects and Organizations to effectively manage data for your group.
  • Use Spark to connect, extract, and manipulate Apollo Datasets on the platform.
  • Utilize Hail to load, query, model, annotate, and visualize large-scale genomics data such as the Exome Data on the UK Biobank Research Analysis Platform.
  • Build native DNAnexus apps effectively that manage inputs, outputs, and work with batch mode.
  • Utilize existing and build native DNAnexus workflows using Workflow Description Language (WDL).
  • Execute NextFlow pipelines on the DNAnexus Platform.

Our goal is to bring information together in a task-oriented format to achieve things on the platform.

Four Levels of Using DNAnexus

One way to approach learning DNAnexus is to think about the skills you need to process a number of files. Ben Busby has noted there are 4 main skill levels in processing files on the DNAnexus platform:

Level # of Files Skill
1 1 Interactive Analysis (Cloud Workstation, JupyterLab)
2 1-50 Files dx run, Swiss Army Knife
3 50-1000 Files Building your own apps
4 1000+ Files, multiple steps Using WDL (Workflow Description Language)

We’ll be covering mostly level 3 and 4 in this book. But you will need to be at level 2 before you can tackle these topics.

The key is to gradually build on your skills.

Prerequisites

Before you tackle this book, you should be able to accomplish the following:

We recommend reviewing Bash for Bioinformatics if you need to brush up on these prerequisite skills.

Want to be a Contributor?

This is the first draft of this book. It’s not going to be perfect, and we need help.

If you find a problem/issue or have a question, you can file it as an issue using this link.

In your issue, please note the following:

  • Your Name
  • What your issue was
  • Which section, and line you found problematic or wouldn’t run

If you have large edits: If you’re quarto/GitHub savvy, you can fork and file a pull request for typos/edits. If you’re not, you can file an issue.

Just be aware that this is not my primary job - I’ll try to be as responsive as I can.

As-Is Software Disclaimer

This content in this book is delivered “As-Is”. Notwithstanding anything to the contrary, DNAnexus will have no warranty, support, liability or other obligations with respect to Materials provided hereunder.

Licensing

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.