11  Working with WDL

In this section, we’ll learn about one of the recommended workflow runners, Cromwell,

11.1 Learning Objectives

  • Explain the basic architecture of a WDL file
  • Explain the role of a task in WDL
  • Utilize Cromwell to execute a WDL script on one file
  • Uttilze Cromwell to batch execute a WDL script on multiple files

11.2 Architecture of a WDL file

The best way to read WDL files is to read them top down. We’ll focus on the basic sections of a WDL file before we see how they work together.

The code below is from the WILDs WDL Repo.

workflow SRA_STAR2Pass {
  input { 
    Array[String] sra_id_list
    RefGenome ref_genome

  scatter ( id in sra_id_list ){
    call fastqdump {

    call STARalignTwoPass {
  } # End scatter 

  # Outputs that will be retained when execution is complete
  output {


11.3 Anatomy of a Task

task fastqdump {
  input {
    String sra_id
    Int ncpu = 12

  command <<<
    set -eo pipefail
    # check if paired ended
    numLines=$(fastq-dump -X 1 -Z --split-spot "~{sra_id}" | wc -l)
    if [ $numLines -eq 8 ]; then
    # perform fastqdump
    if [ $paired_end == 'true' ]; then
      echo true > paired_file
      parallel-fastq-dump \
        --sra-id ~{sra_id} \
        --threads ~{ncpu} \
        --outdir ./ \
        --split-files \
      touch paired_file
      parallel-fastq-dump \
        --sra-id ~{sra_id} \
        --threads ~{ncpu} \
        --outdir ./ \

  output {
    File r1_end = "~{sra_id}_1.fastq.gz"
    File r2_end = "~{sra_id}_2.fastq.gz"
    String paired_end = read_string('paired_file')

  runtime {
    memory: 2 * ncpu + " GB"
    docker: "getwilds/pfastqdump:0.6.7"
    cpu: ncpu

  parameter_meta {

11.4 Resources

  • Developing WDL Workflows is a full guide from the Data Science Lab (DaSL) showing you how to develop your own WDL Workflows and has a much more in detail section of WDL file architecture.