Upload data and run a custom method

Setup

Please download the materials for this section FireCloud Files

Create a workspace in FireCloud

  1. Workspaces > Create a new workspace
    1. name: hello_gatk_fc_YOUR_NAME
    2. billing project: YOUR_PROJECT

Add workspace attributes

  1. Workspaces > Summary > Workspace attributes > Import Attributes
    1. data_bundle > FireCloud > workspaceAttributes.tsv
  2. When it is uploaded, look at the workspace attributes section to see if the upload was successful

Set up data model

  1. Workspaces > Data > Import Metadata > Import from file
    1. Upload in this order:
      1. data_bundle > FireCloud > participant.txt
      2. data_bundle > FireCloud > sample.txt
  2. When it is uploaded, look at the two tables in the data tab that are filled in to see if it was successfully uploaded.

Put WDL on FireCloud

  1. Method Repository > Create New Method
    1. namespace: YOUR_NAME
    2. name: hello_gatk_fc
    3. wdl: load from file
      1. This WDL calls HaplotypeCaller in GVCF mode, which takes a BAM input & outputs a GVCF file of variant likelihoods.
      2. The FireCloud version has a docker image specified among other runtime settings -- the memory and disk size of the machine we will request from Google’s cloud, as well as the number of times we will try to run on a preemptible machine.
      3. Notice that you can type in the WDL field to edit if needed.
    4. documentation: We won’t be filling this out today, but in general documentation here is highly recommended, as it is helpful for others who may want to run your method.
    5. Upload

Import configuration to workspace

  1. Method Repository > your method > Export to Workspace
    1. Use Blank Configuration
      1. Name: hello_gatk_fc
      2. Root Entity Type: sample
      3. Destination Workspace: YOUR_PROJECT/hello_gatk_fc_YOUR_NAME
    2. Would you like to go to the edit page now? Yes

    3. Note

      If you get popup “Synchronize Access to Method” Grant Read Permission

Fill in method config

  1. Workspace > Method Configurations > hello_gatk_fc
  2. Select the Edit Configuration button to fill it in. There are 3 types of inputs.
    1. In the data model
      1. You’ll find this value in your data tab. Since it is under the sample section, and your root entity type is sample, simply type this. and allow autocomplete to guide you.
      2. eg: inputBam = this.inputBam
    2. In the workspace attributes
      1. You’ll find this value in your workspace attributes section under the summary tab. To find it, type in workspace. and let autocomplete guide you.
      2. eg: refDict = workspace.refDict
    3. Hard-coded
      1. These are values which are not in your data model or workspace attributes. They are fixed numbers or strings that are typed in here. You can find the values for these inputs in the inputs json in your data bundle (data_bundle > hello_gatk > hello_gatk_fc.inputs.json)
      2. eg: disk_size = 10
      3. eg: java_opt = "-Xmx2G"
  3. Fill in the remaining inputs on your own/helping your neighbors.
  4. Fill out the output. It won’t auto-complete, but we want to write it to the data model. The value should be this.output_gvcf
  5. Save the configuration

Run

  1. Refresh the page and check for the yellow refresh credentials banner BEFORE running. This isn’t typically an issue for users in a normal setting, but because in a workshop we start and stop a lot, the idle time can cause the credentials to time out. It will throw a Rawls error if you run that won’t pop up until after the job has been submitted and queued, which can be frustrating.
  2. Method Config > Launch Analysis > Select sample > Launch
  3. Watch & refresh from the monitor tab. Click the view link when it appears, and open the timing diagram to see what’s happening.