Upload data and run a custom method¶

Setup¶

Please download the materials for this section FireCloud Files

Create a workspace in FireCloud¶

Workspaces > Create a new workspace

name: hello_gatk_fc_YOUR_NAME

billing project: YOUR_PROJECT

Add workspace attributes¶

Workspaces > Summary > Workspace attributes > Import Attributes

data_bundle > FireCloud > workspaceAttributes.tsv

When it is uploaded, look at the workspace attributes section to see if the upload was successful

Set up data model¶

Workspaces > Data > Import Metadata > Import from file

Upload in this order:

data_bundle > FireCloud > participant.txt

data_bundle > FireCloud > sample.txt

When it is uploaded, look at the two tables in the data tab that are filled in to see if it was successfully uploaded.

Put WDL on FireCloud¶

Method Repository > Create New Method

namespace: YOUR_NAME

name: hello_gatk_fc

wdl: load from file

This WDL calls HaplotypeCaller in GVCF mode, which takes a BAM input & outputs a GVCF file of variant likelihoods.

The FireCloud version has a docker image specified among other runtime settings -- the memory and disk size of the machine we will request from Google’s cloud, as well as the number of times we will try to run on a preemptible machine.

Notice that you can type in the WDL field to edit if needed.

documentation: We won’t be filling this out today, but in general documentation here is highly recommended, as it is helpful for others who may want to run your method.

Upload

Import configuration to workspace¶

Method Repository > your method > Export to Workspace

Use Blank Configuration

Name: hello_gatk_fc

Root Entity Type: sample

Destination Workspace: YOUR_PROJECT/hello_gatk_fc_YOUR_NAME

Would you like to go to the edit page now? Yes

Note

If you get popup “Synchronize Access to Method” Grant Read Permission

Fill in method config¶

Workspace > Method Configurations > hello_gatk_fc

Select the Edit Configuration button to fill it in. There are 3 types of inputs.

In the data model

You’ll find this value in your data tab. Since it is under the sample section, and your root entity type is sample, simply type this. and allow autocomplete to guide you.

eg: inputBam = this.inputBam

In the workspace attributes

You’ll find this value in your workspace attributes section under the summary tab. To find it, type in workspace. and let autocomplete guide you.

eg: refDict = workspace.refDict

Hard-coded

These are values which are not in your data model or workspace attributes. They are fixed numbers or strings that are typed in here. You can find the values for these inputs in the inputs json in your data bundle (data_bundle > hello_gatk > hello_gatk_fc.inputs.json)

eg: disk_size = 10

eg: java_opt = "-Xmx2G"

Fill in the remaining inputs on your own/helping your neighbors.

Fill out the output. It won’t auto-complete, but we want to write it to the data model. The value should be this.output_gvcf

Save the configuration

Run¶

Refresh the page and check for the yellow refresh credentials banner BEFORE running. This isn’t typically an issue for users in a normal setting, but because in a workshop we start and stop a lot, the idle time can cause the credentials to time out. It will throw a Rawls error if you run that won’t pop up until after the job has been submitted and queued, which can be frustrating.

Method Config > Launch Analysis > Select sample > Launch

Watch & refresh from the monitor tab. Click the view link when it appears, and open the timing diagram to see what’s happening.