7. Custom Workflow#
This example shows how to do a fully-automated run of the introductory T20S tutorial workflow from in the CryoSPARC Guide. It includes the following steps:
Import Movies
Motion Correction
CTF Estimation
Curate Exposures
Blob Picker
Template Picker
Inspect Picks
Extract Particles
2D Classification for Blob Picks
2D Classification for Template Picks
Select 2D Classes
Ab-Initio Reconstruction
Homogeneous Refinement
Use this example as a template for writing automated cryo-EM workflows that may be repeated with different datasets.
7.1. Import Movies#
First initialize a connection to CryoSPARC, find the target project and workspace where the workflow will run, and set a scheduler lane where jobs will be queued to.
from cryosparc.tools import CryoSPARC
cs = CryoSPARC(host="cryoem0.sbi", base_port=61000)
assert cs.test_connection()
project = cs.find_project("P251")
workspace = project.find_workspace("W10")
lane = "cryoem3"
Connection succeeded to CryoSPARC API at http://cryoem0.sbi:61002
Import the movies with an Import Movies job. Note that you may use the CryoSPARC.print_job_types method to inspect available job type keys to use with Workspace.create_job.
job_sections = cs.print_job_types()
import_movies_job = workspace.create_job(
"import_movies",
params={
"blob_paths": "/bulk5/data/EMPIAR/10025/data/empiar_10025_subset/*.tif",
"gainref_path": "/bulk5/data/EMPIAR/10025/data/empiar_10025_subset/norm-amibox05-0.mrc",
"psize_A": 0.6575,
"accel_kv": 300,
"cs_mm": 2.7,
"total_dose_e_per_A2": 53,
},
)
Category | Job | Title | Stability
====================================================================================================
import | import_movies | Import Movies | stable
| import_micrographs | Import Micrographs | stable
| import_particles | Import Particle Stack | stable
| import_volumes | Import 3D Volumes | stable
| import_templates | Import Templates | stable
| import_result_group | Import Result Group | stable
| import_beam_shift | Import Beam Shift | stable
motion_correction | patch_motion_correction_multi | Patch Motion Correction | stable
| rigid_motion_correction_multi | Full-frame Motion Correction | stable
| rigid_motion_correction | Full-frame Motion Correction | develop
| local_motion_correction | Local Motion Correction | stable
| local_motion_correction_multi | Local Motion Correction | stable
| motion_correction_motioncor2 | MotionCor2 | beta
| reference_motion_correction | Reference Based Motion Correction | beta
| local_applytraj | Apply Trajectories | develop
| patch_to_local | Patch Motion to Local Motion | develop
| recenter_trajectories | Recenter Trajectories | develop
ctf_estimation | patch_ctf_estimation_multi | Patch CTF Estimation | stable
| patch_ctf_extract | Patch CTF Extraction | stable
| ctf_estimation | CTF Estimation (CTFFIND4) | stable
exposure_curation | denoise_train | Micrograph Denoiser | beta
| curate_exposures_v2 | Manually Curate Exposures | stable
particle_picking | manual_picker_v2 | Manual Picker | stable
| blob_picker_gpu | Blob Picker | stable
| template_picker_gpu | Template Picker | stable
| filament_tracer_gpu | Filament Tracer | stable
| auto_blob_picker_gpu | Blob Picker Tuner | stable
| inspect_picks_v2 | Inspect Particle Picks | stable
| create_templates | Create Templates | stable
extraction | extract_micrographs_multi | Extract From Micrographs (GPU) | stable
| extract_micrographs_cpu_parallel | Extract From Micrographs (CPU) | stable
| downsample_particles | Downsample Particles | stable
| restack_particles | Restack Particles | stable
deep_picker | topaz_train | Topaz Train | stable
| topaz_cross_validation | Topaz Cross Validation (BETA) | beta
| topaz_extract | Topaz Extract | stable
| topaz_denoise | Topaz Denoise | stable
particle_curation | class_2D_new | 2D Classification (NEW) | stable
| select_2D | Select 2D Classes | stable
| reference_select_2D | Reference Based Auto Select 2D | beta
| reconstruct_2D | Reconstruct 2D Classes | stable
| rebalance_classes_2D | Rebalance 2D Classes | stable
| class_probability_filter | Class Probability Filter | stable
| rebalance_3D | Rebalance Orientations | stable
reconstruction | homo_abinit | Ab-Initio Reconstruction | stable
refinement | homo_refine_new | Homogeneous Refinement | stable
| hetero_refine | Heterogeneous Refinement | stable
| nonuniform_refine_new | Non-uniform Refinement | stable
| homo_reconstruct | Homogeneous Reconstruction Only | stable
| hetero_reconstruct_new | Heterogenous Reconstruction Only | stable
ctf_refinement | ctf_refine_global | Global CTF Refinement | stable
| ctf_refine_local | Local CTF Refinement | stable
| exposure_groups | Exposure Group Utilities | stable
variability | var_3D | 3D Variability | stable
| var_3D_disp | 3D Variability Display | stable
| class_3D | 3D Classification | stable
| regroup_3D_new | Regroup 3D Classes | stable
| reference_select_3D | Reference Based Auto Select 3D | beta
| reorder_3D | Reorder 3D Classes | beta
flexibility | flex_prep | 3D Flex Data Prep | beta
| flex_meshprep | 3D Flex Mesh Prep | beta
| flex_train | 3D Flex Training | beta
| flex_highres | 3D Flex Reconstruction | beta
| flex_generate | 3D Flex Generator | beta
postprocessing | sharpen | Sharpening Tools | stable
| deepemhancer | DeepEMhancer | stable
| validation | Validation (FSC) | stable
| local_resolution | Local Resolution Estimation | stable
| local_filter | Local Filtering | stable
| reslog | ResLog Analysis | stable
local_refinement | new_local_refine | Local Refinement | stable
| particle_subtract | Particle Subtraction | stable
helix | helix_refine | Helical Refinement | stable
| helix_search | Symmetry Search Utility | stable
| helix_initmodel | Helical Initial Model Utility | develop
| helix_symmetrize | Apply Helical Symmetry | develop
| helix_average_power_spectra | Average Power Spectra | stable
utilities | exposure_sets | Exposure Sets Tool | stable
| exposure_tools | Exposure Tools | stable
| generate_thumbs | Generate Micrograph Thumbnails | stable
| cache_particles | Cache Particles on SSD | stable
| check_corrupt_particles | Check For Corrupt Particles | stable
| check_corrupt_micrographs | Check For Corrupt Micrographs | stable
| particle_sets | Particle Sets Tool | stable
| reassign_particles_mics | Reassign Particles to Micrographs | stable
| remove_duplicate_particles | Remove Duplicate Particles | stable
| sym_expand | Symmetry Expansion | stable
| volume_tools | Volume Tools | stable
| volume_alignment_tools | Volume Alignment Tools | stable
| align_3D_new | Align 3D Maps | stable
| split_volumes_group | Split Volumes Group | stable
| orientation_diagnostics | Orientation Diagnostics | stable
simulations | simulator_gpu | Simulate Data | stable
instance_testing | instance_launch_test | Test Job Launch | stable
| worker_ssd_test | Test Worker SSD | stable
| worker_gpu_test | Test Worker GPUs | stable
| worker_benchmark | Benchmark | stable
workflows | extensive_workflow_bench | Extensive Validation | stable
You may inspect any job’s internal document to view available parameter keys, their standard titles, type and default values:
import_movies_job.print_param_spec()
Param | Title | Type | Default
=========================================================================================================
blob_paths | Movies data path | string | None
gainref_path | Gain reference path | string | None
defect_path | Defect file path | string | None
gainref_flip_x | Flip gain ref & defect file in X? | boolean | False
gainref_flip_y | Flip gain ref & defect file in Y? | boolean | False
gainref_rotate_num | Rotate gain ref? | integer | 0
psize_A | Pixel size (A) | number | None
accel_kv | Accelerating Voltage (kV) | number | None
cs_mm | Spherical Aberration (mm) | number | None
total_dose_e_per_A2 | Total exposure dose (e/A^2) | number | None
negative_stain_data | Negative Stain Data | boolean | False
phase_plate_data | Phase Plate Data | boolean | False
override_exp_group_id | Override Exposure Group ID | integer | None
skip_header_check | Skip Header Check | boolean | True
output_constant_ctf | Output Constant CTF | boolean | False
eer_num_fractions | EER Number of Fractions | integer | 40
eer_upsamp_factor | EER Upsampling Factor | number | 2
parse_xml_files | Import Beam Shift Values from XML Files | boolean | False
xml_paths | EPU XML metadata path | string | None
mov_cut_prefix_xml | Length of input filename prefix to cut for XML correspondence | integer | None
mov_cut_suffix_xml | Length of input filename suffix to cut for XML correspondence | integer | None
xml_cut_prefix_xml | Length of XML filename prefix to cut for input correspondence | integer | None
xml_cut_suffix_xml | Length of XML filename suffix to cut for input correspondence | integer | 4
compute_num_cpus | Number of CPUs to parallelize during header check | integer | 4
Make further parameter values with Job.set_param while the job is in ‘building’ status.
import_movies_job.set_param("skip_header_check", False)
True
Queue and run the job. Wait until it completes.
import_movies_job.queue(lane)
import_movies_job.wait_for_done()
'completed'
7.2. Motion Correction and CTF Estimation#
Repeat with Patch Motion Correction and Patch CTF Estimation jobs. Use the connections parameter to connect the jobs to the Import Movies job and to each other.
Both jobs may be queued at the same time. The CryoSPARC scheduler ensures both run to completion.
motion_correction_job = workspace.create_job(
"patch_motion_correction_multi",
connections={"movies": (import_movies_job.uid, "imported_movies")},
params={"compute_num_gpus": 2},
)
ctf_estimation_job = workspace.create_job(
"patch_ctf_estimation_multi",
connections={"exposures": (motion_correction_job.uid, "micrographs")},
params={"compute_num_gpus": 2},
)
motion_correction_job.queue(lane)
ctf_estimation_job.queue(lane)
motion_correction_job.wait_for_done(), ctf_estimation_job.wait_for_done()
('completed', 'completed')
7.3. Curate Exposures#
Use half the micrographs to pick particles with the Blob picker. These will be used to generate more precise template-based picks on the full dataset. This requires running a Curate Exposures interactive job.
Note
Interactive jobs are special jobs that allow visual adjustment of data curation parameters from the CryoSPARC web interface. The following interactive jobs are used in this workflow:
Curate Exposures
Inspect Picks
Select 2D Classes
When queued, interactive jobs soon enter status “waiting” (unlike regular jobs which get status “running”). This means they are ready for interaction from the CryoSPARC interface.
After the job enters “waiting” status, either interact with the job from the
CryoSPARC interface or use the Job.interact method to
programmatically invoke the same interactive actions.
Example interactive invocation for a Curate Exposures job:
data = job.interact("get_fields_and_thresholds")
This returns a curation data structure which may be mutated in Python and written back with the following:
job.interact("set_thresholds", data)
An interactive job has a shutdown function that may be invoked when interaction is complete. Example shutdown invocations for different interactive job types:
Job Type |
Shutdown Function |
|---|---|
Manual Picker |
|
Curate Exposures |
|
Inspect Picks |
|
Select 2D Classes |
|
Build and queue a Curate Exposures job and wait for “waiting” status.
curate_exposures_job = workspace.create_job(
"curate_exposures_v2",
connections={"exposures": (ctf_estimation_job.uid, "exposures")},
)
curate_exposures_job.queue()
curate_exposures_job.wait_for_status("waiting")
'waiting'
Either curate exposures from the CryoSPARC interface or use Job.interact method to perform interactive job actions, as follows:
from cryosparc.util import first
data = curate_exposures_job.interact("get_fields_and_thresholds")
idx_field = first(field for field in data["fields"] if field["name"] == "idx")
assert idx_field
idx_field["thresholds"] = [5, 14]
idx_field["active"] = True
curate_exposures_job.interact("set_thresholds", data)
curate_exposures_job.interact("shutdown_interactive")
curate_exposures_job.wait_for_done()
'completed'
Detailed explanation of the previous code block:
Call
get_fields_and_thresholdsto get a dictionary with afieldskey. The value is a list of adjustable curation fields end thresholds. Each item has this format:{ 'name': str, 'title': str 'short': str, 'active': bool, 'range': [number, number], 'thresholds': [number, number], }
For each field to threshold (just the Index field in this case):
Modify the
thresholdslist to[MIN, MAX], whereMINis a number greater than or equal to the first item inrangeMAXis a number less than or equal to the second item inrange
Set
activetoTrueto enable the threshold
Call
set_thresholdswith the modified dictionaryCall
shutdown_interactiveto finish curating and wait until the job is Completed.
7.4. Blob Picker#
The complated curation job will have 10 accepted and 10 rejected exposures. Provide the accepted ones as input to the Blob Picker.
blob_picker_job = workspace.create_job(
"blob_picker_gpu",
connections={"micrographs": (curate_exposures_job.uid, "exposures_accepted")},
params={"diameter": 100, "diameter_max": 200},
)
blob_picker_job.queue(lane)
blob_picker_job.wait_for_done()
'completed'
7.5. Inspect Picks#
Create an Inspect Picks job and interact with it similarly to Curate Exposures.
inspect_blob_picks_job = workspace.create_job(
"inspect_picks_v2",
connections={
"micrographs": (blob_picker_job.uid, "micrographs"),
"particles": (blob_picker_job.uid, "particles"),
},
)
inspect_blob_picks_job.queue()
inspect_blob_picks_job.wait_for_status("waiting")
inspect_blob_picks_job.interact(
"set_thresholds",
{"ncc_score_thresh": 0.3, "lpower_thresh_min": 600, "lpower_thresh_max": 1000},
)
inspect_blob_picks_job.interact("shutdown_interactive")
inspect_blob_picks_job.wait_for_done()
'completed'
7.6. 2D Classification#
Extract the selected particles and classify them with a 2D Classification job.
extract_blob_picks_job = workspace.create_job(
"extract_micrographs_cpu_parallel",
connections={
"micrographs": (inspect_blob_picks_job.uid, "micrographs"),
"particles": (inspect_blob_picks_job.uid, "particles"),
},
params={"box_size_pix": 448},
)
classify_blob_picks_job = workspace.create_job(
"class_2D_new",
connections={"particles": (extract_blob_picks_job.uid, "particles")},
params={"class2D_K": 10},
)
extract_blob_picks_job.queue(lane)
classify_blob_picks_job.queue(lane)
extract_blob_picks_job.wait_for_done(), classify_blob_picks_job.wait_for_done()
('completed', 'completed')
7.7. Select 2D Classes#
Create a Select 2D Classes job and either select templates from the CryoSPARC interface or interact with the job as follows:
select_blob_templates_job = workspace.create_job(
"select_2D",
connections={
"particles": (classify_blob_picks_job.uid, "particles"),
"templates": (classify_blob_picks_job.uid, "class_averages"),
},
)
select_blob_templates_job.queue()
select_blob_templates_job.wait_for_status("waiting")
# Auto-interact
class_info = select_blob_templates_job.interact("get_class_info")
for c in class_info:
if 1.0 < c["res_A"] < 19.0 and c["num_particles_total"] > 900:
select_blob_templates_job.interact(
"set_class_selected",
{
"class_idx": c["class_idx"],
"selected": True,
},
)
select_blob_templates_job.interact("finish")
select_blob_templates_job.wait_for_done()
'completed'
7.8. Template Picker#
Create and run a Template Picker job with all micrographs.
template_picker_job = workspace.create_job(
"template_picker_gpu",
connections={
"micrographs": (ctf_estimation_job.uid, "exposures"),
"templates": (select_blob_templates_job.uid, "templates_selected"),
},
params={"diameter": 200},
)
template_picker_job.queue(lane)
template_picker_job.wait_for_done()
'completed'
Repeat all previous steps from Inspect Picks to Select 2D, using the template picks as input. Note that when queuing a series of connected jobs, only interactive jobs and the last job in the chain need to be waited on.
For example, given the following job chain:
Inspect Picks -> Extract -> 2D Classification -> Select 2D Classes
Queue all the jobs
Wait for Inspect Picks to be interactive
Invoke
shutdown_interactivewhen finished interactingWait for Select 2D Classes to be interactive (occurs after Extraction and 2D Classification complete)
Shutdown when finished interacting
Wait for Select 2D to be done
# Create and connect jobs
inspect_template_picks_job = workspace.create_job(
"inspect_picks_v2",
connections={
"micrographs": (template_picker_job.uid, "micrographs"),
"particles": (template_picker_job.uid, "particles"),
},
)
extract_template_picks_job = workspace.create_job(
"extract_micrographs_cpu_parallel",
connections={
"micrographs": (inspect_template_picks_job.uid, "micrographs"),
"particles": (inspect_template_picks_job.uid, "particles"),
},
params={"box_size_pix": 448},
)
classify_template_picks_job = workspace.create_job(
"class_2D",
connections={"particles": (extract_template_picks_job.uid, "particles")},
params={"class2D_K": 50},
)
select_templates_job = workspace.create_job(
"select_2D",
connections={
"particles": (classify_template_picks_job.uid, "particles"),
"templates": (classify_template_picks_job.uid, "class_averages"),
},
)
# Queue Jobs
inspect_template_picks_job.queue()
extract_template_picks_job.queue(lane)
classify_template_picks_job.queue(lane)
select_templates_job.queue()
# Inspect template picks
inspect_template_picks_job.wait_for_status("waiting")
inspect_template_picks_job.interact(
"set_thresholds",
{"ncc_score_thresh": 0.3, "lpower_thresh_min": 900.0, "lpower_thresh_max": 1800.0},
)
inspect_template_picks_job.interact("shutdown_interactive")
# Select 2D Classes
select_templates_job.wait_for_status("waiting")
class_info = select_templates_job.interact("get_class_info")
for c in class_info:
if 1.0 < c["res_A"] < 19.0 and c["num_particles_total"] > 100:
select_templates_job.interact(
"set_class_selected",
{
"class_idx": c["class_idx"],
"selected": True,
},
)
select_templates_job.interact("finish")
select_templates_job.wait_for_done()
'completed'
7.9. Reconstruction and Refinement#
Finally, queue and run Ab-Initio Reconstruction and Homogeneous Refinement jobs.
abinit_job = workspace.create_job(
"homo_abinit",
connections={"particles": (select_templates_job.uid, "particles_selected")},
)
refine_job = workspace.create_job(
"homo_refine_new",
connections={
"particles": (abinit_job.uid, "particles_all_classes"),
"volume": (abinit_job.uid, "volume_class_0"),
},
params={
"refine_symmetry": "D7",
"refine_defocus_refine": True,
"refine_ctf_global_refine": True,
},
)
abinit_job.queue(lane)
refine_job.queue(lane)
abinit_job.wait_for_done(), refine_job.wait_for_done()
('completed', 'completed')