cryosparc.job#
Defines the Job and External job classes for accessing CryoSPARC jobs.
Classes:
|
Mutable custom output job with customizeble input slots and output results. |
|
Accessor class to a job in CryoSPARC with ability to load inputs and outputs, add to job log, download job files. |
Data:
Input and output result groups may only contain, letters, numbers and underscores. |
- class cryosparc.job.ExternalJob(cs: CryoSPARC, project_uid: str, uid: str)#
Mutable custom output job with customizeble input slots and output results. Use External jobs to save data save cryo-EM data generated by a software package outside of CryoSPARC.
Created external jobs may be connected to any other CryoSPARC job result as an input. Its outputs must be created manually and may be configured to passthrough inherited input fields, just as with regular CryoSPARC jobs.
Create a new External Job with Project.create_external_job. ExternalJob is a subclass of Job and inherits all its methods.
- uid#
Job unique ID, e.g., “J42”
- Type:
str
- project_uid#
Project unique ID, e.g., “P3”
- Type:
str
- doc#
All job data from the CryoSPARC database. Database contents may change over time, use the refresh method to update.
- Type:
Examples
Import multiple exposure groups into a single job
>>> from cryosparc.tools import CryoSPARC >>> cs = CryoSPARC() >>> project = cs.find_project("P3") >>> job = project.create_external_job("W3", title="Import Image Sets") >>> for i in range(3): ... dset = job.add_output( ... type="exposure", ... name=f"images_{i}", ... slots=["movie_blob", "mscope_params", "gain_ref_blob"], ... alloc=10 # allocate a dataset for this output with 10 rows ... ) ... dset['movie_blob/path'] = ... # populate dataset ... job.save_output(output_name, dset)
Methods:
add_input
(type[, name, min, max, slots, title])Add an input slot to the current job.
Add an output slot to the current job.
alloc_output
(name[, alloc])Allocate an empty dataset for the given output with the given name.
connect
(target_input, source_job_uid, ...[, ...])Connect the given input for this job to an output with given job UID and name.
kill
()Kill this job.
queue
([lane, hostname, gpus, cluster_vars])Queue a job to a target lane.
run
()Start a job within a context manager and stop the job when the context ends.
save_output
(name, dataset, *[, refresh])Save output dataset to external job.
start
([status])Set job status to "running" or "waiting"
stop
([error])Set job status to "completed" or "failed"
- add_input(type: Literal['exposure', 'particle', 'template', 'volume', 'volume_multi', 'mask', 'live', 'ml_model', 'symmetry_candidate', 'flex_mesh', 'flex_model', 'hyperparameter', 'denoise_model', 'annotation_model'], name: str | None = None, min: int = 0, max: int | Literal['inf'] = 'inf', slots: Iterable[str | Datafield] = [], title: str | None = None)#
Add an input slot to the current job. May be connected to zero or more outputs from other jobs (depending on the min and max values).
- Parameters:
type (Datatype) – cryo-EM data type for this output, e.g., “particle”
name (str, optional) – Output name key, e.g., “picked_particles”. Defaults to None.
min (int, optional) – Minimum number of required input connections. Defaults to 0.
max (int | Literal["inf"], optional) – Maximum number of input connections. Specify
"inf"
for unlimited connections. Defaults to “inf”.slots (list[SlotSpec], optional) – List of slots that should be connected to this input, such as
"location"
or"blob"
Defaults to [].title (str, optional) – Human-readable title for this input. Defaults to None.
- Raises:
CommandError – General CryoSPARC network access error such as timeout, URL or HTTP
InvalidSlotsError – slots argument is invalid
- Returns:
name of created input
- Return type:
str
Examples
Create an external job that accepts micrographs as input:
>>> cs = CryoSPARC() >>> project = cs.find_project("P3") >>> job = project.create_external_job("W1", title="Custom Picker") >>> job.uid "J3" >>> job.add_input( ... type="exposure", ... name="input_micrographs", ... min=1, ... slots=["micrograph_blob", "ctf"], ... title="Input micrographs for picking ... ) "input_micrographs"
- add_output(type: Literal['exposure', 'particle', 'template', 'volume', 'volume_multi', 'mask', 'live', 'ml_model', 'symmetry_candidate', 'flex_mesh', 'flex_model', 'hyperparameter', 'denoise_model', 'annotation_model'], name: str | None = None, slots: List[str | Datafield] = [], passthrough: str | None = None, title: str | None = None, *, alloc: Literal[None] = None) str #
- add_output(type: Literal['exposure', 'particle', 'template', 'volume', 'volume_multi', 'mask', 'live', 'ml_model', 'symmetry_candidate', 'flex_mesh', 'flex_model', 'hyperparameter', 'denoise_model', 'annotation_model'], name: str | None = None, slots: List[str | Datafield] = [], passthrough: str | None = None, title: str | None = None, *, alloc: int | Dataset = None) Dataset
Add an output slot to the current job. Optionally returns the corresponding empty dataset if
alloc
is specified.- Parameters:
type (Datatype) – cryo-EM datatype for this output, e.g., “particle”
name (str, optional) – Output name key, e.g., “selected_particles”. Same as
type
if not specified. Defaults to None.slots (list[SlotSpec], optional) – List of slot expected to be created for this output, such as
location
orblob
. Do not specify any slots that were passed through from an input unless those slots are modified in the output. Defaults to [].passthrough (str, optional) – Indicates that this output inherits slots from an existing input with the specified name. The input must first be added with
add_input()
. Defaults to False.title (str, optional) – Human-readable title for this input. Defaults to None.
alloc (int | Dataset, optional) – If specified, pre-allocate and return a dataset with the requested slots. Specify an integer to allocate a specific number of rows. Specify a Dataset from which to inherit unique row IDs (useful when adding passthrough outputs). Defaults to None.
- Raises:
CommandError – General CryoSPARC network access error such as timeout, URL or HTTP
InvalidSlotsError – slots argument is invalid
- Returns:
- Name of the created output. If
alloc
is specified as an integer, instead returns blank dataset with the given size and random UIDs. If
alloc
is specified as a Dataset, returns blank dataset with the same UIDs.
- Name of the created output. If
- Return type:
str | Dataset
Examples
Create and allocate an output for new particle picks
>>> cs = CryoSPARC() >>> project = cs.find_project("P3") >>> job = project.find_external_job("J3") >>> particles_dset = job.add_output( ... type="particle", ... name="picked_particles", ... slots=["location", "pick_stats"], ... alloc=10000 ... )
Create an inheritied output for input micrographs
>>> job.add_output( ... type="exposures", ... name="picked_micrographs", ... passthrough="input_micrographs", ... title="Passthrough picked micrographs" ... ) "picked_micrographs"
Create an output with multiple slots of the same type
>>> job.add_output( ... type="particle", ... name="particle_alignments", ... slots=[ ... {"dtype": "alignments3D", "prefix": "alignments_class_0", "required": True}, ... {"dtype": "alignments3D", "prefix": "alignments_class_1", "required": True}, ... {"dtype": "alignments3D", "prefix": "alignments_class_2", "required": True}, ... ] ... ) "particle_alignments"
- alloc_output(name: str, alloc: int | ArrayLike | Dataset = 0) Dataset #
Allocate an empty dataset for the given output with the given name. Initialize with the given number of empty rows. The result may be used with
save_output
with the same output name.- Parameters:
name (str) – Name of job output to allocate
size (int | ArrayLike | Dataset, optional) – Specify as one of the following: (A) integer to allocate a specific number of rows, (B) a numpy array of numbers to use for UIDs in the allocated dataset or (C) a dataset from which to inherit unique row IDs (useful for allocating passthrough outputs). Defaults to 0.
- Returns:
Empty dataset with the given number of rows
- Return type:
Examples
Allocate a dataset of size 10,000 for an output for new particle picks
>>> cs = CryoSPARC() >>> project = cs.find_project("P3") >>> job = project.find_external_job("J3") >>> job.alloc_output("picked_particles", 10000) Dataset([ # 10000 items, 11 fields ("uid": [...]), ("location/micrograph_path", ["", ...]), ... ])
Allocate a dataset from an existing input passthrough dataset
>>> input_micrographs = job.load_input("input_micrographs") >>> job.alloc_output("picked_micrographs", input_micrographs) Dataset([ # same "uid" field as input_micrographs ("uid": [...]), ])
- connect(target_input: str, source_job_uid: str, source_output: str, *, slots: List[str | Datafield] = [], title: str = '', desc: str = '', refresh: bool = True) bool #
Connect the given input for this job to an output with given job UID and name. If this input does not exist, it will be added with the given slots. At least one slot must be specified if the input does not exist.
- Parameters:
target_input (str) – Input name to connect into. Will be created if does not already exist.
source_job_uid (str) – Job UID to connect from, e.g., “J42”
source_output (str) – Job output name to connect from , e.g., “particles”
slots (list[SlotSpec], optional) – List of slots to add to created input. All if not specified. Defaults to [].
title (str, optional) – Human readable title for created input. Defaults to “”.
desc (str, optional) – Human readable description for created input. Defaults to “”.
refresh (bool, optional) – Auto-refresh job document after connecting. Defaults to True.
- Raises:
CommandError – General CryoSPARC network access error such as timeout, URL or HTTP
InvalidSlotsError – slots argument is invalid
Examples
Connect J3 to CTF-corrected micrographs from J2’s
micrographs
output.>>> cs = CryoSPARC() >>> project = cs.find_project("P3") >>> job = project.find_external_job("J3") >>> job.connect("input_micrographs", "J2", "micrographs")
- kill()#
Kill this job.
- queue(lane: str | None = None, hostname: str | None = None, gpus: List[int] = [], cluster_vars: Dict[str, Any] = {})#
Queue a job to a target lane. Available lanes may be queried with CryoSPARC.get_lanes.
Optionally specify a hostname for a node or cluster in the given lane. Optionally specify specific GPUs indexes to use for computation.
Available hostnames for a given lane may be queried with CryoSPARC.get_targets.
- Parameters:
lane (str, optional) – Configuried compute lane to queue to. Leave unspecified to run directly on the master or current workstation. Defaults to None.
hostname (str, optional) – Specific hostname in compute lane, if more than one is available. Defaults to None.
gpus (list[int], optional) – GPUs to queue to. If specified, must have as many GPUs as required in job parameters. Leave unspecified to use first available GPU(s). Defaults to [].
cluster_vars (dict[str, Any], optional) – Specify custom cluster variables when queuing to a cluster. Keys are variable names. Defaults to False.
Examples
Queue a job to lane named “worker”:
>>> cs = CryoSPARC() >>> job = cs.find_job("P3", "J42") >>> job.status "building" >>> job.queue("worker") >>> job.status "queued"
- run()#
Start a job within a context manager and stop the job when the context ends.
- Yields:
ExternalJob – self.
Examples
Job will be marked as “failed” if the contents of the block throw an exception
>>> with job.run(): ... job.save_output(...)
- save_output(name: str, dataset: Dataset, *, refresh: bool = True)#
Save output dataset to external job.
- Parameters:
name (str) – Name of output on this job.
dataset (Dataset) – Value of output with only required fields.
refresh (bool, Optional) – Auto-refresh job document after saving. Defaults to True
Examples
Save a previously-allocated output.
>>> cs = CryoSPARC() >>> project = cs.find_project("P3") >>> job = project.find_external_job("J3") >>> particles = job.alloc_output("picked_particles", 10000) >>> job.save_output("picked_particles", particles)
- start(status: Literal['running', 'waiting'] = 'waiting')#
Set job status to “running” or “waiting”
- Parameters:
status (str, optional) – “running” or “waiting”. Defaults to “waiting”.
- stop(error=False)#
Set job status to “completed” or “failed”
- Parameters:
error (bool, optional) – Job completed with errors. Defaults to False.
- cryosparc.job.GROUP_NAME_PATTERN = '^[A-Za-z][0-9A-Za-z_]*$'#
Input and output result groups may only contain, letters, numbers and underscores.
- class cryosparc.job.Job(cs: CryoSPARC, project_uid: str, uid: str)#
Accessor class to a job in CryoSPARC with ability to load inputs and outputs, add to job log, download job files. Should be instantiated through CryoSPARC.find_job or Project.find_job.
- uid#
Job unique ID, e.g., “J42”
- Type:
str
- project_uid#
Project unique ID, e.g., “P3”
- Type:
str
- doc#
All job data from the CryoSPARC database. Database contents may change over time, use the refresh method to update.
- Type:
Examples
Find an existing job.
>>> cs = CryoSPARC() >>> job = cs.find_job("P3", "J42") >>> job.status "building"
Queue a job.
>>> job.queue("worker_lane") >>> job.status "queued"
Create a 3-class ab-initio job connected to existing particles.
>>> job = cs.create_job("P3", "W1", "homo_abinit" ... connections={"particles": ("J20", "particles_selected")} ... params={"abinit_K": 3} ... ) >>> job.queue() >>> job.status "queued"
Methods:
clear
()Clear this job and reset to building status.
connect
(target_input, source_job_uid, ...[, ...])Connect the given input for this job to an output with given job UID and name.
cp
(source_path[, target_path])Copy a file or folder into the job directory.
dir
()Get the path to the job directory.
disconnect
(target_input[, connection_idx, ...])Clear the given job input group.
download
(path)Initiate a download request for a file inside the job's directory.
download_asset
(fileid, target)Download a job asset from the database with the given ID.
download_dataset
(path)Download a .cs dataset file from the given path in the job directory.
download_file
(path[, target])Download file from job directory to the given target path or writeable file handle.
download_mrc
(path)Download a .mrc file from the given relative path in the job directory.
interact
(action[, body, timeout, refresh])Call an interactive action on a waiting interactive job.
kill
()Kill this job.
Get a list of files available in the database for this job.
list_files
([prefix, recursive])Get a list of files inside the job directory.
load_input
(name[, slots])Load the dataset connected to the job's input with the given name.
load_output
(name[, slots, version])Load the dataset for the job's output with the given name.
log
(text[, level])Append to a job's event log.
log_checkpoint
([meta])Append a checkpoint to the job's event log.
log_plot
(figure, text[, formats, raw_data, ...])Add a log line with the given figure.
mkdir
(target_path[, parents, exist_ok])Create a folder in the given job.
Print a table of input keys, their title, type, connection requirements and details about their low-level required slots.
Print a table of output keys, their title, type and details about their low-level results.
Print a table of parameter keys, their title, type and default to standard output:
queue
([lane, hostname, gpus, cluster_vars])Queue a job to a target lane.
refresh
()Reload this job from the CryoSPARC database.
set_param
(name, value, *[, refresh])Set the given param name on the current job to the given value.
subprocess
(args[, mute, checkpoint, ...])Launch a subprocess and write its text-based output and error to the job log.
symlink
(source_path[, target_path])Create a symbolic link in job's directory.
upload
(target_path, source, *[, overwrite])Upload the given file to the job directory at the given path.
upload_asset
(file[, filename, format])Upload an image or text file to the current job.
upload_dataset
(target_path, dset, *[, ...])Upload a dataset as a CS file into the job directory.
upload_mrc
(target_path, data, psize, *[, ...])Upload a numpy 2D or 3D array to the job directory as an MRC file.
upload_plot
(figure[, name, formats, ...])Upload the given figure.
wait_for_done
(*[, error_on_incomplete, timeout])Wait until a job reaches status "completed", "killed" or "failed".
wait_for_status
(status, *[, timeout])Wait for a job's status to reach the specified value.
Attributes:
- clear()#
Clear this job and reset to building status.
- connect(target_input: str, source_job_uid: str, source_output: str, *, refresh: bool = True) bool #
Connect the given input for this job to an output with given job UID and name.
- Parameters:
target_input (str) – Input name to connect into. Will be created if not specified.
source_job_uid (str) – Job UID to connect from, e.g., “J42”
source_output (str) – Job output name to connect from , e.g., “particles”
refresh (bool, optional) – Auto-refresh job document after connecting. Defaults to True.
- Returns:
False if the job encountered a build error.
- Return type:
bool
Examples
Connect J3 to CTF-corrected micrographs from J2’s
micrographs
output.>>> cs = CryoSPARC() >>> project = cs.find_project("P3") >>> job = project.find_job("J3") >>> job.connect("input_micrographs", "J2", "micrographs")
- cp(source_path: str | PurePosixPath, target_path: str | PurePosixPath = '')#
Copy a file or folder into the job directory.
- Parameters:
source_path (str | Path) – Relative or absolute path of source file or folder to copy. If relative, assumed to be within the job directory.
target_path (str | Path, optional) – Name or path in the job directory to copy into. If not specified, uses the same file name as the source. Defaults to “”.
- dir() PurePosixPath #
Get the path to the job directory.
- Returns:
job directory Pure Path instance
- Return type:
Path
- disconnect(target_input: str, connection_idx: int | None = None, *, refresh: bool = True)#
Clear the given job input group.
- Parameters:
target_input (str) – Name of input to disconnect
connection_idx (int, optional) – Connection index to clear. Set to 0 to clear the first connection, 1 for the second, etc. If unspecified, clears all connections. Defaults to None.
refresh (bool, optional) – Auto-refresh job document after connecting. Defaults to True.
- download(path: str | PurePosixPath)#
Initiate a download request for a file inside the job’s directory. Use to get files from a remote CryoSPARC instance where the job directory is not available on the client file system.
- Parameters:
path (str | Path) – Name or path of file in job directory.
- Yields:
HTTPResponse –
- Use a context manager to read the file from the
request body.
Examples
Download a job’s metadata
>>> cs = CryoSPARC() >>> job = cs.find_job("P3", "J42") >>> with job.download("job.json") as res: >>> job_data = json.loads(res.read())
- download_asset(fileid: str, target: str | PurePath | IO[bytes])#
Download a job asset from the database with the given ID. Note that the file does not necessary have to belong to the current job.
- Parameters:
fileid (str) – GridFS file object ID
target (str | Path | IO) – Local file path, directory path or writeable file handle to write response data.
- Returns:
resulting target path or file handle.
- Return type:
Path | IO
- download_dataset(path: str | PurePosixPath)#
Download a .cs dataset file from the given path in the job directory.
- Parameters:
path (str | Path) – Name or path of .cs file in job directory.
- Returns:
Loaded dataset instance
- Return type:
- download_file(path: str | PurePosixPath, target: str | PurePath | IO[bytes] = '')#
Download file from job directory to the given target path or writeable file handle.
- Parameters:
path (str | Path) – Name or path of file in job directory.
target (str | Path | IO) – Local file path, directory path or writeable file handle to write response data. If not specified, downloads to current working directory with same file name. Defaults to “”.
- Returns:
resulting target path or file handle.
- Return type:
Path | IO
- download_mrc(path: str | PurePosixPath)#
Download a .mrc file from the given relative path in the job directory.
- Parameters:
path (str | Path) – Name or path of .mrc file in job directory.
- Returns:
MRC file header and data as a numpy array
- Return type:
tuple[Header, NDArray]
- interact(action: str, body: Any = {}, *, timeout: int = 10, refresh: bool = False) Any #
Call an interactive action on a waiting interactive job. The possible actions and expected body depends on the job type.
- Parameters:
action (str) – Interactive endpoint to call.
body (any) – Body parameters for the interactive endpoint. Must be JSON-encodable.
timeout (int, optional) – Maximum time to wait for the action to complete, in seconds. Defaults to 10.
refresh (bool, optional) – If True, refresh the job document after posting. Defaults to False.
- kill()#
Kill this job.
- list_assets() List[AssetDetails] #
Get a list of files available in the database for this job. Returns a list with details about the assets. Each entry is a dict with a
_id
key which may be used to download the file with thedownload_asset
method.- Returns:
Asset details
- Return type:
list[AssetDetails]
- list_files(prefix: str | PurePosixPath = '', recursive: bool = False) List[str] #
Get a list of files inside the job directory.
- Parameters:
prefix (str | Path, optional) – Subdirectory inside job to list. Defaults to “”.
recursive (bool, optional) – If True, lists files recursively. Defaults to False.
- Returns:
List of file paths relative to the job directory.
- Return type:
list[str]
- load_input(name: str, slots: Iterable[str] = [])#
Load the dataset connected to the job’s input with the given name.
- Parameters:
name (str) – Input to load
fields (list[str], optional) – List of specific slots to load, such as
movie_blob
orlocations
, or all slots if not specified. Defaults to [].
- Raises:
TypeError – If the job doesn’t have the given input or the dataset cannot be loaded.
- Returns:
Loaded dataset
- Return type:
- load_output(name: str, slots: Iterable[str] = [], version: int | Literal['F'] = 'F')#
Load the dataset for the job’s output with the given name.
- Parameters:
name (str) – Output to load
slots (list[str], optional) – List of specific slots to load, such as
movie_blob
orlocations
, or all slots if not specified (including passthrough). Defaults to [].version (int | Literal["F"], optional) – Specific output version to load. Use this to load the output at different stages of processing. Leave unspecified to load final verion. Defaults to “F”
- Raises:
TypeError – If job does not have any results for the given output
- Returns:
Loaded dataset
- Return type:
- log(text: str, level: Literal['text', 'warning', 'error'] = 'text')#
Append to a job’s event log.
- Parameters:
text (str) – Text to log
level (str, optional) – Log level (“text”, “warning” or “error”). Defaults to “text”.
- Returns:
Created log event ID
- Return type:
str
- log_checkpoint(meta: dict = {})#
Append a checkpoint to the job’s event log.
- Parameters:
meta (dict, optional) – Additional meta information. Defaults to {}.
- Returns:
Created checkpoint event ID
- Return type:
str
- log_plot(figure: str | PurePath | IO[bytes] | Any, text: str, formats: Iterable[Literal['pdf', 'gif', 'jpg', 'jpeg', 'png', 'svg']] = ['png', 'pdf'], raw_data: str | bytes | None = None, raw_data_file: str | PurePath | IO[bytes] | None = None, raw_data_format: Literal['txt', 'csv', 'html', 'json', 'xml', 'bild', 'bld'] | None = None, flags: List[str] = ['plots'], savefig_kw: dict = {'bbox_inches': 'tight', 'pad_inches': 0})#
Add a log line with the given figure.
figure
must be one of the followingPath to an existing image file in PNG, JPEG, GIF, SVG or PDF format
A file handle-like object with the binary data of an image
A matplotlib plot
If a matplotlib figure is specified, Uploads the plots in
png
andpdf
formats. Override theformats
argument withformats=['<format1>', '<format2>', ...]
to save in different image formats.If a text-version of the given plot is available (e.g., in
csv
format), specifyraw_data
with the full contents orraw_data_file
with a path or binary file handle pointing to the contents. Assumes file format from extension orraw_data_format
. Defaults to"txt"
if cannot be determined.- Parameters:
figure (str | Path | IO | Figure) – Image file path, file handle or matplotlib figure instance
text (str) – Associated description for given figure
formats (list[ImageFormat], optional) – Image formats to save plot into. If a
figure
is a file handle, specifyformats=['<format>']
, where<format>
is a valid image extension such aspng
orpdf
. Assumespng
if not specified. Defaults to [“png”, “pdf”].raw_data (str | bytes, optional) – Raw text data for associated plot, generally in CSV, XML or JSON format. Cannot be specified with
raw_data_file
. Defaults to None.raw_data_file (str | Path | IO, optional) – Path to raw text data. Cannot be specified with
raw_data
. Defaults to None.raw_data_format (TextFormat, optional) – Format for raw text data. Defaults to None.
flags (list[str], optional) – Flags to use for UI rendering. Generally should not be specified. Defaults to [“plots”].
savefig_kw (dict, optional) – If a matplotlib figure is specified optionally specify keyword arguments for the
savefig
method. Defaults to dict(bbox_inches=”tight”, pad_inches=0).
- Returns:
Created log event ID
- Return type:
str
- mkdir(target_path: str | PurePosixPath, parents: bool = False, exist_ok: bool = False)#
Create a folder in the given job.
- Parameters:
target_path (str | Path) – Name or path of folder to create inside the job directory.
parents (bool, optional) – If True, any missing parents are created as needed. Defaults to False.
exist_ok (bool, optional) – If True, does not raise an error for existing directories. Still raises if the target path is not a directory. Defaults to False.
- print_input_spec()#
Print a table of input keys, their title, type, connection requirements and details about their low-level required slots.
The “Required?” heading also shows the number of outputs that must be connected to the input for this job to run.
Examples
>>> cs = CryoSPARC() >>> job = cs.find_job("P3", "J42") >>> job.doc['type'] 'extract_micrographs_multi' >>> job.print_output_spec() Input | Title | Type | Required? | Input Slots | Slot Types | Slot Required? ===================================================================================================== micrographs | Micrographs | exposure | ✓ (1+) | micrograph_blob | micrograph_blob | ✓ | | | | mscope_params | mscope_params | ✓ | | | | background_blob | stat_blob | ✕ | | | | ctf | ctf | ✕ particles | Particles | particle | ✕ (0+) | location | location | ✓ | | | | alignments2D | alignments2D | ✕ | | | | alignments3D | alignments3D | ✕
- print_output_spec()#
Print a table of output keys, their title, type and details about their low-level results.
Examples
>>> cs = CryoSPARC() >>> job = cs.find_job("P3", "J42") >>> job.doc['type'] 'extract_micrographs_multi' >>> job.print_output_spec() Output | Title | Type | Result Slots | Result Types ========================================================================================== micrographs | Micrographs | exposure | micrograph_blob | micrograph_blob | | | micrograph_blob_non_dw | micrograph_blob | | | background_blob | stat_blob | | | ctf | ctf | | | ctf_stats | ctf_stats | | | mscope_params | mscope_params particles | Particles | particle | blob | blob | | | ctf | ctf
- print_param_spec()#
Print a table of parameter keys, their title, type and default to standard output:
Examples
>>> cs = CryoSPARC() >>> job = cs.find_job("P3", "J42") >>> job.doc['type'] 'extract_micrographs_multi' >>> job.print_param_spec() Param | Title | Type | Default ======================================================================= box_size_pix | Extraction box size | number | 256 bin_size_pix | Fourier crop box size | number | None compute_num_gpus | Number of GPUs | number | 1 ...
- queue(lane: str | None = None, hostname: str | None = None, gpus: List[int] = [], cluster_vars: Dict[str, Any] = {})#
Queue a job to a target lane. Available lanes may be queried with CryoSPARC.get_lanes.
Optionally specify a hostname for a node or cluster in the given lane. Optionally specify specific GPUs indexes to use for computation.
Available hostnames for a given lane may be queried with CryoSPARC.get_targets.
- Parameters:
lane (str, optional) – Configuried compute lane to queue to. Leave unspecified to run directly on the master or current workstation. Defaults to None.
hostname (str, optional) – Specific hostname in compute lane, if more than one is available. Defaults to None.
gpus (list[int], optional) – GPUs to queue to. If specified, must have as many GPUs as required in job parameters. Leave unspecified to use first available GPU(s). Defaults to [].
cluster_vars (dict[str, Any], optional) – Specify custom cluster variables when queuing to a cluster. Keys are variable names. Defaults to False.
Examples
Queue a job to lane named “worker”:
>>> cs = CryoSPARC() >>> job = cs.find_job("P3", "J42") >>> job.status "building" >>> job.queue("worker") >>> job.status "queued"
- set_param(name: str, value: Any, *, refresh: bool = True) bool #
Set the given param name on the current job to the given value. Only works if the job is in “building” status.
- Parameters:
name (str) – Param name, as defined in the job document’s
params_base
.value (any) – Target parameter value.
refresh (bool, optional) – Auto-refresh job document after connecting. Defaults to True.
- Returns:
False if the job encountered a build error.
- Return type:
bool
Examples
Set the number of GPUs used by a supported job
>>> cs = CryoSPARC() >>> job = cs.find_job("P3", "J42") >>> job.set_param("compute_num_gpus", 4) True
- property status: Literal['building', 'queued', 'launched', 'started', 'running', 'waiting', 'completed', 'killed', 'failed']#
scheduling status.
- Type:
JobStatus
- subprocess(args: str | list, mute: bool = False, checkpoint: bool = False, checkpoint_line_pattern: str | Pattern[str] | None = None, **kwargs)#
Launch a subprocess and write its text-based output and error to the job log.
- Parameters:
args (str | list) – Process arguments to run
mute (bool, optional) – If True, does not also forward process output to standard output. Defaults to False.
checkpoint (bool, optional) – If True, creates a checkpoint in the job event log just before process output begins. Defaults to False.
checkpoint_line_pattern (str | Pattern[str], optional) – Regular expression to match checkpoint lines for processes with a lot of output. If a process outputs a line that matches this pattern, a checkpoint is created in the event log before this line is forwarded. Defaults to None.
**kwargs – Additional keyword arguments for
subprocess.Popen
.
- Raises:
TypeError – For invalid arguments
RuntimeError – If process exists with non-zero status code
- symlink(source_path: str | PurePosixPath, target_path: str | PurePosixPath = '')#
Create a symbolic link in job’s directory.
- Parameters:
source_path (str | Path) – Relative or absolute path of source file or folder to create a link to. If relative, assumed to be within the job directory.
target_path (str | Path) – Name or path of new symlink in the job directory. If not specified, creates link with the same file name as the source. Defaults to “”.
- property type: str#
Job type key
- upload(target_path: str | PurePosixPath, source: str | bytes | PurePath | IO, *, overwrite: bool = False)#
Upload the given file to the job directory at the given path. Fails if target already exists.
- Parameters:
target_path (str | Path) – Name or path of file to write in job directory.
source (str | bytes | Path | IO) – Local path or file handle to upload. May also specified as raw bytes.
overwrite (bool, optional) – If True, overwrite existing files. Defaults to False.
- upload_asset(file: str | PurePath | IO[bytes], filename: str | None = None, format: Literal['txt', 'csv', 'html', 'json', 'xml', 'bild', 'bld'] | Literal['pdf', 'gif', 'jpg', 'jpeg', 'png', 'svg'] | None = None) EventLogAsset #
Upload an image or text file to the current job. Specify either an image (PNG, JPG, GIF, PDF, SVG), text file (TXT, CSV, JSON, XML) or a binary IO object with data in one of those formats.
If a binary IO object is specified, either a filename or file format must be specified.
Unlike the
upload
method which saves files to the job directory, this method saves images to the database and exposes them for use in the job log.If specifying arbitrary binary I/O, specify either a filename or a file format.
- Parameters:
file (str | Path | IO) – Source asset file path or handle.
filename (str, optional) – Filename of asset. If
file
is a handle specify one offilename
orformat
. Defaults to None.format (AssetFormat, optional) – Format of filename. If
file
is a handle, specify one offilename
orformat
. Defaults to None.
- Raises:
ValueError – If incorrect arguments specified
- Returns:
Dictionary including details about uploaded asset.
- Return type:
- upload_dataset(target_path: str | PurePosixPath, dset: Dataset, *, format: int = 1, overwrite: bool = False)#
Upload a dataset as a CS file into the job directory. Fails if target already exists.
- Parameters:
target_path (str | Path) – Name or path of dataset to save in the job directory. Should have a
.cs
extension.dset (Dataset) – Dataset to save.
format (int) – Format to save in from
cryosparc.dataset.*_FORMAT
, defaults to NUMPY_FORMAT)overwrite (bool, optional) – If True, overwrite existing files. Defaults to False.
- upload_mrc(target_path: str | PurePosixPath, data: NDArray, psize: float, *, overwrite: bool = False)#
Upload a numpy 2D or 3D array to the job directory as an MRC file. Fails if target already exists.
- Parameters:
target_path (str | Path) – Name or path of MRC file to save in the job directory. Should have a
.mrc
extension.data (NDArray) – Numpy array with MRC file data.
psize (float) – Pixel size to include in MRC header.
overwrite (bool, optional) – If True, overwrite existing files. Defaults to False.
- upload_plot(figure: str | PurePath | IO[bytes] | Any, name: str | None = None, formats: Iterable[Literal['pdf', 'gif', 'jpg', 'jpeg', 'png', 'svg']] = ['png', 'pdf'], raw_data: str | bytes | None = None, raw_data_file: str | PurePath | IO[bytes] | None = None, raw_data_format: Literal['txt', 'csv', 'html', 'json', 'xml', 'bild', 'bld'] | None = None, savefig_kw: dict = {'bbox_inches': 'tight', 'pad_inches': 0}) List[EventLogAsset] #
Upload the given figure. Returns a list of the created asset objects. Avoid using directly; use
log_plot
instead. Seelog_plot
additional details.- Parameters:
figure (str | Path | IO | Figure) – Image file path, file handle or matplotlib figure instance
name (str) – Associated name for given figure
formats (list[ImageFormat], optional) – Image formats to save plot into. If a
figure
is a file handle, specifyformats=['<format>']
, where<format>
is a valid image extension such aspng
orpdf
. Assumespng
if not specified. Defaults to [“png”, “pdf”].raw_data (str | bytes, optional) – Raw text data for associated plot, generally in CSV, XML or JSON format. Cannot be specified with
raw_data_file
. Defaults to None.raw_data_file (str | Path | IO, optional) – Path to raw text data. Cannot be specified with
raw_data
. Defaults to None.raw_data_format (TextFormat, optional) – Format for raw text data. Defaults to None.
savefig_kw (dict, optional) – If a matplotlib figure is specified optionally specify keyword arguments for the
savefig
method. Defaults to dict(bbox_inches=”tight”, pad_inches=0).
- Raises:
ValueError – If incorrect argument specified
- Returns:
Details about created uploaded job assets
- Return type:
list[EventLogAsset]
- wait_for_done(*, error_on_incomplete: bool = False, timeout: int | None = None) str #
Wait until a job reaches status “completed”, “killed” or “failed”.
- Parameters:
error_on_incomplete (bool, optional) – If True, raises an assertion error when job finishes with status other than “completed” or timeout is reached. Defaults to False.
timeout (int, optional) – If specified, wait at most this many seconds. Once timeout is reached, returns current status or fails if
error_on_incomplete
isTrue
. Defaults to None.
- wait_for_status(status: Literal['building', 'queued', 'launched', 'started', 'running', 'waiting', 'completed', 'killed', 'failed'] | Iterable[Literal['building', 'queued', 'launched', 'started', 'running', 'waiting', 'completed', 'killed', 'failed']], *, timeout: int | None = None) str #
Wait for a job’s status to reach the specified value. Must be one of the following:
‘building’
‘queued’
‘launched’
‘started’
‘running’
‘waiting’
‘completed’
‘killed’
‘failed’
- Parameters:
status (str | set[str]) – Specific status or set of statuses to wait for. If a set of statuses is specified, waits util job reaches any of the specified statuses.
timeout (int, optional) – If specified, wait at most this many seconds. Once timeout is reached, returns current status. Defaults to None.
- Returns:
current job status
- Return type:
str