cryosparc.tools#

Main module exporting the CryoSPARC class for interfacing with a CryoSPARC instance from Python

Examples

>>> from cryosparc.tools import CryoSPARC
>>> license = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
>>> email = "ali@example.com"
>>> password = "password123"
>>> cs = CryoSPARC(
...     license=license,
...     email=email,
...     password=password,
...     host="localhost",
...     base_port=39000
... )
>>> project = cs.find_project("P3")

Classes:

CryoSPARC([license, host, base_port, email, ...])

High-level session class for interfacing with a CryoSPARC instance.

Data:

SUPPORTED_EXPOSURE_FORMATS

Supported micrograph file formats.

Functions:

downsample(arr[, factor])

Downsample a micrograph or movie by the given factor.

get_exposure_format(data_format[, voxel_type])

Get the movie_blob/format or micrograph_blob format value for an exposure type, where data_format is one of

get_import_signatures(abs_paths)

Get list of import signatures for the given path or paths.

lowpass2(arr, psize_A[, ...])

Apply butterworth lowpass filter to the 2D array data with the given pixel size (psize_A).

class cryosparc.tools.CryoSPARC(license: str = '', host: str = 'localhost', base_port: int = 39000, email: str = '', password: str = '', timeout: int = 300)#

High-level session class for interfacing with a CryoSPARC instance.

Initialize with the host and base port of the running CryoSPARC instance. This host and (at minimum) base_port + 2, base_port + 3 and base_port + 5 should be accessible on the network.

Parameters:
  • license (str, optional) – CryoSPARC license key. Defaults to os.getenv("CRYOSPARC_LICENSE_ID").

  • host (str, optional) – Hostname or IP address running CryoSPARC master. Defaults to os.getenv("CRYOSPARC_MASTER_HOSTNAME", "localhost").

  • base_port (int, optional) – CryoSPARC services base port number. Defaults to os.getenv("CRYOSPARC_MASTER_HOSTNAME", 39000).

  • email (str, optional) – CryoSPARC user account email address. Defaults to os.getenv("CRYOSPARC_EMAIL").

  • password (str, optional) – CryoSPARC user account password address. Defaults to os.getenv("CRYOSPARC_PASSWORD").

  • timeout (int, optional) – Timeout error for HTTP requests to CryoSPARC command services. Defaults to 300.

cli#

HTTP/JSONRPC client for command_core service (port + 2).

Type:

CommandClient

vis#

HTTP/JSONRPC client for command_vis service (port + 3).

Type:

CommandClient

rtp#

HTTP/JSONRPC client for command_rtp service (port + 5).

Type:

CommandClient

user_id#

Mongo object ID of user account performing operations for this session.

Type:

str

Examples

Load project job and micrographs

>>> from cryosparc.tools import CryoSPARC
>>> cs = CryoSPARC(
...     license="xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
...     email="ali@example.com",
...     password="password123",
...     base_port=39000
... )
>>> job = cs.find_job("P3", "J42")
>>> micrographs = job.load_output('exposures')

Remove corrupt exposures (assumes is_mic_corrupt function)

>>> filtered_micrographs = micrographs.query(
...     lambda mic: is_mic_corrupt(mic["micrograph_blob/path"])
... )
>>> cs.save_external_result(
...     project_uid="P3",
...     workspace_uid="W1",
...     dataset=filtered_micrographs,
...     type="exposure",
...     name="filtered_exposures",
...     passthrough=("J42", "exposures")
... )
"J43"

Methods:

cp(project_uid, source_path_rel, target_path_rel)

Copy a file or folder within a project to another location within that same project.

create_external_job(project_uid, workspace_uid)

Add a new External job to this project to save generated outputs to.

create_job(project_uid, workspace_uid, type)

Create a new job with the given type.

create_workspace(project_uid, title[, desc])

Create a new empty workspace in the given project.

download(project_uid, path_rel)

Open a file in the given project for reading.

download_asset(fileid, target)

Download a file from CryoSPARC's MongoDB GridFS storage.

download_dataset(project_uid, path_rel)

Download a .cs dataset file from the given relative path in the project directory.

download_file(project_uid, path_rel, target)

Download a file from the project directory to the given writeable target.

download_mrc(project_uid, path_rel)

Download a .mrc file from the given relative path in the project directory.

find_external_job(project_uid, job_uid)

Get the External job accessor instance for an External job in this project with the given UID.

find_job(project_uid, job_uid)

Get a job by its unique project and job ID.

find_project(project_uid)

Get a project by its unique ID.

find_workspace(project_uid, workspace_uid)

Get a workspace accessor instance for the workspace in the given project with the given UID.

get_job_sections()

Get a summary of job types available for this instance, organized by category.

get_lanes()

Get a list of available scheduler lanes.

get_targets([lane])

Get a list of available scheduler targets.

list_assets(project_uid, job_uid)

Get a list of files available in the database for given job.

list_files(project_uid[, prefix, recursive])

Get a list of files inside the project directory.

mkdir(project_uid, target_path_rel[, ...])

Create a directory in the given project.

save_external_result(project_uid, ...[, ...])

Save the given result dataset to the project.

symlink(project_uid, source_path_rel, ...)

Create a symbolic link in the given project.

test_connection()

Verify connection to CryoSPARC command services.

upload(project_uid, target_path_rel, source, *)

Upload the given source file to the project directory at the given relative path.

upload_dataset(project_uid, target_path_rel, ...)

Upload a dataset as a CS file into the project directory.

upload_mrc(project_uid, target_path_rel, ...)

Upload a numpy 2D or 3D array to the job directory as an MRC file.

cp(project_uid: str, source_path_rel: Union[str, PurePosixPath], target_path_rel: Union[str, PurePosixPath])#

Copy a file or folder within a project to another location within that same project.

Parameters:
  • project_uid (str) – Target project UID, e.g., “P3”.

  • source_path_rel (str | Path) – Relative path in project of source file or folder to copy.

  • target_path_rel (str | Path) – Relative path in project to copy to.

create_external_job(project_uid: str, workspace_uid: str, title: Optional[str] = None, desc: Optional[str] = None) ExternalJob#

Add a new External job to this project to save generated outputs to.

Parameters:
  • project_uid (str) – Project UID to create in, e.g., “P3”

  • workspace_uid (str) – Workspace UID to create job in, e.g., “W1”

  • title (str, optional) – Title for external job (recommended). Defaults to None.

  • desc (str, optional) – Markdown description for external job. Defaults to None.

Returns:

created external job instance

Return type:

ExternalJob

create_job(project_uid: str, workspace_uid: str, type: str, connections: Dict[str, Union[Tuple[str, str], List[Tuple[str, str]]]] = {}, params: Dict[str, Any] = {}, title: Optional[str] = None, desc: Optional[str] = None) Job#

Create a new job with the given type. Use CryoSPARC.get_job_sections to query available job types on the connected CryoSPARC instance.

Parameters:
  • project_uid (str) – Project UID to create job in, e.g., “P3”

  • workspace_uid (str) – Workspace UID to create job in, e.g., “W1”

  • type (str) – Job type identifier, e.g., “homo_abinit”

  • connections (dict[str, tuple[str, str] | list[tuple[str, str]]]) – Initial input connections. Each key is an input name and each value is a (job uid, output name) tuple. Defaults to {}

  • params (dict[str, any], optional) – Specify parameter values. Defaults to {}.

  • title (str, optional) – Job title. Defaults to None.

  • desc (str, optional) – Job markdown description. Defaults to None.

Returns:

created job instance. Raises error if job cannot be created.

Return type:

Job

Examples

Create an Import Movies job.

>>> from cryosparc.tools import CryoSPARC
>>> cs = CryoSPARC()
>>> import_job = cs.create_job("P3", "W1", "import_movies")
>>> import_job.set_param("blob_paths", "/bulk/data/t20s/*.tif")
True

Create a 3-class ab-initio job connected to existing particles.

>>> abinit_job = cs.create_job("P3", "W1", "homo_abinit"
...     connections={"particles": ("J20", "particles_selected")}
...     params={"abinit_K": 3}
... )
create_workspace(project_uid: str, title: str, desc: Optional[str] = None) Workspace#

Create a new empty workspace in the given project.

Parameters:
  • project_uid (str) – Project UID to create in, e.g., “P3”.

  • title (str) – Title of new workspace.

  • desc (str, optional) – Markdown text description. Defaults to None.

Returns:

created workspace instance

Return type:

Workspace

download(project_uid: str, path_rel: Union[str, PurePosixPath])#

Open a file in the given project for reading. Use to get files from a remote CryoSPARC instance whose the project directories are not available on the client file system.

Parameters:
  • project_uid (str) – Short unique ID of CryoSPARC project, e.g., “P3”

  • path_rel (str | Path) – Relative path to file in project directory

Yields:

HTTPResponse – Use a context manager to read the file from the request body

Examples

Download a job’s metadata

>>> cs = CryoSPARC()
>>> with cs.download('P3', 'J42/job.json') as res:
>>>     job_data = json.loads(res.read())
download_asset(fileid: str, target: Union[str, PurePath, IO[bytes]])#

Download a file from CryoSPARC’s MongoDB GridFS storage.

Parameters:
  • fileid (str) – GridFS file object ID

  • target (str | Path | IO) – Local file path, directory path or writeable file handle to write response data.

Returns:

resulting target path or file handle.

Return type:

Path | IO

download_dataset(project_uid: str, path_rel: Union[str, PurePosixPath])#

Download a .cs dataset file from the given relative path in the project directory.

Parameters:
  • project_uid (str) – project unique ID, e.g., “P3”

  • path_rel (str | Path) – Realtive path to .cs file in project

  • directory.

Returns:

Loaded dataset instance

Return type:

Dataset

download_file(project_uid: str, path_rel: Union[str, PurePosixPath], target: Union[str, PurePath, IO[bytes]])#

Download a file from the project directory to the given writeable target.

Parameters:
  • project_uid (str) – project unique ID, e.g., “P3”

  • path_rel (str | Path) – Relative path of file in project directory.

  • target (str | Path | IO) – Local file path, directory path or writeable file handle to write response data.

Returns:

resulting target path or file handle.

Return type:

Path | IO

download_mrc(project_uid: str, path_rel: Union[str, PurePosixPath])#

Download a .mrc file from the given relative path in the project directory.

Parameters:
  • project_uid (str) – project unique ID, e.g., “P3”

  • path_rel (str | Path) – Relative path to .mrc file in project directory.

Returns:

MRC file header and data as a numpy array

Return type:

tuple[Header, NDArray]

find_external_job(project_uid: str, job_uid: str) ExternalJob#

Get the External job accessor instance for an External job in this project with the given UID. Fails if the job does not exist or is not an external job.

Parameters:
  • project_uid (str) – Project unique ID, e.g,. “P3”

  • job_uid (str) – Job unique ID, e.g,. “J42”

Raises:

TypeError – If job is not an external job

Returns:

accessor instance

Return type:

ExternalJob

find_job(project_uid: str, job_uid: str) Job#

Get a job by its unique project and job ID.

Parameters:
  • project_uid (str) – project unique ID, e.g., “P3”

  • job_uid (str) – job unique ID, e.g., “J42”

Returns:

job instance

Return type:

Job

find_project(project_uid: str) Project#

Get a project by its unique ID.

Parameters:

project_uid (str) – project unique ID, e.g., “P3”

Returns:

project instance

Return type:

Project

find_workspace(project_uid: str, workspace_uid: str) Workspace#

Get a workspace accessor instance for the workspace in the given project with the given UID. Fails with an error if workspace does not exist.

Parameters:
  • project_uid (str) – Project unique ID, e.g,. “P3”

  • workspace_uid (str) – Workspace unique ID, e.g., “W1”

Returns:

accessor instance

Return type:

Workspace

get_job_sections() List[JobSection]#

Get a summary of job types available for this instance, organized by category.

Returns:

List of job section dictionaries. Job types

are listed in the "contains" key in each dictionary.

Return type:

list[JobSection]

get_lanes() List[SchedulerLane]#

Get a list of available scheduler lanes.

Returns:

Details about available lanes.

Return type:

list[SchedulerLane]

get_targets(lane: Optional[str] = None) List[Union[SchedulerTargetNode, SchedulerTargetGpuNode, SchedulerTargetCluster]]#

Get a list of available scheduler targets.

Parameters:

lane (str, optional) – Only get targets from this specific lane. Returns all targets if not specified. Defaults to None.

Returns:

Details about available targets.

Return type:

list[SchedulerTarget]

list_assets(project_uid: str, job_uid: str) List[AssetDetails]#

Get a list of files available in the database for given job. Returns a list with details about the assets. Each entry is a dict with a _id key which may be used to download the file with the download_asset method.

Parameters:
  • project_uid (str) – project unique ID, e.g., “P3”

  • job_uid (str) – job unique ID, e.g., “J42”

Returns:

Asset details

Return type:

list[AssetDetails]

list_files(project_uid: str, prefix: Union[str, PurePosixPath] = '', recursive: bool = False) List[str]#

Get a list of files inside the project directory.

Parameters:
  • project_uid (str) – Project unique ID, e.g., “P3”

  • prefix (str | Path, optional) – Subdirectory inside project to list. Defaults to “”.

  • recursive (bool, optional) – If True, lists files recursively. Defaults to False.

Returns:

List of file paths relative to the project directory.

Return type:

list[str]

mkdir(project_uid: str, target_path_rel: Union[str, PurePosixPath], parents: bool = False, exist_ok: bool = False)#

Create a directory in the given project.

Parameters:
  • project_uid (str) – Target project directory

  • target_path_rel (str | Path) – Relative path to create inside project directory.

  • parents (bool, optional) – If True, any missing parents are created as needed. Defaults to False.

  • exist_ok (bool, optional) – If True, does not raise an error for existing directories. Still raises if the target path is not a directory. Defaults to False.

save_external_result(project_uid: str, workspace_uid: Optional[str], dataset: Dataset[R], type: Literal['exposure', 'particle', 'template', 'volume', 'mask', 'live', 'ml_model', 'symmetry_candidate', 'flex_mesh', 'flex_model', 'hyperparameter'], name: Optional[str] = None, slots: Optional[List[Union[str, Datafield]]] = None, passthrough: Optional[Tuple[str, str]] = None, title: Optional[str] = None, desc: Optional[str] = None) str#

Save the given result dataset to the project. Specify at least the dataset to save and the type of data.

Returns UID of the External job where the results were saved.

Examples

Save all particle data

>>> particles = Dataset()
>>> cs.save_external_result("P1", "W1", particles, 'particle')
"J43"

Save new particle locations that inherit passthrough slots from a parent job.

>>> particles = Dataset()
>>> cs.save_external_result(
...     project_uid="P1",
...     workspace_uid="W1",
...     dataset=particles,
...     type='particle',
...     name='particles',
...     slots=['location'],
...     passthrough=('J42', 'selected_particles'),
...     title='Re-centered particles'
... )
"J44"

Save a result with multiple slots of the same type.

>>> cs.save_external_result(
...     project_uid="P1",
...     workspace_uid="P1",
...     dataset=particles,
...     type="particle",
...     name="particle_alignments",
...     slots=[
...         {"dtype": "alignments3D", "prefix": "alignments_class_0", "required": True},
...         {"dtype": "alignments3D", "prefix": "alignments_class_1", "required": True},
...         {"dtype": "alignments3D", "prefix": "alignments_class_2", "required": True},
...     ]
... )
"J45"
Parameters:
  • project_uid (str) – Project UID to save results into.

  • workspace_uid (str | None) – Workspace UID to save results into. Specify None to auto-select a workspace.

  • dataset (Dataset) – Result dataset.

  • type (Datatype) – Type of output dataset.

  • name (str, optional) – Name of output on created External job. Same as type if unspecified. Defaults to None.

  • slots (list[SlotSpec], optional) – List of slots expected to be created for this output such as location or blob. Do not specify any slots that were passed through from an input unless those slots are modified in the output. Defaults to None.

  • passthrough (tuple[str, str], optional) – Indicates that this output inherits slots from the specified output. e.g., ("J1", "particles"). Defaults to None.

  • title (str, optional) – Human-readable title for this output. Defaults to None.

  • desc (str, optional) – Markdown description for this output. Defaults to None.

Raises:
Returns:

UID of created job where this output was saved

Return type:

str

Create a symbolic link in the given project. May only create links for files within the project.

Parameters:
  • project_uid (str) – Target project UID, e.g., “P3”.

  • source_path_rel (str | Path) – Relative path in project to file from which to create symlink.

  • target_path_rel (str | Path) – Relative path in project to new symlink.

test_connection()#

Verify connection to CryoSPARC command services.

Returns:

True if connection succeeded, False otherwise

Return type:

bool

upload(project_uid: str, target_path_rel: Union[str, PurePosixPath], source: Union[str, bytes, PurePath, IO], *, overwrite: bool = False)#

Upload the given source file to the project directory at the given relative path. Fails if target already exists.

Parameters:
  • project_uid (str) – project unique ID, e.g., “P3”

  • target_path_rel (str | Path) – Relative target path in project directory.

  • source (str | bytes | Path | IO) – Local path or file handle to upload. May also specified as raw bytes.

  • overwrite (bool, optional) – If True, overwrite existing files. Defaults to False.

upload_dataset(project_uid: str, target_path_rel: Union[str, PurePosixPath], dset: Dataset, *, format: int = 1, overwrite: bool = False)#

Upload a dataset as a CS file into the project directory. Fails if target already exists.

Parameters:
  • project_uid (str) – project unique ID, e.g., “P3”

  • target_path_rel (str | Path) – relative path to save dataset in project directory. Should have a .cs extension.

  • dset (Dataset) – dataset to save.

  • format (int) – format to save in from cryosparc.dataset.*_FORMAT, defaults to NUMPY_FORMAT)

  • overwrite (bool, optional) – If True, overwrite existing files. Defaults to False.

upload_mrc(project_uid: str, target_path_rel: Union[str, PurePosixPath], data: NDArray, psize: float, *, overwrite: bool = False)#

Upload a numpy 2D or 3D array to the job directory as an MRC file. Fails if target already exists.

Parameters:
  • project_uid (str) – project unique ID, e.g., “P3”

  • target_path_rel (str | Path) – filename or relative path. Should have .mrc extension.

  • data (NDArray) – Numpy array with MRC file data.

  • psize (float) – Pixel size to include in MRC header.

  • overwrite (bool, optional) – If True, overwrite existing files. Defaults to False.

cryosparc.tools.LICENSE_REGEX = re.compile('[a-f\\d]{8}-[a-f\\d]{4}-[a-f\\d]{4}-[a-f\\d]{4}-[a-f\\d]{12}')#

Regular expression for matching CryoSPARC license IDs.

cryosparc.tools.SUPPORTED_EXPOSURE_FORMATS = {'CMRCBZ2', 'EER', 'MRC', 'MRCBZ2', 'MRCS', 'TIFF'}#

Supported micrograph file formats.

cryosparc.tools.VERSION_REGEX = re.compile('^v\\d+\\.\\d+\\.\\d+')#

Regular expression for CryoSPARC minor version, e.g., ‘v4.1.0’

cryosparc.tools.downsample(arr: NDArray, factor: int = 2)#

Downsample a micrograph or movie by the given factor.

Parameters:

arr (NDArray) – 2D or 3D numpy array factor (int, optional): How much to reduce size by. e.g., a factor of 2 would reduce a 1024px MRC to 512px, and a factor of 3 would reduce it to 256px. Defaults to 2.

Returns:

Downsampled MRC file

Return type:

NDArray

cryosparc.tools.get_exposure_format(data_format: str, voxel_type: Optional[str] = None) str#

Get the movie_blob/format or micrograph_blob format value for an exposure type, where data_format is one of

  • “MRC”

  • “MRCS”

  • “TIFF”

  • “CMRCBZ2”

  • “MRCBZ2”

  • “EER”

And voxel_type (if specified) is one of

  • “16 BIT FLOAT”:

  • “32 BIT FLOAT”

  • “SIGNED 16 BIT INTEGER”

  • “UNSIGNED 8 BIT INTEGER”

  • “UNSIGNED 16 BIT INTEGER”

Parameters:
  • data_format (str) – One of SUPPORTED_EXPOSURE_FORMATS such as "TIFF" or "MRC". The value of the <dataFormat> tag in an EPU XML file.

  • voxel_type (str, optional) – The value of the <voxelType> tag in an EPU file such as "32 BIT FLOAT". Required when data_format is MRC or MRCS. Defaults to None.

Returns:

The format string to save into the {prefix}/format field of a

CryoSPARC exposure dataset. e.g., "TIFF" or "MRC/2"

Return type:

str

cryosparc.tools.get_import_signatures(abs_paths: Union[str, Iterable[str], NDArray])#

Get list of import signatures for the given path or paths.

Parameters:

abs_paths (str | Iterable[str] | NDArray) – Absolute path or list of file paths.

Returns:

Import signatures as 64-bit numpy integers

Return type:

list[int]

cryosparc.tools.lowpass2(arr: NDArray, psize_A: float, cutoff_resolution_A: float = 0.0, order: float = 1.0)#

Apply butterworth lowpass filter to the 2D array data with the given pixel size (psize_A). cutoff_resolution_A should be a non-negative number specified in Angstroms.

Parameters:
  • arr (NDArray) – 2D numpy array to apply lowpass to.

  • psize_A (float) – Pixel size of array data.

  • cutoff_resolution_A (float, optional) – Cutoff resolution, in Angstroms. Defaults to 0.0.

  • order (float, optional) – Filter order. Defaults to 1.0.

Returns:

Lowpass-filtered copy of given numpy array

Return type:

NDArray