cryosparc.tools#
Main module exporting the CryoSPARC
class for interfacing with a CryoSPARC
instance from Python
Examples
>>> from cryosparc.tools import CryoSPARC
>>> license = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
>>> email = "ali@example.com"
>>> password = "password123"
>>> cs = CryoSPARC(
... license=license,
... email=email,
... password=password,
... host="localhost",
... base_port=39000
... )
>>> project = cs.find_project("P3")
Classes:
|
High-level session class for interfacing with a CryoSPARC instance. |
Data:
Supported micrograph file formats. |
Functions:
|
Downsample a micrograph or movie by the given factor. |
|
Get the |
|
Get list of import signatures for the given path or paths. |
|
Apply butterworth lowpass filter to the 2D array data with the given pixel size ( |
- class cryosparc.tools.CryoSPARC(license: str = '', host: str = 'localhost', base_port: int = 39000, email: str = '', password: str = '', timeout: int = 300)#
High-level session class for interfacing with a CryoSPARC instance.
Initialize with the host and base port of the running CryoSPARC instance. This host and (at minimum)
base_port + 2
,base_port + 3
andbase_port + 5
should be accessible on the network.- Parameters
license (str, optional) – CryoSPARC license key. Defaults to
os.getenv("CRYOSPARC_LICENSE_ID")
.host (str, optional) – Hostname or IP address running CryoSPARC master. Defaults to
os.getenv("CRYOSPARC_MASTER_HOSTNAME", "localhost")
.base_port (int, optional) – CryoSPARC services base port number. Defaults to
os.getenv("CRYOSPARC_MASTER_HOSTNAME", 39000)
.email (str, optional) – CryoSPARC user account email address. Defaults to
os.getenv("CRYOSPARC_EMAIL")
.password (str, optional) – CryoSPARC user account password address. Defaults to
os.getenv("CRYOSPARC_PASSWORD")
.timeout (int, optional) – Timeout error for HTTP requests to CryoSPARC command services. Defaults to 300.
- cli#
HTTP/JSONRPC client for
command_core
service (port + 2).- Type
- vis#
HTTP/JSONRPC client for
command_vis
service (port + 3).- Type
- rtp#
HTTP/JSONRPC client for
command_rtp
service (port + 5).- Type
- user_id#
Mongo object ID of user account performing operations for this session.
- Type
str
Examples
Load project job and micrographs
>>> from cryosparc.tools import CryoSPARC >>> cs = CryoSPARC( ... license="xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx", ... email="ali@example.com", ... password="password123", ... base_port=39000 ... ) >>> job = cs.find_job("P3", "J42") >>> micrographs = job.load_output('exposures')
Remove corrupt exposures (assumes
is_mic_corrupt
function)>>> filtered_micrographs = micrographs.query( ... lambda mic: is_mic_corrupt(mic["micrograph_blob/path"]) ... ) >>> cs.save_external_result( ... project_uid="P3", ... workspace_uid="W1", ... dataset=filtered_micrographs, ... type="exposure", ... name="filtered_exposures", ... passthrough=("J42", "exposures") ... ) "J43"
Methods:
cp
(project_uid, source_path[, target_path])Copy a file or folder within a project to another location within that same project.
create_external_job
(project_uid, workspace_uid)Add a new External job to this project to save generated outputs to.
create_job
(project_uid, workspace_uid, type)Create a new job with the given type.
create_workspace
(project_uid, title[, desc])Create a new empty workspace in the given project.
download
(project_uid, path)Open a file in the given project for reading.
download_asset
(fileid, target)Download a file from CryoSPARC's MongoDB GridFS storage.
download_dataset
(project_uid, path)Download a .cs dataset file from the given relative path in the project directory.
download_file
(project_uid, path[, target])Download a file from the directory of the specified project to the given target path or writeable file handle.
download_mrc
(project_uid, path)Download a .mrc file from the given relative path in the project directory.
find_external_job
(project_uid, job_uid)Get the External job accessor instance for an External job in this project with the given UID.
find_job
(project_uid, job_uid)Get a job by its unique project and job ID.
find_project
(project_uid)Get a project by its unique ID.
find_workspace
(project_uid, workspace_uid)Get a workspace accessor instance for the workspace in the given project with the given UID.
Get a summary of job types available for this instance, organized by category.
Get a detailed summary of job and their specification available on this instance, organized by category.
Get a list of available scheduler lanes.
get_targets
([lane])Get a list of available scheduler targets.
list_assets
(project_uid, job_uid)Get a list of files available in the database for given job.
list_files
(project_uid[, prefix, recursive])Get a list of files inside the project directory.
mkdir
(project_uid, target_path[, parents, ...])Create a directory in the given project.
print_job_types
([section, show_legacy])Print a table of job types and their titles, organized by category.
save_external_result
(project_uid, ...[, ...])Save the given result dataset to the project.
symlink
(project_uid, source_path[, target_path])Create a symbolic link in the given project.
Verify connection to CryoSPARC command services.
upload
(project_uid, target_path, source, *)Upload the given source file to the project directory at the given relative path.
upload_dataset
(project_uid, target_path, dset, *)Upload a dataset as a CS file into the project directory.
upload_mrc
(project_uid, target_path, data, ...)Upload a numpy 2D or 3D array to the job directory as an MRC file.
- cp(project_uid: str, source_path: Union[str, PurePosixPath], target_path: Union[str, PurePosixPath] = '')#
Copy a file or folder within a project to another location within that same project.
- Parameters
project_uid (str) – Target project UID, e.g., “P3”.
source_path (str | Path) – Relative or absolute path of source file or folder to copy. If relative, assumed to be within the project directory.
target_path (str | Path, optional) – Name or path in the project directory to copy into. If not specified, uses the same file name as the source. Defaults to “”.
- create_external_job(project_uid: str, workspace_uid: str, title: Optional[str] = None, desc: Optional[str] = None) ExternalJob #
Add a new External job to this project to save generated outputs to.
- Parameters
project_uid (str) – Project UID to create in, e.g., “P3”
workspace_uid (str) – Workspace UID to create job in, e.g., “W1”
title (str, optional) – Title for external job (recommended). Defaults to None.
desc (str, optional) – Markdown description for external job. Defaults to None.
- Returns
created external job instance
- Return type
- create_job(project_uid: str, workspace_uid: str, type: str, connections: Dict[str, Union[Tuple[str, str], List[Tuple[str, str]]]] = {}, params: Dict[str, Any] = {}, title: Optional[str] = None, desc: Optional[str] = None) Job #
Create a new job with the given type. Use CryoSPARC.get_job_sections to query available job types on the connected CryoSPARC instance.
- Parameters
project_uid (str) – Project UID to create job in, e.g., “P3”
workspace_uid (str) – Workspace UID to create job in, e.g., “W1”
type (str) – Job type identifier, e.g., “homo_abinit”
connections (dict[str, tuple[str, str] | list[tuple[str, str]]]) – Initial input connections. Each key is an input name and each value is a (job uid, output name) tuple. Defaults to {}
params (dict[str, any], optional) – Specify parameter values. Defaults to {}.
title (str, optional) – Job title. Defaults to None.
desc (str, optional) – Job markdown description. Defaults to None.
- Returns
created job instance. Raises error if job cannot be created.
- Return type
Examples
Create an Import Movies job.
>>> from cryosparc.tools import CryoSPARC >>> cs = CryoSPARC() >>> import_job = cs.create_job("P3", "W1", "import_movies") >>> import_job.set_param("blob_paths", "/bulk/data/t20s/*.tif") True
Create a 3-class ab-initio job connected to existing particles.
>>> abinit_job = cs.create_job("P3", "W1", "homo_abinit" ... connections={"particles": ("J20", "particles_selected")} ... params={"abinit_K": 3} ... )
- create_workspace(project_uid: str, title: str, desc: Optional[str] = None) Workspace #
Create a new empty workspace in the given project.
- Parameters
project_uid (str) – Project UID to create in, e.g., “P3”.
title (str) – Title of new workspace.
desc (str, optional) – Markdown text description. Defaults to None.
- Returns
created workspace instance
- Return type
- download(project_uid: str, path: Union[str, PurePosixPath])#
Open a file in the given project for reading. Use to get files from a remote CryoSPARC instance whose the project directories are not available on the client file system.
- Parameters
project_uid (str) – Short unique ID of CryoSPARC project, e.g., “P3”
path (str | Path) – Name or path of file in project directory.
- Yields
HTTPResponse – Use a context manager to read the file from the request body.
Examples
Download a job’s metadata
>>> cs = CryoSPARC() >>> with cs.download('P3', 'J42/job.json') as res: >>> job_data = json.loads(res.read())
- download_asset(fileid: str, target: Union[str, PurePath, IO[bytes]])#
Download a file from CryoSPARC’s MongoDB GridFS storage.
- Parameters
fileid (str) – GridFS file object ID
target (str | Path | IO) – Local file path, directory path or writeable file handle to write response data.
- Returns
resulting target path or file handle.
- Return type
Path | IO
- download_dataset(project_uid: str, path: Union[str, PurePosixPath])#
Download a .cs dataset file from the given relative path in the project directory.
- Parameters
project_uid (str) – Project unique ID, e.g., “P3”.
path (str | Path) – Name or path to .cs file in project directory.
- Returns
Loaded dataset instance
- Return type
- download_file(project_uid: str, path: Union[str, PurePosixPath], target: Union[str, PurePath, IO[bytes]] = '')#
Download a file from the directory of the specified project to the given target path or writeable file handle.
- Parameters
project_uid (str) – Project unique ID, e.g., “P3”.
path (str | Path) – Name or path of file in project directory.
target (str | Path | IO, optional) – Local file path, directory path or writeable file handle to write response data. If not specified, downloads to current working directory with same file name. Defaults to “”.
- Returns
resulting target path or file handle.
- Return type
Path | IO
- download_mrc(project_uid: str, path: Union[str, PurePosixPath])#
Download a .mrc file from the given relative path in the project directory.
- Parameters
project_uid (str) – Project unique ID, e.g., “P3”
path (str | Path) – Name or path to .mrc file in project directory.
- Returns
MRC file header and data as a numpy array
- Return type
tuple[Header, NDArray]
- find_external_job(project_uid: str, job_uid: str) ExternalJob #
Get the External job accessor instance for an External job in this project with the given UID. Fails if the job does not exist or is not an external job.
- Parameters
project_uid (str) – Project unique ID, e.g,. “P3”
job_uid (str) – Job unique ID, e.g,. “J42”
- Raises
TypeError – If job is not an external job
- Returns
accessor instance
- Return type
- find_job(project_uid: str, job_uid: str) Job #
Get a job by its unique project and job ID.
- Parameters
project_uid (str) – Project unique ID, e.g., “P3”
job_uid (str) – job unique ID, e.g., “J42”
- Returns
job instance
- Return type
- find_project(project_uid: str) Project #
Get a project by its unique ID.
- Parameters
project_uid (str) – Project unique ID, e.g., “P3”
- Returns
project instance
- Return type
- find_workspace(project_uid: str, workspace_uid: str) Workspace #
Get a workspace accessor instance for the workspace in the given project with the given UID. Fails with an error if workspace does not exist.
- Parameters
project_uid (str) – Project unique ID, e.g,. “P3”
workspace_uid (str) – Workspace unique ID, e.g., “W1”
- Returns
accessor instance
- Return type
- get_job_sections() List[JobSection] #
Get a summary of job types available for this instance, organized by category.
- Returns
- List of job section dictionaries. Job types
are listed in the
"contains"
key in each dictionary.
- Return type
list[JobSection]
- get_job_specs() List[JobSpecSection] #
Get a detailed summary of job and their specification available on this instance, organized by category.
- Returns
List of job section dictionaries. Job specs are listed in the
"contains"
key in each dictionary- Return type
list[JobSpecSection]
- get_lanes() List[SchedulerLane] #
Get a list of available scheduler lanes.
- Returns
Details about available lanes.
- Return type
list[SchedulerLane]
- get_targets(lane: Optional[str] = None) List[Union[SchedulerTargetNode, SchedulerTargetGpuNode, SchedulerTargetCluster]] #
Get a list of available scheduler targets.
- Parameters
lane (str, optional) – Only get targets from this specific lane. Returns all targets if not specified. Defaults to None.
- Returns
Details about available targets.
- Return type
list[SchedulerTarget]
- list_assets(project_uid: str, job_uid: str) List[AssetDetails] #
Get a list of files available in the database for given job. Returns a list with details about the assets. Each entry is a dict with a
_id
key which may be used to download the file with thedownload_asset
method.- Parameters
project_uid (str) – Project unique ID, e.g., “P3”
job_uid (str) – job unique ID, e.g., “J42”
- Returns
Asset details
- Return type
list[AssetDetails]
- list_files(project_uid: str, prefix: Union[str, PurePosixPath] = '', recursive: bool = False) List[str] #
Get a list of files inside the project directory.
- Parameters
project_uid (str) – Project unique ID, e.g., “P3”.
prefix (str | Path, optional) – Subdirectory inside project to list. Defaults to “”.
recursive (bool, optional) – If True, lists files recursively. Defaults to False.
- Returns
List of file paths relative to the project directory.
- Return type
list[str]
- mkdir(project_uid: str, target_path: Union[str, PurePosixPath], parents: bool = False, exist_ok: bool = False)#
Create a directory in the given project.
- Parameters
project_uid (str) – Target project directory
target_path (str | Path) – Name or path of folder to create inside the project directory.
parents (bool, optional) – If True, any missing parents are created as needed. Defaults to False.
exist_ok (bool, optional) – If True, does not raise an error for existing directories. Still raises if the target path is not a directory. Defaults to False.
- print_job_types(section: Optional[Union[str, Container[str]]] = None, *, show_legacy: bool = False)#
Print a table of job types and their titles, organized by category.
- Parameters
section (str | list[str], optional) – Only show jobs from the given section or list of sections. Defaults to None.
show_legacy (bool, optional) – If True, also show legacy jobs. Defaults to False.
- save_external_result(project_uid: str, workspace_uid: Optional[str], dataset: Dataset[R], type: Literal['exposure', 'particle', 'template', 'volume', 'mask', 'live', 'ml_model', 'symmetry_candidate', 'flex_mesh', 'flex_model', 'hyperparameter'], name: Optional[str] = None, slots: Optional[List[Union[str, Datafield]]] = None, passthrough: Optional[Tuple[str, str]] = None, title: Optional[str] = None, desc: Optional[str] = None) str #
Save the given result dataset to the project. Specify at least the dataset to save and the type of data.
Returns UID of the External job where the results were saved.
Examples
Save all particle data
>>> particles = Dataset() >>> cs.save_external_result("P1", "W1", particles, 'particle') "J43"
Save new particle locations that inherit passthrough slots from a parent job.
>>> particles = Dataset() >>> cs.save_external_result( ... project_uid="P1", ... workspace_uid="W1", ... dataset=particles, ... type='particle', ... name='particles', ... slots=['location'], ... passthrough=('J42', 'selected_particles'), ... title='Re-centered particles' ... ) "J44"
Save a result with multiple slots of the same type.
>>> cs.save_external_result( ... project_uid="P1", ... workspace_uid="P1", ... dataset=particles, ... type="particle", ... name="particle_alignments", ... slots=[ ... {"dtype": "alignments3D", "prefix": "alignments_class_0", "required": True}, ... {"dtype": "alignments3D", "prefix": "alignments_class_1", "required": True}, ... {"dtype": "alignments3D", "prefix": "alignments_class_2", "required": True}, ... ] ... ) "J45"
- Parameters
project_uid (str) – Project UID to save results into.
workspace_uid (str | None) – Workspace UID to save results into. Specify
None
to auto-select a workspace.dataset (Dataset) – Result dataset.
type (Datatype) – Type of output dataset.
name (str, optional) – Name of output on created External job. Same as type if unspecified. Defaults to None.
slots (list[SlotSpec], optional) – List of slots expected to be created for this output such as
location
orblob
. Do not specify any slots that were passed through from an input unless those slots are modified in the output. Defaults to None.passthrough (tuple[str, str], optional) – Indicates that this output inherits slots from the specified output. e.g.,
("J1", "particles")
. Defaults to None.title (str, optional) – Human-readable title for this output. Defaults to None.
desc (str, optional) – Markdown description for this output. Defaults to None.
- Raises
CommandError – General CryoSPARC network access error such as timeout, URL or HTTP
InvalidSlotsError – slots argument is invalid
- Returns
UID of created job where this output was saved
- Return type
str
- symlink(project_uid: str, source_path: Union[str, PurePosixPath], target_path: Union[str, PurePosixPath] = '')#
Create a symbolic link in the given project. May only create links for files within the project.
- Parameters
project_uid (str) – Target project UID, e.g., “P3”.
source_path (str | Path) – Relative or absolute path of source file or folder to create a link to. If relative, assumed to be within the project directory.
target_path (str | Path) – Name or path of new symlink in the project directory. If not specified, creates link with the same file name as the source. Defaults to “”.
- test_connection()#
Verify connection to CryoSPARC command services.
- Returns
True if connection succeeded, False otherwise
- Return type
bool
- upload(project_uid: str, target_path: Union[str, PurePosixPath], source: Union[str, bytes, PurePath, IO], *, overwrite: bool = False)#
Upload the given source file to the project directory at the given relative path. Fails if target already exists.
- Parameters
project_uid (str) – Project unique ID, e.g., “P3”
target_path (str | Path) – Name or path of file to write in project directory.
source (str | bytes | Path | IO) – Local path or file handle to upload. May also specified as raw bytes.
overwrite (bool, optional) – If True, overwrite existing files. Defaults to False.
- upload_dataset(project_uid: str, target_path: Union[str, PurePosixPath], dset: Dataset, *, format: int = 1, overwrite: bool = False)#
Upload a dataset as a CS file into the project directory. Fails if target already exists.
- Parameters
project_uid (str) – Project unique ID, e.g., “P3”
target_path (str | Path) – Name or path of dataset to save in the project directory. Should have a
.cs
extension.dset (Dataset) – dataset to save.
format (int) – Format to save in from
cryosparc.dataset.*_FORMAT
, defaults to NUMPY_FORMAT)overwrite (bool, optional) – If True, overwrite existing files. Defaults to False.
- upload_mrc(project_uid: str, target_path: Union[str, PurePosixPath], data: NDArray, psize: float, *, overwrite: bool = False)#
Upload a numpy 2D or 3D array to the job directory as an MRC file. Fails if target already exists.
- Parameters
project_uid (str) – Project unique ID, e.g., “P3”
target_path (str | Path) – Name or path of MRC file to save in the project directory. Should have a
.mrc
extension.data (NDArray) – Numpy array with MRC file data.
psize (float) – Pixel size to include in MRC header.
overwrite (bool, optional) – If True, overwrite existing files. Defaults to False.
- cryosparc.tools.LICENSE_REGEX = re.compile('[a-f\\d]{8}-[a-f\\d]{4}-[a-f\\d]{4}-[a-f\\d]{4}-[a-f\\d]{12}')#
Regular expression for matching CryoSPARC license IDs.
- cryosparc.tools.SUPPORTED_EXPOSURE_FORMATS = {'CMRCBZ2', 'EER', 'MRC', 'MRCBZ2', 'MRCS', 'TIFF'}#
Supported micrograph file formats.
- cryosparc.tools.VERSION_REGEX = re.compile('^v\\d+\\.\\d+\\.\\d+')#
Regular expression for CryoSPARC minor version, e.g., ‘v4.1.0’
- cryosparc.tools.downsample(arr: NDArray, factor: int = 2)#
Downsample a micrograph or movie by the given factor.
- Parameters
arr (NDArray) – 2D or 3D numpy array factor (int, optional): How much to reduce size by. e.g., a factor of 2 would reduce a 1024px MRC to 512px, and a factor of 3 would reduce it to 256px. Defaults to 2.
- Returns
Downsampled MRC file
- Return type
NDArray
- cryosparc.tools.get_exposure_format(data_format: str, voxel_type: Optional[str] = None) str #
Get the
movie_blob/format
ormicrograph_blob
format value for an exposure type, wheredata_format
is one of“MRC”
“MRCS”
“TIFF”
“CMRCBZ2”
“MRCBZ2”
“EER”
And
voxel_type
(if specified) is one of“16 BIT FLOAT”:
“32 BIT FLOAT”
“SIGNED 16 BIT INTEGER”
“UNSIGNED 8 BIT INTEGER”
“UNSIGNED 16 BIT INTEGER”
- Parameters
data_format (str) – One of
SUPPORTED_EXPOSURE_FORMATS
such as"TIFF"
or"MRC"
. The value of the<dataFormat>
tag in an EPU XML file.voxel_type (str, optional) – The value of the
<voxelType>
tag in an EPU file such as"32 BIT FLOAT"
. Required whendata_format
isMRC
orMRCS
. Defaults to None.
- Returns
- The format string to save into the
{prefix}/format
field of a CryoSPARC exposure dataset. e.g.,
"TIFF"
or"MRC/2"
- The format string to save into the
- Return type
str
- cryosparc.tools.get_import_signatures(abs_paths: Union[str, Iterable[str], NDArray])#
Get list of import signatures for the given path or paths.
- Parameters
abs_paths (str | Iterable[str] | NDArray) – Absolute path or list of file paths.
- Returns
Import signatures as 64-bit numpy integers
- Return type
list[int]
- cryosparc.tools.lowpass2(arr: NDArray, psize_A: float, cutoff_resolution_A: float = 0.0, order: float = 1.0)#
Apply butterworth lowpass filter to the 2D array data with the given pixel size (
psize_A
).cutoff_resolution_A
should be a non-negative number specified in Angstroms.- Parameters
arr (NDArray) – 2D numpy array to apply lowpass to.
psize_A (float) – Pixel size of array data.
cutoff_resolution_A (float, optional) – Cutoff resolution, in Angstroms. Defaults to 0.0.
order (float, optional) – Filter order. Defaults to 1.0.
- Returns
Lowpass-filtered copy of given numpy array
- Return type
NDArray