CDatasetIO#

class ikomia.dataprocess.pydataprocess.CDatasetIO#

Virtual base classe to define task input or output containing deep learning dataset structure. Derived from CWorkflowTaskIO. Instances can be added as input or output of a CWorkflowTask or derived object. Such input or output is required for deep learning training task. Custom dataset loader must inherit this class and implements required virtual methods.

Dataset structure is composed of a dict for each image and some common metadata with the following specifications (mandatory fields may vary depending on the training goal).

  • images (list[dict]): image information and corresponding annotations:
    • filename (str): full path of the image file.

    • height, width (int): size of the image.

    • image_id (int): unique image identifier.

    • annotations (list[dict]): each dict corresponds to annotations of one instance in this image.
      • bbox (list[float]): x, y, width, height of the bounding box.

      • category_id (int): integer representing the category label.

      • segmentation_poly (list[list[float]]): list of polygons, one for each connected component.

      • keypoints (list[float]).

      • iscrowd (boolean): whether the instance is labelled as a crowd region (COCO).

    • segmentation_masks (numpy array [N, H, W]).

    • instance_seg_masks_file: full path of the ground truth instance segmentation image file.

    • semantic_seg_masks_file: full path of the ground truth semantic segmentation image file.

  • metadata (dict): key-value mapping that contains information that’s shared among the entire dataset:
    • category_names (dict{id (int): name (str)]).

    • category_colors (list[tuple(r,g,b)]).

    • keypoint_names (list[str]).

    • keypoint_connection_rules (list[tuple(str, str, (r,g,b))]): each tuple specifies a pair of keypoints that are connected and the color to use for the line between them.

Import

from ikomia.dataprocess import CDatasetIO

Note

A default implementation is provided in the pure Python API. See datasetio for details.

Methods

__init__(arg1)

__init__( (object)arg1) -> None :

get_categories(arg1)

Virtual method to reimplement, return the categories of the dataset.

get_category_count(arg1)

Virtual method to reimplement, return the number of instance categories (ie classes) in the dataset.

get_graphics_annotations(self, image_path)

Virtual method to reimplement, return the list of graphics items corresponding to dataset annotations for the given image.

get_image_paths(arg1)

Virtual method to reimplement, return the file path list of all images contained in the dataset.

get_mask_path(self, image_path)

Virtual method to reimplement, return the path of the segmentation mask associated with the given image contained in the dataset.

get_source_format(arg1)

Get the source format of the dataset.

Overridden methods

clear_data(self)

See clear_data().

from_json(self, json_str)

Set input/output data from JSON formatted string.

is_data_available(self)

See is_data_available().

save(arg1, arg2)

Virtual method to reimplement, save dataset structure as JSON.

to_json(self)

Return input/output data in JSON formatted string (compact mode).

Inherited methods

copy_static_data(self, io)

Copy the static data from the given input or ouput.

get_unit_element_count(self)

Get the number of unit elements in terms of processing scheme.

Attributes

auto_save

Auto-save status

data_type

I/O data type

dim_count

Number of dimensions

description

Custom description to explain input/output type and use

displayable

Displayable status (Ikomia Studio)

name

I/O name

source_file_path

Path to the source file used as workflow input (if any)

Details

__init__((object)arg1) None#
__init__( (object)arg1) -> None :

Default constructor

__init__( (object)self, (str)name) -> None :

Construct a CDatasetIO object specifying the input or output name.

Args:

name (str): custom name

__init__( (object)self, (str)name, (str)source_format) -> None :

Construct a CDatasetIO object specifying the name and the source format.

Args:

name (str): custom name

source_format (str): unique string identifier

clear_data((CDatasetIO)self) None :#

See clear_data().

clear_data( (CDatasetIO)self) -> None

copy_static_data((CWorkflowTaskIO)self, (CWorkflowTaskIO)io) None :#

Copy the static data from the given input or ouput. Static data are those which are not generated at runtime. Should be overriden for custom input or output.

Args:

CWorkflowTaskIO: input or ouput instance from which data is copied

copy_static_data( (CWorkflowTaskIO)self, (CWorkflowTaskIO)io) -> None

from_json((CDatasetIO)self, (str)json_str) None :#

Set input/output data from JSON formatted string. Must be reimplemented

Args:

str: data as JSON formatted string

from_json( (CDatasetIO)self, (str)json_str) -> None

get_categories((CDatasetIO)arg1) object :#

Virtual method to reimplement, return the categories of the dataset.

Returns:

MapIntStr: list of categories

get_categories( (CDatasetIO)arg1) -> object

get_category_count((CDatasetIO)arg1) int :#

Virtual method to reimplement, return the number of instance categories (ie classes) in the dataset.

Returns:

int: number of categories

get_category_count( (CDatasetIO)arg1) -> int

get_graphics_annotations((CDatasetIO)self, (str)image_path) object :#

Virtual method to reimplement, return the list of graphics items corresponding to dataset annotations for the given image.

Args:

image_path (str): path of the associated image in the dataset

Returns:

list of CGraphicsItem or derived: graphics items

get_graphics_annotations( (CDatasetIO)self, (str)image_path) -> object

get_image_paths((CDatasetIO)arg1) object :#

Virtual method to reimplement, return the file path list of all images contained in the dataset.

Returns:

str[]: path list

get_image_paths( (CDatasetIO)arg1) -> object

get_mask_path((CDatasetIO)self, (str)image_path) str :#

Virtual method to reimplement, return the path of the segmentation mask associated with the given image contained in the dataset.

Args:

image_path (str): path of the associated image in the dataset

Returns:

str: mask path or empty string if image does not exist

get_mask_path( (CDatasetIO)self, (str)image_path) -> str

get_unit_element_count((CWorkflowTaskIO)self) int :#

Get the number of unit elements in terms of processing scheme. This value is used to define the number of progress steps for progress bar component in Ikomia STUDIO. For an image, the count is 1. For Z-stack volume, the count is the number of Z levels. Should be overriden for custom input or output.

Returns:

int: number of unit element to process

get_unit_element_count( (CWorkflowTaskIO)self) -> int

is_data_available((CDatasetIO)self) bool :#

See is_data_available().

is_data_available( (CDatasetIO)self) -> bool

save((CDatasetIO)arg1, (str)arg2) None :#

Virtual method to reimplement, save dataset structure as JSON.

Args:

str: file path

save( (CDatasetIO)arg1, (str)arg2) -> None

to_json((CDatasetIO)self) str :#

Return input/output data in JSON formatted string (compact mode).

Returns:

string: JSON formatted string

to_json( (CDatasetIO)self) -> str

to_json( (CDatasetIO)self, (object)options) -> str :

Return input/output data in JSON formatted string. Must be reimplemented and should managethe common option to set the JSON format. It can be [‘json_format’, ‘compact’] or [‘json_format’, ‘indented’].

Args:

list of str: format-specific options encoded as pairs (option_name, option_value)

Returns:

string: JSON formatted string

to_json( (CDatasetIO)self, (object)options) -> str