`mmagic.datasets.transforms.loading`¶

Module Contents¶

Classes¶

`LoadImageFromFile`	Load a single image or image frames from corresponding paths. Required
`LoadMask`	Load Mask for multiple types.
`GetSpatialDiscountMask`	Get spatial discounting mask constant.
`LoadPairedImageFromFile`	Load a pair of images from file.

class mmagic.datasets.transforms.loading.LoadImageFromFile(key: str, color_type: str = 'color', channel_order: str = 'bgr', imdecode_backend: Optional[str] = None, use_cache: bool = False, to_float32: bool = False, to_y_channel: bool = False, save_original_img: bool = False, backend_args: Optional[dict] = None)[source]¶

Bases: mmcv.transforms.BaseTransform

Load a single image or image frames from corresponding paths. Required Keys: - [Key]_path

New Keys: - [KEY] - ori_[KEY]_shape - ori_[KEY]

Parameters

key (str) – Keys in results to find corresponding path.
color_type (str) – The flag argument for :func:mmcv.imfrombytes. Defaults to ‘color’.
channel_order (str) – Order of channel, candidates are ‘bgr’ and ‘rgb’. Default: ‘bgr’.
imdecode_backend (str) – The image decoding backend type. The backend argument for :func:mmcv.imfrombytes. See :func:mmcv.imfrombytes for details. candidates are ‘cv2’, ‘turbojpeg’, ‘pillow’, and ‘tifffile’. Defaults to None.
use_cache (bool) – If True, load all images at once. Default: False.
to_float32 (bool) – Whether to convert the loaded image to a float32 numpy array. If set to False, the loaded image is an uint8 array. Defaults to False.
to_y_channel (bool) – Whether to convert the loaded image to y channel. Only support ‘rgb2ycbcr’ and ‘rgb2ycbcr’ Defaults to False.
backend_args (dict, optional) – Arguments to instantiate the prefix of uri corresponding backend. Defaults to None.

transform(results: dict) → dict[source]¶

Functions to load image or frames.

Parameters: results (dict) – Result dict from :obj:mmcv.BaseDataset.
Returns: The dict contains loaded image and meta information.
Return type: dict

_load_image(filename)[source]¶

Load an image from file.

Parameters: filename (str) – Path of image file.
Returns: Image.
Return type: np.ndarray

_convert(img: numpy.ndarray)[source]¶

Convert an image to the require format.

Parameters: img (np.ndarray) – The original image.
Returns: The converted image.
Return type: np.ndarray

__repr__()[source]¶: Return repr(self).

class mmagic.datasets.transforms.loading.LoadMask(mask_mode='bbox', mask_config=None)[source]¶

Bases: mmcv.transforms.BaseTransform

Load Mask for multiple types.

For different types of mask, users need to provide the corresponding config dict.

Example config for bbox:

config = dict(img_shape=(256, 256), max_bbox_shape=128)

Example config for irregular:

config = dict(
    img_shape=(256, 256),
    num_vertices=(4, 12),
    max_angle=4.,
    length_range=(10, 100),
    brush_width=(10, 40),
    area_ratio_range=(0.15, 0.5))

Example config for ff:

config = dict(
    img_shape=(256, 256),
    num_vertices=(4, 12),
    mean_angle=1.2,
    angle_range=0.4,
    brush_width=(12, 40))

Example config for set:

config = dict(
    mask_list_file='xxx/xxx/ooxx.txt',
    prefix='/xxx/xxx/ooxx/',
    io_backend='local',
    color_type='unchanged',
    file_client_kwargs=dict()
)

The mask_list_file contains the list of mask file name like this:
    test1.jpeg
    test2.jpeg
    ...
    ...

The prefix gives the data path.

Parameters

mask_mode (str) – Mask mode in [‘bbox’, ‘irregular’, ‘ff’, ‘set’, ‘file’]. Default: ‘bbox’. * bbox: square bounding box masks. * irregular: irregular holes. * ff: free-form holes from DeepFillv2. * set: randomly get a mask from a mask set. * file: get mask from ‘mask_path’ in results.
mask_config (dict) – Params for creating masks. Each type of mask needs different configs. Default: None.

_init_info()[source]¶

_get_random_mask_from_set()[source]¶

_get_mask_from_file(path)[source]¶

transform(results)[source]¶

transform function.

Parameters: results (dict) – A dict containing the necessary information and data for augmentation.
Returns: A dict containing the processed data and information.
Return type: dict

__repr__()[source]¶: Return repr(self).

class mmagic.datasets.transforms.loading.GetSpatialDiscountMask(gamma=0.99, beta=1.5)[source]¶

Bases: mmcv.transforms.BaseTransform

Get spatial discounting mask constant.

Spatial discounting mask is first introduced in: Generative Image Inpainting with Contextual Attention.

Parameters

gamma (float, optional) – Gamma for computing spatial discounting. Defaults to 0.99.
beta (float, optional) – Beta for computing spatial discounting. Defaults to 1.5.

spatial_discount_mask(mask_width, mask_height)[source]¶

Generate spatial discounting mask constant.

Parameters

mask_width (int) – The width of bbox hole.
mask_height (int) – The height of bbox height.

Returns

Spatial discounting mask.

Return type

np.ndarray

transform(results)[source]¶

transform function.

Parameters: results (dict) – A dict containing the necessary information and data for augmentation.
Returns: A dict containing the processed data and information.
Return type: dict

__repr__()[source]¶: Return repr(self).

class mmagic.datasets.transforms.loading.LoadPairedImageFromFile(key: str, domain_a: str = 'A', domain_b: str = 'B', color_type: str = 'color', channel_order: str = 'bgr', imdecode_backend: Optional[str] = None, use_cache: bool = False, to_float32: bool = False, to_y_channel: bool = False, save_original_img: bool = False, backend_args: Optional[dict] = None)[source]¶

Bases: LoadImageFromFile

Load a pair of images from file.

Each sample contains a pair of images, which are concatenated in the w dimension (a|b). This is a special loading class for generation paired dataset. It loads a pair of images as the common loader does and crops it into two images with the same shape in different domains.

Required key is “pair_path”. Added or modified keys are “pair”, “pair_ori_shape”, “ori_pair”, “img_{domain_a}”, “img_{domain_b}”, “img_{domain_a}_path”, “img_{domain_b}_path”, “img_{domain_a}_ori_shape”, “img_{domain_b}_ori_shape”, “ori_img_{domain_a}” and “ori_img_{domain_b}”.

Parameters

key (str) – Keys in results to find corresponding path.
domain_a (str, Optional) – One of the paired image domain. Defaults to ‘A’.
domain_b (str, Optional) – The other of the paired image domain. Defaults to ‘B’.
color_type (str) – The flag argument for :func:mmcv.imfrombytes. Defaults to ‘color’.
channel_order (str) – Order of channel, candidates are ‘bgr’ and ‘rgb’. Default: ‘bgr’.
imdecode_backend (str) – The image decoding backend type. The backend argument for :func:mmcv.imfrombytes. See :func:mmcv.imfrombytes for details. candidates are ‘cv2’, ‘turbojpeg’, ‘pillow’, and ‘tifffile’. Defaults to None.
use_cache (bool) – If True, load all images at once. Default: False.
to_float32 (bool) – Whether to convert the loaded image to a float32 numpy array. If set to False, the loaded image is an uint8 array. Defaults to False.
to_y_channel (bool) – Whether to convert the loaded image to y channel. Only support ‘rgb2ycbcr’ and ‘rgb2ycbcr’ Defaults to False.
backend_args (dict, optional) – Arguments to instantiate the prefix of uri corresponding backend. Defaults to None.
io_backend (str, optional) – io backend where images are store. Defaults to None.

transform(results: dict) → dict[source]¶

Functions to load paired images.

Parameters: results (dict) – A dict containing the necessary information and data for augmentation.
Returns: A dict containing the processed data and information.
Return type: dict

mmagic.datasets.transforms.loading¶

Module Contents¶

Classes¶

`mmagic.datasets.transforms.loading`¶