`mmagic.datasets.transforms.crop`¶

Module Contents¶

Classes¶

`Crop`	Crop data to specific size for training.
`CropLike`	Crop/pad the image in the target_key according to the size of image in
`FixedCrop`	Crop paired data (at a specific position) to specific size for training.
`ModCrop`	Mod crop images, used during testing.
`PairedRandomCrop`	Paired random crop.
`RandomResizedCrop`	Crop data to random size and aspect ratio.
`CropAroundCenter`	Randomly crop the images around unknown area in the center 1/4 images.
`CropAroundFg`	Crop around the whole foreground in the segmentation mask.
`CropAroundUnknown`	Crop around unknown area with a randomly selected scale.
`RandomCropLongEdge`	Random crop the given image by the long edge.
`CenterCropLongEdge`	Center crop the given image by the long edge.
`InstanceCrop`	Use maskrcnn to detect instances on image.

Attributes¶

mmdet_apis

mmagic.datasets.transforms.crop.mmdet_apis[source]¶

class mmagic.datasets.transforms.crop.Crop(keys, crop_size, random_crop=True, is_pad_zeros=False)[source]¶

Bases: mmcv.transforms.BaseTransform

Crop data to specific size for training.

Parameters

keys (Sequence[str]) – The images to be cropped.
crop_size (Tuple[int]) – Target spatial size (h, w).
random_crop (bool) – If set to True, it will random crop image. Otherwise, it will work as center crop. Default: True.
is_pad_zeros (bool, optional) – Whether to pad the image with 0 if crop_size is greater than image size. Default: False.

_crop(data)[source]¶

Crop the data.

Parameters: data (Union[List, np.ndarray]) – Input data to crop.
Returns: cropped data and corresponding crop box.
Return type: tuple

transform(results)[source]¶

Transform function.

Parameters: results (dict) – A dict containing the necessary information and data for augmentation.
Returns: A dict containing the processed data and information.
Return type: dict

__repr__()[source]¶: Return repr(self).

class mmagic.datasets.transforms.crop.CropLike(target_key, reference_key=None)[source]¶

Bases: mmcv.transforms.BaseTransform

Crop/pad the image in the target_key according to the size of image in the reference_key .

Parameters

target_key (str) – The key needs to be cropped.
reference_key (str | None) – The reference key, need its size. Default: None.

transform(results)[source]¶

Transform function.

Parameters

results (dict) – A dict containing the necessary information and data for augmentation. Require self.target_key and self.reference_key.

Returns

A dict containing the processed data and information.: Modify self.target_key.

Return type

dict

__repr__()[source]¶: Return repr(self).

class mmagic.datasets.transforms.crop.FixedCrop(keys, crop_size, crop_pos=None)[source]¶

Bases: mmcv.transforms.BaseTransform

Crop paired data (at a specific position) to specific size for training.

Parameters

keys (Sequence[str]) – The images to be cropped.
crop_size (Tuple[int]) – Target spatial size (h, w).
crop_pos (Tuple[int]) – Specific position (x, y). If set to None, random initialize the position to crop paired data batch. Default: None.

_crop(data, x_offset, y_offset, crop_w, crop_h)[source]¶

Crop the data.

Parameters

data (Union[List, np.ndarray]) – Input data to crop.
x_offset (int) – The offset of x axis.
y_offset (int) – The offset of y axis.
crop_w (int) – The width of crop bbox.
crop_h (int) – The height of crop bbox.

Returns

cropped data and corresponding crop box.

Return type

tuple

transform(results)[source]¶

Transform function.

Parameters: results (dict) – A dict containing the necessary information and data for augmentation.
Returns: A dict containing the processed data and information.
Return type: dict

__repr__()[source]¶: Return repr(self).

class mmagic.datasets.transforms.crop.ModCrop(key='gt')[source]¶

Bases: mmcv.transforms.BaseTransform

Mod crop images, used during testing.

Required keys are “scale” and “KEY”, added or modified keys are “KEY”.

Parameters: key (str) – The key of image. Default: ‘gt’

transform(results)[source]¶

Transform function.

Parameters: results (dict) – A dict containing the necessary information and data for augmentation.
Returns: A dict containing the processed data and information.
Return type: dict

__repr__()[source]¶: Return repr(self).

class mmagic.datasets.transforms.crop.PairedRandomCrop(gt_patch_size, lq_key='img', gt_key='gt')[source]¶

Bases: mmcv.transforms.BaseTransform

Paired random crop.

It crops a pair of img and gt images with corresponding locations. It also supports accepting img list and gt list. Required keys are “scale”, “lq_key”, and “gt_key”, added or modified keys are “lq_key” and “gt_key”.

Parameters

gt_patch_size (int) – cropped gt patch size.
lq_key (str) – Key of LQ img. Default: ‘img’.
gt_key (str) – Key of GT img. Default: ‘gt’.

transform(results)[source]¶

Transform function.

Parameters: results (dict) – A dict containing the necessary information and data for augmentation.
Returns: A dict containing the processed data and information.
Return type: dict

__repr__()[source]¶: Return repr(self).

class mmagic.datasets.transforms.crop.RandomResizedCrop(keys, crop_size, scale=(0.08, 1.0), ratio=(3.0 / 4.0, 4.0 / 3.0), interpolation='bilinear')[source]¶

Bases: mmcv.transforms.BaseTransform

Crop data to random size and aspect ratio.

A crop of a random proportion of the original image and a random aspect ratio of the original aspect ratio is made. The cropped image is finally resized to a given size specified by ‘crop_size’. Modified keys are the attributes specified in “keys”.

This code is partially adopted from torchvision.transforms.RandomResizedCrop: [https://pytorch.org/vision/stable/_modules/torchvision/transforms/ transforms.html#RandomResizedCrop].

Parameters

keys (list[str]) – The images to be resized and random-cropped.
crop_size (int | tuple[int]) – Target spatial size (h, w).
scale (tuple[float], optional) – Range of the proportion of the original image to be cropped. Default: (0.08, 1.0).
ratio (tuple[float], optional) – Range of aspect ratio of the crop. Default: (3. / 4., 4. / 3.).
interpolation (str, optional) – Algorithm used for interpolation. It can be only either one of the following: “nearest” | “bilinear” | “bicubic” | “area” | “lanczos”. Default: “bilinear”.

get_params(data)[source]¶

Get parameters for a random sized crop.

Parameters: data (np.ndarray) – Image of type numpy array to be cropped.
Returns: A tuple containing the coordinates of the top left corner and the chosen crop size.

transform(results)[source]¶

Transform function.

Parameters: results (dict) – A dict containing the necessary information and data for augmentation.
Returns: A dict containing the processed data and information.
Return type: dict

__repr__()[source]¶: Return repr(self).

class mmagic.datasets.transforms.crop.CropAroundCenter(crop_size)[source]¶

Bases: mmcv.transforms.BaseTransform

Randomly crop the images around unknown area in the center 1/4 images.

This cropping strategy is adopted in GCA matting. The unknown area is the same as semi-transparent area. https://arxiv.org/pdf/2001.04069.pdf

It retains the center 1/4 images and resizes the images to ‘crop_size’. Required keys are “fg”, “bg”, “trimap” and “alpha”, added or modified keys are “crop_bbox”, “fg”, “bg”, “trimap” and “alpha”.

Parameters: crop_size (int | tuple) – Desired output size. If int, square crop is applied.

transform(results)[source]¶

Transform function.

Parameters: results (dict) – A dict containing the necessary information and data for augmentation.
Returns: A dict containing the processed data and information.
Return type: dict

__repr__()[source]¶: Return repr(self).

class mmagic.datasets.transforms.crop.CropAroundFg(keys, bd_ratio_range=(0.1, 0.4), test_mode=False)[source]¶

Bases: mmcv.transforms.BaseTransform

Crop around the whole foreground in the segmentation mask.

Required keys are “seg” and the keys in argument keys. Meanwhile, “seg” must be in argument keys. Added or modified keys are “crop_bbox” and the keys in argument keys.

Parameters

keys (Sequence[str]) – The images to be cropped. It must contain ‘seg’.
bd_ratio_range (tuple, optional) – The range of the boundary (bd) ratio to select from. The boundary ratio is the ratio of the boundary to the minimal bbox that contains the whole foreground given by segmentation. Default to (0.1, 0.4).
test_mode (bool) – Whether use test mode. In test mode, the tight crop area of foreground will be extended to the a square. Default to False.

transform(results)[source]¶

Transform function.

Parameters: results (dict) – A dict containing the necessary information and data for augmentation.
Returns: A dict containing the processed data and information.
Return type: dict

class mmagic.datasets.transforms.crop.CropAroundUnknown(keys, crop_sizes, unknown_source='alpha', interpolations='bilinear')[source]¶

Bases: mmcv.transforms.BaseTransform

Crop around unknown area with a randomly selected scale.

Randomly select the w and h from a list of (w, h). Required keys are the keys in argument keys, added or modified keys are “crop_bbox” and the keys in argument keys. This class assumes value of “alpha” ranges from 0 to 255.

Parameters

keys (Sequence[str]) – The images to be cropped. It must contain ‘alpha’. If unknown_source is set to ‘trimap’, then it must also contain ‘trimap’.
crop_sizes (list[int | tuple[int]]) – List of (w, h) to be selected.
unknown_source (str, optional) – Unknown area to select from. It must be ‘alpha’ or ‘trimap’. Default to ‘alpha’.
interpolations (str | list[str], optional) – Interpolation method of mmcv.imresize. The interpolation operation will be applied when image size is smaller than the crop_size. If given as a list of str, it should have the same length as keys. Or if given as a str all the keys will be resized with the same method. Default to ‘bilinear’.

transform(results)[source]¶

Transform function.

Parameters: results (dict) – A dict containing the necessary information and data for augmentation.
Returns: A dict containing the processed data and information.
Return type: dict

__repr__()[source]¶: Return repr(self).

class mmagic.datasets.transforms.crop.RandomCropLongEdge(keys='img')[source]¶

Bases: mmcv.transforms.BaseTransform

Random crop the given image by the long edge.

Parameters: keys (list[str]) – The images to be cropped.

transform(results)[source]¶

Call function.

Parameters: results (dict) – A dict containing the necessary information and data for augmentation.
Returns: A dict containing the processed data and information.
Return type: dict

__repr__()[source]¶: Return repr(self).

class mmagic.datasets.transforms.crop.CenterCropLongEdge(keys='img')[source]¶

Bases: mmcv.transforms.BaseTransform

Center crop the given image by the long edge.

Parameters: keys (list[str]) – The images to be cropped.

transform(results)[source]¶

Call function.

Parameters: results (dict) – A dict containing the necessary information and data for augmentation.
Returns: A dict containing the processed data and information.
Return type: dict

__repr__()[source]¶: Return repr(self).

class mmagic.datasets.transforms.crop.InstanceCrop(config_file, from_pretrained=None, key='img', box_num_upbound=- 1, finesize=256)[source]¶

Bases: mmcv.transforms.BaseTransform

Use maskrcnn to detect instances on image.

Mask R-CNN is used to detect the instance on the image pred_bbox is used to segment the instance on the image

Parameters

config_file (str) – config file name relative to detectron2’s “configs/”
key (str) – Unused
box_num_upbound (int) – The upper limit on the number of instances in the figure

transform(results: dict) → dict[source]¶

The transform function of InstanceCrop.

Parameters

results (dict) – A dict containing the necessary information and data for Conversion

Returns

A dict containing the processed data: and information.

Return type

results (dict)

predict_bbox(image)[source]¶

mmagic.datasets.transforms.crop¶

Module Contents¶

Classes¶

Attributes¶

`mmagic.datasets.transforms.crop`¶