`mmagic.datasets.transforms.aug_pixel`¶

Module Contents¶

Classes¶

`BinarizeImage`	Binarize image.
`Clip`	Clip the pixels.
`ColorJitter`	An interface for torch color jitter so that it can be invoked in mmagic
`RandomAffine`	Apply random affine to input images.
`RandomMaskDilation`	Randomly dilate binary masks.
`UnsharpMasking`	Apply unsharp masking to an image or a sequence of images.

class mmagic.datasets.transforms.aug_pixel.BinarizeImage(keys, binary_thr, a_min=0, a_max=1, dtype=np.uint8)[source]¶

Bases: mmcv.transforms.BaseTransform

Binarize image.

Parameters

keys (Sequence[str]) – The images to be binarized.
binary_thr (float) – Threshold for binarization.
a_min (int) – Lower limits of pixel value.
a_max (int) – Upper limits of pixel value.
dtype (np.dtype) – Set the data type of the output. Default: np.uint8

_binarize(img)[source]¶

Binarize image.

Parameters: img (np.ndarray) – Input image.
Returns: Output image.
Return type: img (np.ndarray)

transform(results)[source]¶

The transform function of BinarizeImage.

Parameters: results (dict) – A dict containing the necessary information and data for augmentation.
Returns: A dict containing the processed data and information.
Return type: dict

__repr__()[source]¶: Return repr(self).

class mmagic.datasets.transforms.aug_pixel.Clip(keys, a_min=0, a_max=255)[source]¶

Bases: mmcv.transforms.BaseTransform

Clip the pixels.

Modified keys are the attributes specified in “keys”.

Parameters

keys (list[str]) – The keys whose values are clipped.
a_min (int) – Lower limits of pixel value.
a_max (int) – Upper limits of pixel value.

_clip(input_)[source]¶

Clip the pixels.

Parameters: input (Union[List, np.ndarray]) – Pixels to clip.
Returns: Clipped pixels.
Return type: Union[List, np.ndarray]

transform(results)[source]¶

transform function.

Parameters

results (dict) – A dict containing the necessary information and data for augmentation.

Returns

A dict with the values of the specified keys are rounded: and clipped.

Return type

dict

__repr__()[source]¶: Return repr(self).

class mmagic.datasets.transforms.aug_pixel.ColorJitter(keys, channel_order='rgb', **kwargs)[source]¶

Bases: mmcv.transforms.BaseTransform

An interface for torch color jitter so that it can be invoked in mmagic pipeline.

Randomly change the brightness, contrast and saturation of an image. Modified keys are the attributes specified in “keys”.

Required Keys:

[KEYS]

Modified Keys:

[KEYS]

Parameters

keys (list[str]) – The images to be resized.
channel_order (str) – Order of channel, candidates are ‘bgr’ and ‘rgb’. Default: ‘rgb’.

Notes

**kwards follows the args list of torchvision.transforms.ColorJitter.

brightness (float or tuple of float (min, max)): How much to jitter: brightness. brightness_factor is chosen uniformly from [max(0, 1 - brightness), 1 + brightness] or the given [min, max]. Should be non negative numbers.
contrast (float or tuple of float (min, max)): How much to jitter: contrast. contrast_factor is chosen uniformly from [max(0, 1 - contrast), 1 + contrast] or the given [min, max]. Should be non negative numbers.
saturation (float or tuple of float (min, max)): How much to jitter: saturation. saturation_factor is chosen uniformly from [max(0, 1 - saturation), 1 + saturation] or the given [min, max]. Should be non negative numbers.
hue (float or tuple of float (min, max)): How much to jitter hue.: hue_factor is chosen uniformly from [-hue, hue] or the given [min, max]. Should have 0<= hue <= 0.5 or -0.5 <= min <= max <= 0.5.

_color_jitter(image, this_seed)[source]¶

Color Jitter Function.

Parameters

image (np.ndarray) – Image.
this_seed (int) – Seed of torch.

Returns

The output image.

Return type

image (np.ndarray)

transform(results: Dict) → Dict[source]¶

The transform function of ColorJitter.

Parameters: results (dict) – The result dict.
Returns: The result dict.
Return type: dict

__repr__()[source]¶: Return repr(self).

class mmagic.datasets.transforms.aug_pixel.RandomAffine(keys, degrees, translate=None, scale=None, shear=None, flip_ratio=None)[source]¶

Bases: mmcv.transforms.BaseTransform

Apply random affine to input images.

This class is adopted from https://github.com/pytorch/vision/blob/v0.5.0/torchvision/transforms/ transforms.py#L1015 It should be noted that in https://github.com/Yaoyi-Li/GCA-Matting/blob/master/dataloader/ data_generator.py#L70 random flip is added. See explanation of flip_ratio below. Required keys are the keys in attribute “keys”, modified keys are keys in attribute “keys”.

Parameters

keys (Sequence[str]) – The images to be affined.
degrees (float | tuple[float]) – Range of degrees to select from. If it is a float instead of a tuple like (min, max), the range of degrees will be (-degrees, +degrees). Set to 0 to deactivate rotations.
translate (tuple, optional) – Tuple of maximum absolute fraction for horizontal and vertical translations. For example translate=(a, b), then horizontal shift is randomly sampled in the range -img_width * a < dx < img_width * a and vertical shift is randomly sampled in the range -img_height * b < dy < img_height * b. Default: None.
scale (tuple, optional) – Scaling factor interval, e.g (a, b), then scale is randomly sampled from the range a <= scale <= b. Default: None.
shear (float | tuple[float], optional) – Range of shear degrees to select from. If shear is a float, a shear parallel to the x axis and a shear parallel to the y axis in the range (-shear, +shear) will be applied. Else if shear is a tuple of 2 values, a x-axis shear and a y-axis shear in (shear[0], shear[1]) will be applied. Default: None.
flip_ratio (float, optional) – Probability of the image being flipped. The flips in horizontal direction and vertical direction are independent. The image may be flipped in both directions. Default: None.

static _get_params(degrees, translate, scale_ranges, shears, flip_ratio, img_size)[source]¶

Get parameters for affine transformation.

Returns: Params to be passed to the affine transformation.
Return type: paras (tuple)

static _get_inverse_affine_matrix(center, angle, translate, scale, shear, flip)[source]¶

Helper method to compute inverse matrix for affine transformation.

As it is explained in PIL.Image.rotate, we need compute INVERSE of affine transformation matrix: M = T * C * RSS * C^-1 where T is translation matrix:

[1, 0, tx | 0, 1, ty | 0, 0, 1];

C is translation matrix to keep center:: [1, 0, cx | 0, 1, cy | 0, 0, 1];

RSS is rotation with scale and shear matrix.

It is different from the original function in torchvision. 1. The order are changed to flip -> scale -> rotation -> shear. 2. x and y have different scale factors. RSS(shear, a, scale, f) =

[ cos(a + shear)*scale_x*f -sin(a + shear)*scale_y 0] [ sin(a)*scale_x*f cos(a)*scale_y 0] [ 0 0 1]

Thus, the inverse is M^-1 = C * RSS^-1 * C^-1 * T^-1.

transform(results)[source]¶

transform function.

Parameters: results (dict) – A dict containing the necessary information and data for augmentation.
Returns: A dict containing the processed data and information.
Return type: dict

__repr__()[source]¶: Return repr(self).

class mmagic.datasets.transforms.aug_pixel.RandomMaskDilation(keys, binary_thr=0.0, kernel_min=9, kernel_max=49)[source]¶

Bases: mmcv.transforms.BaseTransform

Randomly dilate binary masks.

Parameters

keys (Sequence[str]) – The images to be resized.
binary_thr (float) – Threshold for obtaining binary mask. Default: 0.
kernel_min (int) – Min size of dilation kernel. Default: 9.
kernel_max (int) – Max size of dilation kernel. Default: 49.

_random_dilate(img)[source]¶

transform(results)[source]¶

transform function.

Parameters: results (dict) – A dict containing the necessary information and data for augmentation.
Returns: A dict containing the processed data and information.
Return type: dict

__repr__()[source]¶: Return repr(self).

class mmagic.datasets.transforms.aug_pixel.UnsharpMasking(kernel_size, sigma, weight, threshold, keys)[source]¶

Bases: mmcv.transforms.BaseTransform

Apply unsharp masking to an image or a sequence of images.

Parameters

kernel_size (int) – The kernel_size of the Gaussian kernel.
sigma (float) – The standard deviation of the Gaussian.
weight (float) – The weight of the “details” in the final output.
threshold (float) – Pixel differences larger than this value are regarded as “details”.
keys (list[str]) – The keys whose values are processed.

Added keys are “xxx_unsharp”, where “xxx” are the attributes specified in “keys”.

_unsharp_masking(imgs)[source]¶: Unsharp masking function.

transform(results)[source]¶

transform function.

Parameters: results (dict) – A dict containing the necessary information and data for augmentation.
Returns: A dict containing the processed data and information.
Return type: dict

__repr__()[source]¶: Return repr(self).

mmagic.datasets.transforms.aug_pixel¶

Module Contents¶

Classes¶

`mmagic.datasets.transforms.aug_pixel`¶