`mmagic.datasets.transforms.aug_pixel`¶

Module Contents¶

Classes¶

`BinarizeImage`	Binarize image.
`Clip`	Clip the pixels.
`ColorJitter`	An interface for torch color jitter so that it can be invoked in mmagic
`RandomAffine`	Apply random affine to input images.
`RandomMaskDilation`	Randomly dilate binary masks.
`UnsharpMasking`	Apply unsharp masking to an image or a sequence of images.

class mmagic.datasets.transforms.aug_pixel.BinarizeImage(keys, binary_thr, a_min=0, a_max=1, dtype=np.uint8)[源代码]¶

Bases: mmcv.transforms.BaseTransform

Binarize image.

参数

keys (Sequence[str]) – The images to be binarized.
binary_thr (float) – Threshold for binarization.
a_min (int) – Lower limits of pixel value.
a_max (int) – Upper limits of pixel value.
dtype (np.dtype) – Set the data type of the output. Default: np.uint8

_binarize(img)[源代码]¶

Binarize image.

参数: img (np.ndarray) – Input image.
返回: Output image.
返回类型: img (np.ndarray)

transform(results)[源代码]¶

The transform function of BinarizeImage.

参数: results (dict) – A dict containing the necessary information and data for augmentation.
返回: A dict containing the processed data and information.
返回类型: dict

__repr__()[源代码]¶: Return repr(self).

class mmagic.datasets.transforms.aug_pixel.Clip(keys, a_min=0, a_max=255)[源代码]¶

Bases: mmcv.transforms.BaseTransform

Clip the pixels.

Modified keys are the attributes specified in “keys”.

参数

keys (list[str]) – The keys whose values are clipped.
a_min (int) – Lower limits of pixel value.
a_max (int) – Upper limits of pixel value.

_clip(input_)[源代码]¶

Clip the pixels.

参数: input (Union[List, np.ndarray]) – Pixels to clip.
返回: Clipped pixels.
返回类型: Union[List, np.ndarray]

transform(results)[源代码]¶

transform function.

参数

results (dict) – A dict containing the necessary information and data for augmentation.

返回

A dict with the values of the specified keys are rounded: and clipped.

返回类型

dict

__repr__()[源代码]¶: Return repr(self).

class mmagic.datasets.transforms.aug_pixel.ColorJitter(keys, channel_order='rgb', **kwargs)[源代码]¶

Bases: mmcv.transforms.BaseTransform

An interface for torch color jitter so that it can be invoked in mmagic pipeline.

Randomly change the brightness, contrast and saturation of an image. Modified keys are the attributes specified in “keys”.

Required Keys:

[KEYS]

Modified Keys:

[KEYS]

参数

keys (list[str]) – The images to be resized.
channel_order (str) – Order of channel, candidates are ‘bgr’ and ‘rgb’. Default: ‘rgb’.

提示

**kwards follows the args list of torchvision.transforms.ColorJitter.

brightness (float or tuple of float (min, max)): How much to jitter: brightness. brightness_factor is chosen uniformly from [max(0, 1 - brightness), 1 + brightness] or the given [min, max]. Should be non negative numbers.
contrast (float or tuple of float (min, max)): How much to jitter: contrast. contrast_factor is chosen uniformly from [max(0, 1 - contrast), 1 + contrast] or the given [min, max]. Should be non negative numbers.
saturation (float or tuple of float (min, max)): How much to jitter: saturation. saturation_factor is chosen uniformly from [max(0, 1 - saturation), 1 + saturation] or the given [min, max]. Should be non negative numbers.
hue (float or tuple of float (min, max)): How much to jitter hue.: hue_factor is chosen uniformly from [-hue, hue] or the given [min, max]. Should have 0<= hue <= 0.5 or -0.5 <= min <= max <= 0.5.

_color_jitter(image, this_seed)[源代码]¶

Color Jitter Function.

参数

image (np.ndarray) – Image.
this_seed (int) – Seed of torch.

返回

The output image.

返回类型

image (np.ndarray)

transform(results: Dict) → Dict[源代码]¶

The transform function of ColorJitter.

参数: results (dict) – The result dict.
返回: The result dict.
返回类型: dict

__repr__()[源代码]¶: Return repr(self).

class mmagic.datasets.transforms.aug_pixel.RandomAffine(keys, degrees, translate=None, scale=None, shear=None, flip_ratio=None)[源代码]¶

Bases: mmcv.transforms.BaseTransform

Apply random affine to input images.

This class is adopted from https://github.com/pytorch/vision/blob/v0.5.0/torchvision/transforms/ transforms.py#L1015 It should be noted that in https://github.com/Yaoyi-Li/GCA-Matting/blob/master/dataloader/ data_generator.py#L70 random flip is added. See explanation of flip_ratio below. Required keys are the keys in attribute “keys”, modified keys are keys in attribute “keys”.

参数

keys (Sequence[str]) – The images to be affined.
degrees (float | tuple[float]) – Range of degrees to select from. If it is a float instead of a tuple like (min, max), the range of degrees will be (-degrees, +degrees). Set to 0 to deactivate rotations.
translate (tuple, optional) – Tuple of maximum absolute fraction for horizontal and vertical translations. For example translate=(a, b), then horizontal shift is randomly sampled in the range -img_width * a < dx < img_width * a and vertical shift is randomly sampled in the range -img_height * b < dy < img_height * b. Default: None.
scale (tuple, optional) – Scaling factor interval, e.g (a, b), then scale is randomly sampled from the range a <= scale <= b. Default: None.
shear (float | tuple[float], optional) – Range of shear degrees to select from. If shear is a float, a shear parallel to the x axis and a shear parallel to the y axis in the range (-shear, +shear) will be applied. Else if shear is a tuple of 2 values, a x-axis shear and a y-axis shear in (shear[0], shear[1]) will be applied. Default: None.
flip_ratio (float, optional) – Probability of the image being flipped. The flips in horizontal direction and vertical direction are independent. The image may be flipped in both directions. Default: None.

static _get_params(degrees, translate, scale_ranges, shears, flip_ratio, img_size)[源代码]¶

Get parameters for affine transformation.

返回: Params to be passed to the affine transformation.
返回类型: paras (tuple)

static _get_inverse_affine_matrix(center, angle, translate, scale, shear, flip)[源代码]¶

Helper method to compute inverse matrix for affine transformation.

As it is explained in PIL.Image.rotate, we need compute INVERSE of affine transformation matrix: M = T * C * RSS * C^-1 where T is translation matrix:

[1, 0, tx | 0, 1, ty | 0, 0, 1];

C is translation matrix to keep center:: [1, 0, cx | 0, 1, cy | 0, 0, 1];

RSS is rotation with scale and shear matrix.

It is different from the original function in torchvision. 1. The order are changed to flip -> scale -> rotation -> shear. 2. x and y have different scale factors. RSS(shear, a, scale, f) =

[ cos(a + shear)*scale_x*f -sin(a + shear)*scale_y 0] [ sin(a)*scale_x*f cos(a)*scale_y 0] [ 0 0 1]

Thus, the inverse is M^-1 = C * RSS^-1 * C^-1 * T^-1.

transform(results)[源代码]¶

transform function.

参数: results (dict) – A dict containing the necessary information and data for augmentation.
返回: A dict containing the processed data and information.
返回类型: dict

__repr__()[源代码]¶: Return repr(self).

class mmagic.datasets.transforms.aug_pixel.RandomMaskDilation(keys, binary_thr=0.0, kernel_min=9, kernel_max=49)[源代码]¶

Bases: mmcv.transforms.BaseTransform

Randomly dilate binary masks.

参数

keys (Sequence[str]) – The images to be resized.
binary_thr (float) – Threshold for obtaining binary mask. Default: 0.
kernel_min (int) – Min size of dilation kernel. Default: 9.
kernel_max (int) – Max size of dilation kernel. Default: 49.

_random_dilate(img)[源代码]¶

transform(results)[源代码]¶

transform function.

参数: results (dict) – A dict containing the necessary information and data for augmentation.
返回: A dict containing the processed data and information.
返回类型: dict

__repr__()[源代码]¶: Return repr(self).

class mmagic.datasets.transforms.aug_pixel.UnsharpMasking(kernel_size, sigma, weight, threshold, keys)[源代码]¶

Bases: mmcv.transforms.BaseTransform

Apply unsharp masking to an image or a sequence of images.

参数

kernel_size (int) – The kernel_size of the Gaussian kernel.
sigma (float) – The standard deviation of the Gaussian.
weight (float) – The weight of the “details” in the final output.
threshold (float) – Pixel differences larger than this value are regarded as “details”.
keys (list[str]) – The keys whose values are processed.

Added keys are “xxx_unsharp”, where “xxx” are the attributes specified in “keys”.

_unsharp_masking(imgs)[源代码]¶: Unsharp masking function.

transform(results)[源代码]¶

transform function.

参数: results (dict) – A dict containing the necessary information and data for augmentation.
返回: A dict containing the processed data and information.
返回类型: dict

__repr__()[源代码]¶: Return repr(self).

mmagic.datasets.transforms.aug_pixel¶

Module Contents¶

Classes¶

`mmagic.datasets.transforms.aug_pixel`¶