mmagic.datasets.transforms.aug_pixel
¶
Module Contents¶
Classes¶
Binarize image. |
|
Clip the pixels. |
|
An interface for torch color jitter so that it can be invoked in mmagic |
|
Apply random affine to input images. |
|
Randomly dilate binary masks. |
|
Apply unsharp masking to an image or a sequence of images. |
- class mmagic.datasets.transforms.aug_pixel.BinarizeImage(keys, binary_thr, a_min=0, a_max=1, dtype=np.uint8)[源代码]¶
Bases:
mmcv.transforms.BaseTransform
Binarize image.
- 参数
keys (Sequence[str]) – The images to be binarized.
binary_thr (float) – Threshold for binarization.
a_min (int) – Lower limits of pixel value.
a_max (int) – Upper limits of pixel value.
dtype (np.dtype) – Set the data type of the output. Default: np.uint8
- _binarize(img)[源代码]¶
Binarize image.
- 参数
img (np.ndarray) – Input image.
- 返回
Output image.
- 返回类型
img (np.ndarray)
- class mmagic.datasets.transforms.aug_pixel.Clip(keys, a_min=0, a_max=255)[源代码]¶
Bases:
mmcv.transforms.BaseTransform
Clip the pixels.
Modified keys are the attributes specified in “keys”.
- 参数
keys (list[str]) – The keys whose values are clipped.
a_min (int) – Lower limits of pixel value.
a_max (int) – Upper limits of pixel value.
- _clip(input_)[源代码]¶
Clip the pixels.
- 参数
input (Union[List, np.ndarray]) – Pixels to clip.
- 返回
Clipped pixels.
- 返回类型
Union[List, np.ndarray]
- class mmagic.datasets.transforms.aug_pixel.ColorJitter(keys, channel_order='rgb', **kwargs)[源代码]¶
Bases:
mmcv.transforms.BaseTransform
An interface for torch color jitter so that it can be invoked in mmagic pipeline.
Randomly change the brightness, contrast and saturation of an image. Modified keys are the attributes specified in “keys”.
Required Keys:
[KEYS]
Modified Keys:
[KEYS]
- 参数
keys (list[str]) – The images to be resized.
channel_order (str) – Order of channel, candidates are ‘bgr’ and ‘rgb’. Default: ‘rgb’.
提示
**kwards
follows the args list oftorchvision.transforms.ColorJitter
.- brightness (float or tuple of float (min, max)): How much to jitter
brightness. brightness_factor is chosen uniformly from [max(0, 1 - brightness), 1 + brightness] or the given [min, max]. Should be non negative numbers.
- contrast (float or tuple of float (min, max)): How much to jitter
contrast. contrast_factor is chosen uniformly from [max(0, 1 - contrast), 1 + contrast] or the given [min, max]. Should be non negative numbers.
- saturation (float or tuple of float (min, max)): How much to jitter
saturation. saturation_factor is chosen uniformly from [max(0, 1 - saturation), 1 + saturation] or the given [min, max]. Should be non negative numbers.
- hue (float or tuple of float (min, max)): How much to jitter hue.
hue_factor is chosen uniformly from [-hue, hue] or the given [min, max]. Should have 0<= hue <= 0.5 or -0.5 <= min <= max <= 0.5.
- _color_jitter(image, this_seed)[源代码]¶
Color Jitter Function.
- 参数
image (np.ndarray) – Image.
this_seed (int) – Seed of torch.
- 返回
The output image.
- 返回类型
image (np.ndarray)
- class mmagic.datasets.transforms.aug_pixel.RandomAffine(keys, degrees, translate=None, scale=None, shear=None, flip_ratio=None)[源代码]¶
Bases:
mmcv.transforms.BaseTransform
Apply random affine to input images.
This class is adopted from https://github.com/pytorch/vision/blob/v0.5.0/torchvision/transforms/ transforms.py#L1015 It should be noted that in https://github.com/Yaoyi-Li/GCA-Matting/blob/master/dataloader/ data_generator.py#L70 random flip is added. See explanation of flip_ratio below. Required keys are the keys in attribute “keys”, modified keys are keys in attribute “keys”.
- 参数
keys (Sequence[str]) – The images to be affined.
degrees (float | tuple[float]) – Range of degrees to select from. If it is a float instead of a tuple like (min, max), the range of degrees will be (-degrees, +degrees). Set to 0 to deactivate rotations.
translate (tuple, optional) – Tuple of maximum absolute fraction for horizontal and vertical translations. For example translate=(a, b), then horizontal shift is randomly sampled in the range -img_width * a < dx < img_width * a and vertical shift is randomly sampled in the range -img_height * b < dy < img_height * b. Default: None.
scale (tuple, optional) – Scaling factor interval, e.g (a, b), then scale is randomly sampled from the range a <= scale <= b. Default: None.
shear (float | tuple[float], optional) – Range of shear degrees to select from. If shear is a float, a shear parallel to the x axis and a shear parallel to the y axis in the range (-shear, +shear) will be applied. Else if shear is a tuple of 2 values, a x-axis shear and a y-axis shear in (shear[0], shear[1]) will be applied. Default: None.
flip_ratio (float, optional) – Probability of the image being flipped. The flips in horizontal direction and vertical direction are independent. The image may be flipped in both directions. Default: None.
- static _get_params(degrees, translate, scale_ranges, shears, flip_ratio, img_size)[源代码]¶
Get parameters for affine transformation.
- 返回
Params to be passed to the affine transformation.
- 返回类型
paras (tuple)
- static _get_inverse_affine_matrix(center, angle, translate, scale, shear, flip)[源代码]¶
Helper method to compute inverse matrix for affine transformation.
As it is explained in PIL.Image.rotate, we need compute INVERSE of affine transformation matrix: M = T * C * RSS * C^-1 where T is translation matrix:
[1, 0, tx | 0, 1, ty | 0, 0, 1];
- C is translation matrix to keep center:
[1, 0, cx | 0, 1, cy | 0, 0, 1];
RSS is rotation with scale and shear matrix.
It is different from the original function in torchvision. 1. The order are changed to flip -> scale -> rotation -> shear. 2. x and y have different scale factors. RSS(shear, a, scale, f) =
[ cos(a + shear)*scale_x*f -sin(a + shear)*scale_y 0] [ sin(a)*scale_x*f cos(a)*scale_y 0] [ 0 0 1]
Thus, the inverse is M^-1 = C * RSS^-1 * C^-1 * T^-1.
- class mmagic.datasets.transforms.aug_pixel.RandomMaskDilation(keys, binary_thr=0.0, kernel_min=9, kernel_max=49)[源代码]¶
Bases:
mmcv.transforms.BaseTransform
Randomly dilate binary masks.
- 参数
keys (Sequence[str]) – The images to be resized.
binary_thr (float) – Threshold for obtaining binary mask. Default: 0.
kernel_min (int) – Min size of dilation kernel. Default: 9.
kernel_max (int) – Max size of dilation kernel. Default: 49.
- class mmagic.datasets.transforms.aug_pixel.UnsharpMasking(kernel_size, sigma, weight, threshold, keys)[源代码]¶
Bases:
mmcv.transforms.BaseTransform
Apply unsharp masking to an image or a sequence of images.
- 参数
kernel_size (int) – The kernel_size of the Gaussian kernel.
sigma (float) – The standard deviation of the Gaussian.
weight (float) – The weight of the “details” in the final output.
threshold (float) – Pixel differences larger than this value are regarded as “details”.
keys (list[str]) – The keys whose values are processed.
Added keys are “xxx_unsharp”, where “xxx” are the attributes specified in “keys”.