Shortcuts

mmagic.datasets.data_utils

Module Contents

Functions

infer_io_backend(→ str)

Infer the io backend from the given data_root.

calculate_md5(→ str)

Calculate MD5 of the file.

check_md5(→ bool)

Checn whether the MD5 of the file.

check_integrity(→ bool)

Check whether the file is integrity by comparing the MD5 of the file

download_url_to_file(url, dst[, hash_prefix, progress])

Download object at the given URL to a local path.

download_url(url, root[, filename, md5])

Download a file from a url and place it in root.

_is_tarxz(filename)

Judge whether the file is .tar.xz

_is_tar(filename)

Judge whether the file is .tar

_is_targz(filename)

Judge whether the file is .tar.gz

_is_tgz(filename)

Judge whether the file is .tgz

_is_gzip(filename)

Judge whether the file is .gzip

_is_zip(filename)

Judge whether the file is .zip

extract_archive(from_path[, to_path, remove_finished])

Extract the archive.

download_and_extract_archive(url, download_root[, ...])

Download and extract the archive.

open_maybe_compressed_file(path)

Return a file object that possibly decompresses 'path' on the fly.

expanduser(path)

Expand ~ and ~user constructions.

find_folders(→ Tuple[List[str], Dict[str, int]])

Find classes by folders under a root.

get_samples(root, folder_to_idx, is_valid_file, ...)

Make dataset by walking all images under a root.

mmagic.datasets.data_utils.infer_io_backend(data_root: str) str[source]

Infer the io backend from the given data_root.

Parameters

data_root (str) – The path of data root.

Returns

The io backend.

Return type

str

mmagic.datasets.data_utils.calculate_md5(fpath: str, file_backend: mmengine.fileio.backends.BaseStorageBackend = None, chunk_size: int = 1024 * 1024) str[source]

Calculate MD5 of the file.

Parameters
  • fpath (str) – The path of the file.

  • file_backend (BaseStorageBackend, optional) – The file backend to fetch the file. Defaults to None.

  • chunk_size (int, optional) – The chunk size to calculate MD5. Defaults to 1024*1024.

Returns

The string of MD5.

Return type

str

mmagic.datasets.data_utils.check_md5(fpath, md5, **kwargs) bool[source]

Checn whether the MD5 of the file.

Parameters
  • fpath (str) – The path of the file.

  • md5 (str) – Target MD5 value.

Returns

If true, the MD5 of passed file is same as target MD5.

Return type

bool

mmagic.datasets.data_utils.check_integrity(fpath, md5=None) bool[source]

Check whether the file is integrity by comparing the MD5 of the file with target MD5.

Parameters
  • fpath (str) – The path of the file.

  • md5 (str, optional) – The target MD5 value. Defaults to None.

Returns

If true, the passed file is integrity.

Return type

bool

mmagic.datasets.data_utils.download_url_to_file(url, dst, hash_prefix=None, progress=True)[source]

Download object at the given URL to a local path.

Modified from https://pytorch.org/docs/stable/hub.html#torch.hub.download_url_to_file

Parameters
  • url (str) – URL of the object to download

  • dst (str) – Full path where object will be saved, e.g. /tmp/temporary_file

  • hash_prefix (string, optional) – If not None, the SHA256 downloaded file should start with hash_prefix. Defaults to None.

  • progress (bool) – whether or not to display a progress bar to stderr. Defaults to True

mmagic.datasets.data_utils.download_url(url, root, filename=None, md5=None)[source]

Download a file from a url and place it in root.

Parameters
  • url (str) – URL to download file from.

  • root (str) – Directory to place downloaded file in.

  • filename (str | None) – Name to save the file under. If filename is None, use the basename of the URL.

  • md5 (str | None) – MD5 checksum of the download. If md5 is None, download without md5 check.

mmagic.datasets.data_utils._is_tarxz(filename)[source]

Judge whether the file is .tar.xz

mmagic.datasets.data_utils._is_tar(filename)[source]

Judge whether the file is .tar

mmagic.datasets.data_utils._is_targz(filename)[source]

Judge whether the file is .tar.gz

mmagic.datasets.data_utils._is_tgz(filename)[source]

Judge whether the file is .tgz

mmagic.datasets.data_utils._is_gzip(filename)[source]

Judge whether the file is .gzip

mmagic.datasets.data_utils._is_zip(filename)[source]

Judge whether the file is .zip

mmagic.datasets.data_utils.extract_archive(from_path, to_path=None, remove_finished=False)[source]

Extract the archive.

mmagic.datasets.data_utils.download_and_extract_archive(url, download_root, extract_root=None, filename=None, md5=None, remove_finished=False)[source]

Download and extract the archive.

mmagic.datasets.data_utils.open_maybe_compressed_file(path: str)[source]

Return a file object that possibly decompresses ‘path’ on the fly.

Decompression occurs when argument path is a string and ends with ‘.gz’ or ‘.xz’.

mmagic.datasets.data_utils.expanduser(path)[source]

Expand ~ and ~user constructions.

If user or $HOME is unknown, do nothing.

mmagic.datasets.data_utils.find_folders(root: str, file_backend: mmengine.fileio.backends.BaseStorageBackend) Tuple[List[str], Dict[str, int]][source]

Find classes by folders under a root.

Parameters

root (string) – root directory of folders

Returns

  • folders: The name of sub folders under the root.

  • folder_to_idx: The map from folder name to class idx.

Return type

Tuple[List[str], Dict[str, int]]

mmagic.datasets.data_utils.get_samples(root: str, folder_to_idx: Dict[str, int], is_valid_file: Callable, file_backend: mmengine.fileio.backends.BaseStorageBackend)[source]

Make dataset by walking all images under a root.

Parameters
  • root (string) – root directory of folders

  • folder_to_idx (dict) – the map from class name to class idx

  • is_valid_file (Callable) – A function that takes path of a file and check if the file is a valid sample file.

Returns

  • samples: a list of tuple where each element is (image, class_idx)

  • empty_folders: The folders don’t have any valid files.

Return type

Tuple[list, set]

Read the Docs v: latest
Versions
latest
stable
0.x
Downloads
pdf
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.