mmagic.datasets.data_utils
¶
Module Contents¶
Functions¶
|
Infer the io backend from the given data_root. |
|
Calculate MD5 of the file. |
|
Checn whether the MD5 of the file. |
|
Check whether the file is integrity by comparing the MD5 of the file |
|
Download object at the given URL to a local path. |
|
Download a file from a url and place it in root. |
|
Judge whether the file is .tar.xz |
|
Judge whether the file is .tar |
|
Judge whether the file is .tar.gz |
|
Judge whether the file is .tgz |
|
Judge whether the file is .gzip |
|
Judge whether the file is .zip |
|
Extract the archive. |
|
Download and extract the archive. |
Return a file object that possibly decompresses 'path' on the fly. |
|
|
Expand ~ and ~user constructions. |
|
Find classes by folders under a root. |
|
Make dataset by walking all images under a root. |
- mmagic.datasets.data_utils.infer_io_backend(data_root: str) str [source]¶
Infer the io backend from the given data_root.
- Parameters
data_root (str) – The path of data root.
- Returns
The io backend.
- Return type
str
- mmagic.datasets.data_utils.calculate_md5(fpath: str, file_backend: mmengine.fileio.backends.BaseStorageBackend = None, chunk_size: int = 1024 * 1024) str [source]¶
Calculate MD5 of the file.
- Parameters
fpath (str) – The path of the file.
file_backend (BaseStorageBackend, optional) – The file backend to fetch the file. Defaults to None.
chunk_size (int, optional) – The chunk size to calculate MD5. Defaults to 1024*1024.
- Returns
The string of MD5.
- Return type
str
- mmagic.datasets.data_utils.check_md5(fpath, md5, **kwargs) bool [source]¶
Checn whether the MD5 of the file.
- Parameters
fpath (str) – The path of the file.
md5 (str) – Target MD5 value.
- Returns
If true, the MD5 of passed file is same as target MD5.
- Return type
bool
- mmagic.datasets.data_utils.check_integrity(fpath, md5=None) bool [source]¶
Check whether the file is integrity by comparing the MD5 of the file with target MD5.
- Parameters
fpath (str) – The path of the file.
md5 (str, optional) – The target MD5 value. Defaults to None.
- Returns
If true, the passed file is integrity.
- Return type
bool
- mmagic.datasets.data_utils.download_url_to_file(url, dst, hash_prefix=None, progress=True)[source]¶
Download object at the given URL to a local path.
Modified from https://pytorch.org/docs/stable/hub.html#torch.hub.download_url_to_file
- Parameters
url (str) – URL of the object to download
dst (str) – Full path where object will be saved, e.g.
/tmp/temporary_file
hash_prefix (string, optional) – If not None, the SHA256 downloaded file should start with
hash_prefix
. Defaults to None.progress (bool) – whether or not to display a progress bar to stderr. Defaults to True
- mmagic.datasets.data_utils.download_url(url, root, filename=None, md5=None)[source]¶
Download a file from a url and place it in root.
- Parameters
url (str) – URL to download file from.
root (str) – Directory to place downloaded file in.
filename (str | None) – Name to save the file under. If filename is None, use the basename of the URL.
md5 (str | None) – MD5 checksum of the download. If md5 is None, download without md5 check.
- mmagic.datasets.data_utils.extract_archive(from_path, to_path=None, remove_finished=False)[source]¶
Extract the archive.
- mmagic.datasets.data_utils.download_and_extract_archive(url, download_root, extract_root=None, filename=None, md5=None, remove_finished=False)[source]¶
Download and extract the archive.
- mmagic.datasets.data_utils.open_maybe_compressed_file(path: str)[source]¶
Return a file object that possibly decompresses ‘path’ on the fly.
Decompression occurs when argument path is a string and ends with ‘.gz’ or ‘.xz’.
- mmagic.datasets.data_utils.expanduser(path)[source]¶
Expand ~ and ~user constructions.
If user or $HOME is unknown, do nothing.
- mmagic.datasets.data_utils.find_folders(root: str, file_backend: mmengine.fileio.backends.BaseStorageBackend) Tuple[List[str], Dict[str, int]] [source]¶
Find classes by folders under a root.
- Parameters
root (string) – root directory of folders
- Returns
folders: The name of sub folders under the root.
folder_to_idx: The map from folder name to class idx.
- Return type
Tuple[List[str], Dict[str, int]]
- mmagic.datasets.data_utils.get_samples(root: str, folder_to_idx: Dict[str, int], is_valid_file: Callable, file_backend: mmengine.fileio.backends.BaseStorageBackend)[source]¶
Make dataset by walking all images under a root.
- Parameters
root (string) – root directory of folders
folder_to_idx (dict) – the map from class name to class idx
is_valid_file (Callable) – A function that takes path of a file and check if the file is a valid sample file.
- Returns
samples: a list of tuple where each element is (image, class_idx)
empty_folders: The folders don’t have any valid files.
- Return type
Tuple[list, set]