mmagic.evaluation.evaluator
¶
Module Contents¶
Classes¶
Evaluator for generative models. Unlike high-level vision tasks, metrics |
- class mmagic.evaluation.evaluator.Evaluator(metrics: Union[dict, mmengine.evaluator.BaseMetric, Sequence])[source]¶
Bases:
mmengine.evaluator.Evaluator
Evaluator for generative models. Unlike high-level vision tasks, metrics for generative models have various input types. For example, Inception Score (IS,
InceptionScore
) only needs to take fake images as input. However, Frechet Inception Distance (FID,FrechetInceptionDistance
) needs to take both real images and fake images as input, and the numbers of real images and fake images can be set arbitrarily. For Perceptual path length (PPL,PerceptualPathLength
), generator need to sample images along a latent path.In order to be compatible with different metrics, we designed two critical functions,
prepare_metrics()
andprepare_samplers()
to support those requirements.prepare_metrics()
set the image images’ color order and pass the dataloader to all metrics. Therefore metrics need pre-processing to prepare the corresponding feature.prepare_samplers()
pass the dataloader and model to the metrics, and get the corresponding sampler of each kind of metrics. Metrics with same sample mode can share the sampler.
The whole evaluation process can be found in
mmagic.engine.runner.MultiValLoop.run()
andmmagic.engine.runner.MultiTestLoop.run()
.- Parameters
metrics (dict or BaseMetric or Sequence) – The config of metrics.
- prepare_metrics(module: mmengine.model.BaseModel, dataloader: torch.utils.data.dataloader.DataLoader)[source]¶
Prepare for metrics before evaluation starts. Some metrics use pretrained model to extract feature. Some metrics use pretrained model to extract feature and input channel order may vary among those models. Therefore, we first parse the output color order from data preprocessor and set the color order for each metric. Then we pass the dataloader to each metrics to prepare pre-calculated items. (e.g. inception feature of the real images). If metric has no pre-calculated items,
metric.prepare()
will be ignored. Once the function has been called,self.is_ready
will be set as True. Ifself.is_ready
is True, this function will directly return to avoid duplicate computation.- Parameters
module (BaseModel) – Model to evaluate.
dataloader (DataLoader) – The dataloader for real images.
- static _cal_metric_hash(metric: mmagic.evaluation.metrics.base_gen_metric.GenMetric)[source]¶
Calculate a unique hash value based on the SAMPLER_MODE and sample_model.
- prepare_samplers(module: mmengine.model.BaseModel, dataloader: torch.utils.data.dataloader.DataLoader) List[Tuple[List[mmengine.evaluator.BaseMetric], Iterator]] [source]¶
Prepare for the sampler for metrics whose sampling mode are different. For generative models, different metric need image generated with different inputs. For example, FID, KID and IS need images generated with random noise, and PPL need paired images on the specific noise interpolation path. Therefore, we first group metrics with respect to their sampler’s mode (refers to :attr:~`GenMetrics.SAMPLER_MODE`), and build a shared sampler for each metric group. To be noted that, the length of the shared sampler depends on the metric of the most images required in each group.
- Parameters
module (BaseModel) – Model to evaluate. Some metrics (e.g. PPL) require module in their sampler.
dataloader (DataLoader) – The dataloader for real image.
- Returns
- A list of “metrics-shared
sampler” pair.
- Return type
List[Tuple[List[BaseMetric], Iterator]]
- process(data_samples: Sequence[mmagic.structures.DataSample], data_batch: Optional[Any], metrics: Sequence[mmengine.evaluator.BaseMetric]) None [source]¶
Pass data_batch from dataloader and predictions (generated results) to corresponding metrics.
- Parameters
data_samples (Sequence[DataSample]) – A batch of generated results from model.
data_batch (Optional[Any]) – A batch of data from the metrics specific sampler or the dataloader.
metrics (Optional[Sequence[BaseMetric]]) – Metrics to evaluate.
- evaluate() dict [source]¶
Invoke
evaluate
method of each metric and collect the metrics dictionary. Different from Evaluator.evaluate, this function does not take size as input, and elements in self.metrics will call their own evaluate method to calculate the metric.- Returns
- Evaluation results of all metrics. The keys are the names
of the metrics, and the values are corresponding results.
- Return type
dict