Migration of Runtime Settings¶
We update runtime settings in MMagic 1.x. Important modifications are as following.
The
checkpoint_config
is moved todefault_hooks.checkpoint
and thelog_config
is moved todefault_hooks.logger
. And we move many hooks settings from the script code to thedefault_hooks
field in the runtime configuration.The
resume_from
is removed. And we useresume
to replace it.If resume=True and load_from is not None, resume training from the checkpoint in load_from.
If resume=True and load_from is None, try to resume from the latest checkpoint in the work directory.
If resume=False and load_from is not None, only load the checkpoint, not resume training.
If resume=False and load_from is None, do not load nor resume.
The
dist_params
field is a sub field ofenv_cfg
now. And there are some new configurations in theenv_cfg
.The
workflow
related functionalities are removed.New field
visualizer
: The visualizer is a new design. We use a visualizer instance in the runner to handle results & log visualization and save to different backends, like Local, TensorBoard and Wandb.New field
default_scope
: The start point to search module for all registries.
Original | New |
---|---|
checkpoint_config = dict( # Config to set the checkpoint hook, Refer to https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/hooks/checkpoint.py for implementation.
interval=5000, # The save interval is 5000 iterations
save_optimizer=True, # Also save optimizers
by_epoch=False) # Count by iterations
log_config = dict( # Config to register logger hook
interval=100, # Interval to print the log
hooks=[
dict(type='TextLoggerHook', by_epoch=False), # The logger used to record the training process
dict(type='TensorboardLoggerHook'), # The Tensorboard logger is also supported
])
visual_config = None # Visual config, we do not use it.
# runtime settings
dist_params = dict(backend='nccl') # Parameters to setup distributed training, the port can also be set
log_level = 'INFO' # The level of logging
load_from = None # load models as a pre-trained model from a given path. This will not resume training
resume_from = None # Resume checkpoints from a given path, the training will be resumed from the iteration when the checkpoint's is saved
workflow = [('train', 1)] # Workflow for runner. [('train', 1)] means there is only one workflow and the workflow named 'train' is executed once. Keep this unchanged when training current matting models
|
default_hooks = dict( # Used to build default hooks
checkpoint=dict( # Config to set the checkpoint hook
type='CheckpointHook',
interval=5000, # The save interval is 5000 iterations
save_optimizer=True,
by_epoch=False, # Count by iterations
out_dir=save_dir,
),
timer=dict(type='IterTimerHook'),
logger=dict(type='LoggerHook', interval=100), # Config to register logger hook
param_scheduler=dict(type='ParamSchedulerHook'),
sampler_seed=dict(type='DistSamplerSeedHook'),
)
default_scope = 'mmedit' # Used to set registries location
env_cfg = dict( # Parameters to setup distributed training, the port can also be set
cudnn_benchmark=False,
mp_cfg=dict(mp_start_method='fork', opencv_num_threads=4),
dist_cfg=dict(backend='nccl'),
)
log_level = 'INFO' # The level of logging
log_processor = dict(type='LogProcessor', window_size=100, by_epoch=False) # Used to build log processor
load_from = None # load models as a pre-trained model from a given path. This will not resume training.
resume = False # Resume checkpoints from a given path, the training will be resumed from the epoch when the checkpoint's is saved.
|