Training

This Howto guides you through training your first model. Before starting with this Howto make sure you already have

This Howto can completed by using the sample dataset from “create a dataset”. The same dataset (with labels) can be downloaded here:

sample_dataset.zip

Start training

We don’t need any further preparations. Make sure your virtual environment/conda is active and start training with the following command:

python ufld.py configs/sample_dataset.py --mode train --epoch 5

This command loads our config and runs ufld.py in training mode. It also sets epoch to 5 which is enough for testing.

If the following error occurs you don’t have enough VRAM. Lower the batch size until the error goes away. Remember this can also be done temporarily via CLI, e.g. --batch_size 4. For more information on batch_size see the batch_size section of the config howto.

Monitoring

As soon as the training started (and also after the training finished) we can use tensorboard to monitor our network.

tensorboard --logdir /home/user/work_dir/ --bind_all

This command calls tensorboard. Set logdir to your working directory (see config). bind_all allows access for other computers in your network.

The webinterface can now be opened with http://localhost:6006

What now?