Output modules - src.runtime.modules.output namespace¶
An output module is responsible to do something with the results supplied by the neural network.
It is usually a good idea to use a class here to easily keep context between each call (e.g. a file handler).
An output module has to provide a method which will be called for every batch the net processed.
This method is usually called out
. It has to accept three parameters
predictions (torch.Tensor): The raw output of the neural network
names (List[str], optional): Filenames (relative paths starting from
data_root
) for each sample in the batch.frames (List[np.ndarray], optional): Source Frames, can be used to visualize the network results
Important: As python does not support interfaces it is not guaranteed the parameters of the out-method will have the same names. -> The parameters are called as positional parameters.
A module might also provide a method which is called after the input method finished.
This method is usually called post
and takes no parameters.
It is not called if the applications exits because of an exception (like ctrl + c). Use a destructor for these cases.
predictions¶
The result is a four dimensional tensor. The innermost dimension contains the probabilities that a point belongs to a lane. A sample Tensor of the innermost dimension could look like:
# [0th lane, 1st lane, 2nd lane, 3rd lane]
[ 0.1005, -0.6627, -1.0224, -0.8216]
The next two dimensions describes a point on the image. Let’s ignore for a second that this is about estimations and just think about a grayscale image. A grayscale image could be represented as follows
[
[50, 60, 30], # column 0
[30, 40, 30], # column 1
[30, 20, 10], # column 2
[10, 20, 10] # column 3
]
It’s a bit unintuitive because rows and lines are inverted compared to how it would normally be. If rendered this would look as follows:
50, 30, 30, 10
60, 40, 20, 20
30, 30, 10, 10
Let’s transfer this back to the network result. The second and third innermost dimensions are exactly what is described above, but instead of grayscale values they contain the probabilities.
The outer dimension equals the batch size. Batch size one means there will be one sample inside the outer dimension, batch size two results in being two samples inside this dimension.
To summarize this, the four dimensional tensor is, from inside to outside, constructed as follows
Probabilities for one of the four lanes
y coordinates (column)
x coordinates (rows)
samples
Hint: The prediction resolution does not match the resolution of the sample supplied to the net. The y resolution matches the amount of h_samples (cls_num_per_lane) and the x resolution the config’s griding_num.
predictions evaluation helpers¶
The output module contains a common.py file, which provides functions helping with evaluating the predictions. It’s recommended to use them as follows to get the most accurate predictions scaled to the image’s width (cfg.img_width).
map_x_to_image(evaluate_predictions(y[i]))
The result per sample will look as follows:
...
[-2. -2. 38.73001617 -2. ]
[-2. 33.28607988 38.91600986 -2. ]
[-2. 32.92785092 39.23658818 -2. ]
[-2. 32.48616481 39.6095107 -2. ]
[-2. 31.6839597 40.14521082 -2. ]
[27.46894699 30.73299785 40.73678818 -2. ]
[25.48803848 29.7993621 41.32698695 -2. ]
[23.15705156 28.82870762 42.06111232 -2. ]
[20.72240206 27.86122163 42.75888292 -2. ]
[18.6873128 26.90077982 43.4880209 -2. ]
[16.89856385 26.03937897 44.17240046 -2. ]
...
Every column represents one lane. Remember: -2 means this points belongs to the residue class.
Submodules¶
src.runtime.modules.output.common module¶
Contains some common helper functions for output modules
-
src.runtime.modules.output.common.
evaluate_predictions
(y)[source]¶ Evaluate predictions Tries to improve the estimation by including all probabilities instead of only using the most probable class :Parameters: y – one result sample
- Returns
2D array containing x values (float) per h_sample and lane
src.runtime.modules.output.out_json module¶
-
class
src.runtime.modules.output.out_json.
JsonOut
(filepath='cfg.work_dir/2021-03-19_16-44-42_cfg.dataset_t.json')[source]¶ Bases:
object
provides the ability to output detected data in a json like format (one json object per line) to a file This file will be analog to the source labels you are using for training
- Parameters
filepath – full file path where the results will be stored
src.runtime.modules.output.out_prod module¶
-
class
src.runtime.modules.output.out_prod.
ProdOut
[source]¶ Bases:
object
You can use this class as a starting point to implement your “real use case”
-
out
(predictions: torch.Tensor, names: List[str], frames: List[numpy.ndarray])[source]¶ This is the place where you implement your out-logic. notice that you’ll receive a list of predictions and filenames. The list list size equals your batch size.
You will probably want to find the most probable class first or do more complex things to get more accurate class assignments
Next step will be mapping the predicted class in a format suitable for your use case. Eg for image export this would be image coordinates
- Parameters
predictions – network result (list of samples containing probabilities per sample)
names – filenames for predictions, might not be available depending on input module
frames – source frames, might not be available depending on input module
-
src.runtime.modules.output.out_test module¶
-
class
src.runtime.modules.output.out_test.
TestOut
(out_file: str = 'cfg.test_validation_data')[source]¶ Bases:
object
This module allows to validate predictions against known labels. It prints the accuracy after the test completed. Additionally it writes its results as csv to the directory where the trained model is located.
used non-basic-cfg values:
cfg.test_validation_data
- Parameters
out_file – relative path to cfg.data_root
src.runtime.modules.output.out_video module¶
-
class
src.runtime.modules.output.out_video.
VisualOut
(enable_live_video='cfg.video_out_enable_live_video', enable_video_export='cfg.video_out_enable_video_export', enable_image_export='cfg.video_out_enable_image_export', enable_line_mode='cfg.video_out_enable_line_mode')[source]¶ Bases:
object
provides different visual output types
live video
record video
save images
visualization can be points or lines
used non-basic-cfg values:
cfg.video_out_enable_live_video
cfg.video_out_enable_video_export
cfg.video_out_enable_image_export
cfg.video_out_enable_line_mode
- Parameters
enable_live_video – show video
enable_video_export – save as video to disk
enable_image_export – save as image files to disk
enable_line_mode – visualization as lines instead of dots
-
out
(y: torch.Tensor, names: List[str], frames: List[numpy.ndarray])[source]¶ Generate visual output
- Parameters
y – network result (list of samples containing probabilities per sample)
names – filenames for y, if empty: frames have to be provided
frames – source frames, if empty: names have to be provided