Output modules - src.runtime.modules.output namespace¶

An output module is responsible to do something with the results supplied by the neural network.

It is usually a good idea to use a class here to easily keep context between each call (e.g. a file handler).

An output module has to provide a method which will be called for every batch the net processed. This method is usually called out. It has to accept three parameters

predictions (torch.Tensor): The raw output of the neural network
names (List[str], optional): Filenames (relative paths starting from data_root) for each sample in the batch.
frames (List[np.ndarray], optional): Source Frames, can be used to visualize the network results

Important: As python does not support interfaces it is not guaranteed the parameters of the out-method will have the same names. -> The parameters are called as positional parameters.

A module might also provide a method which is called after the input method finished. This method is usually called post and takes no parameters. It is not called if the applications exits because of an exception (like ctrl + c). Use a destructor for these cases.

predictions¶

The result is a four dimensional tensor. The innermost dimension contains the probabilities that a point belongs to a lane. A sample Tensor of the innermost dimension could look like:

# [0th lane, 1st lane, 2nd lane, 3rd lane]
[ 0.1005, -0.6627, -1.0224, -0.8216]

The next two dimensions describes a point on the image. Let’s ignore for a second that this is about estimations and just think about a grayscale image. A grayscale image could be represented as follows

[
    [50, 60, 30],  # column 0
    [30, 40, 30],  # column 1
    [30, 20, 10],  # column 2
    [10, 20, 10]  # column 3
]

It’s a bit unintuitive because rows and lines are inverted compared to how it would normally be. If rendered this would look as follows:

50, 30, 30, 10
60, 40, 20, 20
30, 30, 10, 10

Let’s transfer this back to the network result. The second and third innermost dimensions are exactly what is described above, but instead of grayscale values they contain the probabilities.

The outer dimension equals the batch size. Batch size one means there will be one sample inside the outer dimension, batch size two results in being two samples inside this dimension.

To summarize this, the four dimensional tensor is, from inside to outside, constructed as follows

Probabilities for one of the four lanes
y coordinates (column)
x coordinates (rows)
samples

Hint: The prediction resolution does not match the resolution of the sample supplied to the net. The y resolution matches the amount of h_samples (cls_num_per_lane) and the x resolution the config’s griding_num.

predictions evaluation helpers¶

The output module contains a common.py file, which provides functions helping with evaluating the predictions. It’s recommended to use them as follows to get the most accurate predictions scaled to the image’s width (cfg.img_width).

map_x_to_image(evaluate_predictions(y[i]))

The result per sample will look as follows:

...
[-2.         -2.         38.73001617 -2.        ]
[-2.         33.28607988 38.91600986 -2.        ]
[-2.         32.92785092 39.23658818 -2.        ]
[-2.         32.48616481 39.6095107  -2.        ]
[-2.         31.6839597  40.14521082 -2.        ]
[27.46894699 30.73299785 40.73678818 -2.        ]
[25.48803848 29.7993621  41.32698695 -2.        ]
[23.15705156 28.82870762 42.06111232 -2.        ]
[20.72240206 27.86122163 42.75888292 -2.        ]
[18.6873128  26.90077982 43.4880209  -2.        ]
[16.89856385 26.03937897 44.17240046 -2.        ]
...

Every column represents one lane. Remember: -2 means this points belongs to the residue class.

Submodules¶

src.runtime.modules.output.common module¶

Contains some common helper functions for output modules

src.runtime.modules.output.common.evaluate_predictions(y)[source]¶

Evaluate predictions Tries to improve the estimation by including all probabilities instead of only using the most probable class :Parameters: y – one result sample

Returns: 2D array containing x values (float) per h_sample and lane

src.runtime.modules.output.common.get_filename_date_string()[source]¶: get current date and time in a format suitable for file exports Returns: string

src.runtime.modules.output.common.map_x_to_image(y)[source]¶

Map x-axis (griding_num) estimations to image coordinates

Parameters: y – one result sample (can be directly from net or post-processed -> all number types should be accepted)

Returns: x coordinates for each lane

src.runtime.modules.output.out_json module¶

class src.runtime.modules.output.out_json.JsonOut(filepath='cfg.work_dir/2021-03-19_16-44-42_cfg.dataset_t.json')[source]¶

Bases: object

provides the ability to output detected data in a json like format (one json object per line) to a file This file will be analog to the source labels you are using for training

Parameters: filepath – full file path where the results will be stored

out(y, names, frames: List[numpy.ndarray])[source]¶

Generate json output to text file

Parameters

y – network result (list of samples)
names – filenames for y

src.runtime.modules.output.out_prod module¶

class src.runtime.modules.output.out_prod.ProdOut[source]¶

Bases: object

You can use this class as a starting point to implement your “real use case”

out(predictions: torch.Tensor, names: List[str], frames: List[numpy.ndarray])[source]¶

This is the place where you implement your out-logic. notice that you’ll receive a list of predictions and filenames. The list list size equals your batch size.

You will probably want to find the most probable class first or do more complex things to get more accurate class assignments

Next step will be mapping the predicted class in a format suitable for your use case. Eg for image export this would be image coordinates

Parameters

predictions – network result (list of samples containing probabilities per sample)
names – filenames for predictions, might not be available depending on input module
frames – source frames, might not be available depending on input module

post()[source]¶: Called after dataset/video/whatever was completely processed. You can do things like cleanup or printing stats here

src.runtime.modules.output.out_test module¶

class src.runtime.modules.output.out_test.TestOut(out_file: str = 'cfg.test_validation_data')[source]¶

Bases: object

This module allows to validate predictions against known labels. It prints the accuracy after the test completed. Additionally it writes its results as csv to the directory where the trained model is located.

used non-basic-cfg values:

cfg.test_validation_data

Parameters: out_file – relative path to cfg.data_root

out(predictions: torch.Tensor, names: List[str], _)[source]¶

collect results of batch

Parameters

predictions – network result (list of samples containing probabilities per sample)
names – filenames for predictions, if empty

post()[source]¶: Evaluate collected data and print accuracy

src.runtime.modules.output.out_video module¶

class src.runtime.modules.output.out_video.VisualOut(enable_live_video='cfg.video_out_enable_live_video', enable_video_export='cfg.video_out_enable_video_export', enable_image_export='cfg.video_out_enable_image_export', enable_line_mode='cfg.video_out_enable_line_mode')[source]¶

Bases: object

provides different visual output types

live video
record video
save images

visualization can be points or lines

used non-basic-cfg values:

cfg.video_out_enable_live_video
cfg.video_out_enable_video_export
cfg.video_out_enable_image_export
cfg.video_out_enable_line_mode

Parameters

enable_live_video – show video
enable_video_export – save as video to disk
enable_image_export – save as image files to disk
enable_line_mode – visualization as lines instead of dots

out(y: torch.Tensor, names: List[str], frames: List[numpy.ndarray])[source]¶

Generate visual output

Parameters

y – network result (list of samples containing probabilities per sample)
names – filenames for y, if empty: frames have to be provided
frames – source frames, if empty: names have to be provided

src.runtime.modules.output.out_video.get_lane_color(i: int) → Tuple[source]¶

Get a predefined colors depending on i. Colors repeat if i gets to big

Parameters: i – any number, same number -> same color

Returns: Tuple containing 3 values, eg (255, 0, 0)