3rd Stage

The workflow for the 2nd and 3rd stages with exposed parameters is illustrated below.

class rsdtlib.Window(tf_record_path, delta_size, window_stride, omega, Omega, generate_triple, n_threads=1, use_new_save=False)

Class for windowing the time series of observations.

Parameters:

tf_record_path (str) – Path to stacked TFRecord files
delta_size (int) – Window size in seconds (\(\Delta\))
window_stride (int) – Stride of window (\(\rho\))
omega (int) – Minimum window size in number of observations (\(\omega\))
Omega (int) – Maximum window size in number of observations (\(\Omega\))
generate_triple (boolean) – Indicate whether a window triplet should be generated or just a single window
n_threads (int) – Number of threads to use for concurrent processing (default = 1)
use_new_save (boolean) – Whether to use tf.data.Dataset.save(...) from Tensorflow if true. If flase (default) create TFRecord files.

Example:

Define the windowing (and optional labeling) with different parameters. This can be run in two modes:

Write to disk: write_tf_files(…)

Interactive: get_infer_dataset(…)

Irrespective of the modes, the windows are described with the class instantiation. In the example below, the path to the stacked TFRecord files are provided as tf_record_path. The window parameters \(\Delta\) for the window size in seconds, the window stride \(\rho\), and lower (\(\omega\)) and upper bound (\(\Omega\)) of number of observations per window are provided here.

import rsdtlib

tf_record_path = "<SOURCE PATH STACKED TFRECORD FILES>"

window = rsdtlib.Window(
              tf_record_path,        # stacked TFRecord file path
              60*60*24*30,           # Delta (size)
              1,                     # window stride
              10,                    # omega (min. window size)
              16,                    # Omega (max. window size)
              True,                  # generate triplet
              n_threads=n_threads,   # number of threads to use
              use_new_save=False)    # new TF Dataset save

get_infer_dataset(tile, win_filter=None)

Return a dataset for inference.

Parameters:

tile ([y, x]) – Tile coordinates in y and x dimensions
win_filter (tf.data.Dataset.filter predicate) – Filter for windows (default = None)

Returns:

If tile exists, return the dataset, otherwise None

Return type:

tf.data.Dataset | None

Example:

This is the interactive mode. In the example below, the windows of the tile [5, 10] are used for inference. The win_filter allows to add additional filters for windows. In the example, only windows that have their first observation later than 2022-07-01 00:00:00 (GMT) are considered. Note that window[0][0] denotes the first timestamp. When applying filters, there is no notion of previous, current or next windows (irrespective of the setting of generate_triple).

import rsdtlib
import tensorflow as tf

later_than = lambda window: tf.math.greater(
                                tf.cast(window[0][0], tf.int64),
                                1656626400) # 2022-07-01 00:00:00

windows_ds = window.get_infer_dataset([5, 10],
                                      win_filter=later_than)

# Use for inference on a loaded Tensorflow/Keras model
result = model.predict(windows_ds)

get_num_tiles()

Get the number of y-x tiles. It assumes that no gap of tiles exist.

Note: Omitting tiles is possible. This function only takes the maximum y-x tile coordinates. In further processing a selector can be used to filter non-available tiles.

Returns:: Returns a tuple (y, x)
Return type:: Tuple of (int, int)

Example:

In the example below, the amount of tiles in each dimension are returned.

import rsdtlib

num_tiles_y, num_tiles_x = window.get_num_tiles()

windows_list()

Retrieve the list of windows without constructing them.

Returns:

Returns a list of window descriptors

Return type:

Each window is described as a quadruple (id, starttime, enddtime, no_obs):

id: ID of the window (zero based, sequential enumeration)

starttime: Time stamp of first observation in window

endtime: Time stamp of last observation in window

no_obs: Number of observations in window

Example:

In the example below, the list of windows that are generated are saved to a CSV file.

import rsdtlib
import csv

window_list = window.windows_list()

with open("windows_training.csv", mode = "w") as csv_file:
    csv_writer = csv.writer(csv_file,
                            delimiter=",",
                            quotechar="\"",
                            quoting=csv.QUOTE_MINIMAL)
    for item in window_list:
        csv_writer.writerow([item[0],
                             datetime.utcfromtimestamp(item[1]),
                             datetime.utcfromtimestamp(item[2]),
                             item[3]])

write_tf_files(tf_record_out_path, selector, win_filter=None, label_args_ds=None, gen_label=None)

Write the windows as TFRecord files (one for each tile).

Parameters:

tf_record_out_path (str) – Path to destination where to store the TFRecord files
selector – Functor to query which tile should be written
win_filter (tf.data.Dataset.filter predicate) – Filter for windows (default = None)
label_args_ds (tf.data.Dataset) – Arguments for labeling (default = None). The sequence has to be identical to the samples.
gen_label (gen_label(data, label_args) -> [y, x]) – Functor to generate the labels (default = None). Training data is provided by data, the label arguments via label_args. The output is a label in spatial y and x. dimension.

Returns:

None

Example:

This is the windowing mode to write to disk. In the example below, the path to the windowed TFRecord files are provided as tf_record_out_path. A checkerboard pattern of tiles are windowed. This is useful for separating training and validation/testing data. The win_filter allows to add additional filters for windows. In the example, randomly every 10th window is saved and the rest is discarded. If labels should be assigned to every window, gen_label is the generator for these labels. In the example, only labels with values of one are created.

import rsdtlib
import tensorflow as tf

tf_record_out_path = "<DESTINAITON PATH OF WINDOWED TFRECORD FILES>"

selector = lambda j, i: (i + j) % 2 == 0 # checkerboard pattern

# Randomly select every 10th window only
randomize = lambda *args:                                         \
                tf.random.uniform([], 0, 10,
                                  dtype=tf.dtypes.int32) == 0

# Generate 32x32 pixel label with only values of one
my_label = lambda window, label_args:                             \
      (tf.concat(
            # Only serialize current window (index 1)
            # Note: We require window triplets are generated!
            [
             #window[1][0][:],          # timestamps (not used)
             window[1][1][:, :, :, :],  # SAR ascending
             window[1][2][:, :, :, :],  # SAR descending
             window[1][3][:, :, :, :]], # optical
            axis=-1),
       tf.ensure_shape(                 # label
            tf.ones([32, 32]), [32, 32]))

window.write_tf_files(
              tf_record_out_path, # path to write TFRecord files to
              selector,
              win_filter=randomize,
              gen_label=my_label)