Skip to content

list_min

ListMinLayer ¤

ListMinLayer(
    name=None,
    input_dtype=None,
    output_dtype=None,
    top_n=None,
    sort_order="asc",
    with_segment=False,
    min_filter_value=None,
    nan_fill_value=0.0,
    axis=1,
    **kwargs
)

Bases: BaseLayer

Calculate the min across the axis dimension. - If one tensor is passed, the transformer calculates the min of the tensor based on all the items in the given axis dimension. - If inputCols is set, - If with_segment = True: the layer calculates the minimum of the first tensor segmented by values of the second tensor. Example: calculate the minimum price of hotels within star ratings

- If with_segment = False: the layer calculates the min of the first tensor

based on second tensor's topN items in the same given axis dimension.

By using the topN items to calculate the statistics, we can better approximate the real statistics in production. It is suggested to use a large enough topN to get a good approximation of the statistics, and an important feature to sort on, such as item's past production.

Example: calculate the min price in the same query, based only on the top N items sorted by descending production.

Initializes the Listwise Min layer.

WARNING: The code is fully tested for axis=1 only. Further testing is needed.

WARNING: The code can be affected by the value of the padding items. Always make sure to filter out the padding items value with min_filter_value.

Parameters:

Name Type Description Default
name Optional[str]

Name of the layer, defaults to None.

None
input_dtype Optional[str]

The dtype to cast the input to. Defaults to None.

None
output_dtype Optional[str]

The dtype to cast the output to. Defaults to None.

None
top_n Optional[int]

The number of top items to consider when calculating the min.

None
sort_order str

The order to sort the second tensor by. Defaults to asc.

'asc'
with_segment bool

Whether the second tensor should be used for segmentation (True) or sorting (False). Defaults to False.

False
min_filter_value Optional[float]

The minimum filter value to ignore values during calculation. Defaults to None (no filter).

None
nan_fill_value float

The value to fill NaNs results with. Defaults to 0.

0.0
axis int

The axis to calculate the statistics across. Defaults to 1.

1
Source code in src/kamae/tensorflow/layers/list_min.py
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
def __init__(
    self,
    name: Optional[str] = None,
    input_dtype: Optional[str] = None,
    output_dtype: Optional[str] = None,
    top_n: Optional[int] = None,
    sort_order: str = "asc",
    with_segment: bool = False,
    min_filter_value: Optional[float] = None,
    nan_fill_value: float = 0.0,
    axis: int = 1,
    **kwargs: Any,
) -> None:
    """
    Initializes the Listwise Min layer.

    WARNING: The code is fully tested for axis=1 only. Further testing is needed.

    WARNING: The code can be affected by the value of the padding items. Always
    make sure to filter out the padding items value with min_filter_value.

    :param name: Name of the layer, defaults to `None`.
    :param input_dtype: The dtype to cast the input to. Defaults to `None`.
    :param output_dtype: The dtype to cast the output to. Defaults to `None`.
    :param top_n: The number of top items to consider when calculating the min.
    :param sort_order: The order to sort the second tensor by. Defaults to `asc`.
    :param with_segment: Whether the second tensor should be used for segmentation (True)
    or sorting (False). Defaults to False.
    :param min_filter_value: The minimum filter value to ignore values during
    calculation. Defaults to None (no filter).
    :param nan_fill_value: The value to fill NaNs results with. Defaults to 0.
    :param axis: The axis to calculate the statistics across. Defaults to 1.
    """
    super().__init__(
        name=name, input_dtype=input_dtype, output_dtype=output_dtype, **kwargs
    )
    self.top_n = top_n
    self.sort_order = sort_order
    self.min_filter_value = min_filter_value
    self.nan_fill_value = nan_fill_value
    self.axis = axis
    self.with_segment = with_segment

compatible_dtypes property ¤

compatible_dtypes

Returns the compatible dtypes of the layer.

Returns:

Type Description
Optional[List[DType]]

The compatible dtypes of the layer.

_call ¤

_call(inputs, **kwargs)

Calculate the listwise min, optionally sorting and filtering based on the second input tensor, or segmenting based on the second input tensor. Behaviour is set by with_segment.

Parameters:

Name Type Description Default
inputs Iterable[Tensor]

The iterable tensor for the feature.

required

Returns:

Type Description
Tensor

The new tensor result column.

Source code in src/kamae/tensorflow/layers/list_min.py
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
@allow_single_or_multiple_tensor_input
def _call(self, inputs: Iterable[Tensor], **kwargs: Any) -> Tensor:
    """
    Calculate the listwise min, optionally sorting and
    filtering based on the second input tensor, or segmenting
    based on the second input tensor. Behaviour is set by with_segment.

    :param inputs: The iterable tensor for the feature.
    :returns: The new tensor result column.
    """
    val_tensor = inputs[0]
    output_shape = tf.shape(val_tensor)

    # Define use of second input
    if len(inputs) == 2:
        if self.with_segment:
            segment_tensor = inputs[1]
        else:
            sort_tensor = inputs[1]
            if self.top_n is None:
                raise ValueError("topN must be specified when using a sort column.")
            val_tensor = get_top_n(
                val_tensor=val_tensor,
                axis=self.axis,
                sort_tensor=sort_tensor,
                sort_order=self.sort_order,
                top_n=self.top_n,
            )
    else:
        if self.with_segment:
            raise ValueError("with_segment set to True, expected two inputs.")

    # Apply the mask to filter out elements less than or equal to the threshold
    if self.min_filter_value is not None:
        mask = tf.greater_equal(val_tensor, self.min_filter_value)
        inf = val_tensor.dtype.max
        val_tensor = tf.where(mask, val_tensor, inf)
    else:
        val_tensor = val_tensor

    # Apply segmented calculation
    if (
        self.with_segment
    ):  # TODO: What happens if I pass in one column and this is True? Handle that gracefully.
        listwise_min = map_fn_w_axis(
            elems=[val_tensor, segment_tensor],
            fn=lambda x: segmented_operation(x, tf.math.unsorted_segment_min),
            axis=self.axis,
            fn_output_signature=tf.TensorSpec(
                shape=val_tensor.shape[self.axis], dtype=val_tensor.dtype
            ),
        )
    # Apply global calculation
    else:
        listwise_min = tf.reduce_min(val_tensor, axis=self.axis, keepdims=True)
        listwise_min = tf.broadcast_to(listwise_min, output_shape)

    if self.min_filter_value is not None:
        # Fill NaNs
        fill_val = tf.constant(self.nan_fill_value, dtype=listwise_min.dtype)
        listwise_min = tf.where(listwise_min != inf, listwise_min, fill_val)

    return listwise_min

get_config ¤

get_config()

Gets the configuration of the layer. Used for saving and loading from a model.

Returns:

Type Description
Dict[str, Any]

Dictionary of the configuration of the layer.

Source code in src/kamae/tensorflow/layers/list_min.py
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
def get_config(self) -> Dict[str, Any]:
    """
    Gets the configuration of the layer.
    Used for saving and loading from a model.

    :returns: Dictionary of the configuration of the layer.
    """
    config = super().get_config()
    config.update(
        {
            "top_n": self.top_n,
            "sort_order": self.sort_order,
            "min_filter_value": self.min_filter_value,
            "nan_fill_value": self.nan_fill_value,
            "axis": self.axis,
            "with_segment": self.with_segment,
        }
    )
    return config