Loss¶

Ensemble enabled loss functions

CrossEntropyEnsembleBoost¶

class supertransformerlib.Loss.CrossEntropyEnsembleBoost(ensemble_channel: int = - 3, label_smoothing: float = 0.0, boost_smoothing: float = 0.2)¶

Cross entropy with a bit of a twist

Beginning from the lowest item in the ensemble channel, we calculate the cross entropy loss, use this as weights on the next calculation, then proceed to the next channel and do it again.

The final result will be the sum of all the intermediate losses.

CrossEntropyAdditiveBoost¶

class supertransformerlib.Loss.CrossEntropyAdditiveBoost(logit_width: int, ensemble_channel: int = 1, label_smoothing: float = 0.0, boost_smoothing: float = 0.3)¶

Cross entropy with a bit of a twist.

Each ensemble channel is independently processed, and then starting from channel one the results are merged. In particular, each channel contributes an additive factor to the final logits; additionally, each intermediate logit is evaluated and the bit losses are used to update the cross entropy weights.

The net result is that if there was a lot of loss on the prior layer, this layer will aggressively train to minimize further loss.

Loss¶

CrossEntropyEnsembleBoost¶

CrossEntropyAdditiveBoost¶

supertransformerlib

Navigation

Related Topics