Authors: You Xie, Nils Thuerey

Abstract
We propose a novel training approach for improving the generalization in neural networks. We show that in contrast to regular constraints for orthogonality, our approach represents a data-dependent orthogonality constraint, and is closely related to singular value decompositions of the weight matrices. We also show how our formulation is easy to realize in practical network architectures via a reverse pass, which aims for reconstructing the full sequence of internal states of the network. Despite being a surprisingly simple change, we demonstrate that this forward-backward training approach, which we refer to as racecar training, leads to significantly more generic features being extracted from a given data set. Networks trained with our approach show more balanced mutual information between input and output throughout all layers, yield improved explainability and, exhibit improved performance for a variety of tasks and task transfers.

Links
Preprint
Code

Fig. 1: To enable racecar training, we propose a novel forward-reverse network structure that re-uses all weights of a network (left). It has positive effects for learning generic features across a wide range of tasks. In addition, we demonstrate that it yields a way to embed human interpretable singular vectors into the weight metrics (right).