Autoencoders Implementation Design I

In this post, I’d like to discuss the existing architecture of variable transformation.

Following are the issues with implementing autoencoders in the current design:

architecture
  • All the current variable transformations inherit TMVA::VariableTransformBase. If we want to add a new transformation method in the current architecture, we need to inherit TMVA::VariableTransformBase class and implement all the virtual functions. However there are virtual functions which won’t make sense in autoencoder transformation like InverseTransform as there is no guarantee that inverse always exists for the transformation that a deep neural network i.e. autoencoder learns.

  • As per the current design, user is allowed to select variables based on his choice and then rest of all the variables are transformed by some one to one function. Number of variables are not reduced after selection of input, there is just one to one transformation. In autoencoder transformation, events are completely transformed to a new space with low number of variables.

  • The current variable transformation architecture is so tightly packed that it’s hard to connect to DNN namespace. Also, it works on the underlying assumption that a variable x is transformed to Ax which is the not the case anymore.

All the above issues seem to get resolved if we implement autoencoder transformation in DataLoader class. I’d pass option string for autoencoder to VarTransform method, similar to what was done in Variance Threshold transformation. It’d return a new DataLoader with transformed variables and then all the further analysis can be done using the new DataLoader. I am intending to use the exactly same design for all the other transformations as well which I’d be implementing in future like feature clustering, LLE etc.

Now I am figuring out the way to use functionalities of DNN namespace in this autoencoder transformation. I feel necessary to use this DNN namespace as I need to setup and train the net before transformation. To obtain good representation of the data, we pre-train autoencoder as a stack of single layer autoencoders. We train the first hidden layer to reconstruct raw input data, train the second hidden layer to reconstruct the states of first hidden layer and so on. After this pre-training phase, results are fine-tuned using backpropagation. So the implementation and design of this pre-training procedure also needs to figured out.

I’ll discuss the prototype of autoencoder transformation in my next post Autoencoders Implementation Design II soon.


Hey there! Feel free to email me if you have any comments.