Autoencoders Implementation Design I

In this post, I’d like to discuss the existing architecture of variable transformation.

Following are the issues with implementing autoencoders in the current design:

current-design

All the above issues seem to get resolved if we implement autoencoder transformation in DataLoader class. I’d pass option string for autoencoder to VarTransform method, similar to what was done in Variance Threshold transformation. It’d return a new DataLoader with transformed variables and then all the further analysis can be done using the new DataLoader. I am intending to use the exactly same design for all the other transformations as well which I’d be implementing in future like feature clustering, LLE etc.

Now I am figuring out the way to use functionalities of DNN namespace in this autoencoder transformation. I feel necessary to use this DNN namespace as I need to setup and train the net before transformation. To obtain good representation of the data, we pre-train autoencoder as a stack of single layer autoencoders. We train the first hidden layer to reconstruct raw input data, train the second hidden layer to reconstruct the states of first hidden layer and so on. After this pre-training phase, results are fine-tuned using backpropagation. So the implementation and design of this pre-training procedure also needs to figured out.

I’ll discuss the prototype of autoencoder transformation in my next post Autoencoders Implementation Design II soon.