Date of Award

May 2018

Degree Type


Degree Name

Doctor of Philosophy


Computer Science

First Advisor

Tian Zhao

Committee Members

Yu Cao, Ethan Munson, Susan McRoy, Xiangming Mu


compiler, convolutional neural network, deep learning, deep neural network, DSL, functional programming


\textit {Deep Learning} (DL) has found great success in well-diversified areas such as machine vision, speech recognition, big data analysis, and multimedia understanding recently. However, the existing state-of-the-art DL frameworks, e.g. Caffe2, Theano, TensorFlow, MxNet, Torch7, and CNTK, are programming libraries with fixed user interfaces, internal representations, and execution environments. Modifying the code of DL layers or data structure is very challenging without in-depth understanding of the underlying implementation. The optimization of the code and execution in these tools is often limited and relies on the specific DL computation graph manipulation and scheduling that lack systematic and universal strategies. Furthermore, most of these tools demand many dependencies beside the tool itself and require to be built to some specific platforms for DL training or inference.


\noindent This dissertation presents {\it DeepDSL}, a \textit {domain specific language} (DSL) embedded in Scala, that compiles DL networks encoded with DeepDSL to efficient, compact, and portable Java source programs for DL training and inference. DeepDSL represents DL networks as abstract tensor functions, performs symbolic gradient derivations to generate the Intermediate Representation (IR), optimizes the IR expressions, and compiles the optimized IR expressions to cross-platform Java code that is easily modifiable and debuggable. Also, the code directly runs on GPU without additional dependencies except a small set of \textit{JNI} (Java Native Interface) wrappers for invoking the underneath GPU libraries. Moreover, DeepDSL provides static analysis for memory consumption and error detection.


\noindent DeepDSL\footnote{Our previous results are reported in~\cite{zhao2017}; design and implementation details are summarized in~\cite{Zhao2018}.} has been evaluated with many current state-of-the-art DL networks (e.g. Alexnet, GoogleNet, VGG, Overfeat, and Deep Residual Network). While the DSL code is highly compact with less than 100 lines for each of the network, the Java source code generated by the DeepDSL compiler is highly efficient. Our experiments show that the output java source has very competitive runtime performance and memory efficiency compared to the existing DL frameworks.