Residual Capsule Networks
Mentor 1
Lingfeng Wang
Location
Union Wisconsin Room
Start Date
5-4-2019 1:30 PM
End Date
5-4-2019 3:30 PM
Description
Artificial neural network systems have excelled in classification, decision-making, and prediction tasks because they can effectively learn representational relationships by processing large amounts of examples. In order for NNs to be applicable and widely accessible, architectures must be developed which decreases the time required to train, cost, and amount of data needed. The most prominent architecture in use are convolutional neural networks which learn shared “concepts” throughout the data for a given example. CNNs excel at understanding the hierarchical properties of information yet one of the challenges is keeping perspective independent from the learned features. Capsule networks are an expansion of CNN which consider perspective or pose independent of learned features. This architecture can be done with less parameters but still requires a lot of computation. I propose an architecture which uses additional residual information passed through the layers of a capsule network in order to reach convergence faster and to provide avenues to explore the possible use of ODE solvers to determine the proper weight matrix of each capsule in all layers. It is feasible to utilize ODE’s to calculate the activation output of the higher layer capsule given the current activation value of the layers prior.
Residual Capsule Networks
Union Wisconsin Room
Artificial neural network systems have excelled in classification, decision-making, and prediction tasks because they can effectively learn representational relationships by processing large amounts of examples. In order for NNs to be applicable and widely accessible, architectures must be developed which decreases the time required to train, cost, and amount of data needed. The most prominent architecture in use are convolutional neural networks which learn shared “concepts” throughout the data for a given example. CNNs excel at understanding the hierarchical properties of information yet one of the challenges is keeping perspective independent from the learned features. Capsule networks are an expansion of CNN which consider perspective or pose independent of learned features. This architecture can be done with less parameters but still requires a lot of computation. I propose an architecture which uses additional residual information passed through the layers of a capsule network in order to reach convergence faster and to provide avenues to explore the possible use of ODE solvers to determine the proper weight matrix of each capsule in all layers. It is feasible to utilize ODE’s to calculate the activation output of the higher layer capsule given the current activation value of the layers prior.