Residual Capsule Networks

Mentor 1

Lingfeng Wang

Location

Union Wisconsin Room

Start Date

5-4-2019 1:30 PM

End Date

5-4-2019 3:30 PM

Description

Artificial neural network systems have excelled in classification, decision-making, and prediction tasks because they can effectively learn representational relationships by processing large amounts of examples. In order for NNs to be applicable and widely accessible, architectures must be developed which decreases the time required to train, cost, and amount of data needed. The most prominent architecture in use are convolutional neural networks which learn shared “concepts” throughout the data for a given example. CNNs excel at understanding the hierarchical properties of information yet one of the challenges is keeping perspective independent from the learned features. Capsule networks are an expansion of CNN which consider perspective or pose independent of learned features. This architecture can be done with less parameters but still requires a lot of computation. I propose an architecture which uses additional residual information passed through the layers of a capsule network in order to reach convergence faster and to provide avenues to explore the possible use of ODE solvers to determine the proper weight matrix of each capsule in all layers. It is feasible to utilize ODE’s to calculate the activation output of the higher layer capsule given the current activation value of the layers prior.

This document is currently not available here.

Share

COinS
 
Apr 5th, 1:30 PM Apr 5th, 3:30 PM

Residual Capsule Networks

Union Wisconsin Room

Artificial neural network systems have excelled in classification, decision-making, and prediction tasks because they can effectively learn representational relationships by processing large amounts of examples. In order for NNs to be applicable and widely accessible, architectures must be developed which decreases the time required to train, cost, and amount of data needed. The most prominent architecture in use are convolutional neural networks which learn shared “concepts” throughout the data for a given example. CNNs excel at understanding the hierarchical properties of information yet one of the challenges is keeping perspective independent from the learned features. Capsule networks are an expansion of CNN which consider perspective or pose independent of learned features. This architecture can be done with less parameters but still requires a lot of computation. I propose an architecture which uses additional residual information passed through the layers of a capsule network in order to reach convergence faster and to provide avenues to explore the possible use of ODE solvers to determine the proper weight matrix of each capsule in all layers. It is feasible to utilize ODE’s to calculate the activation output of the higher layer capsule given the current activation value of the layers prior.