arxiv:1904.04862

SWNet: Small-World Neural Networks and Rapid Convergence

Published on Apr 9, 2019

Authors:

Abstract

Small-world network transformations improve deep learning model convergence by optimizing layer connectivity and reducing redundant parameters while maintaining classification accuracy across multiple datasets.

AI-generated summary

Training large and highly accurate deep learning (DL) models is computationally costly. This cost is in great part due to the excessive number of trained parameters, which are well-known to be redundant and compressible for the execution phase. This paper proposes a novel transformation which changes the topology of the DL architecture such that it reaches an optimal cross-layer connectivity. This transformation leverages our important observation that for a set level of accuracy, convergence is fastest when network topology reaches the boundary of a Small-World Network. Small-world graphs are known to possess a specific connectivity structure that enables enhanced signal propagation among nodes. Our small-world models, called SWNets, provide several intriguing benefits: they facilitate data (gradient) flow within the network, enable feature-map reuse by adding long-range connections and accommodate various network architectures/datasets. Compared to densely connected networks (e.g., DenseNets), SWNets require a substantially fewer number of training parameters while maintaining a similar level of classification accuracy. We evaluate our networks on various DL model architectures and image classification datasets, namely, CIFAR10, CIFAR100, and ILSVRC (ImageNet). Our experiments demonstrate an average of ~2.1x improvement in convergence speed to the desired accuracy

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/1904.04862 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/1904.04862 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/1904.04862 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.