The Propensity for Density in Feed-forward Models

Nandi Schoots*, Alex Jackson*, Ali Kholmovaia, Peter McBurney, Murray Shanahan

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference paperpeer-review

16 Downloads (Pure)

Abstract

Does the process of training a neural network to solve a task tend to use all of the available weights even when the task could be solved with fewer weights? To address this question we study the effects of pruning fully connected, convolutional and residual models while varying their widths. We find that the proportion of weights that can be pruned without degrading performance is largely invariant to model size. Increasing the width of a model has little effect on the density of the pruned model relative to the increase in absolute size of the pruned network. In particular, we find substantial prunability across a large range of model sizes, where our biggest model is 50 times as wide as our smallest model. We explore three hypotheses that could explain these findings.
Original languageEnglish
Title of host publication27th European Conference on Artificial Intelligence
PublisherIOS Press
Pages2830-2837
Number of pages8
Volume392
DOIs
Publication statusPublished - 18 Oct 2024

Keywords

  • science of deep learning
  • sparsity
  • neural networks
  • simplicity
  • deep learning
  • machine learning

Fingerprint

Dive into the research topics of 'The Propensity for Density in Feed-forward Models'. Together they form a unique fingerprint.

Cite this