Abstract
The explosion of interest in deep learning over the last decade has been driven by scaling mathematical models of brains referred to as Deep Neural Networks (DNNs). Common deep learning architectures have millions of parameters that require optimisation (training) for the network to learn tasks; and empirical evidence suggests training an overparameterised network (with more parameters than data points in the training data) is necessary for the network to learn. Despite this, it has been shown that overparameterised DNNs can typically be compressed to a fraction of their original size once trained (frequently by over 90%). Further, the Lottery Ticket Hypothesis (LTH) asserts that there exists a much sparser but equally trainable subnetwork within any sufficiently overparameterised DNN initialisation. In other words, the number of parameters could be significantly reduced from the outset if we could find these smaller initialisations (referred to as winning tickets). In this paper, we introduce a new evolutionary algorithm for finding winning tickets given a feed-forward or convolutional DNN architecture, and compare our approach to the current state-of-the-art. We refer to our algorithm as Neuroevolutionary Ticket Search (NeTS) and find it discovers competitive winning tickets for a variety of architectures and two common training datasets (MNIST and CIFAR-10). We show that NeTS can be applied to pruning DNNs before substantial training with gradient descent by genetically optimising a genome consisting of a set of initial weights and a binary pruning mask; this appears to offer a significant performance benefit.
Original language | English |
---|---|
Title of host publication | 2023 Conference on Artificial Life |
Publisher | MIT Press |
Number of pages | 11 |
Publication status | Published - 24 Jul 2023 |