On the initialization of long short-term memory networks

Mostafa Mehdipour Ghazi*, Mads Nielsen, Akshay Pai, Marc Modat, M. Jorge Cardoso, Sébastien Ourselin, Lauge Sørensen

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference paperpeer-review

8 Citations (Scopus)

Abstract

Weight initialization is important for faster convergence and stability of deep neural networks training. In this paper, a robust initialization method is developed to address the training instability in long short-term memory (LSTM) networks. It is based on a normalized random initialization of the network weights that aims at preserving the variance of the network input and output in the same range. The method is applied to standard LSTMs for univariate time series regression and to LSTMs robust to missing values for multivariate disease progression modeling. The results show that in all cases, the proposed initialization method outperforms the state-of-the-art initialization techniques in terms of training convergence and generalization performance of the obtained solution.

Original languageEnglish
Title of host publicationNeural Information Processing - 26th International Conference, ICONIP 2019, Proceedings
EditorsTom Gedeon, Kok Wai Wong, Minho Lee
PublisherSPRINGER
Pages275-286
Number of pages12
ISBN (Print)9783030367077
DOIs
Publication statusPublished - 1 Jan 2019
Event26th International Conference on Neural Information Processing, ICONIP 2019 - Sydney, Australia
Duration: 12 Dec 201915 Dec 2019

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11953 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference26th International Conference on Neural Information Processing, ICONIP 2019
Country/TerritoryAustralia
CitySydney
Period12/12/201915/12/2019

Keywords

  • Deep neural networks
  • Disease progression modeling
  • Initialization
  • Long short-term memory
  • Time series regression

Fingerprint

Dive into the research topics of 'On the initialization of long short-term memory networks'. Together they form a unique fingerprint.

Cite this