Research output: Contribution to journal › Article › peer-review

**How to iron out rough landscapes and get optimal performances: Averaged Gradient Descent and its application to tensor PCA.** / Biroli, Giulio; Cammarota, Chiara; Ricci-Tersenghi, Federico.

Research output: Contribution to journal › Article › peer-review

Biroli, G, Cammarota, C & Ricci-Tersenghi, F 2020, 'How to iron out rough landscapes and get optimal performances: Averaged Gradient Descent and its application to tensor PCA', *Journal Of Physics A-Mathematical And Theoretical*, vol. 53, no. 17, 174003. https://doi.org/10.1088/1751-8121/ab7b1f

Biroli, G., Cammarota, C., & Ricci-Tersenghi, F. (2020). How to iron out rough landscapes and get optimal performances: Averaged Gradient Descent and its application to tensor PCA. *Journal Of Physics A-Mathematical And Theoretical*, *53*(17), [174003]. https://doi.org/10.1088/1751-8121/ab7b1f

Biroli G, Cammarota C, Ricci-Tersenghi F. How to iron out rough landscapes and get optimal performances: Averaged Gradient Descent and its application to tensor PCA. Journal Of Physics A-Mathematical And Theoretical. 2020 Apr 8;53(17). 174003. https://doi.org/10.1088/1751-8121/ab7b1f

@article{1dc38108eadb49bea07c9ab5bf80df3a,

title = "How to iron out rough landscapes and get optimal performances: Averaged Gradient Descent and its application to tensor PCA",

abstract = "In many high-dimensional estimation problems the main task consists in minimizing a cost function, which is often strongly non-convex when scanned in the space of parameters to be estimated. A standard solution to flatten the corresponding rough landscape consists in summing the losses associated to different data points and obtaining a smoother empirical risk. Here we propose a complementary method that works for a single data point. The main idea is that a large amount of the roughness is uncorrelated in different parts of the landscape. One can then substantially reduce the noise by evaluating an empirical average of the gradient obtained as a sum over many random independent positions in the space of parameters to be optimized. We present an algorithm, called averaged gradient descent, based on this idea and we apply it to tensor PCA, which is a very hard estimation problem. We show that averaged gradient descent over-performs physical algorithms such as gradient descent and approximate message passing and matches the best algorithmic thresholds known so far, obtained by tensor unfolding and methods based on sum-of-squares.",

keywords = "averaged gradient descent, gradient descent, tensor PCA",

author = "Giulio Biroli and Chiara Cammarota and Federico Ricci-Tersenghi",

year = "2020",

month = apr,

day = "8",

doi = "10.1088/1751-8121/ab7b1f",

language = "English",

volume = "53",

journal = "Journal Of Physics A-Mathematical And Theoretical",

issn = "1751-8113",

publisher = "IOP Publishing",

number = "17",

}

TY - JOUR

T1 - How to iron out rough landscapes and get optimal performances: Averaged Gradient Descent and its application to tensor PCA

AU - Biroli, Giulio

AU - Cammarota, Chiara

AU - Ricci-Tersenghi, Federico

PY - 2020/4/8

Y1 - 2020/4/8

N2 - In many high-dimensional estimation problems the main task consists in minimizing a cost function, which is often strongly non-convex when scanned in the space of parameters to be estimated. A standard solution to flatten the corresponding rough landscape consists in summing the losses associated to different data points and obtaining a smoother empirical risk. Here we propose a complementary method that works for a single data point. The main idea is that a large amount of the roughness is uncorrelated in different parts of the landscape. One can then substantially reduce the noise by evaluating an empirical average of the gradient obtained as a sum over many random independent positions in the space of parameters to be optimized. We present an algorithm, called averaged gradient descent, based on this idea and we apply it to tensor PCA, which is a very hard estimation problem. We show that averaged gradient descent over-performs physical algorithms such as gradient descent and approximate message passing and matches the best algorithmic thresholds known so far, obtained by tensor unfolding and methods based on sum-of-squares.

AB - In many high-dimensional estimation problems the main task consists in minimizing a cost function, which is often strongly non-convex when scanned in the space of parameters to be estimated. A standard solution to flatten the corresponding rough landscape consists in summing the losses associated to different data points and obtaining a smoother empirical risk. Here we propose a complementary method that works for a single data point. The main idea is that a large amount of the roughness is uncorrelated in different parts of the landscape. One can then substantially reduce the noise by evaluating an empirical average of the gradient obtained as a sum over many random independent positions in the space of parameters to be optimized. We present an algorithm, called averaged gradient descent, based on this idea and we apply it to tensor PCA, which is a very hard estimation problem. We show that averaged gradient descent over-performs physical algorithms such as gradient descent and approximate message passing and matches the best algorithmic thresholds known so far, obtained by tensor unfolding and methods based on sum-of-squares.

KW - averaged gradient descent

KW - gradient descent

KW - tensor PCA

UR - http://www.scopus.com/inward/record.url?scp=85085151184&partnerID=8YFLogxK

U2 - 10.1088/1751-8121/ab7b1f

DO - 10.1088/1751-8121/ab7b1f

M3 - Article

VL - 53

JO - Journal Of Physics A-Mathematical And Theoretical

JF - Journal Of Physics A-Mathematical And Theoretical

SN - 1751-8113

IS - 17

M1 - 174003

ER -

King's College London - Homepage

© 2020 King's College London | Strand | London WC2R 2LS | England | United Kingdom | Tel +44 (0)20 7836 5454