AbstractMachine learning is an artiﬁcial intelligence technique used to automatically infer rules from data, and use these rules to perform some tasks on unknown data. This technique is widely used in the ﬁeld of ﬁnance and other disciplines and is characterised by the combination of massive amounts of data and the powerful computing abilities of modern computers. The surge of cryptocurrency markets with their high ﬂuctuations has challenged both traditional econometrics tools, based on statistics and time series analysis, and machine learning. Investigations into the analysis of cryptocurrency markets and the use of emerging machine learning techniques within are therefore useful for researchers to compare market performance and technological innovation in traditional equity/bond markets and cryptocurrency markets.
The problem within this area is however the wide array of disciplines contributing to the ﬁeld. Although there exist a wealth of surveys related to the research on blockchain and cryptocurrencies, none is really comprehensive and able to cut across different ﬁelds. The ﬁrst part of the thesis focuses on a widely cited survey on cryptocurrency trading, which informs much academic and industry work in this area. The survey provides an in-depth analysis of the literature from the perspective of research distribution among properties, categories, technologies, datasets, research trends and opportunities.
One of the research directions identiﬁed in the survey is the prediction of signals for cryptocurrency markets on live data. We address this research challenge in the second part of the thesis. An important ﬁnding of this work is that by using multi-layer architectures, deep learning model, and dynamic retraining methods, we can overcome the decay in predictive power on live data due to non-stationary features of the order book. A new dynamic retraining structure is proposed and compared to existing training frameworks in this part.
In the last part of the thesis, we look more closely at the model selection motivated by the success of regular retraining for cryptocurrency prediction. We study model selection from the ﬁrst principles and independently from the application domain, with the objective to ﬁnd techniques that are alternative to cross-validation (which often relies on the absence of temporal relationships in the data). We focus on tree models and use the dispersion of feature importance as a criterion for model selection. We show how this new method can help us choose models with a better generalisation more efﬁciently.
|Date of Award
|1 Nov 2023
|Carmine Ventre (Supervisor) & Maria Polukarov (Supervisor)