Zaremba, Anna Barbara;
(2022)
Assessing causality in financial time series.
Doctoral thesis (Ph.D), UCL (University College London).
Preview |
Text
Zaremba_Assessing_Causality.pdf - Other Download (2MB) | Preview |
Abstract
We develop new classes of semiparametric multivariate time series models based on Multi-Output Gaussian Processes and warped Multi-Output Gaussian Processes. These describe relationships between a current vector of observations and the lagged history of each marginal time series. We encode a serial dependence structure through mean and covariance functions and introduce a more complex dependence structure using copulas to couple each warped marginal Gaussian process. Within this class of models our primary goal is to detect causality and to study the interplay between the causal structure and the dependence structure. We do not, however, require true representation of the data generating process, but we model structural hypotheses regarding how causality may have manifested in the observed vector valued processes. With our framework we test the dependence with regards to the structures that are specified, and can use testing for causality under different model assumptions as a way to explore the data and the potentially complex dependence relationships. To perform the testing we consider several families of causality testing and develop compound tests which first require estimation/calibration of the mean and covariance functions parametrising the nonparametric vector valued time series. Our approach allows very general nonlinear dependence and causal relationships which are not often considered in classical parametric time series models, including causality in higher order information and joint extreme dependence features. We provide a generic framework which can be applied to a variety of different problem classes and discuss a number of examples to illustrate the ideas developed. Throughout, we will consider, without loss of generality, two multivariate time series denoted by X_t in R^d, Y_t in R^d' where one may assume, for instance, that these have been generated by observing partial realisations of a generalised diffusion processes: dX_t = mu_X(t, X_t^{-k}, Y_t^{-l}, Z_t^{-m}) dt + Sigma_X(t, X_t^{-k}, Y_t^{-l}, Z_t^{-m}) dW_t dY_t = mu_Y(t, X_t^{-k}, Y_t^{-l}, Z_t^{-m}) dt + Sigma_Y(t, X_t^{-k}, Y_t^{-l}, Z_t^{-m}) dW'_t, where Z_t, which may or may not be included, is some real process that we will call side information, dW_t, dW'_t are two different Brownian motions, possibly with marginal serial correlation and/or instantaneous cross-correlation. All of those processes are only partially observed, and may be sampled at irregular intervals. The form of drift and volatility described by the diffusion equation means that the processes X_t and Y_t can be conditionally dependent on each other, and this dependence can be introduced through both the drift and the volatility. Such generalised diffusion models can induce in the marginal process between X_t and Y_t different types of extremal dependence, depending on the forms of the drift and volatility functions. We propose a smooth stochastic process statistical model to capture the smooth variation of the partially observed time series represented by data X_t, Y_t, Z_t using multiple output Warped Gaussian Process models. In this work we are interested in partial observations of these processes, for which the partially observed time series of X_t and Y_t will have different types of extremal dependence characteristics. We wish to detect the presence or absence of statistical causality where such extremal dependence features may or may not obfuscate the ability to detect causality in nonlinear partially observed time series models. The rationale for developing a semiparametric solution for modelling the partially observed time series is that we may accommodate, through the use of Gaussian Process models, a wide variety of features for the hypotheses about the trends and volatility and importantly their possible causal structures, which can be formally tested in our framework. Furthermore the use of Warped Gaussian Process models allows to incorporate higher order dependence such as extremal tail dependence features. Statistical Causality. The notion of causality that lies at the centre of our research is the concept of statistical causality, based on comparing two predictive models. Quoting Wiener [1956]: "For two simultaneously measured signals, if we can predict the first signal better by using the past information from the second one than by using the information without it, then we call the second signal causal to the first one". The null hypothesis of no causal relationship from time series X_t to Y_t means that including the past of X_t does not improve the prediction of future of Y_t. In a most general form this can be written as equality of conditional distribution of Y, conditioning on either set of explanatory variables (X_t^{-k}, Y_t^{-l}, Z_t^{-m}) denote past of the X_t, Y_t, Z_t time series up to lags k,l,m respectively): H_0: p(Y_t | X_{t-1}^{-k}, Y_{t-1}^{-l}, Z_{t-1}^{-m}) = p(Y_t \mid Y_{t-1}^{-l}, \bZ_{t-1}^{-m}) H_1: p(Y_t | X_{t-1}^{-k}, Y_{t-1}^{-l}, Z_{t-1}^{-m}) <> p(Y_t \mid Y_{t-1}^{-l}, \bZ_{t-1}^{-m}). The type of casual dependence that is described by statistical causality is a mechanism that occurs at multiple lags over time - which could have been triggered by a sequence of processes, not an individual one. It can help to gain an insight into both cross-sectional and temporal dynamics of the data analysed. Warped Multi-Output Gaussian Processes. A Gaussian process is a Markov process, such that all finite dimensional distributions are Gaussian. While Gaussian processes models can accommodate wide range of properties and are very attractive for their easy implementation and optimisation, but they do not allow higher order dependence such as extremal tail dependence features. One way to generalise Gaussian process models so that higher order dependence can be handled, is to apply a transformation to the joint collection of Gaussian processes for each marginal time series model. We apply mean-variance transformation that results in the transformed variables having multivariate skew-t distributions and being finite dimensional realisations of a general multivariate skew-t process. Motivation for the Model Choice. There are numerous advantages of using Gaussian Processes, beginning with: ease of optimisation and interpretability of hyperparameters, flexibility, richness of covariance functions, allowing for various model structures. Using a likelihood ratio type test with a GP is a very natural choice, as estimating GP model parameters is often done on the basis of maximising likelihood, and therefore this estimation can be incorporated into the compound version of the likelihood ratio test (Generalised Likelihood Ratio Test, GLRT). From Gaussian variables, GPs inherited the property of being fully specified by the mean and the covariance, and so testing for model equivalence inherently means testing for equivalence of the mean and covariance functions. But many popular kernels do not have the ARD property, and using them for a likelihood ratio test settings gives no easy way to account for causal structures in covariance. Consequently, it is using GLRT with an ARD-GP that gives a uniformly most powerful test with an unparalleled flexibility: known asymptotic distribution under the null, explicit evaluation and in a closed form, and usefulness also for misspecified models. The proposed use of copula warping allows introduction of additional dependence, in particular tail dependence, while keeping the likelihood in closed form. Application. We provide a generic framework which can be applied to a wide range of problems, and which can be readily tailored or further extended. The illustrative examples included demonstrate how a range of data properties can be encoded in the model, and how they might affect the detection of causality. We present two real data application: to commodity futures data and inflation and interest rates. We show how the framework can be used in practice, and how it can be combined with, or enhance, more common approaches to analysing financial time series. Our observations are in line with financial interpretations, but they also offer additional insight and pose thought-provoking questions. Structure of the thesis. This thesis presents the research as it evolved: starting from an overview of a range of the causality methods already known, and demonstrating out why they are unsatisfactory. Subsequently, a new approach is presented -- a method based on Gaussian processes, that was developed to solve the drawbacks of the methods presented in the first part. Afterwards, an extension is proposed to widen the range of dependence structures, as well as marginal properties of the data that can be incorporated. Chapter 1 introduces the topic of the thesis, and reviews relevant literature. Chapter 2 discusses philosophical roots of the concept of statistical causality, as well as alternative notions of causality. After illustrating some of the varied ways of conceptual representation of causality, we present four distinct ways of modelling statistical causality. Chapter 3 contains background on the models considered: Gaussian processes, copulas and selected distributions. Chapter 4 describes inference procedures used: assessing hypothesis tests, generalised likelihood ratio test, permutation tests, and likelihood ratio test. The second part, New Perspectives on Causality Representation and Inference, presents the main contribution of our work. It starts with Chapter 5 containing the theoretical background for describing and testing causality with GP models. Chapter 6 extends the model from the previous chapter by introducing mean-variance transformation that results in a warped GP model, which can describe causality in the presence of skewness and tail dependence. Chapter 7 describes how synthetic data has been simulated, details the algorithm for approximating likelihood in the warped GP, and provides information on other relevant algorithms and the software used to implement our method. Chapter 8 presents an extensive experiment section, which aim to show, firstly, the good behaviour of the proposed procedures (model sensitivity and misspecification analysis), secondly, good power of the test for a range of structures, and, thirdly, the interaction of causality and tail dependence. Applications to real-world data are described in Chapter 9, where time series for commodities and currency markets are analysed. Finally, Chapter 10 presents the conclusions and directions for further development, and Appendices provide supplementary material.
Type: | Thesis (Doctoral) |
---|---|
Qualification: | Ph.D |
Title: | Assessing causality in financial time series |
Open access status: | An open access version is available from UCL Discovery |
Language: | English |
Additional information: | Copyright © The Author 2022. Original content in this thesis is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) Licence (https://creativecommons.org/licenses/by-nc/4.0/). Any third-party copyright material present remains the property of its respective owner(s) and is licensed under its existing terms. Access may initially be restricted at the author’s request. |
UCL classification: | UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science UCL > Provost and Vice Provost Offices > UCL BEAMS UCL |
URI: | https://discovery.ucl.ac.uk/id/eprint/10143981 |
Archive Staff Only
View Item |