UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Building Information Filtering Networks with Topological Constraints: Algorithms and Applications

Previde Massara, Guido; (2020) Building Information Filtering Networks with Topological Constraints: Algorithms and Applications. Doctoral thesis (Ph.D), UCL (University College London). Green open access

[thumbnail of Previde Massara__thesis.pdf]
Preview
Text
Previde Massara__thesis.pdf

Download (1MB) | Preview

Abstract

We propose a new methodology for learning the structure of sparse networks from data; in doing so we adopt a dual perspective where we consider networks both as weighted graphs and as simplicial complexes. The proposed learning methodology belongs to the family of preferential attachment algorithms, where a network is extended by iteratively adding new vertices. In the conventional preferential attachment algorithm a new vertex is added to the network by adding a single edge to another existing vertex; in our approach a new vertex is added to a set of vertices by adding one or more new simplices to the simplicial complex. We propose the use of a score function to quantify the strength of the association between the new vertex and the attachment points. The methodology performs a greedy optimisation of the total score by selecting, at each step, the new vertex and the attachment points that maximise the gain in the score. Sparsity is enforced by restricting the space of the feasible configurations through the imposition of topological constraints on the candidate networks; the constraint is fulfilled by allowing only topological operations that are invariant with respect to the required property. For instance, if the topological constraint requires the constructed network to be be planar, then only planarity-invariant operations are allowed; if the constraint is that the network must be a clique forest, then only simplicial vertices can be added. At each step of the algorithm, the vertex to be added and the attachment points are those that provide the maximum increase in score while maintaining the topological constraints. As a concrete but general realisation we propose the clique forest as a possible topological structure for the representation of sparse networks, and we allow to specify further constraints such as the allowed range of clique sizes and the saturation of the attachment points. In this thesis we originally introduce the Maximally Filtered Clique Forest (MFCF) algorithm: the MFCF builds a clique forest by repeated application of a suitably invariant operation that we call Clique Expansion operator and adds vertices according to a strategy that greedily maximises the gain in a local score function. The gains produced by the Clique Expansion operator can be validated in a number of ways, including statistical testing, cross-validation or value thresholding. The algorithm does not prescribe a specific form for the gain function, but allows the use of any number of gain functions as long as they are consistent with the Clique Expansion operator. We describe several examples of gain functions suited to different problems. As a specific practical realisation we study the extraction of planar networks with the Triangulated Maximally Filtered Graph (TMFG). The TMFG, in its simplest form, is a specialised version of the MFCF, but it can be made more powerful by allowing the use of specialised planarity invariant operators that are not based on the Clique Expansion operator. We provide applications to two well known applied problems: the Maximum Weight Planar Subgraph Problem (MWPSP) and the Covariance Selection problem. With regards to the Covariance Selection problem we compare our results to the state of the art solution (the Graphical Lasso) and we highlight the benefits of our methodology. Finally, we study the geometry of clique trees as simplicial complexes and note how the statistics based on cliques and separators provides information equivalent to the one that can be achieved by means of homological methods, such as the analysis of Betti numbers, however with our approach being computationally more efficient and intuitively simpler. Finally, we use the geometric tools developed to provide a possible methodology for inferring the size of a dataset generated by a factor model. As an example we show that our tools provide a solution for inferring the size of a dataset generated by a factor model.

Type: Thesis (Doctoral)
Qualification: Ph.D
Title: Building Information Filtering Networks with Topological Constraints: Algorithms and Applications
Event: UCL (University College London)
Open access status: An open access version is available from UCL Discovery
Language: English
Additional information: Copyright © The Author 2021. Original content in this thesis is licensed under the terms of the Creative Commons Attribution 4.0 International (CC BY 4.0) Licence (https://creativecommons.org/licenses/by/4.0/). Any third-party copyright material present remains the property of its respective owner(s) and is licensed under its existing terms. Access may initially be restricted at the author’s request.
UCL classification: UCL
UCL > Provost and Vice Provost Offices
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/10118525
Downloads since deposit
265Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item