UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Wikipedia Information Flow Analysis Reveals the Scale-Free Architecture of the Semantic Space

Masucci, AP; Kalampokis, A; Eguiluz, VM; Hernandez-Garcia, E; (2011) Wikipedia Information Flow Analysis Reveals the Scale-Free Architecture of the Semantic Space. PLOS ONE , 6 (2) , Article e17333. 10.1371/journal.pone.0017333. Green open access

[thumbnail of 1329721.pdf]
Preview
PDF
1329721.pdf

Download (733kB)

Abstract

In this paper we extract the topology of the semantic space in its encyclopedic acception, measuring the semantic flow between the different entries of the largest modern encyclopedia, Wikipedia, and thus creating a directed complex network of semantic flows. Notably at the percolation threshold the semantic space is characterised by scale-free behaviour at different levels of complexity and this relates the semantic space to a wide range of biological, social and linguistics phenomena. In particular we find that the cluster size distribution, representing the size of different semantic areas, is scale-free. Moreover the topology of the resulting semantic space is scale-free in the connectivity distribution and displays small-world properties. However its statistical properties do not allow a classical interpretation via a generative model based on a simple multiplicative process. After giving a detailed description and interpretation of the topological properties of the semantic space, we introduce a stochastic model of content-based network, based on a copy and mutation algorithm and on the Heaps' law, that is able to capture the main statistical properties of the analysed semantic space, including the Zipf's law for the word frequency distribution.

Type: Article
Title: Wikipedia Information Flow Analysis Reveals the Scale-Free Architecture of the Semantic Space
Open access status: An open access version is available from UCL Discovery
DOI: 10.1371/journal.pone.0017333
Publisher version: http://dx.doi.org/10.1371/journal.pone.0017333
Language: English
Additional information: © 2011 Masucci et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Supported by Ministerio de Ciencia e Innovacion and Fondo Europeo de Desarrollo Regional through project FISICOS (FIS2007-60327). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Keywords: NETWORKS, LANGUAGE, WEB, DYNAMICS, MODEL
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of the Built Environment
URI: https://discovery.ucl.ac.uk/id/eprint/1329721
Downloads since deposit
190Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item