Research output: Contribution to journal › Article › peer-review

**Estimating network parameters using random walks.** / Cooper, Colin; Radzik, Tomasz; Siantos, Yiannis.

Research output: Contribution to journal › Article › peer-review

Cooper, C, Radzik, T & Siantos, Y 2014, 'Estimating network parameters using random walks', *Social Network Analysis and Mining*, vol. 4, no. 1, 168. https://doi.org/10.1007/s13278-014-0168-6

Cooper, C., Radzik, T., & Siantos, Y. (2014). Estimating network parameters using random walks. *Social Network Analysis and Mining*, *4*(1), [168]. https://doi.org/10.1007/s13278-014-0168-6

Cooper C, Radzik T, Siantos Y. Estimating network parameters using random walks. Social Network Analysis and Mining. 2014 Dec;4(1). 168. https://doi.org/10.1007/s13278-014-0168-6

@article{6119ccaddb414015811aa7519e8dde90,

title = "Estimating network parameters using random walks",

abstract = "Sampling from large graphs is an area of great interest, especially since the emergence of huge structures such as Online Social Networks and the World Wide Web (WWW). The large scale properties of a network can be summarized in terms of parameters of the underlying graph, such as the total number of vertices, edges and triangles. However, the large size of these networks makes it computationally expensive to obtain such structural properties of the underlying graph by exhaustive search. If we can estimate these properties by taking small but representative samples from the network, then size is no longer such a problem. In this paper we present a general framework to estimate network properties using random walks. These methods work under the assumption we are able to obtain local characteristics of a vertex during each step of the random walk, for example the number of its neighbours, and their labels. As examples of this approach, we present practical methods to estimate the total number of edges/links m, number of vertices/nodes n and number of connected triads of vertices (triangles) t in graphs. We also give a general method to count any type of small connected subgraph, of which vertices, edges and triangles are specific examples. Additionally we present experimental estimates for n, m, t we obtained using our methods on real or synthetic networks. The synthetic networks were random graphs with power-law degree distributions and designed to have a large number of triangles. We used these graphs as they tend to correspond to the structure of large online networks. The real networks were samples of the WWW and social networks obtained from the SNAP database. In order to test that the methods are indeed practical, the total number of steps made by the walk was limited to at most the size n of the network. In fact the estimates appear to converge to the correct value at a lower number of steps, indicating that our proposed methods are feasible in practice.",

keywords = "Graph crawling, Graph sampling, Markov chain, Property estimation, Random walk",

author = "Colin Cooper and Tomasz Radzik and Yiannis Siantos",

year = "2014",

month = dec,

doi = "10.1007/s13278-014-0168-6",

language = "English",

volume = "4",

journal = "Social Network Analysis and Mining",

issn = "1869-5450",

publisher = "Springer Wien",

number = "1",

}

TY - JOUR

T1 - Estimating network parameters using random walks

AU - Cooper, Colin

AU - Radzik, Tomasz

AU - Siantos, Yiannis

PY - 2014/12

Y1 - 2014/12

N2 - Sampling from large graphs is an area of great interest, especially since the emergence of huge structures such as Online Social Networks and the World Wide Web (WWW). The large scale properties of a network can be summarized in terms of parameters of the underlying graph, such as the total number of vertices, edges and triangles. However, the large size of these networks makes it computationally expensive to obtain such structural properties of the underlying graph by exhaustive search. If we can estimate these properties by taking small but representative samples from the network, then size is no longer such a problem. In this paper we present a general framework to estimate network properties using random walks. These methods work under the assumption we are able to obtain local characteristics of a vertex during each step of the random walk, for example the number of its neighbours, and their labels. As examples of this approach, we present practical methods to estimate the total number of edges/links m, number of vertices/nodes n and number of connected triads of vertices (triangles) t in graphs. We also give a general method to count any type of small connected subgraph, of which vertices, edges and triangles are specific examples. Additionally we present experimental estimates for n, m, t we obtained using our methods on real or synthetic networks. The synthetic networks were random graphs with power-law degree distributions and designed to have a large number of triangles. We used these graphs as they tend to correspond to the structure of large online networks. The real networks were samples of the WWW and social networks obtained from the SNAP database. In order to test that the methods are indeed practical, the total number of steps made by the walk was limited to at most the size n of the network. In fact the estimates appear to converge to the correct value at a lower number of steps, indicating that our proposed methods are feasible in practice.

AB - Sampling from large graphs is an area of great interest, especially since the emergence of huge structures such as Online Social Networks and the World Wide Web (WWW). The large scale properties of a network can be summarized in terms of parameters of the underlying graph, such as the total number of vertices, edges and triangles. However, the large size of these networks makes it computationally expensive to obtain such structural properties of the underlying graph by exhaustive search. If we can estimate these properties by taking small but representative samples from the network, then size is no longer such a problem. In this paper we present a general framework to estimate network properties using random walks. These methods work under the assumption we are able to obtain local characteristics of a vertex during each step of the random walk, for example the number of its neighbours, and their labels. As examples of this approach, we present practical methods to estimate the total number of edges/links m, number of vertices/nodes n and number of connected triads of vertices (triangles) t in graphs. We also give a general method to count any type of small connected subgraph, of which vertices, edges and triangles are specific examples. Additionally we present experimental estimates for n, m, t we obtained using our methods on real or synthetic networks. The synthetic networks were random graphs with power-law degree distributions and designed to have a large number of triangles. We used these graphs as they tend to correspond to the structure of large online networks. The real networks were samples of the WWW and social networks obtained from the SNAP database. In order to test that the methods are indeed practical, the total number of steps made by the walk was limited to at most the size n of the network. In fact the estimates appear to converge to the correct value at a lower number of steps, indicating that our proposed methods are feasible in practice.

KW - Graph crawling

KW - Graph sampling

KW - Markov chain

KW - Property estimation

KW - Random walk

UR - http://www.scopus.com/inward/record.url?scp=84947224727&partnerID=8YFLogxK

U2 - 10.1007/s13278-014-0168-6

DO - 10.1007/s13278-014-0168-6

M3 - Article

AN - SCOPUS:84947224727

VL - 4

JO - Social Network Analysis and Mining

JF - Social Network Analysis and Mining

SN - 1869-5450

IS - 1

M1 - 168

ER -

King's College London - Homepage

© 2020 King's College London | Strand | London WC2R 2LS | England | United Kingdom | Tel +44 (0)20 7836 5454