## Abstract

An ego-network is a graph representing the interactions of a node (*ego*) with its neighbors and the interactions among those neighbors. A sequence of ego-networks having the same ego can thus model the evolution of these interactions over time. We introduce the problem of segmenting a sequence of ego-networks into *k* segments, for any given integer *k*. Each segment is represented by a summary network, and the goal is to minimize the total loss of representing *k * segments by *k* summaries. The problem allows partitioning the sequence into homogeneous segments with respect to the activities or properties of the ego (e.g., to identify time periods when a user acquired different circles of friends in a social network) and to compactly represent each segment with a summary. The main challenge is to construct a summary that represents a collection of ego-networks with minimum loss. To address this challenge, we employ Jaccard Median (JM), a well-known NP-hard problem for summarizing sets, for which, however, no effective and efficient algorithms are known. We develop a series of algorithms for JM offering different effectiveness/efficiency trade-offs: (I) an exact exponential-time algorithm, based on Mixed Integer Linear Programming; (II) exact and approximation polynomial-time algorithms for minimizing an upper bound of the objective function of JM; and (III) efficient heuristics for JM, which are based on an effective scoring scheme and one of them also on sketching. We also study a generalization of the segmentation problem, in which there may be multiple edges between a pair of nodes in an ego-network. To tackle this problem, we develop a series of algorithms, based on a more general problem than JM, called Weighted Jaccard Median WJM: (I) an exact exponential-time algorithm, based on Mixed Integer Linear Programming; (II) exact algorithms for minimizing an upper bound of the objective function of WJM; and (III) efficient heuristics, based on the percentiles of edge multiplicities and one of them also on divide-and-conquer. By building upon the above results, we design algorithms for segmenting a sequence of ego-networks. Experiments with 10 real datasets and with synthetic datasets show that our algorithms produce optimal or near-optimal solutions to JM or to WJM, and that they substantially outperform state-of-the-art methods which can be employed for ego-network segmentation.

Original language | English |
---|---|

Pages (from-to) | 1-16 |

Number of pages | 16 |

Journal | IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING |

DOIs | |

Publication status | Accepted/In press - 3 Mar 2024 |

## Keywords

- Approximation algorithms
- ego-network
- Heuristic algorithms
- Jaccard median
- Linear programming
- Mixed integer linear programming
- segmentation
- Social networking (online)
- Synthetic data
- Upper bound