Community detection in multilayer networks aims to identify groups of well-connected nodes across multiple layers. While existing methods have been developed to deal with large graphs with few layers (typically less than 10), many real-world datasets are structured by transitive relationships that give rise to networks with thousands of extremely dense layers (e.g. co-citation networks, IMDb actor graphs, co-reference information network, social net- works). In addition, in these datasets the layers are often associated with textual summaries which provide important and hitherto unexploited information on the nature of the relation encoded by the layer. In this paper, we propose a new method which exploits the text associated with the layers in order to identify communities grouping together nodes connected through several semantically close layers. The method consists in embedding layer textual information in an Euclidean space, and to use it to group together, in the same community, nodes belonging to semantically close layers. To that end, we develop a pattern mining approach that extracts communities from numerical data. This approach, which mixes both symbolic and numeric techniques, is particularly well suited to identify communities in multilayer graphs. Indeed, we show that it obtains more diverse and better quality communities than those obtained by state-of-the-art competitors on datasets where ground truth is known. We also show that taking into account the semantic information improves the quality of the communities.
Article ID: 2021L09
Publisher: Canadian Artificial Intelligence Association