The detection of communities in social networks is a challenging task. A rigorous way to model communities considers maximal cliques, that is, maximal subgraphs in which each pair of nodes is connected by an edge. State-of-the-art strategies for finding maximal cliques in very large networks decompose the network in blocks and then perform a distributed computation. These approaches exhibit a trade-off between efficiency and completeness: decreasing the size of the blocks has been shown to improve efficiency but some cliques may remain undetected since high-degree nodes, also called hubs, may not fit with all their neighborhood into a small block. In this paper, we present a distributed approach that, by suitably handling hub nodes, is able to detect maximal cliques in large networks meeting both completeness and efficiency. The approach relies on a two-level decomposition process. The first level aims at recursively identifying and isolating tractable portions of the network. The second level further decomposes the tractable portions into small blocks. We demonstrate that this process is able to correctly detect all maximal cliques, provided that the sparsity of the network is bounded, as it is the case of real-world social networks. An extensive campaign of experiments confirms the effectiveness, efficiency, and scalability of our solution and shows that, if hub nodes were neglected, significant cliques would be undetected.
1 of 36
More Related Content
Finding All Maximal Cliques in Very Large Social Networks
1. Finding All Maximal Cliques in
Very Large Social Networks
16 March 2016, EDBT 2016, Bordeaux, France
Alessio Conte<, Roberto De Virgilio′
, Antonio Maccioni′
,
Maurizio Patrignani′
, Riccardo Torlone′
<Universit┐ di Pisa ′
Universit┐ Roma Tre
5. Maximal Clique Enumeration (MCE)
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 4
A
J
H H
F
D D E
S
E Y
E G
S U
S W
6. Distributed MCE
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 5
[Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010]
Block size m (Max number of nodes) = 5
A
J
H
F
D
D
U
S
W
E
E
S D
GY
7. Distributed MCE
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 6
[Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010]
Block size m (Max number of nodes) = 5
A
J
H
F
D
D
U
S
W
E
E
S D
GY
8. Distributed MCE
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 7
[Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010]
Block size m (Max number of nodes) = 5
A
J
H
F
D
D
U
S
W
E
E
S D
GY
9. Distributed MCE
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 8
[Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010]
Block size m (Max number of nodes) = 5
A
J
H
F
Z R
P
D
L
E
S
W
U
X
G
Y
10. Distributed MCE
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 9
[Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010]
Block size m (Max number of nodes) = 5
A
J
H
F
D
Z R
P
D E
E
S X
GY
D
L
S
W
U
11. Distributed MCE
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 10
[Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010]
Block size m (Max number of nodes) = 5
A
J
H
F
D
Z R
P
D E
E
S X
GY
D
L
S
W
U
12. Distributed MCE
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 11
[Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010]
Block size m (Max number of nodes) = 5
A
J
H
F
D
Z R
P
D E
E
S
X
GY
D
L
S
W
U
D E
S
undetected cliques
13. Hub Nodes Effect
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 12
Block size m (Max number of nodes) = 5
A
J
H
F
Z R
P
D
L
E
S
W
U
X
G
Y
14. The Block Size Effect
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 13
* Taken from [Cheng et al., KDD 2012]
efficiency vs completeness/correcteness
15. Overview of the Approach
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 14
G = (N, E)
C = c1
, c2
, ..., cn
1st level decomposition
Induced graph2nd level decomposition
FIND MAX CLIQUES
FIND MAX
CLIQUES
Block analysis
Nf Nh
Block 1 Block z...
U
Cf
Ch
16. 1st Level Decomposition
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 15
Separate hubs from the rest of nodes in N
- according to a maximum block size m
1st level decomposition
FIND MAX CLIQUES
Nf Nh
G = (N, E)
A
J
H
F
Z R
P
L W
U
X
G
Y
D E
S
17. 1st Level Decomposition - Lemma
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 16
The set of all maximal cliques C of G can
be obtained by computing Cf
and Ch
alone
- C = Cf
U Ch
- Cf
is the set of cliques with at least one node in Nf
- Ch
is the set of cliques with all the nodes in Nh
- proof of this Lemma is in our paper
18. 2nd Level Decomposition
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 17
2nd level decomposition
FIND MAX CLIQUES
Nf
Block 1 Block z...
19. 2nd Level Decomposition
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 18
Kernel node Border nodeKernel node Visited node
B1
A
J
H
2nd level decomposition
FIND MAX CLIQUES
Nf
Block 1 Block z...
20. 2nd Level Decomposition
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 19
Kernel node Border nodeKernel node Visited node
B1
A
J
H
2nd level decomposition
FIND MAX CLIQUES
Nf
Block 1 Block z...
21. 2nd Level Decomposition
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 20
Kernel node Border nodeKernel node Visited node
B1
A
J
H
F
D
2nd level decomposition
FIND MAX CLIQUES
Nf
Block 1 Block z...
22. 2nd Level Decomposition
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 21
Kernel node Border nodeKernel node Visited node
23. Maximal Clique Enumeration
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 22
There are many algorithms for MCE
- no one outperforms the others
- but each has specific advantages
Tomita et al. Eppstein et al....
FIND MAX CLIQUES
Block analysis
Block 1 Block z...
Cf
24. Block Analysis
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 23
We determine the best-fit MCE algorithm on
each block
Block analysis
Select
best-fit
Tomita et al. Eppstein et al....
Cf
Block
FIND MAX CLIQUES
Block analysis
Block 1 Block z...
Cf
26. Recursion Over Hub Nodes
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 25
Induced graph
FIND MAX CLIQUES
FIND MAX
CLIQUES
Nh
Ch
27. Recursion Over Hub Nodes
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 26
The induced graph of the hub nodes is recursively
processed
B12
D
E
S
Kernel node Border nodeKernel node Visited node
Induced graph
FIND MAX CLIQUES
FIND MAX
CLIQUES
Nh
Ch
28. Convergence Guarantee
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 27
Given a degeneracy of the graph lower than
the block size, the whole process converges
Theorem. Let G be a graph and let Gi
, with i=1, 2, 3, ... be a sequence
of subgraphs of G such that G1
= G and Gi
, for i > 1 is the graph induced
by the nodes of Gi?1
of degree greater or equal than m.
Let the degeneracy d of G be strictly less than m + 1.
1. There is a value q such that all Gj
, with j − q, are empty graphs.
2. There exists a graph with n nodes for which q is Ω(n).
29. Degeneracy and Sparsity
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 28
A measure of the sparsity of a graph
- it is the highest value d for which the network
contains a d-core. A d-core is obtained by
recursively removing nodes with degree less
than d
- degeneracy is typically < 100 on scale-free real
graphs
- facebook has a degeneracy ~ 54
34. Conclusion
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 33
Approach for computing maximal cliques over an
arbitrarily large graph
- taking advantage of distributed computation
- leveraging on the advantages of different existing
algorithms for MCE
Completeness and correcteness are not
compromised by hub nodes
- unlike state-of-the-art
35. Future Work
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 34
Take into account the ^semantics ̄ of the graph
in order to search for specific kind of cliques
Extend our framework for computing more
relaxed communities
- k-plexes, k-clans, k-cores, etc.
36. Thanks For The Attention
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 35