際際滷

際際滷Share a Scribd company logo
Finding All Maximal Cliques in
Very Large Social Networks
16 March 2016, EDBT 2016, Bordeaux, France
Alessio Conte<, Roberto De Virgilio′
, Antonio Maccioni′
,
Maurizio Patrignani′
, Riccardo Torlone′
<Universit┐ di Pisa ′
Universit┐ Roma Tre
Social Network Analysis
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 1
Community Detection
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 2
Cliques
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 3
A
J H
F
Z
A
J
A
J
F
A
J H
F
Maximal Clique Enumeration (MCE)
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 4
A
J
H H
F
D D E
S
E Y
E G
S U
S W
Distributed MCE
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 5
[Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010]
Block size m (Max number of nodes) = 5
A
J
H
F
D
D
U
S
W
E
E
S D
GY
Distributed MCE
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 6
[Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010]
Block size m (Max number of nodes) = 5
A
J
H
F
D
D
U
S
W
E
E
S D
GY
Distributed MCE
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 7
[Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010]
Block size m (Max number of nodes) = 5
A
J
H
F
D
D
U
S
W
E
E
S D
GY
Distributed MCE
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 8
[Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010]
Block size m (Max number of nodes) = 5
A
J
H
F
Z R
P
D
L
E
S
W
U
X
G
Y
Distributed MCE
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 9
[Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010]
Block size m (Max number of nodes) = 5
A
J
H
F
D
Z R
P
D E
E
S X
GY
D
L
S
W
U
Distributed MCE
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 10
[Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010]
Block size m (Max number of nodes) = 5
A
J
H
F
D
Z R
P
D E
E
S X
GY
D
L
S
W
U
Distributed MCE
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 11
[Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010]
Block size m (Max number of nodes) = 5
A
J
H
F
D
Z R
P
D E
E
S
X
GY
D
L
S
W
U
D E
S
undetected cliques
Hub Nodes Effect
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 12
Block size m (Max number of nodes) = 5
A
J
H
F
Z R
P
D
L
E
S
W
U
X
G
Y
The Block Size Effect
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 13
* Taken from [Cheng et al., KDD 2012]
efficiency vs completeness/correcteness
Overview of the Approach
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 14
G = (N, E)
C = c1
, c2
, ..., cn
1st level decomposition
Induced graph2nd level decomposition
FIND MAX CLIQUES
FIND MAX
CLIQUES
Block analysis
Nf Nh
Block 1 Block z...
U
Cf
Ch
1st Level Decomposition
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 15
Separate hubs from the rest of nodes in N
- according to a maximum block size m
1st level decomposition
FIND MAX CLIQUES
Nf Nh
G = (N, E)
A
J
H
F
Z R
P
L W
U
X
G
Y
D E
S
1st Level Decomposition - Lemma
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 16
The set of all maximal cliques C of G can
be obtained by computing Cf
and Ch
alone
- C = Cf
U Ch
- Cf
is the set of cliques with at least one node in Nf
- Ch
is the set of cliques with all the nodes in Nh
- proof of this Lemma is in our paper
2nd Level Decomposition
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 17
2nd level decomposition
FIND MAX CLIQUES
Nf
Block 1 Block z...
2nd Level Decomposition
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 18
Kernel node Border nodeKernel node Visited node
B1
A
J
H
2nd level decomposition
FIND MAX CLIQUES
Nf
Block 1 Block z...
2nd Level Decomposition
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 19
Kernel node Border nodeKernel node Visited node
B1
A
J
H
2nd level decomposition
FIND MAX CLIQUES
Nf
Block 1 Block z...
2nd Level Decomposition
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 20
Kernel node Border nodeKernel node Visited node
B1
A
J
H
F
D
2nd level decomposition
FIND MAX CLIQUES
Nf
Block 1 Block z...
2nd Level Decomposition
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 21
Kernel node Border nodeKernel node Visited node
Maximal Clique Enumeration
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 22
There are many algorithms for MCE
- no one outperforms the others
- but each has specific advantages
Tomita et al. Eppstein et al....
FIND MAX CLIQUES
Block analysis
Block 1 Block z...
Cf
Block Analysis
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 23
We determine the best-fit MCE algorithm on
each block
Block analysis
Select
best-fit
Tomita et al. Eppstein et al....
Cf
Block
FIND MAX CLIQUES
Block analysis
Block 1 Block z...
Cf
Decision Tree
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 24
Recursion Over Hub Nodes
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 25
Induced graph
FIND MAX CLIQUES
FIND MAX
CLIQUES
Nh
Ch
Recursion Over Hub Nodes
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 26
The induced graph of the hub nodes is recursively
processed
B12
D
E
S
Kernel node Border nodeKernel node Visited node
Induced graph
FIND MAX CLIQUES
FIND MAX
CLIQUES
Nh
Ch
Convergence Guarantee
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 27
Given a degeneracy of the graph lower than
the block size, the whole process converges
Theorem. Let G be a graph and let Gi
, with i=1, 2, 3, ... be a sequence
of subgraphs of G such that G1
= G and Gi
, for i > 1 is the graph induced
by the nodes of Gi?1
of degree greater or equal than m.
Let the degeneracy d of G be strictly less than m + 1.
1. There is a value q such that all Gj
, with j − q, are empty graphs.
2. There exists a graph with n nodes for which q is Ω(n).
Degeneracy and Sparsity
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 28
A measure of the sparsity of a graph
- it is the highest value d for which the network
contains a d-core. A d-core is obtained by
recursively removing nodes with degree less
than d
- degeneracy is typically < 100 on scale-free real
graphs
- facebook has a degeneracy ~ 54
Experiments: Decomposition
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 29
Experiments: the Block Size Effect
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 30
Experiments: Decision Tree
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 31
Experiments: Effectiveness
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 32
Conclusion
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 33
Approach for computing maximal cliques over an
arbitrarily large graph
- taking advantage of distributed computation
- leveraging on the advantages of different existing
algorithms for MCE
Completeness and correcteness are not
compromised by hub nodes
- unlike state-of-the-art
Future Work
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 34
Take into account the ^semantics ̄ of the graph
in order to search for specific kind of cliques
Extend our framework for computing more
relaxed communities
- k-plexes, k-clans, k-cores, etc.
Thanks For The Attention
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 35

More Related Content

Finding All Maximal Cliques in Very Large Social Networks

  • 1. Finding All Maximal Cliques in Very Large Social Networks 16 March 2016, EDBT 2016, Bordeaux, France Alessio Conte<, Roberto De Virgilio′ , Antonio Maccioni′ , Maurizio Patrignani′ , Riccardo Torlone′ <Universit┐ di Pisa ′ Universit┐ Roma Tre
  • 2. Social Network Analysis Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 1
  • 3. Community Detection Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 2
  • 4. Cliques Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 3 A J H F Z A J A J F A J H F
  • 5. Maximal Clique Enumeration (MCE) Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 4 A J H H F D D E S E Y E G S U S W
  • 6. Distributed MCE Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 5 [Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010] Block size m (Max number of nodes) = 5 A J H F D D U S W E E S D GY
  • 7. Distributed MCE Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 6 [Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010] Block size m (Max number of nodes) = 5 A J H F D D U S W E E S D GY
  • 8. Distributed MCE Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 7 [Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010] Block size m (Max number of nodes) = 5 A J H F D D U S W E E S D GY
  • 9. Distributed MCE Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 8 [Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010] Block size m (Max number of nodes) = 5 A J H F Z R P D L E S W U X G Y
  • 10. Distributed MCE Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 9 [Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010] Block size m (Max number of nodes) = 5 A J H F D Z R P D E E S X GY D L S W U
  • 11. Distributed MCE Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 10 [Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010] Block size m (Max number of nodes) = 5 A J H F D Z R P D E E S X GY D L S W U
  • 12. Distributed MCE Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 11 [Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010] Block size m (Max number of nodes) = 5 A J H F D Z R P D E E S X GY D L S W U D E S undetected cliques
  • 13. Hub Nodes Effect Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 12 Block size m (Max number of nodes) = 5 A J H F Z R P D L E S W U X G Y
  • 14. The Block Size Effect Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 13 * Taken from [Cheng et al., KDD 2012] efficiency vs completeness/correcteness
  • 15. Overview of the Approach Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 14 G = (N, E) C = c1 , c2 , ..., cn 1st level decomposition Induced graph2nd level decomposition FIND MAX CLIQUES FIND MAX CLIQUES Block analysis Nf Nh Block 1 Block z... U Cf Ch
  • 16. 1st Level Decomposition Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 15 Separate hubs from the rest of nodes in N - according to a maximum block size m 1st level decomposition FIND MAX CLIQUES Nf Nh G = (N, E) A J H F Z R P L W U X G Y D E S
  • 17. 1st Level Decomposition - Lemma Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 16 The set of all maximal cliques C of G can be obtained by computing Cf and Ch alone - C = Cf U Ch - Cf is the set of cliques with at least one node in Nf - Ch is the set of cliques with all the nodes in Nh - proof of this Lemma is in our paper
  • 18. 2nd Level Decomposition Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 17 2nd level decomposition FIND MAX CLIQUES Nf Block 1 Block z...
  • 19. 2nd Level Decomposition Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 18 Kernel node Border nodeKernel node Visited node B1 A J H 2nd level decomposition FIND MAX CLIQUES Nf Block 1 Block z...
  • 20. 2nd Level Decomposition Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 19 Kernel node Border nodeKernel node Visited node B1 A J H 2nd level decomposition FIND MAX CLIQUES Nf Block 1 Block z...
  • 21. 2nd Level Decomposition Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 20 Kernel node Border nodeKernel node Visited node B1 A J H F D 2nd level decomposition FIND MAX CLIQUES Nf Block 1 Block z...
  • 22. 2nd Level Decomposition Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 21 Kernel node Border nodeKernel node Visited node
  • 23. Maximal Clique Enumeration Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 22 There are many algorithms for MCE - no one outperforms the others - but each has specific advantages Tomita et al. Eppstein et al.... FIND MAX CLIQUES Block analysis Block 1 Block z... Cf
  • 24. Block Analysis Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 23 We determine the best-fit MCE algorithm on each block Block analysis Select best-fit Tomita et al. Eppstein et al.... Cf Block FIND MAX CLIQUES Block analysis Block 1 Block z... Cf
  • 25. Decision Tree Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 24
  • 26. Recursion Over Hub Nodes Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 25 Induced graph FIND MAX CLIQUES FIND MAX CLIQUES Nh Ch
  • 27. Recursion Over Hub Nodes Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 26 The induced graph of the hub nodes is recursively processed B12 D E S Kernel node Border nodeKernel node Visited node Induced graph FIND MAX CLIQUES FIND MAX CLIQUES Nh Ch
  • 28. Convergence Guarantee Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 27 Given a degeneracy of the graph lower than the block size, the whole process converges Theorem. Let G be a graph and let Gi , with i=1, 2, 3, ... be a sequence of subgraphs of G such that G1 = G and Gi , for i > 1 is the graph induced by the nodes of Gi?1 of degree greater or equal than m. Let the degeneracy d of G be strictly less than m + 1. 1. There is a value q such that all Gj , with j − q, are empty graphs. 2. There exists a graph with n nodes for which q is Ω(n).
  • 29. Degeneracy and Sparsity Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 28 A measure of the sparsity of a graph - it is the highest value d for which the network contains a d-core. A d-core is obtained by recursively removing nodes with degree less than d - degeneracy is typically < 100 on scale-free real graphs - facebook has a degeneracy ~ 54
  • 30. Experiments: Decomposition Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 29
  • 31. Experiments: the Block Size Effect Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 30
  • 32. Experiments: Decision Tree Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 31
  • 33. Experiments: Effectiveness Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 32
  • 34. Conclusion Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 33 Approach for computing maximal cliques over an arbitrarily large graph - taking advantage of distributed computation - leveraging on the advantages of different existing algorithms for MCE Completeness and correcteness are not compromised by hub nodes - unlike state-of-the-art
  • 35. Future Work Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 34 Take into account the ^semantics ̄ of the graph in order to search for specific kind of cliques Extend our framework for computing more relaxed communities - k-plexes, k-clans, k-cores, etc.
  • 36. Thanks For The Attention Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 35