ºÝºÝߣ

ºÝºÝߣShare a Scribd company logo
Learning Without Peeking:
Secure Multi-party Computation Genetic
Programming
Jinhan Kim, KAIST
Michael G. Epitropakis, Lancaster University
Shin Yoo, KAIST
SSBSE 2018
Montpellier, France
Genetic Programming
• GP has been successfully applied to various domains.
Data Security
Test coverage data
Defect proneness data
Internal resources
Historical
defect proneness
data
Algorithm
(GP)
We want to
analyze our data,
without revealing
any sensitive data
Secure Multiparty Computation (SMC)
Example
Users don’t trust each other,
but they want to achieve a common goal.
Alice Bob
I don’t
trust Bob
I don’t
trust Alice
Example
Alice Bob
A = 1 if Alice loves Bob
0 otherwise
B = 1 if Bob loves Alice
0 otherwise
Common goal: A & B
Without trusted third party, we can do this by using
Secure Multiparty Computation (SMC)
Secure Multiparty Computation
• 1-2 Oblivious Transfer
• Garbled Circuit
1-2 Oblivious Transfer
Alice Bob
Message m0, m1
Random bit b
C = m0 if b = 0
m1 if b = 1
Alice (sender) has no idea
what message has been sent.
Bob (receiver) don’t know
the other message.
The 1–2 oblivious transfer protocol can be implemented over
asymmetric cryptography, such as the RSA.
enctypted
Garbled Circuit
Alice
(garbler)
Bob
(evaluator)
Alice garbles (encrypts) the circuit,
and sends with her input (encrypted)
â‘  Bob evaluates (decrypts) the circuit,
using oblivious transfer
â‘¡
Obliv-C
• Obliv-C is both a domain specific extension of C and a gcc
wrapper that compiles the extension.
High-level interface for 2PC
Extension of C
Obliv-C Example
Variables that should
remain oblivious.
The body of obliv if will be
always executed.â‘ 
â‘¡
â‘ 
â‘¡
Secure Multiparty Computation GP
GP
Obliv-C
Fitness evaluation Data
Scenario 1: Multiparty Dataholder (2PC)
Obliv-CData Party 1 Data Party 2
GP Party
Fault info
(Jan 2018 - Mar 2018)
Fault info
(April 2018 - June 2018)
Scenario 2: Singleparty Dataholder (1PC)
Obliv-CData Party GP Party
Study Subjects
• Symbolic Regression using GP
• Keijzer-6
• Nguyen-7
• Vladislavleva-4
• Dow Chemical, Chemical process dataset
• Learning to Ranking using GP
• FLUCCS, Real-world fault dataset from Defects4j
RQ1: How well does the SMCGP perform
compared to the Normal-GP?
SMCGP and Normal-GP have
same logic, do same job
RQ1: Symbolic Regression
There is no statistical difference
between SMCGP and Normal-GP
For 2PC, we split dataset in half
RQ1: FLUCCS
Localize faults in Mockito-1 is
particularly challenging
Learning from 32 versions, we
tested on four versions.
RQ2: What is the runtime overhead of
SMCGP when compared to Normal-GP?
RQ2: Runtime Overhead
We observed that SMCGP is about 1,290 times slower than Normal-GP
Learning Without Peeking: Secure Multi-party Computation Genetic Programming

More Related Content

Learning Without Peeking: Secure Multi-party Computation Genetic Programming