These slides are about detection method for game bots, I presented the slides in NDSS 2016
You can find the original paper in https://www.internetsociety.org/sites/default/files/blogs-media/you-are-game-bot-uncovering-game-bots-mmorpgs-via-self-similarity-wild.pdf
1 of 45
Downloaded 10 times
More Related Content
Ndss 2016 game_bot_final_no_video
1. You are a Game Bot:
Uncovering Game Bots in MMORPGs via Self-similarity in the Wild
Eunjo Lee (NCSOFT)
Jiyoung Woo (Korea University)
Hyoungshick Kim (Sungkyunkwan University)
Aziz Mohaisen (State University of New York at Buffalo)
Huy Kang Kim (Korea University)
4. 4 / 45
Introduction
Game BOT
Program that plays a game autonomously
(instead of human users)
Bot configurations
5. 5 / 45
Introduction
Real Money Trading (RMT)
Collect valuable items and monetize it by trading item to others
Game World
Virtual
Assets
Virtual
Assets
Real
Money
8. 8 / 45
Introduction
Game BOT
Widespread cheating in online games
Collapse of an in-game economy
Cause a human users churn
Reduce the revenue
9. 9 / 45
Introduction
Countermeasures
Client-side
Bot process detection using anti-malware programs
Server-side
Bot classification using game log analysis
12. 12 / 45
Introduction
Challenges
Raw data
collection
Preproce
ssing
Cleanin
g
Visual-
izing
Game A
Bots
Pattern
Raw data
collection
Preproce
ssing
Cleanin
g
Visual-
izing
Game B
Bots
Pattern
High cost
time consuming
14. 14 / 45
Introduction
Our proposals
Using self-similarity as a generic feature
Focus on the repetitive activities of game bots, not specific behavior
Proposing framework to maintain a prediction model autonomously
Detect the change in performance of the prediction model and retrain it
16. 16 / 45
Definition
Measurement of the similarity of periodic actions per user
Self-similarity
17. 17 / 45
Motivation and consideration
Intrinsic attributes
Bot programs repeat routines using predetermined settings
Human users may exhibit similar behavior, but not for long period of time
Stability
Little effect of game update or bot program changes
Considering various actions rather than a single action
Computing efficiency
Easy to apply distributed algorithms (i.e. MapReduce) for log processing
Self-similarity
19. 19 / 45
Generating log vectors
Game
logs
Game
logs in
User A
distributed
by users
time
Event
id
Character
info.
Other info.
15/08/13 12:00:12.131 1205 AAA, 34 N/A
15/08/13 12:00:14.237 1204 AAA, 34 N/A
15/08/13 12:00:59.436 1208 AAA, 34 Ogre
15/08/13 12:00:59:436 1208 AAA, 34 Ogre
15/08/13 12:00:59.857 1208 AAA, 34 Troll
15/08/13 12:01:17.019 1022 AAA, 34 Ring
15/08/13 12:01:21.341 1022 AAA, 34 Sword
15/08/13 12:01:23.151 1205 AAA, 34 N/A
15/08/13 12:01:54.354 1204 AAA, 34 N/A
15/08/13 12:01:56.445 1208 AAA, 34 Wolf
15/08/13 12:02:07.351 1205 AAA, 34 N/A
15/08/13 12:02:41.847 1204 AAA, 34 N/A
15/08/13 12:02:47.650 1208 AAA, 34 Ogre
15/08/13 12:03:09.353 1208 AAA, 34 Ogre
Time
period
(hour:min)
Log count per
event id
1022 1204 1205 1208
12:00 0 1 1 3
12:01 2 1 1 1
12:02 0 1 1 1
12:03 0 0 0 1
Game
logs in
User B
Game
logs in
User C
Self-similarity
20. 20 / 45
Measuring the cosine similarity between log vector(Vt) and unit vector(E)
Frequency
of event B
Frequency
of event A
Vt (2,1)E (1,1)
cos(慮)
cos(慮) =
癌
| |
=
巨
( )2 ( )2
=
(21 +11)
22+11 11+11
=
3
5 2
0.948
Self-similarity
21. 21 / 45
Measuring self-similarity
Measuring std. of cosine similarity and transforming using the following model
= 1
1
2
, (0.5 1, : $. p$ $)
Self-similarity
22. 22 / 45
Modeling
Logistic regression
Calculating the probability of a character being a game bot
Modeling and Evaluation
25. 25 / 45
Bots have cosine similarities with fewer variations than human users
Bots Humans
Cosine similarities
Experiments
26. 26 / 45
Self-similarity
Almost bots have higher values than human users
Lineage Aion B&S
Experiments
27. 27 / 45
Additional feature selection
Exceptional cases short time playing or no activities over long time
Outliers
No. Field name Description
1 self_sim Self-similarity
2 cosim_count Count of a set of log vectors
3 cosim_uniq_count Unique count of a set of log vectors
4 cosim_zero_count Count of data in which cosine similarity is zero
5 cosim_mode
Count of data that appears most often in a set of
log vectors
6 total_log_count Total count of logs generated by user
7 main_char_level Character level
8 total_use_time_min Play time during certain period per user
9 npc_kill_count NPC kill count
10 trade_get_count Count of trade in which user takes item
11 trade_give_count Count of trade in which user gives items
12 retrieve_count
Count of activity in which user retrieve items from
warehouse
13 deposit_count
Count of activity in which user deposits items to
warehouse
14 log_count_per_min Average count of logs are generated per minute
Feature selection
28. 28 / 45
Performance evaluation
Model1: using only self-similarity. Model2: using all features
Game BOT Human AUC (model 1) AUC (model 2)
Lineage 128 149 0.8967 0.9455
Aion 186 160 0.9557 0.9942
B&S 131 129 0.8280 0.9399
Experiments
Lineage Aion B&S
30. 30 / 45
Motivation and consideration
How to optimize the time for retraining
Too often -> high cost
Too rare -> obsolete model
How to retrain a model autonomously
Model maintenance
31. 31 / 45
System Flow
Inspector
Ground
Truth Modeler
Preprocessor Predictor
Change
Detector
Model
(PMML)
Game
Logs
BOT Detection System
Model maintenance
32. 32 / 45
Logic Flow
If change is detected, retraining the model
Notifying to operator, if new model is invalid
or change is detected consecutively
Calculate bot probability
Notify to operator
Model retraining
Change
detected?
Already
retraining?
End
Validation
check
End
no
yes
yes no
invalid valid
Model maintenance
33. 33 / 45
If change is detected, retraining the model
Notifying to operator, if new model is invalid
or change is detected consecutively
Logic Flow
Calculate bot probability
Notify to operator
Model retraining
Change
detected?
Already
retraining?
End
Validation
check
End
no
yes
yes no
invalid valid
EWMA Algorithm
Model maintenance
34. 34 / 45
EWMA algorithm
User
Bot probability
(time t)
Bot probability
(time t-1)
A 0.99 0.95
B 0.95 0.92
C 0.23 0.25
D 0.55 0.55
Calculating the correlation coefficient of bot probability between time t and t-1
Correlation
coefficient
Model maintenance
35. 35 / 45
EWMA algorithm
Calculating the correlation coefficient of bot probability between time t and t-1
Calculating the weighted moving average of coefficients
(X: coefficient, Z: moving average)
Model maintenance
36. 36 / 45
EWMA algorithm
Model maintenance
Calculating the correlation coefficient of bot probability between time t and t-1
Calculating the weighted moving average of coefficients
Measuring upper an lower control limits
37. 37 / 45
EWMA algorithm
Model maintenance
Calculating the correlation coefficient of bot probability between time t and t-1
Calculating the weighted moving average of coefficients
Measuring upper an lower control limits
Retraining the model, unless 瑞駒 < < 駒
39. 39 / 45
BOT detection system dashboard
Real-World Deployment
Provide the trend of numbers or rates of BOT, and the chart of BOT
statistics by main activity zone
The trend of bot The trend of bot rate
The trend of bot rate
at specific zone
Bot statistics by
activity zone
40. 40 / 45
BOT detection system search and filter
Real-World Deployment
Search and filter the list of accounts to ban
Fill in the
conditions to filter
accounts to ban
Print the list of accounts to
ban according to the search
conditions
42. 42 / 45
We proposed self-similarity as a feature and demonstrated its effectiveness
with real datasets
We proposed a bot detection framework that includes a detection model
maintenance process
We implemented the proposed framework and utilized it for live MMORPGs
Conclusion
Contributions
43. 43 / 45
Future works short-time playing bot
Undetected massive number of bots playing for less than 10 hours per week
Star-shaped trading network structure
Conclusion
VS
Short-time playing BOT Human users
44. 44 / 45
Future works occasional bot users
Human players playing for hours and then turning on a bot for a few hours
Self-similarities have pulse, if we use short period of time for aggregation
Conclusion