際際滷

際際滷Share a Scribd company logo
You are a Game Bot:
Uncovering Game Bots in MMORPGs via Self-similarity in the Wild
Eunjo Lee (NCSOFT)
Jiyoung Woo (Korea University)
Hyoungshick Kim (Sungkyunkwan University)
Aziz Mohaisen (State University of New York at Buffalo)
Huy Kang Kim (Korea University)
2 / 45
Contents
Introduction
Feature selection and modeling
Experiments
Model maintenance
Real-World deployment
Conclusions
3 / 45
Introduction
4 / 45
Introduction
Game BOT
 Program that plays a game autonomously
(instead of human users)
Bot configurations
5 / 45
Introduction
Real Money Trading (RMT)
 Collect valuable items and monetize it by trading item to others
Game World
Virtual
Assets
Virtual
Assets
Real
Money
6 / 45
Introduction
Gold Farming Group (GFG)
7 / 45
Introduction
Game BOT
https://www.youtube.com/watch?v=k6tk8_R2w08
8 / 45
Introduction
Game BOT
 Widespread cheating in online games
 Collapse of an in-game economy
 Cause a human users churn
 Reduce the revenue
9 / 45
Introduction
Countermeasures
 Client-side
 Bot process detection using anti-malware programs
 Server-side
 Bot classification using game log analysis
10 / 45
Machine Learning-based Approach
Introduction
Prediction Model
Character ID T1 T2 T3 Response
686042 0 0 0 0
854209 1 1 1 3
1032131 0 0 0 0
1049483 1 1 1 3
1340479 0 0 0 0
1352850 0 0 0 0
1771815 1 1 1 3
1832497 0 0 0 0
1884884 1 1 1 3
2130576 1 1 1 3
2445903 1 0 0 1
Game Logs
Ground Truth
Feature
Selection
Learning Algorithm
11 / 45
Introduction
Challenges
Raw data
collection
Preproce
ssing
Cleaning
Visual-
izing

Game Bots
Pattern
12 / 45
Introduction
Challenges
Raw data
collection
Preproce
ssing
Cleanin
g
Visual-
izing

Game A
Bots
Pattern
Raw data
collection
Preproce
ssing
Cleanin
g
Visual-
izing

Game B
Bots
Pattern
High cost
time consuming
13 / 45
Introduction
Challenges
Consistent
maintenance
Game update
Bot change
14 / 45
Introduction
Our proposals
 Using self-similarity as a generic feature
 Focus on the repetitive activities of game bots, not specific behavior
 Proposing framework to maintain a prediction model autonomously
 Detect the change in performance of the prediction model and retrain it
15 / 45
Feature Selection
and
Modeling
16 / 45
Definition
 Measurement of the similarity of periodic actions per user
Self-similarity
17 / 45
Motivation and consideration
 Intrinsic attributes
 Bot programs repeat routines using predetermined settings
 Human users may exhibit similar behavior, but not for long period of time
 Stability
 Little effect of game update or bot program changes
 Considering various actions rather than a single action
 Computing efficiency
 Easy to apply distributed algorithms (i.e. MapReduce) for log processing
Self-similarity
18 / 45
Detailed process
 Generating log vectors
 Measuring cosine similarity
 Measuring self-similarity
Self-similarity
19 / 45
Generating log vectors
Game
logs
Game
logs in
User A
distributed
by users
time
Event
id
Character
info.
Other info.
15/08/13 12:00:12.131 1205 AAA, 34 N/A
15/08/13 12:00:14.237 1204 AAA, 34 N/A
15/08/13 12:00:59.436 1208 AAA, 34 Ogre
15/08/13 12:00:59:436 1208 AAA, 34 Ogre
15/08/13 12:00:59.857 1208 AAA, 34 Troll
15/08/13 12:01:17.019 1022 AAA, 34 Ring
15/08/13 12:01:21.341 1022 AAA, 34 Sword
15/08/13 12:01:23.151 1205 AAA, 34 N/A
15/08/13 12:01:54.354 1204 AAA, 34 N/A
15/08/13 12:01:56.445 1208 AAA, 34 Wolf
15/08/13 12:02:07.351 1205 AAA, 34 N/A
15/08/13 12:02:41.847 1204 AAA, 34 N/A
15/08/13 12:02:47.650 1208 AAA, 34 Ogre
15/08/13 12:03:09.353 1208 AAA, 34 Ogre
Time
period
(hour:min)
Log count per
event id
1022 1204 1205 1208
12:00 0 1 1 3
12:01 2 1 1 1
12:02 0 1 1 1
12:03 0 0 0 1
Game
logs in
User B
Game
logs in
User C
Self-similarity
20 / 45
Measuring the cosine similarity between log vector(Vt) and unit vector(E)
Frequency
of event B
Frequency
of event A
Vt (2,1)E (1,1)
cos(慮)
cos(慮) =
癌
 |  |
=
 巨
( )2  ( )2
=
(21 +11)
22+11  11+11
=
3
5  2
 0.948
Self-similarity
21 / 45
Measuring self-similarity
 Measuring std. of cosine similarity and transforming using the following model
  = 1 
1
2
, (0.5    1, : $. p$   $)
Self-similarity
22 / 45
Modeling
 Logistic regression
 Calculating the probability of a character being a game bot
Modeling and Evaluation
23 / 45
Experiments
24 / 45
Datasets
Experiments
Lineage Aion B&S
Release year 1997 2008 2012
Daily active users 300K 200K 100K
Concurrent users 150K 80K 50K
25 / 45
 Bots have cosine similarities with fewer variations than human users
Bots Humans
Cosine similarities
Experiments
26 / 45
Self-similarity
 Almost bots have higher values than human users
Lineage Aion B&S
Experiments
27 / 45
Additional feature selection
 Exceptional cases  short time playing or no activities over long time
 Outliers
No. Field name Description
1 self_sim Self-similarity
2 cosim_count Count of a set of log vectors
3 cosim_uniq_count Unique count of a set of log vectors
4 cosim_zero_count Count of data in which cosine similarity is zero
5 cosim_mode
Count of data that appears most often in a set of
log vectors
6 total_log_count Total count of logs generated by user
7 main_char_level Character level
8 total_use_time_min Play time during certain period per user
9 npc_kill_count NPC kill count
10 trade_get_count Count of trade in which user takes item
11 trade_give_count Count of trade in which user gives items
12 retrieve_count
Count of activity in which user retrieve items from
warehouse
13 deposit_count
Count of activity in which user deposits items to
warehouse
14 log_count_per_min Average count of logs are generated per minute
Feature selection
28 / 45
Performance evaluation
 Model1: using only self-similarity. Model2: using all features
Game BOT Human AUC (model 1) AUC (model 2)
Lineage 128 149 0.8967 0.9455
Aion 186 160 0.9557 0.9942
B&S 131 129 0.8280 0.9399
Experiments
Lineage Aion B&S
29 / 45
Model Maintenance
30 / 45
Motivation and consideration
 How to optimize the time for retraining
 Too often -> high cost
 Too rare -> obsolete model
 How to retrain a model autonomously
Model maintenance
31 / 45
System Flow
Inspector
Ground
Truth Modeler
Preprocessor Predictor
Change
Detector
Model
(PMML)
Game
Logs
BOT Detection System
Model maintenance
32 / 45
Logic Flow
 If change is detected, retraining the model
 Notifying to operator, if new model is invalid
or change is detected consecutively
Calculate bot probability
Notify to operator
Model retraining
Change
detected?
Already
retraining?
End
Validation
check
End
no
yes
yes no
invalid valid
Model maintenance
33 / 45
 If change is detected, retraining the model
 Notifying to operator, if new model is invalid
or change is detected consecutively
Logic Flow
Calculate bot probability
Notify to operator
Model retraining
Change
detected?
Already
retraining?
End
Validation
check
End
no
yes
yes no
invalid valid
EWMA Algorithm
Model maintenance
34 / 45
EWMA algorithm
User
Bot probability
(time t)
Bot probability
(time t-1)
A 0.99 0.95
B 0.95 0.92
C 0.23 0.25
D 0.55 0.55
  
 Calculating the correlation coefficient of bot probability between time t and t-1
Correlation
coefficient
Model maintenance
35 / 45
EWMA algorithm
 Calculating the correlation coefficient of bot probability between time t and t-1
 Calculating the weighted moving average of coefficients
(X: coefficient, Z: moving average)
Model maintenance
36 / 45
EWMA algorithm
Model maintenance
 Calculating the correlation coefficient of bot probability between time t and t-1
 Calculating the weighted moving average of coefficients
 Measuring upper an lower control limits
37 / 45
EWMA algorithm
Model maintenance
 Calculating the correlation coefficient of bot probability between time t and t-1
 Calculating the weighted moving average of coefficients
 Measuring upper an lower control limits
 Retraining the model, unless 瑞駒 <  < 駒
38 / 45
Real-World Deployment
39 / 45
BOT detection system  dashboard
Real-World Deployment
 Provide the trend of numbers or rates of BOT, and the chart of BOT
statistics by main activity zone
The trend of bot The trend of bot rate
The trend of bot rate
at specific zone
Bot statistics by
activity zone
40 / 45
BOT detection system  search and filter
Real-World Deployment
 Search and filter the list of accounts to ban
Fill in the
conditions to filter
accounts to ban
Print the list of accounts to
ban according to the search
conditions
41 / 45
Conclusion
42 / 45
 We proposed self-similarity as a feature and demonstrated its effectiveness
with real datasets
 We proposed a bot detection framework that includes a detection model
maintenance process
 We implemented the proposed framework and utilized it for live MMORPGs
Conclusion
Contributions
43 / 45
Future works  short-time playing bot
 Undetected massive number of bots playing for less than 10 hours per week
 Star-shaped trading network structure
Conclusion
VS
Short-time playing BOT Human users
44 / 45
Future works  occasional bot users
 Human players playing for hours and then turning on a bot for a few hours
 Self-similarities have pulse, if we use short period of time for aggregation
Conclusion
Questions and Answers

More Related Content

Ndss 2016 game_bot_final_no_video

  • 1. You are a Game Bot: Uncovering Game Bots in MMORPGs via Self-similarity in the Wild Eunjo Lee (NCSOFT) Jiyoung Woo (Korea University) Hyoungshick Kim (Sungkyunkwan University) Aziz Mohaisen (State University of New York at Buffalo) Huy Kang Kim (Korea University)
  • 2. 2 / 45 Contents Introduction Feature selection and modeling Experiments Model maintenance Real-World deployment Conclusions
  • 4. 4 / 45 Introduction Game BOT Program that plays a game autonomously (instead of human users) Bot configurations
  • 5. 5 / 45 Introduction Real Money Trading (RMT) Collect valuable items and monetize it by trading item to others Game World Virtual Assets Virtual Assets Real Money
  • 6. 6 / 45 Introduction Gold Farming Group (GFG)
  • 7. 7 / 45 Introduction Game BOT https://www.youtube.com/watch?v=k6tk8_R2w08
  • 8. 8 / 45 Introduction Game BOT Widespread cheating in online games Collapse of an in-game economy Cause a human users churn Reduce the revenue
  • 9. 9 / 45 Introduction Countermeasures Client-side Bot process detection using anti-malware programs Server-side Bot classification using game log analysis
  • 10. 10 / 45 Machine Learning-based Approach Introduction Prediction Model Character ID T1 T2 T3 Response 686042 0 0 0 0 854209 1 1 1 3 1032131 0 0 0 0 1049483 1 1 1 3 1340479 0 0 0 0 1352850 0 0 0 0 1771815 1 1 1 3 1832497 0 0 0 0 1884884 1 1 1 3 2130576 1 1 1 3 2445903 1 0 0 1 Game Logs Ground Truth Feature Selection Learning Algorithm
  • 11. 11 / 45 Introduction Challenges Raw data collection Preproce ssing Cleaning Visual- izing Game Bots Pattern
  • 12. 12 / 45 Introduction Challenges Raw data collection Preproce ssing Cleanin g Visual- izing Game A Bots Pattern Raw data collection Preproce ssing Cleanin g Visual- izing Game B Bots Pattern High cost time consuming
  • 14. 14 / 45 Introduction Our proposals Using self-similarity as a generic feature Focus on the repetitive activities of game bots, not specific behavior Proposing framework to maintain a prediction model autonomously Detect the change in performance of the prediction model and retrain it
  • 15. 15 / 45 Feature Selection and Modeling
  • 16. 16 / 45 Definition Measurement of the similarity of periodic actions per user Self-similarity
  • 17. 17 / 45 Motivation and consideration Intrinsic attributes Bot programs repeat routines using predetermined settings Human users may exhibit similar behavior, but not for long period of time Stability Little effect of game update or bot program changes Considering various actions rather than a single action Computing efficiency Easy to apply distributed algorithms (i.e. MapReduce) for log processing Self-similarity
  • 18. 18 / 45 Detailed process Generating log vectors Measuring cosine similarity Measuring self-similarity Self-similarity
  • 19. 19 / 45 Generating log vectors Game logs Game logs in User A distributed by users time Event id Character info. Other info. 15/08/13 12:00:12.131 1205 AAA, 34 N/A 15/08/13 12:00:14.237 1204 AAA, 34 N/A 15/08/13 12:00:59.436 1208 AAA, 34 Ogre 15/08/13 12:00:59:436 1208 AAA, 34 Ogre 15/08/13 12:00:59.857 1208 AAA, 34 Troll 15/08/13 12:01:17.019 1022 AAA, 34 Ring 15/08/13 12:01:21.341 1022 AAA, 34 Sword 15/08/13 12:01:23.151 1205 AAA, 34 N/A 15/08/13 12:01:54.354 1204 AAA, 34 N/A 15/08/13 12:01:56.445 1208 AAA, 34 Wolf 15/08/13 12:02:07.351 1205 AAA, 34 N/A 15/08/13 12:02:41.847 1204 AAA, 34 N/A 15/08/13 12:02:47.650 1208 AAA, 34 Ogre 15/08/13 12:03:09.353 1208 AAA, 34 Ogre Time period (hour:min) Log count per event id 1022 1204 1205 1208 12:00 0 1 1 3 12:01 2 1 1 1 12:02 0 1 1 1 12:03 0 0 0 1 Game logs in User B Game logs in User C Self-similarity
  • 20. 20 / 45 Measuring the cosine similarity between log vector(Vt) and unit vector(E) Frequency of event B Frequency of event A Vt (2,1)E (1,1) cos(慮) cos(慮) = 癌 | | = 巨 ( )2 ( )2 = (21 +11) 22+11 11+11 = 3 5 2 0.948 Self-similarity
  • 21. 21 / 45 Measuring self-similarity Measuring std. of cosine similarity and transforming using the following model = 1 1 2 , (0.5 1, : $. p$ $) Self-similarity
  • 22. 22 / 45 Modeling Logistic regression Calculating the probability of a character being a game bot Modeling and Evaluation
  • 24. 24 / 45 Datasets Experiments Lineage Aion B&S Release year 1997 2008 2012 Daily active users 300K 200K 100K Concurrent users 150K 80K 50K
  • 25. 25 / 45 Bots have cosine similarities with fewer variations than human users Bots Humans Cosine similarities Experiments
  • 26. 26 / 45 Self-similarity Almost bots have higher values than human users Lineage Aion B&S Experiments
  • 27. 27 / 45 Additional feature selection Exceptional cases short time playing or no activities over long time Outliers No. Field name Description 1 self_sim Self-similarity 2 cosim_count Count of a set of log vectors 3 cosim_uniq_count Unique count of a set of log vectors 4 cosim_zero_count Count of data in which cosine similarity is zero 5 cosim_mode Count of data that appears most often in a set of log vectors 6 total_log_count Total count of logs generated by user 7 main_char_level Character level 8 total_use_time_min Play time during certain period per user 9 npc_kill_count NPC kill count 10 trade_get_count Count of trade in which user takes item 11 trade_give_count Count of trade in which user gives items 12 retrieve_count Count of activity in which user retrieve items from warehouse 13 deposit_count Count of activity in which user deposits items to warehouse 14 log_count_per_min Average count of logs are generated per minute Feature selection
  • 28. 28 / 45 Performance evaluation Model1: using only self-similarity. Model2: using all features Game BOT Human AUC (model 1) AUC (model 2) Lineage 128 149 0.8967 0.9455 Aion 186 160 0.9557 0.9942 B&S 131 129 0.8280 0.9399 Experiments Lineage Aion B&S
  • 29. 29 / 45 Model Maintenance
  • 30. 30 / 45 Motivation and consideration How to optimize the time for retraining Too often -> high cost Too rare -> obsolete model How to retrain a model autonomously Model maintenance
  • 31. 31 / 45 System Flow Inspector Ground Truth Modeler Preprocessor Predictor Change Detector Model (PMML) Game Logs BOT Detection System Model maintenance
  • 32. 32 / 45 Logic Flow If change is detected, retraining the model Notifying to operator, if new model is invalid or change is detected consecutively Calculate bot probability Notify to operator Model retraining Change detected? Already retraining? End Validation check End no yes yes no invalid valid Model maintenance
  • 33. 33 / 45 If change is detected, retraining the model Notifying to operator, if new model is invalid or change is detected consecutively Logic Flow Calculate bot probability Notify to operator Model retraining Change detected? Already retraining? End Validation check End no yes yes no invalid valid EWMA Algorithm Model maintenance
  • 34. 34 / 45 EWMA algorithm User Bot probability (time t) Bot probability (time t-1) A 0.99 0.95 B 0.95 0.92 C 0.23 0.25 D 0.55 0.55 Calculating the correlation coefficient of bot probability between time t and t-1 Correlation coefficient Model maintenance
  • 35. 35 / 45 EWMA algorithm Calculating the correlation coefficient of bot probability between time t and t-1 Calculating the weighted moving average of coefficients (X: coefficient, Z: moving average) Model maintenance
  • 36. 36 / 45 EWMA algorithm Model maintenance Calculating the correlation coefficient of bot probability between time t and t-1 Calculating the weighted moving average of coefficients Measuring upper an lower control limits
  • 37. 37 / 45 EWMA algorithm Model maintenance Calculating the correlation coefficient of bot probability between time t and t-1 Calculating the weighted moving average of coefficients Measuring upper an lower control limits Retraining the model, unless 瑞駒 < < 駒
  • 38. 38 / 45 Real-World Deployment
  • 39. 39 / 45 BOT detection system dashboard Real-World Deployment Provide the trend of numbers or rates of BOT, and the chart of BOT statistics by main activity zone The trend of bot The trend of bot rate The trend of bot rate at specific zone Bot statistics by activity zone
  • 40. 40 / 45 BOT detection system search and filter Real-World Deployment Search and filter the list of accounts to ban Fill in the conditions to filter accounts to ban Print the list of accounts to ban according to the search conditions
  • 42. 42 / 45 We proposed self-similarity as a feature and demonstrated its effectiveness with real datasets We proposed a bot detection framework that includes a detection model maintenance process We implemented the proposed framework and utilized it for live MMORPGs Conclusion Contributions
  • 43. 43 / 45 Future works short-time playing bot Undetected massive number of bots playing for less than 10 hours per week Star-shaped trading network structure Conclusion VS Short-time playing BOT Human users
  • 44. 44 / 45 Future works occasional bot users Human players playing for hours and then turning on a bot for a few hours Self-similarities have pulse, if we use short period of time for aggregation Conclusion