際際滷

際際滷Share a Scribd company logo
VxClass for Incident
     Response


         zynamics
    info@zynamics.com
Introduction

 Binary code is often left behind by attackers
   Running processes
   Dropped executables
   Kernel memory snapshots
   Network traffic
   Crash dumps
Introduction

 Useful evidence but difficult to analyze
 Current methods:
   Use AV scanner
   Run executable to provoke/observe behavior
   Remove packer/obfuscator code
   Manual analysis using IDA Pro
Current Methods

   Error-prone and time-consuming
   AV signatures are brittle, out of date
   Behavior can be difficult to provoke
   Removal of protection code is difficult
   Manual analysis
     Does not scale
     No easy correlation of results
VxClass

   Structural malware classification tool
   Categorizes malware samples into families
   Groups malware that shares code
   Allows correlation between samples
     Regardless of how they were obtained
VxClass

   Upload of samples through a web server
   Generic unpacking through emulation
   Extraction of structural information
   Comparison with known samples
   Storage of the results in a SQL database
   Visualization of the results in the browser
Uploading

 Upload samples
   Through a web interface in your browser
   Through XML-RPC
   User-based access control to samples:
      Public: All users can see and download
      Limited: All users can see, but not download
      Private: Only original uploader can see and download
VxClass for Incident Response
Unpacking

 Generic unpacking is difficult
   Anti-debugging tricks
   Attempts to foil emulators
   Creation of and interaction between multiple
    processes
   Code obfuscation
Unpacking

 Our approach: Full system emulation
 Emulated Windows XP SP2 in Bochs
 Run the executable until it looks unpacked
 Aquire memory of all processes and dirty
  kernel pages
 Use code in aquired memory for classification
Unpacking

 Solved problems
   Anti-Debugging tricks
   Legacy API calls
   Multiple processes
   Interprocess communication
   Kernel memory analysis

 Result: Most packers can be unpacked
  automatically
Comparison

 Problem: Meaningful comparison of binary
  code
 Byte-by-byte comparison is useless
 Our approach: Structural comparison
   Award-winning (German IT-Security Award 2006)
   Uses industry-standard BinDiff engine
   Uses patent-pending MD-Index (more later)
Structural Comparison

 Extract call graph and flow graph information
  from samples
 Compare the structure of these graphs instead
  of byte sequences
 Compares code derived from same source
   Regardless of compiler settings
   Regardless of compiler
VxClass for Incident Response
Structural Comparison
Structural Comparison
MD-Index

 Patent-pending
 Clever hash function for directed graphs
 Assigns 80-bit value to a directed graph
   Allows keeping a database of flow graphs
   Allows efficient queries into the database
 Is used within VxClass for several purposes:
   Very fast approximate comparison
   Code search
Results

   Memory dumps and recovered strings
   IDA files (IDB) of the resulting disassemblies
   Pairwise similarity scores
   Visualisation:
     Family trees
     Top-10-most-similar list
Results
Results
Results
Architecture
Automated Upload                                     Manual Upload
                                  Malware




              XMLRPC-Server                    Web-Interface



                                SQL Database




Distributed
Workers


    Unpacker                  Disassembly       Classifier
                                                  MD-Index
     Bochs                     IDA
                                                  BinDiff
Case Studies

 Noise reduction
   Automatically filter uninteresting samples
 Knowledge management
   Share information between analysts
 Attacker Correlation
   Is a set of attacks performed with one toolset
 Code searching
   Find certain functions in known samples
Noise Reduction

 Upload new files to the system
 How similar are they to interesting samples ?
   Comparison to database of known samples
 Prioritize accordingly
Knowledge management


 Each analyst uploads samples he knows to
  VxClass
 New malware comes in, gets uploaded
 VxClass determines which known samples this
  is similar to
 The expert for similar samples can be found
Attacker Correlation

 A series of incidents is investigated
 On a large number of machines, code is found
 Classify the code using VxClass to find out:
   Is this one group of attackers ?
   Is this similar to attacks seen in the past ?
Code Searching

 A particularly strange piece of code (just one
  function) is identified
   Perhaps a strange encryption function
 Does this particular piece of code appear in
  other samples in the database ?
 Search is not byte-based, but flow graph
  based (MD-Index)
 The answer is one click away
Performance

 One VxClass machine
   800-1600 samples per day
 Performance depends on
   Obfuscation complexity
   Size of the malware
   Size of the database
 Can be fully parallelized
   The only bottleneck is the central database
Behavioral Analysis

 VxClass is not a behavioral-analysis tool
 VxClass is complementary to such tools
 We recommend combining VxClass with
  behavior-monitoring tools such as
   CWSandbox (http://www.cwsandbox.org)
   Anubis (Free) (http://anubis.iseclabs.org)
VxClass Options

 VxClass on a single machine
   Run it inside your organisation
 VxClass distributed
   Scale it to your needs
 VxClass as service
   We host a machine for you
 VxClass as shared service
   We host a machine for you
   Multiple clients use a shared database
Existing Customers

 The German BSI
   Agency for security in information systems
 Vodafone Germany
   Pre-filters Symbian/ARM executables
 Other government entities and private
  companies
 Mostly used for attacker correlation and noise
  filtering
Limitations

 Heavy obfuscation of control flow
 Virtualizing packers
 Unpacking only works on 32-bit Windows
   No Linux / OSX / Mobile unpacking
   64 bit support is in the works


 Upload of IDBs allows heavy manual
  intervention beforehand
FAQ

 What OS does it run on ?
   It runs on a 64-bit Debian Lenny install
 Does it have any network dependencies ?
   No
 How can we extend the system ?
   All generated data is accessible through XML-RPC
   If needed, direct access to the SQL can be used
   The SQL schema is available on request
Other questions ?

More Related Content

VxClass for Incident Response

  • 1. VxClass for Incident Response zynamics info@zynamics.com
  • 2. Introduction Binary code is often left behind by attackers Running processes Dropped executables Kernel memory snapshots Network traffic Crash dumps
  • 3. Introduction Useful evidence but difficult to analyze Current methods: Use AV scanner Run executable to provoke/observe behavior Remove packer/obfuscator code Manual analysis using IDA Pro
  • 4. Current Methods Error-prone and time-consuming AV signatures are brittle, out of date Behavior can be difficult to provoke Removal of protection code is difficult Manual analysis Does not scale No easy correlation of results
  • 5. VxClass Structural malware classification tool Categorizes malware samples into families Groups malware that shares code Allows correlation between samples Regardless of how they were obtained
  • 6. VxClass Upload of samples through a web server Generic unpacking through emulation Extraction of structural information Comparison with known samples Storage of the results in a SQL database Visualization of the results in the browser
  • 7. Uploading Upload samples Through a web interface in your browser Through XML-RPC User-based access control to samples: Public: All users can see and download Limited: All users can see, but not download Private: Only original uploader can see and download
  • 9. Unpacking Generic unpacking is difficult Anti-debugging tricks Attempts to foil emulators Creation of and interaction between multiple processes Code obfuscation
  • 10. Unpacking Our approach: Full system emulation Emulated Windows XP SP2 in Bochs Run the executable until it looks unpacked Aquire memory of all processes and dirty kernel pages Use code in aquired memory for classification
  • 11. Unpacking Solved problems Anti-Debugging tricks Legacy API calls Multiple processes Interprocess communication Kernel memory analysis Result: Most packers can be unpacked automatically
  • 12. Comparison Problem: Meaningful comparison of binary code Byte-by-byte comparison is useless Our approach: Structural comparison Award-winning (German IT-Security Award 2006) Uses industry-standard BinDiff engine Uses patent-pending MD-Index (more later)
  • 13. Structural Comparison Extract call graph and flow graph information from samples Compare the structure of these graphs instead of byte sequences Compares code derived from same source Regardless of compiler settings Regardless of compiler
  • 17. MD-Index Patent-pending Clever hash function for directed graphs Assigns 80-bit value to a directed graph Allows keeping a database of flow graphs Allows efficient queries into the database Is used within VxClass for several purposes: Very fast approximate comparison Code search
  • 18. Results Memory dumps and recovered strings IDA files (IDB) of the resulting disassemblies Pairwise similarity scores Visualisation: Family trees Top-10-most-similar list
  • 22. Architecture Automated Upload Manual Upload Malware XMLRPC-Server Web-Interface SQL Database Distributed Workers Unpacker Disassembly Classifier MD-Index Bochs IDA BinDiff
  • 23. Case Studies Noise reduction Automatically filter uninteresting samples Knowledge management Share information between analysts Attacker Correlation Is a set of attacks performed with one toolset Code searching Find certain functions in known samples
  • 24. Noise Reduction Upload new files to the system How similar are they to interesting samples ? Comparison to database of known samples Prioritize accordingly
  • 25. Knowledge management Each analyst uploads samples he knows to VxClass New malware comes in, gets uploaded VxClass determines which known samples this is similar to The expert for similar samples can be found
  • 26. Attacker Correlation A series of incidents is investigated On a large number of machines, code is found Classify the code using VxClass to find out: Is this one group of attackers ? Is this similar to attacks seen in the past ?
  • 27. Code Searching A particularly strange piece of code (just one function) is identified Perhaps a strange encryption function Does this particular piece of code appear in other samples in the database ? Search is not byte-based, but flow graph based (MD-Index) The answer is one click away
  • 28. Performance One VxClass machine 800-1600 samples per day Performance depends on Obfuscation complexity Size of the malware Size of the database Can be fully parallelized The only bottleneck is the central database
  • 29. Behavioral Analysis VxClass is not a behavioral-analysis tool VxClass is complementary to such tools We recommend combining VxClass with behavior-monitoring tools such as CWSandbox (http://www.cwsandbox.org) Anubis (Free) (http://anubis.iseclabs.org)
  • 30. VxClass Options VxClass on a single machine Run it inside your organisation VxClass distributed Scale it to your needs VxClass as service We host a machine for you VxClass as shared service We host a machine for you Multiple clients use a shared database
  • 31. Existing Customers The German BSI Agency for security in information systems Vodafone Germany Pre-filters Symbian/ARM executables Other government entities and private companies Mostly used for attacker correlation and noise filtering
  • 32. Limitations Heavy obfuscation of control flow Virtualizing packers Unpacking only works on 32-bit Windows No Linux / OSX / Mobile unpacking 64 bit support is in the works Upload of IDBs allows heavy manual intervention beforehand
  • 33. FAQ What OS does it run on ? It runs on a 64-bit Debian Lenny install Does it have any network dependencies ? No How can we extend the system ? All generated data is accessible through XML-RPC If needed, direct access to the SQL can be used The SQL schema is available on request