際際滷

際際滷Share a Scribd company logo
BITS Pilani
Hyderabad Campus
Hades: A Hadoop-based Framework
for Detection of Peer-to-Peer Botnets
Pratik Narang, Abhishek Thakur, Chittaranjan Hota
P2P: uses & misuses
Approach & Contributions
 Hades:
o Host-aggregation based detection system for P2P botnets
o is a system for distinguishing Peer-to-Peer (P2P) botnets from benign P2P applications
 We propose a distributed data collection architecture wherein data
collectors are distributed at multiple locations inside a network and
sit close to the nodes (say at an Access switch or a Wi-fi APs)
o This allows inside-to-inside communication view  can be vital for detecting P2P
botnets inside a network which communicate to each other over LAN.
 Hades adopts a Host-aggregation based approach which obtains
statistical features per host for all P2P hosts involved in network
communications.
o No signatures or Deep Packet inspection (DPI) required
 Built on top of the Hadoop ecosystem, Hades is scalable by design.

Data nodes
P2P bots
detected
Name node
2. Parse
Packets with
Tshark
5. Feature set
evaluated against
models built with
Mahout
4. Host-based
features
extracted
with Hive
3. Push data to HDFS
1. Data collection
Trigger
Firewall rules
Distributed Systems Lab Student Hostels
Host-aggregated features
 Number of distinct destination hosts contacted
o Less destination diversity in botnets
 The total volume of data sent from the host
o Bots dont share movies!
 The average of the TTL value of the packets sent from the host
o Lower TTL expected in bots
Host-based approach
Parsed Packet data stored with Hive:
CREATE EXTERNAL TABLE packet_data (
timestamp DECIMAL, ip_source STRING,
ip_destination STRING, ttl INT,
proto INT, payload INT )
ROW FORMAT DELIMITED FIELDS TERMINATED BY ,
LOCATION /user/hdfs/PacketDump;
CREATE TABLE host_data (
host STRING, destinations DECIMAL,
avg_ttl DECIMAL, volume BIGINT )
ROW FORMAT DELIMITED FIELDS TERMINATED BY ,
LINES TERMINATED BY n STORED AS TEXTFILE;
INSERT INTO TABLE host_data
SELECT ip_source, COUNT(DISTINCT ip_destination), AVG(ttl), SUM(payload)
FROM packet_data
GROUP BY ip_source;
Results over Mahout
0
10
20
30
40
50
60
70
80
90
100
True Positive False Positive True Positive False Positive
Training Testing
Botnet Benign
Feedback: pratiknarang@outlook.com
Ad

Recommended

DEEPSEC 2013: Malware Datamining And Attribution
DEEPSEC 2013: Malware Datamining And Attribution
Michael Boman
Performing Network & Security Analytics with Hadoop
Performing Network & Security Analytics with Hadoop
DataWorks Summit
Open source network forensics and advanced pcap analysis
Open source network forensics and advanced pcap analysis
GTKlondike
Apache metron - An Introduction
Apache metron - An Introduction
Baban Gaigole
Forensic Analysis - Empower Tech Days 2013
Forensic Analysis - Empower Tech Days 2013
Islam Azeddine Mennouchi
Kademlia introduction
Kademlia introduction
Priyab Satoshi
Peer to peer Paradigms
Peer to peer Paradigms
hassan ahmed
Hadoop / Spark on Malware Expression
Hadoop / Spark on Malware Expression
MapR Technologies
Apache metron meetup presentation at capital one
Apache metron meetup presentation at capital one
gvetticaden
Scaling DDS to Millions of Computers and Devices
Scaling DDS to Millions of Computers and Devices
Rick Warren
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
Lucidworks (Archived)
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Lucidworks (Archived)
Python for Data Science
Python for Data Science
Harri H辰m辰l辰inen
Large Scale Processing of Unstructured Text
Large Scale Processing of Unstructured Text
DataWorks Summit
Building a Real-Time News Search Engine: Presented by Ramkumar Aiyengar, Bloo...
Building a Real-Time News Search Engine: Presented by Ramkumar Aiyengar, Bloo...
Lucidworks
Hadoop for Bioinformatics: Building a Scalable Variant Store
Hadoop for Bioinformatics: Building a Scalable Variant Store
Uri Laserson
The Internet
The Internet
ConorW
Hades_poster_Comad
Hades_poster_Comad
Pratik Narang
A Dynamic Botnet Detection Model based on Behavior Analysis
A Dynamic Botnet Detection Model based on Behavior Analysis
idescitation
Literature survey on peer to peer botnets
Literature survey on peer to peer botnets
Acad
Guarding Against Large-Scale Scrabble In Social Network
Guarding Against Large-Scale Scrabble In Social Network
Editor IJCATR
Botnet detection using ensemble classifiers of network flow
Botnet detection using ensemble classifiers of network flow
IJECEIAES
A Taxonomy of Botnet Detection Approaches
A Taxonomy of Botnet Detection Approaches
Fabrizio Farinacci
Detecting Victim Systems In Client Networks Using Coarse Grained Botnet Algor...
Detecting Victim Systems In Client Networks Using Coarse Grained Botnet Algor...
IRJET Journal
Towards botnet detection through features using network traffic classification
Towards botnet detection through features using network traffic classification
IJERA Editor
A Survey of HTTP Botnet Detection
A Survey of HTTP Botnet Detection
IRJET Journal
Feature selection for detection of peer to-peer botnet traffic
Feature selection for detection of peer to-peer botnet traffic
Pratik Narang
DETECTION OF PEER-TO-PEER BOTNETS USING GRAPH MINING
DETECTION OF PEER-TO-PEER BOTNETS USING GRAPH MINING
IJCNCJournal
Detection of Peer-to-Peer Botnets using Graph Mining
Detection of Peer-to-Peer Botnets using Graph Mining
IJCNCJournal
Machine Learning Based Botnet Detection
Machine Learning Based Botnet Detection
butest

More Related Content

What's hot (9)

Apache metron meetup presentation at capital one
Apache metron meetup presentation at capital one
gvetticaden
Scaling DDS to Millions of Computers and Devices
Scaling DDS to Millions of Computers and Devices
Rick Warren
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
Lucidworks (Archived)
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Lucidworks (Archived)
Python for Data Science
Python for Data Science
Harri H辰m辰l辰inen
Large Scale Processing of Unstructured Text
Large Scale Processing of Unstructured Text
DataWorks Summit
Building a Real-Time News Search Engine: Presented by Ramkumar Aiyengar, Bloo...
Building a Real-Time News Search Engine: Presented by Ramkumar Aiyengar, Bloo...
Lucidworks
Hadoop for Bioinformatics: Building a Scalable Variant Store
Hadoop for Bioinformatics: Building a Scalable Variant Store
Uri Laserson
The Internet
The Internet
ConorW
Apache metron meetup presentation at capital one
Apache metron meetup presentation at capital one
gvetticaden
Scaling DDS to Millions of Computers and Devices
Scaling DDS to Millions of Computers and Devices
Rick Warren
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
Lucidworks (Archived)
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Lucidworks (Archived)
Large Scale Processing of Unstructured Text
Large Scale Processing of Unstructured Text
DataWorks Summit
Building a Real-Time News Search Engine: Presented by Ramkumar Aiyengar, Bloo...
Building a Real-Time News Search Engine: Presented by Ramkumar Aiyengar, Bloo...
Lucidworks
Hadoop for Bioinformatics: Building a Scalable Variant Store
Hadoop for Bioinformatics: Building a Scalable Variant Store
Uri Laserson
The Internet
The Internet
ConorW

Similar to Hades (20)

Hades_poster_Comad
Hades_poster_Comad
Pratik Narang
A Dynamic Botnet Detection Model based on Behavior Analysis
A Dynamic Botnet Detection Model based on Behavior Analysis
idescitation
Literature survey on peer to peer botnets
Literature survey on peer to peer botnets
Acad
Guarding Against Large-Scale Scrabble In Social Network
Guarding Against Large-Scale Scrabble In Social Network
Editor IJCATR
Botnet detection using ensemble classifiers of network flow
Botnet detection using ensemble classifiers of network flow
IJECEIAES
A Taxonomy of Botnet Detection Approaches
A Taxonomy of Botnet Detection Approaches
Fabrizio Farinacci
Detecting Victim Systems In Client Networks Using Coarse Grained Botnet Algor...
Detecting Victim Systems In Client Networks Using Coarse Grained Botnet Algor...
IRJET Journal
Towards botnet detection through features using network traffic classification
Towards botnet detection through features using network traffic classification
IJERA Editor
A Survey of HTTP Botnet Detection
A Survey of HTTP Botnet Detection
IRJET Journal
Feature selection for detection of peer to-peer botnet traffic
Feature selection for detection of peer to-peer botnet traffic
Pratik Narang
DETECTION OF PEER-TO-PEER BOTNETS USING GRAPH MINING
DETECTION OF PEER-TO-PEER BOTNETS USING GRAPH MINING
IJCNCJournal
Detection of Peer-to-Peer Botnets using Graph Mining
Detection of Peer-to-Peer Botnets using Graph Mining
IJCNCJournal
Machine Learning Based Botnet Detection
Machine Learning Based Botnet Detection
butest
Bot net detection by using ssl encryption
Bot net detection by using ssl encryption
Acad
Detection of Botnets using Honeypots and P2P Botnets
Detection of Botnets using Honeypots and P2P Botnets
CSCJournals
Machine-learning Approaches for P2P Botnet Detection using Signal-processing...
Machine-learning Approaches for P2P Botnet Detection using Signal-processing...
Pratik Narang
An Efficient Framework for Detection & Classification of IoT BotNet.pptx
An Efficient Framework for Detection & Classification of IoT BotNet.pptx
Sandeep Maurya
A Botnet Detecting Infrastructure Using a Beneficial Botnet
A Botnet Detecting Infrastructure Using a Beneficial Botnet
Takashi Yamanoue
A review botnet detection and suppression in clouds
A review botnet detection and suppression in clouds
Alexander Decker
Performance evaluation of botnet detection using machine learning techniques
Performance evaluation of botnet detection using machine learning techniques
IJECEIAES
Hades_poster_Comad
Hades_poster_Comad
Pratik Narang
A Dynamic Botnet Detection Model based on Behavior Analysis
A Dynamic Botnet Detection Model based on Behavior Analysis
idescitation
Literature survey on peer to peer botnets
Literature survey on peer to peer botnets
Acad
Guarding Against Large-Scale Scrabble In Social Network
Guarding Against Large-Scale Scrabble In Social Network
Editor IJCATR
Botnet detection using ensemble classifiers of network flow
Botnet detection using ensemble classifiers of network flow
IJECEIAES
A Taxonomy of Botnet Detection Approaches
A Taxonomy of Botnet Detection Approaches
Fabrizio Farinacci
Detecting Victim Systems In Client Networks Using Coarse Grained Botnet Algor...
Detecting Victim Systems In Client Networks Using Coarse Grained Botnet Algor...
IRJET Journal
Towards botnet detection through features using network traffic classification
Towards botnet detection through features using network traffic classification
IJERA Editor
A Survey of HTTP Botnet Detection
A Survey of HTTP Botnet Detection
IRJET Journal
Feature selection for detection of peer to-peer botnet traffic
Feature selection for detection of peer to-peer botnet traffic
Pratik Narang
DETECTION OF PEER-TO-PEER BOTNETS USING GRAPH MINING
DETECTION OF PEER-TO-PEER BOTNETS USING GRAPH MINING
IJCNCJournal
Detection of Peer-to-Peer Botnets using Graph Mining
Detection of Peer-to-Peer Botnets using Graph Mining
IJCNCJournal
Machine Learning Based Botnet Detection
Machine Learning Based Botnet Detection
butest
Bot net detection by using ssl encryption
Bot net detection by using ssl encryption
Acad
Detection of Botnets using Honeypots and P2P Botnets
Detection of Botnets using Honeypots and P2P Botnets
CSCJournals
Machine-learning Approaches for P2P Botnet Detection using Signal-processing...
Machine-learning Approaches for P2P Botnet Detection using Signal-processing...
Pratik Narang
An Efficient Framework for Detection & Classification of IoT BotNet.pptx
An Efficient Framework for Detection & Classification of IoT BotNet.pptx
Sandeep Maurya
A Botnet Detecting Infrastructure Using a Beneficial Botnet
A Botnet Detecting Infrastructure Using a Beneficial Botnet
Takashi Yamanoue
A review botnet detection and suppression in clouds
A review botnet detection and suppression in clouds
Alexander Decker
Performance evaluation of botnet detection using machine learning techniques
Performance evaluation of botnet detection using machine learning techniques
IJECEIAES
Ad

Hades

  • 1. BITS Pilani Hyderabad Campus Hades: A Hadoop-based Framework for Detection of Peer-to-Peer Botnets Pratik Narang, Abhishek Thakur, Chittaranjan Hota
  • 2. P2P: uses & misuses
  • 3. Approach & Contributions Hades: o Host-aggregation based detection system for P2P botnets o is a system for distinguishing Peer-to-Peer (P2P) botnets from benign P2P applications We propose a distributed data collection architecture wherein data collectors are distributed at multiple locations inside a network and sit close to the nodes (say at an Access switch or a Wi-fi APs) o This allows inside-to-inside communication view can be vital for detecting P2P botnets inside a network which communicate to each other over LAN. Hades adopts a Host-aggregation based approach which obtains statistical features per host for all P2P hosts involved in network communications. o No signatures or Deep Packet inspection (DPI) required Built on top of the Hadoop ecosystem, Hades is scalable by design.
  • 4. Data nodes P2P bots detected Name node 2. Parse Packets with Tshark 5. Feature set evaluated against models built with Mahout 4. Host-based features extracted with Hive 3. Push data to HDFS 1. Data collection Trigger Firewall rules Distributed Systems Lab Student Hostels
  • 5. Host-aggregated features Number of distinct destination hosts contacted o Less destination diversity in botnets The total volume of data sent from the host o Bots dont share movies! The average of the TTL value of the packets sent from the host o Lower TTL expected in bots
  • 6. Host-based approach Parsed Packet data stored with Hive: CREATE EXTERNAL TABLE packet_data ( timestamp DECIMAL, ip_source STRING, ip_destination STRING, ttl INT, proto INT, payload INT ) ROW FORMAT DELIMITED FIELDS TERMINATED BY , LOCATION /user/hdfs/PacketDump; CREATE TABLE host_data ( host STRING, destinations DECIMAL, avg_ttl DECIMAL, volume BIGINT ) ROW FORMAT DELIMITED FIELDS TERMINATED BY , LINES TERMINATED BY n STORED AS TEXTFILE; INSERT INTO TABLE host_data SELECT ip_source, COUNT(DISTINCT ip_destination), AVG(ttl), SUM(payload) FROM packet_data GROUP BY ip_source;
  • 7. Results over Mahout 0 10 20 30 40 50 60 70 80 90 100 True Positive False Positive True Positive False Positive Training Testing Botnet Benign

Editor's Notes

  • #4: Hades defines our ETL logic