• Tiada Hasil Ditemukan

Thesis submitted in partial fulfillment of the requirements for the degree of

N/A
N/A
Protected

Academic year: 2022

Share "Thesis submitted in partial fulfillment of the requirements for the degree of "

Copied!
114
0
0

Tekspenuh

(1)

~

..

DETECTION OF BOTNET BASED ON ABNORMAL DNS TRAFFIC

BY

A WSAN ABDULRAHMAN HASAN ABDULLAH

Thesis submitted in partial fulfillment of the requirements for the degree of

Master of Science

June 2009

(2)

DECLARATION

Name: Awsan Abdulrahman Hasan Abdullah Matric No: P-COM0024/08

Faculty: School of Computer Sciences

Thesis Title: Detection of Botnet Based on Abnormal DNS Traffic

I hereby declare that this thesis in I have submitted to S.c:;bgglQLCgmp.!!!~LS.y.i~ny.~_!?

on J.~~~_gfJ.!!n~ is my own work. I have stated all references used for the completion of my thesis.

I agree to prepare electronic copies of the said thesis to the external examiner or internal examiner for the determination of amount of words used or to check on plagiarism should a reque<;- rye!- c~e.

I make this declaration with the believe that what is stated in this declaration is true and the thesis as forwarded is free from plagiarism as provided under Rule 6 of the Universities and University Colleges (Amendment) Act 2008, University Science Malaysia Rules (Student Discipline) 1999.

I conscientiously believe and agree that the University can take disciplinary actions against me under Rule 48 of the Act should my thesis be found to be the work or ideas of other persons.

Student's Signature:

.~\.

__ .... _ ...

Acknowledgement of receipt by:

A s:

MOHD REDZUAN ASMI Penolong Pendaftar Pusat Pengajian Sains Komputer

Date: 18/6/2009

Date:

0'.

d

f-;)oo Cf

(3)

ACKNOWLEDGEMENTS

First of all, I would like to thank the Almighty Allah, the most merciful, the most beneficial for giving me the opportunity to do my post graduation in the School of Computer Sciences, Universiti Sains Malaysia.

I would like to express my heartfelt gratitude and special regards to my parents, my wife and my lovely daughter Afnan for their generous support and encouragement during my research work. The successful completion of my work is the fruit of their sacrifices, their devotion and their determination.

I wish to record my deep sense of gratitude and appreciations to my supervisor Professor Dr. Sureswaran Ramadass for his expert guidance and support throughout my research tenure, also' my acknowledgment goes to my coordinator Dr.

Shah ida Sulaiman for her patience and wisdom that made things immensely easier for me and lead to the success of my research.

Last but not least, I have been extremely lucky to have support, encouragement, and inspiration from many people; without them, this work would not have been possible. Here, I would also like to extend my sincere appreciations to my friends Ahmed Mubarak, Mohammed Imran, Bassam Altamimi, Mohammed Fadhil, Salah Salem, Rais Altamimi, and all NA v6 members, as they generously gave me very good pieces of advice and assisted me in my research work.

(4)

TABLE OF CONTENTS

DEC LARA TrON ... ii

ACKNOWLEDGEMENTS ... " ... iii

TABLE OF CONTENTS ... iv

LIST OF TABLES ... vii

LIST OF FIGURES ... viii

LIST OF SYMBOLS AND ABBREVIATIONS ... x

ABSTRAK ... xii

ABSTRACT ... xiv

CHAPTER 1: INTRODUCTION 1.1 Overview ... 1

1.2 Motivation ... 3

1.3 Problem Statement ... 3

1.4 Objectives of Research ... 4

1.5 Scope of the Research ... 5

1.6 Contribution of Research ... 5

1.7 Proposed Method ... 5

1.8 Research Methodology ... 6

1.9 Outline of Research ... 7

CHAPTER 2: LITERATURE REVIEW 2.1 Introduction ... 8

2.2 Botnet Phenomenon ... 9

2.2.1 Malicious Botnet Activities ... 10

(5)

2.2.2 Botnet Behaviors ... ' ... 12

2.3 Botnet Command and Controller (C&C) ... 14

2.4 Domain Name System (DNS) ... ~ 16

2.4.1 Query Process ... 18

2.4.2 DNS Weakness ... 19

2.5 DNS Utilization for Malicious Botnet Activities ... 19

2.6 Related Works on DNS Monitoring ... 2 1 2.7 Critical Analysis ... 25

2.8 Summary ... 28

CHAPTER 3: BOTNET DETECTION MECHANISM (BOM) 3.1 Introduction ... 29

3.2 Jaccard Similarity Coefficients ... 29

3.3 Monitoring Normal and Abnormal Behaviors ... 32

3.4 The Proposed Method for DNS Monitoring ... 33

3.5 Botnet Detection Mechanism (BOM) Framework ... 36

3.5.1 Capturing Phase ... 38

3.5.2 Analyzing Phase ... 39

3.5.3 ClassifYing Phase ... 40

3.6 Summary ... 46

CHAPTER 4: IMPLEMENTATION AND RESULT 4.1 Introduction ... 47

4.2 Implementation of the BOM ... 47

4.3 Performance Test of BOM ... 51

(6)

4.4 Performance Test Results ... 55

4.5 Performance Evaluation ... 57

4.5.1 False Positive ... 58

4.5.2 False Negative ... 62

4.6 Summary ... 64

CHAPTER 5: CONCLUSION AND FUTURE WORK 5.1 Conclusion ... 66

5.2 Future Work ... 67

REFERENCES ... , ... 68

Appendix A ... 72

(7)

LIST OF TABLES Page

Table 2.1 Difference between Botnet and Legitimate 14

Hosts (Choi et al. 2007)

Table 2.2 Summary of Related Work on DNS Monitoring 27

Table 3.1 Jaccard Similarity Values (Kim and Choi, 1998) 31

Table 3.2 Insertion Query Data in Main Table 40

Table 3.3 Query Data in Results Table 40

Table 4.1 Results of Domain Names Classification 56

Table 4.2 First Experiment Results 59

Table 4.3 Second Experiment Results 60

Table 4.4 Third Experiment Results 61

(8)

LIST OF FIGURES

Page Figure 1.1 Network Infected with Botnets (Cisco, 2007) 2

Figure 1.2 Research Procedure 6

Figure 2.1 Response Time for Humans and Bots 13

(Akiyama et ai., 2007)

Figure 2.2 Communication Flow in C&C (Freiling el aI., 2005) 14 Figure 2.3 Hierarchical Structure for C&C Server 15

(Zou and Cunningham, 2006)

Figure 2.4 DNS Hierarchical Structure (Behrouz, 2006) 17

Figure 2.5 DNS Query Process (Davies, 2006) 18

Figure 3.1 Similarity between Two Blocks of Hosts 31

(Kim and Choi, 1998)

Figure 3.2 Normal and Abnormal DNS Traffic 33

Figure 3.3 Applying Jaccard Similarity between Two 34 Blocks of Hosts

Figure 3.4 Similarity Measurement based on MAC Addresses 36

Figure 3.5 BDM Frame Work 37

Figure 3.6 DNS Packets 38

Figure 3.7 DNS Packets of Type A 38

Figure 3.8 Capturing and Analyzing Phase 39

Figure 3.9 DNS Query Classification Phase 42

Figure 3.10 Single Host Classification 44

Figure 3.11 Matching Single Host with Database 45

Figure 4.1 BDM Database Tables 47

Figure 4.2 Botnet Detection Mechanism (BDM) Interface 48 Figure 4.3 Dividing Query Data into Two Interval Time 49 Figure 4.4 Grouping Query Data in Each Time Interval 50

(9)

Figure 4.5 Figure 4.6 Figure 4.7

Experiment Topology Simulator BotDNS Interface

Domain Names Classification Based on Jaccard Similarity

52

53

57
(10)

A

ADSL BDM BotDNS C&C CDRR CNAME CPU DDoS DONS DHCP DNS DNSBL FQDN FTP HTTP IP IRC MAC MX NS

NXDOMAIN PTR

RR

LIST OF SYMBOLS AND ABBREVIATIONS

Address Record

Asymmetric Digital Subscriber Line Botnet Detection Mechanism Simulator Botnet DNS Command-And-Control Canonical DNS Request Rate Host's Canonical Name Central Processing Unit Distributed Denial of Service Dynamic Domain Name System Dynamic Host Configuration Protocol Domain Name System

DNS Blacklist

Fully Qualified Domain Name File Transfer Protocol

Hypertext Transfer Protocol Internet Protocol

Internet Relay Chat Media Access Control Mail Exchange Name Server(s) Non-Existent Domain Pointer Records Resource Recorder

(11)

SQL Structured Query Language

TIL Time-To-Live

n

Intersection

II11 Magnitude

U

Union
(12)

PENGESANAN BOTNET BERDASARKAN LALU LINTAS DNS YANG TIDAK NORMAL

ABSTRAK

Pertumbuhan pesat dalam sektor rangkaian telah menarik minat komuniti penyerang (attackers' community). Mereka sentiasa membangunkan teknik-teknik baru dalam usaha mereka melemabkan sejumlah besar komputer di seluruh dunia. Botnet merupakan satu contoh bagi teknik seumpama ini. Botnet ialah sekumpulan Bot yang terdapat pada rangkaian tersebut dan hos yang dikawal dari jauh oleh penyerang melalui satu mesin pelayari perintah dan kawal (C&C). Botnet digunakan untuk melakukan banyak aktiviti yang berniat jabat seperti serangan spam dan DDoS.

Botnet dianggap sebagai bahagian utama Internet disebabkan mekanismenya yang mampu meningkat dengan cepat. Kini, botnet telah menggunakan ONS dan mesin pelayan pertanyaan DNS, seperti juga mana-mana hos yang sah tarafnya. Oalam kes ini, sukar untuk membezakan an tara lalu !intas ONS yang sah tarafnya dengan yang tidak sah tarafnya. Adalah penting untuk membina satu penyelesaian yang sesuai untuk mengesan botnet dalam lalu Iintas ONS dan sekaligus melindungi rangkaian daripada serangan aktiviti jahat Botnet. Oalam penyelidikan ini, kami mencadangkan satu mekanisme mudah, yang kami namakan Mekanisme Pengesanan Botnet (BOM).

BDM mampu memantau Ialu !intas ONS dan mengesan sebarang ketidaknormalan dalam lalu !intas ONS, yang berpunca daripada aktiviti botnet berdasarkan perlakuan botnet terutamanya kewujudan botnet sebagai satu kunpulan secara berkala. BDM mampu mengklasifikasikan lalu lintas DNS yang diminta oleh sesuatu kumpulan hos (perilaku kumpulan) dan hos tunggal (perilaku individu), sekaligus mengesan nama domain yang tidak normal yang dijana menerusi aktiviti jahat Botnet. Akhir sekali, hasil eksperimen kami menunjukkan bahawa BOM mampu mengklasi fikasikan lalu

(13)

Iintas DNS, dan mampu mengesan aktiviti botnet secara efektif dengan purata kadar pengesanan 89%. Ini membuktikan BDM bersifat lebih teguh berbanding pendekatan pengesanan Botnet yang sebelumnya.

(14)

DETECTION OF BOTNET BASED ON ABNORMAL DNS TRAFFIC

ABSTRACT

The immense growth in the network sector has attracted the attackers' community.

The attackers are always developing new techniques to assist them compromise a large number of computers around the world. Botnet is an example of such technique.

Botnet is a group of Bots running on a compromised network and hosts which are controlled remotely by the botmaster via a Command and Control (C&C) server.

Botnet is used to perform many malicious activities such as Spam and DDoS attacks.

The Botnet is considered as a major part of Internet due to its fast increasing mechanism. Recently, Botnets have utilized the DNS and query DNS server just like any legitimate hosts. In this case, it is difficult to distinguish between the legitimate DNS traffic and illegitimate DNS traffic. It is important to build a suitable solution for Botnet detection in the DNS traffic and consequently protect the network from the malicious Botnets activities. In this research, a simple mechanism is proposed and is called Botnet Detection Mechanism (BDM). BDM monitors the DNS traffic and detects the abnormal DNS traffic issued by the Botnet activity based on the Botnet behaviors particularly the appearance of Botnet as a group in a periodic manner. The BDM is able to classify the DNS traffic requested by group of hosts (group behavior) and single hosts (individual behavior), consequently detect the abnormal domain name issued by the malicious Botnets. Finally, the experimental results proved that the BDM is able to classify DNS traffic, and efficiently detects the Botnet activity with average detection rate of 89%. This proves that BDM is more robust than previous approaches of Botnet detection.

(15)

1.1 Overview

CHAPTER 1 INTRODUCTION

The growth in the area of network in the past few years is considered as a part of the exponential growth of the communication system. The network is just like computers; it needs software to simplify its functionality and make it easy to use.

Internet browsing, e-mail, and instant messaging are a few simple examples of the usage of computer communication across the Internet. Nowadays, personal computer systems are widely used, hence the number of Internet subscribers have increased gradually. Generally, these computers contain important data, such as users' information and probably any business activities (Ianelli and Hackworth, 2005).

Due to the growth in the computer networks and Internet users, the computers have become a favorite target that attracts the attackers' community. Even though these systems are protected by antivirus software and firewalls, they may still be exposed to different malicious attacks. The attackers are always looking for various techniques to assist them in compromising a large number of computer systems in the world (Bacher et al., 2005).

An example of such technique is Botnet, which is used in compromising computer systems; it can be used in several malicious activities such as (Bacher et al., 2005; Ianelli and Hackworth, 2005):

(16)

(i) Malware dissemination (ii) DDoS attack

(iii) Phishing (iv) Spamming

Botnet is a collection of Bots running on a compromised network and hosts, which can be remotely controlled by a human attacker called botmaster via the Command-and-Control (C&C) server. Importantly, these Bots are pieces of programmable software which are considered as a part of the same Botnet (Zou and Cunningham, 2006). Figure 1.1 shows an example of a compromised network that is controlled by the botmasters.

~"

.III

.III'.

Figure 1.1: Network Infected with Botnets (Cisco, 2007)

Nowadays, the Domain Name System (DNS) has become the desired target of bot masters due to its importance in the Internet. The DNS is a distributed database spread over the Internet which is used to translate the domain names into Internet Protocol (IP) addresses and vice versa (Weimer, 2005). This is because, it is difficult for human to memorize all IP addresses of the hosts in the Internet.

(17)

DNS is not owned or controlled by a specific organization and the DNS traffic flows between the clients and DNS server without any protection or restriction. As such, the Botnet can exploit the DNS to perform their malicious activities. The Botnet queries the DNS server just like any legitimate host and the DNS server responds to this query without distinguishing the source of the query (Castillo-Perez and Garcia-Alfaro, 2008).

1.2 Motivation

Botnets' online crimes have increased rapidly due to exploitation ofDNS by the attackers (Ianelli and Hackworth, 2005). There are many computer applications and legitimate users who utilize the DNS to access the Internet and perform their jobs correctly (Wills et al., 2003). On the other hand, Botnet also utilizes the DNS to perform its malicious activities (Schiller et aI., 2007). In this case, it is difficult to distinguish the DNS traffic that belongs to the Botnet activity. However, by monitoring the DNS query on the networks, it is possible to identify and detect the Botnet activities in the DNS traffic (Kristoff, 2004; Choi et ai., 2007). The motivation behind this research is the detection of abnormal DNS traffic caused by the Botnet activity.

1.3 Problem Statement

The Botnet relies on the C&C server to receive further commands from the botmaster. However, if the C&C server is blocked by the administrator, then all the communications between the botmaster and Bots fail (Schiller et ai., 2007). As a result, the botmasters tend to hide themselves behind the new C&C server that is of unknown IP address to the Bots. In this case, the Bots should query a DNS server to

(18)

find out the configured domain name that has now moved to another C&C server's IP address. By using a dynamic DNS, the botmaster could point the Bots to a specific domain name that is under his control. The Botnet queries the DNS server and in return, the DNS server responds without distinguishing the source of the query.

Since many normal applications require DNS to access the Internet, the problem persists on how the normal DNS traffic caused by a legitimate user or application can be distinguished from the abnormal DNS traffic caused by the Botnet activity. There are several researches conducted with regards to this problem -and these researches focused on distinguishing between the normal DNS traffic generated legally on the monitored network, and those suspicious and alike to Botnet behavior.

1.4 Objectives of Research

The main goal of this research is to propose a mechanism that is able to classify the DNS traffic into normal and abnormal traffics based on the Botnet behavior through the following objectives:

(i) To distinguish between the normal DNS traffic and the abnormal DNS traffic in terms of DNS queries.

(ii) To develop an algorithm that could detect the abnormal DNS traffic issued by malicious Botnet activities in both, individual and group of hosts.

(iii) To evaluate the performance of the proposed algorithm.

(19)

1.5 Scope of Research

The scope of this research is restricted to monitoring the DNS traffic on a local network, and classifying the DNS traffic into normal DNS traffic and abnormal DNS traffic issued by malicious Botnet based on the Botnet behavior.

1.6 Contribution of Research

This research proposes detection mechanism for abnormal DNS traffic issued by malicious Botnet activities in both, individual and group of hosts, based on the Botnet behavior as well as DNS traffic monitoring which is considered as the main resource for spreading malicious activities.

1. 7 Proposed Method

The proposed method in this research uses the Jaccard similarity coefficient S j (Kim and Choi, ] 998; Rieck et al., 2006) which measures the ratio of similarity between two individual objects, X and Y to detect the Botnet activity. The Media Access Control (MAC) is used as the host's identifier instead of the IP address;

however, the spoofing of MAC address is not taken into consideration.

This method classifies the domain name which is requested'by two blocks of hosts, X and Y (group behavior) at time intervals til and t;2' by calculating the ratio of overlapping MAC addresses between these two blocks of hosts based on the Jaccard similarity S I ' On the other hand, this method classifies the single host (individual behavior) by matching the domain name which is requested by the single host and its MAC address with the results obtained from classifying the blocks of hosts.

(20)

1.8 Research Methodology

This section discusses the details of the methodology of this research which consist of three main phases as illustrated in Figure 1.2. The first phase of this research involves data capturing which consists of two steps: in the first step, the network packets are captured, and in the second step, the network packets are filtered and the DNS packets are extracted from it. The second phase consists of two steps: in the first step, the query name and the source MAC address are extracted from the DNS packets. In the second step, the query data inserted into the database.

Phase 1: Capturing phase Phase 2: Analyzing phase

,---~---,

.---

I I I ,~,~~<_~, , ' > ' n W ' I

: : : .::'EXtracf

Insert'iuery:

: :

~: ,'~qq~rynatne pa~ into . :

: : ...,. :·.i"ihtl MA:C·;

database

t ____________________________ J

L_~'~'~~~ _~

__________

~_ ~_'

_________ :

L~~~~~~~4~~~~~~~~~~~~~e~·~·~~~I ..

•

Phase 3: Classifying phase

r---~

I I I I I I I I I I I I I I

~~~--~~I~ ~F_---~

'Classifying group beh~vjor Passifying individual behavior

Figure 1.2: Research Procedure

The third phase consists of three steps: in the first step, the query data is read from the database and is grouped. In the second step, the query data is classified and this step consists of two parts; the first part classifies the group of hosts' behavior and the second part classifies the individual host behavior. Finally, the third step combines the behaviors and performs evaluation.

(21)

To evaluate the performance of the proposed method; the quantitative data of DNS traffic is collected and the results are tested. The experiment is performed on the National Advance IPv6 (NA v6) center USM and java platform is used to implement this method. Besides, a simulated Botnet is created and named as BotDNS, and installed in some hosts to test the performance ofthe proposed method.

The performance of the proposed method is tested by implementing this method to capture the real DNS request and test its ability of classifying DNS traffic and detecting the Botnet activity. The results of this method are compared to Dagon (2005) results to test the difference of enhancements. The potential of false positive and false negative generation by this method during the monitoring period are tested.

This evaluation indicates the capability of the proposed method in efficiently classifying the DNS traffic and detecting the abnormal DNS traffic issued by Botnet activities.

1.9 Outline of Research

This research is organized into five chapters. Chapter 1 has briefly outlined the background of Botnet, motivation, problem statements, objectives, research scope, contribution, the proposed method and the research methodology. Chapter 2 covers the literature review of Botnet and DNS, and discusses the relationship between them.

It also provides summary of related work on DNS based Botnet detection. Chapter 3 explains the proposed method used in this research to distinguish between the normal and abnormal DNS traffic in the network. Chapter 4 presents the experimental results and the evaluation of these results. Finally Chapter 5 discusses the conclusions of this research and recommendation for future works.

(22)

2.1 Introduction

CHAPTER 2 LITERATURE REVIEW

The computer networks are growing exponentially and the use of personal computers are spreading widely in homes, companies and almost everywhere. This leads to the increase in subscribers of the Internet service, especially the subscribers of Asymmetric Digital Subscriber Line (ADSL) connection and broadband connection services.

Among these computers, a very large number of them are kept on for a long time or probably for several days, and they remain connected to the Internet.

Generally, the computer systems contain important data, such as users' information and any business activities. On the other hand, there are a huge number of data available and accessible in the Internet by any user. This data is transferred between the computers within the same or different networks at all time. Moreover, the e- commerce transactions are also done online (Ianelli and Hackworth, 2005).

Most users believe that no one can access this valuable data and they are not aware of the dangers around them. The attackers know that most users have less knowledge about computer securities, hence attack these weak computer systems and steal the users' data. Some curious attacker just focus on the users' data, but the majority of them focus on the financial gain which leads to many criminal activities such as e-comers online crimes (Bacher et ai., 2005; Ianelli and Hackworth, 2005).

(23)

All these issues have increased the attackers' curiosity to attack a wider range of computer networks and they continuously develop various tools to help them in compromising a large number of computer networks. Viruses, Trojans, Worms, Sniffers, and Backdoor tools are examples of tools that threat the Internet users. In the past few years, a new type of threat has evolved in the Internet, which is called the "Botnet".

2.2 Botnet Phenomenon

A Bot is an individual piece of programmable software which is used to perform malicious activities on the network (Zou and Cunningham, 2006). It can be installed and run automatically in any compromised system, and it has the ability to spread, similar to the Worms spreading mechanism, and also it can evade any detection programs similar to Viruses (Cooke et al., 2005; Gu, 2008).

A compromised network infected with a large number of Bots, is called a Botnet (Akiyama et al., 2007). Botnet is a group of compromised computers called zombie or drone, which are controlled remotely by an attacker or botmaster (Freiling et al., 2005; Gu, 2008). These Bots can be installed in any computer by Trojan horses,

Backdoors tools and Worms. When this malicious software or mal ware has been installed successfully in a computer, then this computer becomes a compromised computer, and it responds to any commands issue by the botmaster. Moreover, this computer becomes a part of the zombie network which is also under the control of the botmaster (Bacher et al., 2005; Cooke et al., 2005).

(24)

The botmaster can now communicate with this Bot in the infected host by using the C&C server. All Bots receive the same commands from the botmaster via the same C&C server and respond to an execution the commands (Cooke et al., 2005).

Nowadays, Botnet is considered as a serious problem as it forms a major and dangerous part of the Internet. This is because it spreads rapidly on the network over the Internet, and it is difficult to be detected because it is always hidden (lanelli and Hackworth, 2005). About five million active Bots infected computers are discovered in the first half of 2007, whereby approximately 30,000 Botnets are discovered every day and about 800,000 to 900,000 Bots had infected the entire systems around the world (Symantec, 2007).

2.2.1 Malicious Botnet Activities

There are several motivations for the botmasters to use the Botnets.

Nevertheless, the main motivation is to gain financial benefits. Some of the reasons are listed below:

(a) Propagation of New Bots

Botnets have the ability to propagate and spread new Bots, this is because all Bots can download any file that belongs to the botmaster in the Internet by using the File Transfer Protocol (FTP) or Hypertext Transfer Protocol (HTIP) and then execute it. The Bots can also spread several malware such as computer virus or mail Spam (Bacher et aI., 2005).

(25)

(b) Distributed Denial~or~Service Attacks

The Distributed Denial-of-Service (DDoS) attacks are caused by Botnet and are considered as one of the Internet crimes. The main sign of this attack is slowness in the Internet connection. This is because the Bots consume an the bandwidth of the network by sending a huge number of packets per second and this leads to loss of connection service in the victim's network. (Freiling et al., 2005).

(c) Spamming

Any compromised machine infected with Bots is considered as the main source for spreading new Spamming. By using large number of Botnets, the botmasters can send a huge number of Spam mails in a short time (Bacher et al., 2005). About 80% of email Spams are spread by Botnet and by using the Spam, about 30,000 new computers are compromised and infected with Bots every day (Husna et al., 2008).

(d) Phishing

The main purpose of phishing is to steal the user's personal information such as user name and password. This malicious activity is performed by creating a website that looks similar to a legitimate website. This is performed by the botmaster by instructing the Bots to direct the users to open this website instead of the original one. When the users open this illegitimate website, all the information entered by the users are directed to the botmaster (lanelli and Hackworth, 2005; Heron, 2007).

(26)

(e) Traffic Sniffing

Bots can sniff the network packets that pass through the infected computers in the network either for curiosity purposes or for stealing some users' information such as: usemames, passwords, and license keys (Bacher et al., 2005).

2.2.2 Botnet Behaviors

Akiyama et al. (2007) proposed three important behaviors of Botnet, which was discovered by monitoring the activities of Botnets during the flow of data in the C&C servers and these behaviors are:

(a) Bots Relationship

The relationship between botmaster and Bots is one to many, because the botmaster usually controls a number of Bots and issues the same command to all the Bots. Hence, the Bots work as one group and it is possible to detect their behavior by monitoring the activities of these groups of Botnet in the network traffic.

(b) Bots Synchronization

Botnet receives the same command from the botmaster. They communicate between each other and attack at the same time, for example the DDoS attacks are performed by a large group of Botnet at the same time. This action can expose the group of Botnet, because the ratio of traffic that is released from this group is very high compared to the others.

(27)

(c) Dots Responding

When the infected host receives commands from the botmaster, it responds immediately to those commands and executes them accurately. By comparing the behavior of human being with that of the Bots, it was found that the human performs different activities at different times. Thus, the time taken to respond is not constant;

this is because human needs to think before doing anything. However, as for Botnet behavior, when the Bot receives a command from the botmaster, it executes it immediately without a need to think about it, so the time taken to do this is always constant. Thus, this response time can be used to discover the presence of Botnet.

Figure 2.1 shows the difference between human's and Bot's response.

LegJtlmate Legitimate

host host

Human behavior

action

Bot behavior

Bot

Constant

Figure 2.1: Response Time for Human and Bots (Akiyama et 0/.,2007)

Choi et af. (2007) discussed some Botnet features in the DNS, and how the Botnet could exploit the DDNS to move to new C&C server when the old one is blocked. If so, the Botnet queries the DNS server to find the location of the domain name. Table 2.1 shows the comparison between the activities of a legitimate host and Botnet when both are using the DNS.

(28)

Table 2.1: Difference between Botnet and Legitimate Hosts (Choi et oJ. 2007)

ou size Anonymous legitimate users have

random size

~;'J\cHvlti;aftd', ~

. ;A'

~p~~r~aj~attep1:;

Group appears immediatel Usually appears

randomly and continuously

2.3 Botnet Command and Controller (C&C)

Usually DNS

In most cases, the C&C is a compromised server under the control of the botmaster who is controlling the entire Botnet in the compromised networks. The Bots in the infected computer need to communicate with the C&C server regularly to receive further instructions from the botmaster (Schiller et aI., 2007) as shown in Figure 2.2.

BOI~

Figure 2.2: Communication Flow in C&C (Freiling et al., 2005)

The botmaster utilizes the existing Internet Relay Chat (IRC) network to execute his commands easily by using the IRC channels which are considered as the media that connects the Bots with the botmasters. These channels are used to carry out commands issued by the botmaster to the Bots (Oikarinen and Reed, 1993;

Kugisaki et al., 2007). Therefore, if the C&C server is blocked by network

(29)

administrators or authorities; the Bots cannot receive the commands issued by the botmaster. In this case, the botmaster will compromise a new C&C server and use the Dynamic Domain Name System (DDNS) to move his domain name from the old C&C server to the new C&C server (Schiller et ai., 2007).

By using the DNS, the botmaster could direct the Bots to migrate to the domain name which has been moved to new C&C server as the domain name is hard-coded in the Bots' binary. The Botnet queries the DNS server to find out the botmaster's domain name. In return, the DNS server replies to the Bots and provides them with the new IP address of the botmaster's domain name, which is located in the new compromised C&C server (Schil1er et al., 2007).

The botmaster is able to compromise more than one C&C server, and this constructs a hierarchical structure for the Botnet as shown in Figure 2.3. All the Bots in the infected hosts will try to communicate with one or more C&C servers at the same time to receive commands from the botmaster. The botmaster could use a number of C&C servers and shift between them all the time (Zou and Cunningham, 2006).

Figure 2.3: Hierarchical Structure for C&C Server (Zou and Cunningham, 2006)

(30)

The ratio of traffic in C&C will be high at a short time only, and after that the botmaster shifts to another C&C in the Botnet hierarchical structure and so on. These hierarchical structures may hide the botmaster's identity, so the detection of the C&C server could be difficult when the botmaster uses this structure (Zou and Cunningham, 2006).

2.4 Domain Name System (DNS)

DNS is considered as an important part of the Internet which consists of huge distributed database spread in the Internet that contains the IP addresses and domain name of the hosts in the Internet. The goal of DNS is to translate the domain names into IP addresses and vice versa, because it is very difficult for human to memorize all IP addresses of the hosts that spread in the Internet. Fully Qualified Domain Name (FQDN) consists of the host name and domain name such as:

hostname.domainname.com (Behrouz, 2006).

DNS has a hierarchical structure, which branches into labels and each one of these labels can be up to 63 characters maximum long, and each text word between the dots can also be 63 characters in length as shown in Figure 2.4 (Mockapetris, 1987; Behrouz, 2006). The DNS is divided into a set of zones where each zone is served by a set of authoritative servers called the "name servers". These servers reply to the query data in the zone which is under the control of these servers. The resolver is the part that sends the queries in the client side of the configuration, and the name server answers these queries (Mockapetris, 1987).

(31)

Figure 2.4: DNS Hierarchical Structure (8ehrouz, 2006).

The infrastructure of the DNS consists of many different Resource Recorder (RR) type such as: Address Records (A), Name Server (NS), Host's Canonical Name (CNAME), Mail Exchange (MX), and Pointer Records (PTR) (Mockapetris, 1987).

But the most important and commonly used among them is A Record which performs the process of translation from a hostname to an IP address and is considered an ordinary DNS query that is frequently used (Oberheide et al., 2007).

On the other hand, the PTR record translates the queried IP address into a valid domain name. This is a reverse translation of A records. However, if the client needs to perform a mail query, then the MX record is used (Mockapetris, 1987;

Oberheide et aI., 2007).

(32)

2.4.1 Query Process

When the client requests any domain name, such as the USM domain name WWw.usm.my. it first sends a query to a local DNS server to ask about the IP address for this domain name. The local DNS server checks its cache for the requested domain name and if the answer exists in its cache, then the local DNS server replies to the client (Mockapetris, 1987; Davies, 2006). But if the answer does not exist in the cache, then the local DNS server asks the Root DNS server about the requested domain as shown in Figure 2.5.

Query for .www.usm.my

Root Name Server

Respond 10.202.1.4

Authoritative DNS server

61·

,,;,;'.···

,,:':d-->

, )'

Figure 2.5: DNS Query Process (Davies, 2006)

The Root DNS server replies to the local DNS and points the local DNS server to ask the .my name space which replies and points the local DNS server to ask the authoritative DNS server that knows the IP address of the requested domain name. After that, the authoritative DNS server replies to the query to the local DNS

(33)

server who caches the answe'r in time determined by Time To Live (TIL) field and replies the answer of the query to the client (Mockapetris, 1987; Davies, 2006).

2.4.2 DNS Weakness

The DNS is a distributed hierarchical database that can be accessed by anyone. It is not centrally controlJed and does not belong to any specific organization. It was earlier designed to map the domain names to the IP addresses and vice versa. But unfortunately, the DNS traffic flows between the client's resolver and DNS server without any protection or restriction from the firewall. Hence, it can be easily captured by the attackers and exploited to perform their malicious activities (Castillo-Perez and Garcia-Alfaro, 2008). The DNS was used since 1990 for malicious attacking and this proves the weakness of the DNS. This also confirms that the attackers could exploit the DNS to perform the attacks (Bellovin, 1995).

2.5 DNS Utilization for Malicious Botnet Activates

The Botnet has become a significant part of the network and has utilized the DNS just like any other legitimate host. The botmasters exploit the DDNS to revitalize the life of the Botnet every time when the C&C server is blocked and continuously perform their malicious activities (Heron, 2007).

It is difficult to distinguish between the normal traffic caused by the legitimate applications and the abnormal traffic caused by the Botnet activity, because less information is provided about the domain name that is under the control of the botmasters (Tu et al., 2007). On the other hand, the Botnet can be detected by monitoring the DNS traffic and the host's activities in the network (Wills et aI.,

(34)

2003; Kristoff, 2004). There are several locations where Botnets exploit the DNS query:

(i) Most of the malicious activities such as ODoS attack are joined together to perform their activities as one group by using the DNS query. So, when the new host is infected with Bot, it will communicate to the other infected hosts by using the DNS query (Cooke et ai., 2005).

(ii) If the domain name is moved to new the C&C server, this changes the IP address of the C&C server. Hence, the Bots cannot connect to the previous IP address of the C&C server. In this case, the Botnet sends ONS query to connect to a new C&C server (Zou and Cunningham, 2006).

(iii) Sometimes, the botmaster uses only one compromised host as a sensor and continuously sends ONS query to test whether the C&C is available or blocked.

If the test passes, then the botmaster instructs all the Bots to perform its nefarious tasks. However, if there is no responds from this sensor, then the botmaster updates all the Bots and points them to go to the new domain name that are under his control, by using the ONS query (Zou and Cunningham, 2006).

(iv) The Botnet can trick the legitimate users to visit the botmaster's domain name instead of the legitimate web site - this is called phishing and it is considered as one of the Botnet's financial crimes (Gu, 2008). In this case, the botmaster poisons the DNS server cache that belongs to the legitimate domain

(35)

that he intends to attacK with a fake IP address that belongs to his site. Any client who queries the legitimate domain name, where its IP address has already been replaced with the fake IP address wiIJ open the botmaster's domain name. Consequently, the botmaster steals the user's information.

Moreover, this host will also be infected with the botmaster's malicious software and is taken under the control of the botmaster (Heron, 2007).

(v) A large number of Botnet can generate DNS amplification attack by using the DNS query. The Botnet queries the DNS server and replaces the source IP address with a spoofed IP address for the target victim. The DNS server replies the target victim instead of the queried Botnet hosts that causes the DDoS attack which brings the server down (Freiling et al., 2005).

2.6 Related Work on DNS Monitoring

Botnet utilizes the DNS queries to locate the C&C server. Therefore, by monitoring the DNS, it is possible to detect the Botnet activity in DNS (Kristoff, 2004). There are some research focusing on the DNS monitoring such as Oberheide et al. (2007), but do not basically detect the infection of Botnet. These researches generally track and measure the Botnet to understand its technology and characteristics. Here are some previous works related to this research.

Kristoff (2004) wrote software, which is called the "ON Swatch". This study depended on the prior knowledge of the blacklist servers that spread malicious software or any server that connect to these blacklisted servers in the network. The purpose of DNSwatch is to analysis the structure of DNS logs of the local ONS

(36)

servers to detect the infected' hosts with the Bot. This softWare only works in the event the infected hosts use their local DNS servers to resolve the DNS names.

Weimer (2005) conducted a study that monitored the DNS traffic. The study was in passive DNS replication, at the University of Stuttgart. The project is called the "dnslogger ", which compounds of some sensors that spread in the network.

These sensors capture the reply of DNS queries from the clients. All query data such as domain name and IP address are stored in the database for analysis. The purpose of it is to build a reverse lookups with IP addresses for which no PTR records exist.

By doing so, it will be easy to detect any domain name used to contact a system on the Internet. The most important result of this study is Botnet detection and some DDoS which use the abnormal DNS resource that records RR.

Dagon (2005) concentrated· on identifying Botnet on the C&C servers by monitoring the domain names which have an abnormally high or temporally dynamic DNS query rates, because high query rates indicate the presence of Botnet activity.

The study was based on the measurement of the request rate of canonical DNS as well as by comparing the DNS density. The proposed method is called "Canonical DNS Request Rate" (CDRR) which collects all query rates of the second level domain called SLD with the query rates of the SLD's children of the third level 3LDs, and then calculates this ratio. By using the Chebyshev's inequality, he suggested that when the CDRR of a name is anomalous, then it has an abnormally high query rate which indicates that it belongs to a Botnet C&C server.

(37)

Schonewille and Van Helmond (2006) proposed an approach based on the abnormal recurring of Non-Existent Domain Name (NXDOMAIN) reply rates. In order to do this, they use similar algorithm used by Dagon (2005) to classify the query rate. They observed that the DONS response indicates name error NXDOMAIN and this is related to the C&C servers controlled by the Botnet. Any infected host with Bots tends to send queries repeatedly and also have vulnerability to allow similar infection. This approach detects the abnormal domain name effectively and has less positive false.

Ramachandran e/ al. (2006) proposed technique and heuristics by utilizing the DNS Blacklist (DNSBL) lookup traffic to identify the Botnet, where the technique performs counter-intelligence that detects DNSBL inspection on the Botnet activity group that spreads the mail Spam.This technique depends on the botmasters where they perform lookup against the DNSBL to verify the status of their Bots' blacklist and check the machine that sends multiple queries to many other hosts. The heuristics work as active counter-measures to detect the inspection activities in real-time. However, this detection technique does not disrupt the activity of Botnet because it does not need any direct communication with the Botnet component. This technique works well, however it can generate many false positives due to the active counter-measures such as inspection poisoning.

Choi et al. (2007) proposed an anomaly approach based on the Botnet detection algorithm by monitoring the DNS traffic and used the group activity feature of Botnet in DNS traffic to distinguish between the DNS traffics in the network. They developed another algorithm that detects the Botnet migration

(38)

frequently from one C&C server to another when this server is blocked by the administrator monitoring system. The study could also detect the Botnet with the encrypted channels because it is dependent on the information of the IP headers. The proposed method is very strong and is able to detect any type of Botnet in DNS because it depends on the group activity feature of the Botnet.

Tu et al. (2007) conducted a study to identify the activities of Botnets by mining the DNS traffic data. This is done by making a system connected to routers by using optical splitter to classify and filter the DNS traffic data and store it in a database for more analysis. They collected the data, and took a period of DNS traffic as training dataset. They created a profile defining a suspicious domain with some prior knowledge and compared it with the current domain name. The results showed that there are many suspicious domain names, which are clearly recognized as abnormal domain controlled by a Botnet, and the domain names have less query time. They also found many MX queries classified as suspicious domain, because their percentages are more than other types of queries.

Villamarin-Salomon and Brustoloni (2008) evaluated the study by Dagon (2005) on CDRR that depends on the DDNS to identify the Botnet C&C server with the rate of abnormal query. They also evaluated Schonewille and Van Helmond (2006) study that depended on recurring of the NXDOMAIN reply rates. They performed experiments and captured DNS traffic at the University of Pittsburgh for 192 hours. They discovered that Dagon's approach did output too many false positives, where it classified the legitimate domain as C&C, so the popular site such as Gmail that has low TTL or DONS is classified as abnormal domain. On the other

(39)

hand, they discovered that Schonewille and Van (2006) approach detected several abnormal domain names effectively and generated probably less false positives, as the replies ofNXDOMAIN are more related to DDNS than to other domain names.

2.7 Critical Analysis

Kristoff (2004) conducted a study that monitors the DNS, in order to detect the Botnet with prior knowledge for the blacklisted servers that spread or connect to malicious malware. However, Choi et al. (2007) criticized this idea because this approach can simply evade when the botmaster knows this mechanism, hence it could be easily tricked by using fake DNS queries. The study by Choi et al. (2007) detects the Botnet by exploiting the group activity feature of the Botnet and evaluating the relationship between Botnets in each group. This approach is stronger than the previous approaches but the main weakness of this approach is when it is applied to large scale network as the processing time will be high.

Dagon (2005) discovered that the ratio of abnormal DSN traffic is high compared to the others and this indicates the presence of Botnet activity, but Villamarin-Salomon and Brustoloni (2008) criticized the study by Dagon (2005) on CDRR as they found out that the CDRR may generate false results and could classify the legitimate domain name as abnormal domain name. Moreover, the technique by Ramachandran et al. (2006) also generates false positives due to the active nature of counter-measures such as inspection poisoning; besides, this approach could not detect the distributed inspection.

(40)

Schonewille and Van Helmond (2006) proposed an approach based on the recurring of NXDOMAIN reply rates, but Villamarin-Salomon and

Il","U.JlUJ'U (2008) evaluated their study and discovered that the approach could detect

several abnormal domain names effectively and generate less false positives.

MeanwhHe~ the proposed approach in this thesis does not require any prior knowledge of the blacklisted server to classify the DNS. Besides, it also does not depend on the high ratio of DNS traffic to detect the Botnet. However, it depends on exploitation of the Botnet's behavior in the DNS traffic, particularly the appearance of Bot net as a group periodically.

The probability of Botnet detection can be obtained by measuring the ratio of similarities between any blocks of the hosts that requested the same domain name.

Hence, this study could provide better results than previous studies and avoid the false results that appeared in previous studies.

The works of some researchers discussed in the prevIous section are summarized in Table 2.2.

(41)

Dagon

Schonewille and Van Helmond Ramachandran

et al.

Choi et al.

Tu et al.

Table 2.2: Summary of Related Work on DNS Monitoring

2005

2006

2006

2007

2007

CDRR

NXDOMAIN

DNSBL

Group Activity

Detects Botnet with The approach can prior knowledge of simply evade by server that spreads using fake DNS or connects to queries.

malicious malware.

Discovered that the ratio of abnormal traffic is higher compared to others and this indicates the presence of Botnet activity.

Detects several abnormal domain names effectively.

Uses DNSBL (DNS blacklist) counter- intelligence to locate the Botnet

members that

generate spam.

Detects the Botnet in DNS by

exp loiting the group activity features of Bot net.

Generates false results and could

classify the

legitimate domain name as abnormal domain name.

Generates less false positive results.

Generates false positives due to active nature of counter-measures such as inspection poisoning; besides, this approach cannot detect distributed inspection.

When it is applied to large scale

network, the

processing time will be high.

DNS Mine Detected the Botnet Generates less false by mining the DNS positive results.

traffic.

(42)

2.8 Summary

Internet attacking has increased rapidly, and the attackers are continuously developing new tools or techniques to help them compromise large number of systems in the networks. The Botnet is one of these tools or techniques. Botnet is a group of compromised computers called zombie or drone, which is controlled remotely by a botmaster via C&C server and used to perform many malicious activities such as: Spam and DDoS attack.

The Botnet relies on C&C server to receive further instructions from the botmaster. However, if the C&C is blocked by the administrator monitoring system, the botmaster will comprise new C&C and the Botnet queries the DNS server to locate the botmaster's domain name and the DNS server replies to the Botnet query without distinguishing the source of the query.

DNS is a distributed database spread over the Internet which translates the domain name into IP address and vice versa. The data flows in DNS between clients and server without any restriction or protection which makes them vulnerable to be captured and exploited by the botmasters. Due to importance of DNS in the Internet, the Botnet utilizes it to perform its malicious activities. Besides, the legitimate hosts also utilize the DNS to perform their job correctly. In this case, it is difficult to distinguish the traffic that belongs to Botnet activity. Several past researches have focused on monitoring the DNS traffic to detect the Botnets. As such, this research focuses on classifying the DNS traffic and detecting the abnormal DNS traffic issued by the Botnet activity.

(43)

CHAPTER 3

BOTNET DETECTION MECHANISM (BDM)

3.1 Introduction

In this chapter, a mechanism is proposed to monitor the DNS traffic to identify and detect the abnormal DNS traffic caused by Botnet activity. DNS is considered as an important part of Internet; hence the Botnet utilizes it to perform its malicious activities.

The proposed mechanism refers to monitoring and capturing the DNS traffic at different time intervals t;, and measure the ratio of similarity between any two blocks of hosts X and Y (group behavior) requesting the same domain name at time intervals til and t;2' For this purpose, the similarity measurement formula is used and the MAC address is also used as the host's identifier.

3.2 Jaccard Similarity Coefficients

There are a number of similarity measurement formulas exist, which measure the ratio of similarity between two individual objects, X and Y. These formulas measure on the binary term vector basis and the similarity value standardized between 0 and 1. In this research, the Jaccard similarity coefficient S j (Kim and Choi, 1998; Rieck et al., 2006) is chosen because it is simple and provides good results. Jaccard similarity coefficients consist of three summation variables: X, Yand Z as shown in Equation 3.1:

(44)

s =

z

J z+x+y (3.1)

Z is the number of similar elements that are in both two objects X and Y.

X is the number of elements in the first object X only but not in Y.

Y is the number of elements in the second object Yonly but not in X.

It can also be represented as sets of objects, considering that set X is the first block of hosts and Y is the second block of hosts:

The Jaccard similarity between any two sets of objects is defined as the size of the intersection between the two sets divided by the size of the union between the two sets (Kim and Choi, 1998). A Jaccard similarity coefficient for the two sets is represented by Equation 3.2:

s

(X Y)= Xny

J ' XUY (3.2)

Whereas: X

n

Y is the size of the intersection and XU Y is the size of the union as shown in Figure 3.1.
(45)

Figure 3.1: Similarity between Two Blocks of Hosts (Kim and Choi,1998)

The probability value of the Jaccard similarity coefficient S j used to match between two blocks of hosts is explained in Table 3.1.

Table 3.1: Jaccard Similarity Values (Kim and Choi, 1998)

Similarity ratio between all in

100%. So this domain name is an abnormal domain name issued by Botnet activity.

There is an assurance that 80% of the hosts make association in a direct or indirect relation. It is a good value for this research (considering false alarm rates). Hence, this domain name is an abnormal domain name issued by Botnet activity.

The similarity ratio is less than 80% as it cannot be stated exactly that there is a similarity between the two blocks of hosts. So this domain can classify as normal domain name.

There is no similarity between hosts in the two blocks and consequently the domain name is normal domain name.

To apply the Jaccard similarity values between the two blocks of hosts, the MAC address is a preferred choice as the host's identifier to match between the hosts

(46)

in both two blocks rather than the IP address. This is because the botmaster exploits the feature of dynamic IP that may hide the identity of the infected hosts with the Bot. Consequently, it is difficult to place this IP on the blacklist. The Dynamic Host Configuration Protocol (DHCP) assigns multiple dynamic IP addresses to the unique host.

The dynamic IP address is assigned for wireless connection, dial-up connection, and DSL connection. Any infected host such as laptops can move from one network to another with new IP address assigned to it each time it connects to a new network. This forms the host's identifier by tracing the TP address aliasing and generating false information about the activity ofthis host (Xie et al., 2007).

By using the MAC as host's identifier to identify the activity of the hosts, accurate results could be obtained even if the hosts move from one location to another, because the DHCP cannot act on the MAC address. Moreover, the infected hosts that caused this abnormal traffic can be detected. However, the MAC address spoofing is not taken into consideration, because any Bot infected host would want the reply back to itself when sending a query to the DNS server. In the case of spoofing, the reply is sent back to different host which is out of the scope of this research. The spoofing takes place in another location such as DDoS attack.

3.3 Monitoring Normal and Abnormal Behaviors

The ratio of abnormal traffic in C&C server appears to be higher compared to the normal DNS traffic due to Botnet activity (Dagon, 2005). This abnormal DNS

(47)

traffic appears only in a short and discrete time, but the activity of a legitimate host appears for a longer and maybe continuous time as shown in Figure 3.2.

Nonnal Traffic

• Appeared with Random Ratio

• Abnonnal Traffic Appeared in Fixed

Ratio

Figure 3.2: Normal and Abnormal DNS Traffic

Monitoring the normal and abnormal behaviors is considered as one of the key that leads to the detection of the Botnet activities. The botmaster are always not attempting to risk or expose the Botnet activity. So the botmaster instructs alI the Bots to perform their malicious activities simultaneously as groups in a short and discrete time and then stop all these activities suddenly, and so on, as show in Figure 3.2. Taking this important behavior into consideration, the detection of Botnet can be made possible.

3.4 The Proposed Method for DNS Monitoring

This method depends on DNS traffic monitoring for a certain amount of time Tm; this time is divided into different time intervals til and t;2. This makes the implementation easier due to the huge number of DNS traffic captured by this method. A relationship is formed between any two blocks of hosts (group behavior) requesting the same domain name. The probability of Botnet detection between these two blocks of hosts is calculated by using Jaccard similarity as shown in Figure 3.2.

(48)

The probability of Bothet detection

P"o/s

is possible if the size of block X and Y is not equal to zero and if the DNS ratio R DNS in monitoring time

1',,/

is also greater then zero. Hence, the Equation 3.3 is suggested in this research:

IIXIIAndllyll:F-

0

(3.3) RDNS in Tm >0

Assuming that there is a domain name requested by a block of hosts (X) at time interval til and the same domain name is requested again by a block of hosts (Y) at time interval t i2' as shown in Figure 3.3. By calculating the Jaccard similarity coefficients S j between the sizes of the two blocks of hosts X and Y, the domain name can be classified based on value between 0 and I as normal or abnormal domain name that belongs to Botnet activities and consequently detect the Botnet.

Block

I

Calculate

of Host;---''' 41---- s

j

.. -111

Another domain

.~names

tit \..

D O n 10 0 • onn I

Inn.

./ t i2

Same

DOJ~

Name )

~---~---~

Tm

Figure 3.3: Applying Jaccard Similarity between Two Blocks of Hosts

For instance, assume that there are blocks of hosts X of size 9 requesting a domain name at t it and another block of hosts Y of size 9 requesting same domain name at t i2 as shown in Figure 3.4. By applying the Jaccard similarity coefficient

(49)

S j and using the MAC address as host's identifier the value of S j is obtained as follows:

Hosts of block X ={ 00-FB-03-32-15-1 C, OO-CC-AB-l 0-11-20, 02-FF-C4-22-10-EE, 00-E5-C6-10-02-AA, 00-lF-IE-E2-48-5C, 03-AA-00-3F-2A-IC, 03-20-1B-2E-08- EE, 05-30-10-21-10-1 A, 06-01-02-01-3C-4B }

Hosts of block Y ={00-FB-03-32-15-1C, OO-CC-AB-1O-Fl-20, 02-FF-C4-22-1O-EE, OO-E5-C6-10-77-AA, 00-IF-IE-E2-48-5C, 03-AA-00-3F-2F-IC, 03-20-IB-2E-08- EE, 05-30-1O-21-1D-IA, 06-01-02-01-3C-4B }

There are 5 similar MAC hosts in both blocks, X and Y. There are 4 MAC hosts in block X only but none is found in block Y, and there are 4 MAC hosts in block Y only but none is found in block X. By applying Jaccard similarity, the obtained results are as shown below:

S

=

Z

J z+x+ y

S. = 5 = 0.38

J 5+4+4

The ratio of similarity between the hosts in the two blocks is 38% and this indicates a normal domain name which belongs to a legitimate host as shown in the Figure 3.4.

(50)

Blocks of Hosts

Similar MAC

II

Addresses

BlodtX

Apply Jaccard coefficient Between Blocks X and Y Using MAC Address as

Host's Identifier

Same Domain Name Requested by Two Blocks of Hosts

BlockY

Figure 3.4: Similarity Measurement Based on MAC Address

3.5 Botnet Detection Mechanism (BDM) Framework

A simple mechanism framework is formed to classify the ONS traffic and detect the Botnet activity in ONS; it is called Botnet Detection Mechanism (BOM).

The BOM consists of three main phases: capturing phase, analyzing phase, and classitying phase.

Each phase will be elaborated in detail to show the relationship between them.

The flow of data in BOM will be presented to illustrate the process, from capturing the ONS traffic to classifying it into normal ONS issued by legitimate hosts and abnormal DNS which is issued by botnet activity as shown in Figure 3.5.

(51)

Store as abnormal

domain name and

Send alarm

Filter network packets and extract DNS packet of type A

Extract from DNS packet: query name and MAC address

Insert query data into database

Read query data

I~~---I

!

Classify every query issued from single host or blocks of host

If two blocks of hosts request the same domain

name at time interval til and

',2

Calculate Jaccard Similarity S 1 between these two blocks of host

No

S j ~ 0.8 Sj ~OandS j <0.8

and S,

S J -< I

D A T A B A S E

Match the domain name requested by single host and its MAC with database results

Store as normal domain name

in Database

Abnormal domain

name Send alarm

(52)

3.5.1 Capturing Phase

BDM continuously receives huge number of network packets. However, in this research, only the DNS packets are required. For this purpose, the BDM filters the network packets arid captures the DNS packet that is required. These DNS packets consist of different RR type such as: A, CNAM, PTR, etc. as shown in Figure 3.6. In the proposed method, only the DNS packet of type A is required because it belongs to query domain name.

Figure 3.6: DNS Packets

The BDM filters the DNS packet by utilizing the iNetmon project that belo

Rujukan

DOKUMEN BERKAITAN

Figure 4.2 General Representation of Source-Interceptor-Sink 15 Figure 4.3 Representation of Material Balance for a Source 17 Figure 4.4 Representation of Material Balance for

The Halal food industry is very important to all Muslims worldwide to ensure hygiene, cleanliness and not detrimental to their health and well-being in whatever they consume, use

H1: There is a significant relationship between social influence and Malaysian entrepreneur’s behavioral intention to adopt social media marketing... Page 57 of

One effective technique for botnet detection is to identifY botnet C&amp;C traffic. However, botnet C&amp;C traffic is difficult to detect. In fact, since botnets utilize

Taraxsteryl acetate and hexyl laurate were found in the stem bark, while, pinocembrin, pinostrobin, a-amyrin acetate, and P-amyrin acetate were isolated from the root extract..

A report submitted to Universiti Teknologi Mara in partial fulfillment of the requirements for the Degree of Bachelor Engineering (Hons) (Civil) in the faculty of..

With this commitment, ABM as their training centre is responsible to deliver a very unique training program to cater for construction industries needs using six regional

5.3 Experimental Phage Therapy 5.3.1 Experimental Phage Therapy on Cell Culture Model In order to determine the efficacy of the isolated bacteriophage, C34, against infected