• Tiada Hasil Ditemukan

DISASTER RECOVERY WITH MINIMUM REPLICA PLAN

N/A
N/A
Protected

Academic year: 2022

Share "DISASTER RECOVERY WITH MINIMUM REPLICA PLAN "

Copied!
24
0
0
Tunjuk Lagi ( halaman)

Tekspenuh

(1)

DISASTER RECOVERY WITH MINIMUM REPLICA PLAN

BY

MOHAMMAD MATAR ALSHAMMARI

A thesis submitted in fulfilment of the requirement for the degree of Doctor of Philosophy in Information Technology

Kulliyyah of Information and Communication Technology International Islamic University Malaysia

NOVEMBER 2018

(2)

ii

ABSTRACT

Cloud computing has emerged as a new paradigm for hosting and delivering computing resources over the Internet. The cloud has become a dominant and preferred method to store large amounts of data and enable the sharing of that data among several users. It also enables the use of pay-as-you-go pricing models. Today’s cloud computing environment has required data centers to increase the amount of available storage. There are two main concerns with cloud storage: data reliability and cost of storage. This research proposed a data replication management in multi-cloud approach that determine the number of replicas (which should be less than 3) which reduce the cloud storage consumption while meeting the data reliability requirements.

Furthermore, it proposed a preventive approach for data backup and recovery aiming at minimizing the number of replicas and ensure high reliability for data before the disaster. The approach named Preventive Disaster Recovery Plan with Minimum Replica (PDRPMR) which is a cost-effective mechanism to reduce the number of replications in the cloud to be 1 or 2-replicas only without compromising the data reliability. The name PDRPMR originates from its preventive action checking of the availability of replicas and monitoring of denial of service attacks to maintain data reliability. Several experiments have been carried out to demonstrate that PDRPMR reduces the amount of storage space used by one third to two-thirds compared to typical 3-replicas replication strategies, which in turn reduces the cost of storage.

These two metrics have been used most frequently in the literature. In this thesis we focused on the critical factors that influence the Disaster Recovery (DR) plan including, minimizing storage cost, reducing Recovery Time Objective (RTO), ensuring high reliability rate and decrease the number of replicas to be less than 3 (typical number of replicas).

(3)

iii

ثحبلا ةصلاخ

ABSTRACT IN ARABIC

باحس تزرب ة

فاضتسلا ديدج جذومنك ةبسولحا ة

تحبصأ دقو .تنترنلإا برع ةيبوسالحا دراولما ميلستو

قيرط ةباحسلا ة

نميهم ة لضفمو ة يبك تايمك نيزختل ة

دع ينب تناايبلا هذه لدابت ينكتمو تناايبلا نم ة

تلا جذانم مادختسا حيتي هنا امك .ينمدختسم .راعسلأبا ةعوفدلما يعس

باحس مويلا ة

ئيبلا ةبسولحا ي ة

يمك ةديازل تناايبلا زكارم ةبولطلما ة

نثا كانه .ةحاتلما نيزختلا ا

باحس عم ةيسيئرلا لغاوشلا نم ن ة

:نيزختلا ةيقوثوم

فلكتو تناايبلا ة

ثحبلا اذه حترقاو .نيزختلا إ

راد ة جنه في تناايبلل لثامتلما خسنلا

تبااحسلا ةددعتم نم لقا نوكت نا يغبني تيلا( ةلثامتلما خسنلا ددع ددتح تيلا

3 )خسن نم للقت تيلاو

باحس كلاهتسا ة

يبلت ينح في نيزختلا ة

تابلطتم ةيقوثوم

ةولاعو .تناايبلا ىلع

،كلذ ثحبلا حترقا

لحا لىإ ةدلقلما خسنلا ددع ليلقت لىإ فدهي اهدادترساو تناايبلل يطايتحلاا خسنلل ايئاقو اجنه نىدلأا د

نامضو ةيقوثولما

جهنلا ناكو .ةثراكلا عوقو لبق تناايبلل ةيلاعلا ىمسلما

طخ ة ثراوكلا نم فياعتلا

نىدلأا دلحا عم ةيئاقولا لل

وه يذلا( ةلثامتلما ةخسن ةيلا

لاعف ة تاراركتلا ددع نم دحلل ةفلكتلا ثيح نم

في لا باحس ة نوكتل 1 وأ 2 لل ساسلما نود طقف ةلثامتلما خسن بم

ةيقوثو نم زاهلجا مسا عبنيو .تناايبلا

ةظفاحملل ةمدلخا نم نامرلحا تامجه دصرو ةدلقلما خسنلا رفاوت نم ققحتلبا ةقلعتلما ةيئاقولا هتاءارجإ ىلع ةيقوثوم دع تيرجأ دقو .تناايبلا

ة يمك نم للقي زاهلجا نا تابثلإ براتج ة

ةينيزختلا ةحاسلما

بسنب ةمدختسلما ة

ثلا لىإ ثلثلا ينثل

لثامتلما خسنلا تايجيتاترسا عم ةنراقلمبا ة

نم نوكتت تيلا ةيجذومنلا 3

خسن

،ةلثامتم فلكت اهرودب ضفتخ تيلاو

ة مظعم في يننثا سيياقلما هذه تمدختسا دقو .نيزختلا

.ةقباسلا تاساردلا هذه في

ةحورطلأا نازكر

ىلع رثؤت تيلا ةسمالحا لماوعلا ىلع

طخ ة ثراوكلا نم فياعتلا

في ابم فلكت ليلقت كلذ ة

،نيزختلا نم دلحاو

لا تقو لا شاعتن

،فدلها لدعم عافترا نامضو

ةيقوثولما

نم لقا نوكتل ةلثامتلما خسنلا ددع ضفخو 3

خسنلل يجذومنلا ددعلا(

لما

.)ةلثامت

(4)

iv

APPROVAL PAGE

The thesis of Mohammad Matar Alshammari has been approved by the following:

_____________________________

Ali A. Alwan Supervisor

_____________________________

Azlin Nordin Co-Supervisor

_____________________________

Norsaremah Salleh Internal Examiner

_____________________________

Siddeeq Yousif Ameen External Examiner

_____________________________

Jafreezal Jaafar External Examiner

_____________________________

Saim Kayadibi Chairperson

(5)

v

DECLARATION

I hereby declare that this thesis is the result of my own investigations, except where otherwise stated. I also declare that it has not been previously or concurrently submitted as a whole for any other degrees at IIUM or other institutions.

Mohammad Matar Alshammari

Signature ... Date ...

(6)

vi

COPYRIGHT

INTERNATIONAL ISLAMIC UNIVERSITY MALAYSIA

DECLARATION OF COPYRIGHT AND AFFIRMATION OF FAIR USE OF UNPUBLISHED RESEARCH

DISASTER RECOVERY WITH MINIMUM REPLICA PLAN

I declare that the copyright holders of this thesis are jointly owned by the Student and IIUM.

Copyright © 2018 Mohammad Matar Alshammari and International Islamic University Malaysia. All rights reserved.

No part of this unpublished research may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise without prior written permission of the copyright holder except as provided below

1. Any material contained in or derived from this unpublished research may only be used by others in their writing with due acknowledgement.

2. IIUM or its library will have the right to make and transmit copies (print or electronic) for institutional and academic purposes.

3. The IIUM library will have the right to make, store in a retrieved system and supply copies of this unpublished research if requested by other universities and research libraries.

By signing this form, I acknowledged that I have read and understand the IIUM Intellectual Property Right and Commercialization policy.

Affirmed by Mohammad Matar Alshammari

……..……….. ………..

Signature Date

(7)

vii

DEDICATION

This thesis is dedicated to my family

(8)

viii

ACKNOWLEDGEMENTS

First of all, I am gratified with the core of my heart to Almighty Allah who made it possible to complete this thesis.

I must acknowledge my work to my dear mother, wife and my family. Without their support, concern, and love, it was impossible for me to complete my Ph.D.

studies. I especially thank my wife who encouraged me to pursue my Ph.D.

I am also grateful to my supportive supervisor Assist. Prof. Dr. Ali A. Alwan and co-supervisor Assoc. Prof. Dr. Azlin Nordin Salleh who have continuously encouraged me throughout my research. I am especially thankful to my main supervisor Dr. Ali who guided me with great patience and keep persuaded me during my research. He made me learn many things as I enrolled in the Ph.D. program at IIUM with a weak research background. Thank you very much, Dr. Ali, for being my supervisor and mentor.

Finally, I wish to express my appreciation and thanks to those who provided their time, effort and support for this project. To the members of my thesis committee, thank you for sticking with me.

(9)

ix

TABLE OF CONTENTS

Abstract ... ii

Abstract in Arabic ... iii

Approval Page ... iv

Declaration ... v

Copyright ... vi

Dedication ... vii

Acknowledgements ... viii

List of Tables ... xii

List of Figures ... xiii

List of Abbreviations ... xv

CHAPTER ONE: INTRODUCTION ... 1

1.1 Overview ... 1

1.2 Problem Statement ... 3

1.3 Research Questions ... 5

1.4 Research Objectives ... 5

1.5 Research Scope ... 6

1.6 Research Significance ... 7

1.7 Organization of the Thesis ... 8

CHAPTER TWO: BACKGROUND AND LITREATURE REVIEW ... 10

2.1 Introduction ... 10

2.2 Cloud Computing Overview ... 10

2.2.1 Definition of Cloud Computing ... 12

2.2.2 Essential Characteristics of Cloud Computing ... 13

2.2.3 Service Models of Cloud Computing ... 14

2.2.4 Deployment Models of Cloud Computing ... 16

2.3 Disaster Recovery Overview... 17

2.3.1 Definition of Disaster Recovery ... 19

2.3.2 Types of Disaster Recovery ... 20

2.3.3 Importance of Disaster Recovery ... 22

2.3.4 Issues with Disaster Recovery ... 23

2.4 An Overview of Disaster Recovery in Cloud ... 24

2.4.1 Importance of Disaster Recovery in the Cloud ... 25

2.4.2 Issues with Disaster Recovery in the Cloud ... 26

2.4.3 Advantages and Disadvantages of Disaster Recovery in the Cloud ... 29

2.5 Data Reliability in the Cloud... 29

2.6 Data Disaster Recovery in the Cloud ... 31

2.6.1 Traditional Data Disaster Recovery ... 31

2.6.2 Data Disaster Recovery in the Cloud ... 32

2.6.3 Data Disaster Recovery Models in the Cloud ... 32

2.7 Previous Approaches of Disaster Recovery ... 34

2.8 Existing Studies on Disaster Recovery in the Cloud ... 36

2.9 Previous Works of Data Reliability in the Cloud ... 43

(10)

x

2.10 Previous Approaches of Data Management in Cloud ... 46

2.11 Previous Works on Data Disaster Recovery in the Cloud ... 49

2.12 Summary ... 51

CHAPTER THREE: RESEARCH METHODOLOGY ... 52

3.1 Introduction ... 52

3.2 Methodology of the Research ... 53

3.3 Data Replication Management in a Multi-Cloud ... 56

3.4 Data Backup and Recovery in Multi-Cloud ... 58

3.5 Performance Measurements ... 59

3.6 Cloud Simulator ... 60

3.6.1 CloudSim ... 60

3.6.2 CloudAnalyst ... 61

3.7 Implementation ... 64

3.8 Summary ... 65

CHAPTER FOUR: PROPOSED APPROACH FOR DATA REPLICATION MANAGEMENT IN MULTI-CLOUD ... 66

4.1 Introduction ... 66

4.2 Proposed Approach of Data Replication Management in Multi- Cloud ... 66

4.2.1 Proactive Replica Checking ... 68

4.2.2 Overview of the PDRPMR ... 69

4.2.3 Working Procedure of the PDRPMR ... 73

4.3 Optimization Algorithms in the PDRPMR ... 76

4.3.1 Minimum Replication Algorithm ... 77

4.3.2 Metadata Distribution Algorithm ... 79

4.3.2.1 The maximum capacity of the PDRPMR ... 79

4.3.2.2 Provision of Sufficient Data Reliability Assurance ... 80

4.4 Summary ... 82

CHAPTER FIVE: PROPOSED DATA BACKUP, RECOVERY AND SCHEDULING IN MULTI-CLOUD ENVIRONMENT ... 83

5.1 Introduction ... 83

5.2 System Model of the Proposed Approach ... 84

5.2.1 Architecture of the Proposed Approach ... 85

5.2.2 Data Backup Model ... 85

5.2.3 Data Recovery Model ... 90

5.3 Scheduling Strategy of the Proposed Approach... 93

5.4 Summary ... 97

CHAPTER SIX: PROPOSED SYSTEMS IMPLEMENTATION AND EVALUATION ... 98

6.1 Introduction ... 98

6.2 System Architecture of the Simulation ... 98

6.3 Experimental Settings ... 101

6.3.1 Experiment Evaluation Metrics ... 102

6.3.2 Experiment Evaluation Scenarios ... 102

6.3.3 Simulation Configuration ... 103

(11)

xi

6.4 Experimental Results and Analysis ... 104

6.4.1 Data Replication Results ... 105

6.4.1.1 Cost-Preferred Strategy ... 105

6.4.1.2 RTO-Preferred Strategy ... 107

6.4.1.3 Results Discussion for Data Replication ... 109

6.4.2 Data Backup and Recovery Results ... 111

6.4.2.1 Cost for 1 and 3-Replicas ... 112

6.4.2.2 Cost for 2 and 3-Replicas ... 113

6.4.2.3 RTO for 1 and 3-Replicas ... 114

6.4.2.4 RTO for 2 and 3-Replicas ... 115

6.4.2.5 Results Discussion for Data Backup and Recovery ... 116

6.5 Summary ... 118

CHAPTER SEVEN: CONCLUSIONS AND FUTURE WORK ... 119

7.1 Research Summary... 119

7.2 Conclusions of Research ... 119

7.3 Contribution of Research ... 121

7.4 Future Work ... 122

REFERENCES ... 125

LIST OF PUBLICATIONS ... 133

(12)

xii

LIST OF TABLES

Table 2.1 Standards Platform Recovery 20

Table 2.2 The Events Categories 21

Table 2.3 Advantages and Disadvantages of Disaster Recovery in the

Cloud 29

Table 2.4 Summary of Previous Approaches of Disaster Recovery in the

Cloud 42

Table 2.5 Summary of Previous Work on Data Disaster Recovery in the

Cloud 51

Table 4.1 Types of Metadata 71

Table 6.1 Simulation Setup Requirements 101

Table 6.2 Parameter Settings of CPs 103

Table 6.3 Simulation Parameters 104

Table 6.4 Latency Matrix Values (ms) 104

Table 6.5 Bandwidth Matrix Values (Mbps) 104

Table 6.6 Cost and RTO Results Comparison in Cost-Preferred Strategy 106 Table 6.7 Cost and RTO Results Comparison in RTO-Preferred Strategy 108

Table 6.8 Performance Values of Cost and RTO 112

(13)

xiii

LIST OF FIGURES

Figure 2.1 Cloud Computing Fundamentals 13

Figure 2.2 Layers of Cloud services 16

Figure 2.3 Deployment Model of Cloud Computing 16

Figure 2.4 Comparison Between Traditional and Cloud DR Models 18 Figure 2.5 Recovery Point Objective & Recovery Tine Objective 34

Figure 2.6 The Model of Disaster Recovery System 37

Figure 2.7 Typical Deployment Scenario 38

Figure 2.8 Disaster-CDM architecture 39

Figure 2.9 Deployment Architecture of Optimal 39

Figure 2.10 Framework of Disaster Recovery Assistance 40

Figure 2.11 Data Backup Process 44

Figure 2.12 Data Recovery Process 45

Figure 2.13 PRCR Architecture 46

Figure 2.14 Procedure Between DBMS/CSP 48

Figure 3.1 Methodology of the Research 56

Figure 3.2 The Proposed Mechanism Architecture 57

Figure 3.3 The Proposed Framework of Disaster Recovery 58

Figure 3.4 CloudSim Architecture 61

Figure 3.5 CloudAnalyst Architecture 63

Figure 3.6 The Process Diagram of Disaster Recovery Model 64

Figure 4.1 PDRPMR Architecture 70

Figure 4.2 Working Process of Proposed Mechanism 74

Figure 4.3 The Algorithm of Minimum Replication Algorithm 78 Figure 4.4 Pseudo Code of Metadata Distribution Algorithm 82

(14)

xiv

Figure 5.1 Data Backup Model 87

Figure 5.2 Flowchart of the Data Backup Model 89

Figure 5.3 Data Recovery Model 91

Figure 5.4 Flowchart of the Data Recovery Model 93

Figure 6.1 Flowchart of the Disaster Recovery Model 100

Figure 6.2 The Cost of 1, 2, and 3-Replicas Using the Cost-Preferred

Strategy 106

Figure 6.3 The RTO of 1, 2, and 3-Replicas Using the Cost-Preferred

Strategy 107

Figure 6.4 The Cost of 1, 2, and 3-Replicas Using the RTO-Preferred

Strategy 108

Figure 6.5 The RTO of 1, 2, and 3-Replicas Using the RTO-Preferred

Strategy 109

Figure 6.6 Cost ($) for 1 and 3-Replicas Using 3 Scheduling Strategies

With 500 Tasks 113

Figure 6.7 Cost ($) for 2 and 3-Replicas using 3 Scheduling Strategies

With 500 Tasks 114

Figure 6.8 RTO (ms) for 1 and 3-Replicas Using 3 Scheduling Strategies

With 500 Tasks 115

Figure 6.9 RTO (ms) for 2 and 3-Replicas Using 3 Scheduling Strategies

With 500 Tasks 116

(15)

xv

LIST OF ABBREVIATIONS

A/C Alternating Current

Amazon S3 Amazon Simple Storage Service

ARPANET Advanced Research Projects Agency Network

BC Business Continuity

BIA Business Impact Analysis CBF Critical Business Function

CI Checking Interval

CIS Set of Checking Interval values

CP Cloud Provider

CPE Cloud Provider have Enough space CPU Central Processing Unit

CSP Cloud Service Provider DaaS Database as a Service

DAR Data storage, request Allocation and resource Reservation

DC Data Center

DDP-DR Data Distribution Plan for multi-site DR Disaster CDM Disaster Cloud Data Management

DNS Domain Name System

DR Disaster Recovery

DRaaS Disaster Recovery as a Service DR-Cloud Cloud Disaster Recovery DRP Disaster Recovery Plan EC2 Elastic Compute Cloud EHR Electronic Health Record ERP Enterprise Resource Planning

ET Expected Time/Expected Storage Duration

GB Gigabyte

GFS Google File System

GRA Geographical Redundancy Approach GUI Graphical User Interface

HDFS Hadoop Distributed File System IaaS Infrastructure as a Service

IDEMA Impact of Decoupling and Modulation iSCSI Internet Small Computer System Interface

IT Information Technology

JTA Java Transaction API KaaS Knowledge as a Service MAO Maximum Acceptable Outage

MB MegaByte

Mbps Megabits Per Second NAS Network Attached Storage NetDB2 Network Database2

NetDB2-MS Network Database2 Management System NIST National Institute of Standards and Technology NoSQL Non Structured Query Language

(16)

xvi

OA & M Operations Administration and Management OLTP Online Transaction Processing

OMNet++ Objective Modular Network Testbed in C++

OS Operating System

OSM Organizational Sustainability Modeling PaaS Platform as a Service

PC Personal Computer

PDRPMR Preventive Disaster Recovery Plan with Minimum Replica PRCR Proactive Replica Checking for Reliability

RMAN Recovery Manager

RPO Recovery Point Objective RTO Recovery Time Objective RTT Round Trip Time

SaaS Software as a Service SAN Storage Area Networks SLO Service Level Objective SMB Small and Medium Business SOA Service-Oriented Architectures SQL Structured Query Language SSP Storage Service Provider

TB TeraByte

VM Virtual Machine

ZB ZettaByte

(17)

1

CHAPTER ONE INTRODUCTION

1.1 OVERVIEW

With the rapid growth of Internet technologies, large-scale online services such as data backup and data recovery have increased in recent years. Because these services require substantial networking, processing and storage capacities, it is a critical challenge to design large-scale computing infrastructures that support these services in a cost-effective manner. As a solution, cloud computing has been refined during the past decade and has become an attractive business for organizations that own large datacenters and rent their computing resources (Rimal et al., 2011; Tsai et al., 2010).

Cloud computing delivers numerous benefits, including reduced costs for data storage backup and data accessibility.

The essential cloud characteristic is its ability to store data while ensuring its availability, which is an important feature when storing sensitive information.

However, the rapid development of the scale and complexity of today's cloud services and infrastructures has also revealed important challenges regarding the design of fundamental cloud computing architectures. This is specifically concerning high data reliability requirements and storage costs.

Without considering data reliability, various studies on maintaining data reliability have focused on software. The majority of the proposed solutions suggest that the data must be replicated into at least three copies (3 replicas) to ensure high data reliability (Li et al., 2012; Gu et al., 2014). These replicas can be placed either in one location or distributed over multiple locations. However, this solution incurs high storage costs, consumes significant volumes of storage space, and causes high

(18)

2

network traffic, mainly for data-intensive applications in the cloud. Furthermore, the current approaches to data backup and recovery for single-cloud environments require vast amounts of storage space due to the creation of multiple replicas in numerous Data Centers (DCs) (Li et al., 2012; Gu et al., 2014; Sengupta and Annervaz, 2014).

Accordingly, the use of a single-cloud paradigm can generate risks, including hardware faults and software errors, natural disasters, and damage by human interference. These issues can lead to service disruptions or a total loss of data through a system collapse (Li et al., 2012; Gu et al., 2014; Sengupta and Annervaz, 2014).

Cloud computing development is not recommended without considering the risks, which may be particularly pronounced when only one DC is involved. Various Cloud Providers (CPs) address these risks via practical measures, including the geographic dispersion of data. However, DCs in different locations are still operated by a single-cloud service provider. They usually use the same infrastructures and software stacks and have similar or identical operational processes and management teams (Gu et al., 2014). Many surveys conducted over recent years have shown that enterprises and critical business organizations are moving from the single-cloud to the multi-cloud (Tebaa and Hajji, 2014; Sengupta and Annervaz, 2014). Moreover, using a minimum of two clouds (or more) is a way to reduce the risk of failure with regard to service availability, data loss, and compromised privacy, and using multiple clouds simultaneously can reduce the risk when using a public cloud for applications and data. The most common barriers to the adoption of the cloud are cost, security, reliability, and loss of control. However, the use of a multi-cloud environment can enable an organization to enjoy greater flexibility and control to decide which workloads will be run and where they should be run (Sulochana and Dubey 2015).

The overarching theme of this study focuses on the critical factors that influence the

(19)

3

Disaster Recovery (DR) plan, including minimizing storage costs, reducing the Recovery Time Objective (RTO), ensuring a high reliability rate and decreasing the number of replicas to less than 3 (the typical number of replicas) in a multi-cloud environment.

1.2 PROBLEM STATEMENT

In today’s business environment, the Information Technology (IT) data services operated by CPs face many challenges in ensuring the reliability of data services before and after disasters (Saquib et al., 2013). Data services must ensure reliability and flexibility through an effective and practical DR plan, which are vital initiatives for any organization to prosper and sustain growth (Saquib et al., 2013).

The main concern with DR in the cloud is how to ensure an effective data backup and recovery process that achieves high data reliability before a disaster while maintaining a reasonable cost (Saquib et al., 2013). Several solutions for data backup have been designed for a single-cloud architecture (Saquib et al., 2013; Suguna and Suhasini, 2014; Lenk, 2015; Jena and Mohanty, 2016). Accordingly, the idea of having only one copy of the data in a single-cloud environment may not be a good solution because any damage to the data in the case of disaster will result in a permanent loss (Tebaa and Hajji, 2014, Gu et al., 2014; Sengupta and Annervaz, 2014). Other solutions for developing a data backup and recovery plan involve multi- cloud providers in which multiple data replicas are generated for several remote CPs (Gu et al., 2014; Sengupta and Annervaz, 2014; Sulochana and Dubey, 2015; Toosi and Buyya, 2017). This approach guarantees high data reliability and minimizes the risk of data loss in case of disaster, thereby ensuring that user data are recoverable in the event of catastrophic failure.

(20)

4

According to Vukolic (2010), the main purpose of moving to a multi-cloud environment is to improve what can be offered by single-cloud by distributing data reliability among multiple CPs. The single-cloud is expected to become less popular with customers due to the risks of data service availability failure and the possibility of malicious insiders. DCs in different locations owned by one CP primarily use similar operational environments and infrastructures, which may affect the recovery of data services. For instance, if we entrust our data DR solution to a single-cloud provider that does not have a backup solution or that hosts the data in a single platform or in the same geographic area, the risk of downtime for customers, who might be unable to access their data for several hours, could increase.

Most proposed solutions assume that the data should be replicated into at least three copies (3 replicas) to ensure high reliability (Li et al., 2012; Gu et al., 2014; Li et al., 2014; Li et al., 2016; Du et al., 2017). These copies might be in one location or distributed over multiple remote locations. Nevertheless, these solutions incur high storage costs and consume a significant amount of storage space, which leads to high network traffic, particularly for data-intensive applications in the cloud (Li et al., 2012; Gu et al., 2014; Li et al., 2014; Li et al., 2016; Du et al., 2017).

Moreover, most of the previous approaches do not consider the required level of data reliability denoting if the data to be stored is critical or non-critical. Besides, the storage duration has not been taken into account whether the user wish to store the data for short-term or long-term when replicating the data across distributed CPs.

Thus, an efficient data backup and recovery strategy for DR in a multi-cloud environment taking into account the level of importance and the duration of storage must be explored. The solution should take into consideration the critical factors that

(21)

5

influence the DR plan, including minimizing storage costs, reducing RTO, ensuring high data reliability rates and decreasing the number of replicas (to less than 3).

1.3 RESEARCH QUESTIONS

In the following, we outline the research questions addressed in this research work:

1. What are the current methods available for data backup and recovery and for the maintenance of these services during disasters?

2. What are the limitations of the current methods used for data backup and recovery operations, and how does an effective and practical DR plan ensure the availability, reliability and flexibility of services?

3. Is it Possible to apply the current DR techniques designed for single-cloud environment to be used for DR in a multi-cloud environment?

4. How does the data backup and recovery process perform during disasters in a multi-cloud context?

5. How is the availability of services maintained and the continuity of these services ensured during disasters?

1.4 RESEARCH OBJECTIVES

The objectives of this thesis are as follows:

1. To design an approach for data replication management in a multi-cloud environment that determines the number of replicas (which should be less than 3) and reduces cloud storage consumption while meeting data reliability requirements.

(22)

6

2. To propose an approach for data backup and recovery for multi-cloud architecture with the aim of minimizing backup storage costs and RTOs and ensuring high data reliability.

3. To propose scheduling strategies that offer different data backup and recovery solutions based on user given criteria such as Cost, RTO and Cost/RTO.

4. To design and develop a framework for data recovery in a multi-cloud environment that provides solutions based on user preferences during disasters.

1.5 RESEARCH SCOPE

The scope of this research work is outlined in the following points:

• This research focuses on designing and developing a framework for data recovery in a multi-cloud environment to provide numerous solutions based on user preferences before and after disasters. Moreover, we examine the critical factors that influence a DR plan, including minimizing storage costs, reducing RTO, ensuring high reliability rates and decreasing the number of replicas to less than three (the typical number of replicas).

• Furthermore, we focus on issues related to data reliability services in a multi-cloud environment, including a new approach for cost-effective data reliability with minimum replicas and effective data recovery solutions before and after disasters.

• Because the data backup and recovery process require a significant amount of time, data often can be lost during disasters. Therefore, this

(23)

7

research considers the following two performance metrics to evaluate the performance of the proposed approach: the cost of backup storage and the RTO. These two metrics have been used most frequently in the literature (Sengupta and Annervaz, 2012; Saquib et al., 2013; Gu et al., 2014;

Khoshkholghi et al., 2014; Sengupta and Annervaz, 2014; Suguna and Suhasini, 2014; Alhazmi, 2016).

• The three primary DR levels are the data level, system level, and application level. The concern at the system level is data backup and recovery in the shortest recovery time, whereas the focus at the application level is on maintaining data reliability before and after disasters. Thus, this research mainly emphasizes the system and application levels (Prakash et al., 2012; Khoshkholghi et al., 2014).

1.6 RESEARCH SIGNIFICANCE

The aim of this research is to design and develop a framework for data recovery in a multi-cloud environment that provides numerous solutions before and after disasters based on user preferences. Hence, there is significant demand for a multi-cloud infrastructure that guarantees data reliability and ensures these services during disasters (Li et al., 2012; Gu et al., 2014; Li et al., 2014; Li et al., 2016; Du et al., 2017). This research also aims to propose a cost-effective approach that determines the number of replicas (which should be less than 3), thereby reducing cloud storage consumption while meeting data reliability requirements. In addition, it proposes various scheduling strategies that offer different data backup and recovery solutions based on user criteria such as Cost, RTO and Cost/RTO. These two factors cost and RTO are the most critical factors that influence the user when making a decision to

(24)

8

choose the best plan for data backup and recovery (Sengupta and Annervaz, 2012;

Saquib et al., 2013; Gu et al., 2014; Khoshkholghi et al., 2014; Sengupta and Annervaz, 2014; Suguna and Suhasini, 2014; Alhazmi, 2016).

1.7 ORGANIZATION OF THE THESIS This thesis is organized as follows:

Chapter 1 is an introductory chapter that discusses the problem statement, the research questions, the objectives of the research, the scope of the research and the research significance.

Chapter 2 is a background chapter that explains the fundamental concepts in DR. The chapter also introduces the main concepts of the preferred data DR techniques in cloud computing. In this chapter, various backup replica scheduling strategies that offer different data backup and recovery processes in a multi-cloud architecture are examined and extensively discussed. Also, it presents the fundamental concepts in DR and cloud computing. It also reviews relevant works by previous researchers on DR in cloud computing, including single-cloud and multi-cloud environment.

Chapter 3 depicts the research methodology of the thesis and describes how this research was conducted. The chapter also discusses the different phases in this research and the methodology followed during each phase. The measurement metrics and the datasets used in the experiments are presented.

Chapter 4 presents a detailed description of the proposed approach for data replication management strategy in a multi-cloud environment. This chapter also

Rujukan

DOKUMEN BERKAITAN

Hence, it is a high time for Malaysia, to integrate comprehensive competencies for disaster workers not only with the health and survivor training but as well as

To evaluate the mediating effect of affective organizational commitment between internal marketing practices (namely, internal communication, employee rewards, employee training,

Moreover, the results revealed that affective organizational commitment mediated the relationship between internal marketing practices, role ambiguity, organizational

To evaluate the mediating effect of affective organizational commitment between internal marketing practices (namely, internal communication, employee rewards, employee training,

Also, Waste material can be used for waste incineration with energy recovery, thus decreasing the greenhouse gas emission from energy utilization by

The most common data guard configuration is physical standby where it is an identical copy of block-by-block production database for a use of disaster recovery and opened as

To answer these research questions, a qualitative approach is most pertinent for studies in the area of recovery because of valuable lived experiences of persons in recovery and

RQ7: Does frontline hotel employees’ management commitment to service quality (service training, empowerment, rewards, teamwork and customer complaint management) have a

Stroke rate, maximum heart rate, drive to recovery phase ratio and VO2max showed statistically significant differences during 2 km rowing time trials on stationary versus

Reviews and discussions on performance of stand- alone energy recovery unit, split unit air-conditioning system and integrated energy recovery for split unit air-conditioning

Shorter larval developmental time, higher pupal recovery and bigger male adult size were observed when the larvae were reared in feed with 60% water

The RMB 4 trillion stimulus packages were a huge action plan in restoring market confidence and boost the economy but it is still questionable whether such

1) To determine the best combination of solvents for high yield of oil recovery. 2) To establish the best POME to solvent ratio for high oil recovery. 3) To characterize oil

(2012) on enhanced oil recovery using nanoparticles, silicon oxide NP dispersed in ethanol tends to enhance oil recovery through a change in rock wettability from water

The Malaysian delegation scheduled to attend the virtual APEC Economic Leaders’ Week (AELW) 2021, which starts from 8th to 12th November 2021, will call for stronger

14 Effect of salinity on recovery factor in tertiary recovery phase 15 Effect of salinity on oil production rate in tertiary recovery phase 16 First year oil

The study revolved around the current status of preparedness, on various issues including existence of disaster preparedness plan among academic libraries in

This research aims to investigate the effect of lockdown, economic stimulus package and national recovery plan announcements during the Covid-19 pandemic on the Malaysian

The main reason for the present study was to investigate the cold water immersion, active recovery and passive recovery effects on heart rate and blood pressure levels among male

Extract recovery, total polyphenol and flavonoid content The percentage of recovery, total phenolic and flavonoid con- tents of immature and mature silks using water, ethanol and

Based on Justice Theory, this research conducted in Malaysia employed a field study to investigate how customer evaluations of recovery efforts are influenced by interplay of

In conclusion, the aim of this study is to investigate the effects of foam rolling vs massage as recovery tools towards perception or recovery, flexibility, speed and

The recovery of veneer was also classified into log shape (Straight and curve). Straight logs generated more veneer recovery and less residue losses because they were easier to peel