Advantages of Cloud Computing and Virtualization
Background
Cloud computing has been used all over the world because of the versatility and flexibleness for Data storing and transferring the data keeping required secrecy. This technology provides demanded IT products and a service, where virtualization has a major role to play, that is distributes computing on internet and delivery of computer service provided over internet [94 p.164]. Virtualization in Cloud Computing provides services to the cloud clients. This is a brand new concept of business paradigm that has its basic concept in multi-tenancy, virtualization and shared infrastructure. The concept of IT directs to the configuring, manipulating and accessing of online applications, offering online storage, infrastructure and application. It is based of computing resources, combination of software and hardware which are delivered as network service [81]. Deployment Models and Service Models are two working model for cloud computing. Service model is also called the reference model. Service model is divided in to IaaS, PaaS and Saas service cloud.
Motivation
Cloud computing and virtualization startups have several reasons to move from an intranet based, models dependent of capital-purchase of infrastructure in IT to some cloud-based service and utility style demand [10]. This manages the shifting computing requirements by giving better flexibility in the service provided to the organization. The cloud based infrastructure is much more versatile from both service and security perspective.
Scalability in cloud computing provides greater flexibility by managing the shifting computer in requirements in the services they have purchased. This is a more versatile service in terms of scalability than in local, intranet-based infrastructure [14]. Reliability in the terms where the cloud vendors creates greater redundancy in the system than the organization can help in building onto the intranet, the cloud vendor hence spreads the infrastructure investment price across entire base of the customers allocating necessary resources [59]. Virtualization since the cloud-based IT infrastructure can be geographically dislocated and virtualized. It infrastructure startups have been freed from being considered as the physical location of its data centers and business operations decisions. Affordability within the traditional infrastructure startups is generally not received. Then marginal cost provided to the cloud computing features like enhancement of security or many other unaffordable services are offered for free to the startups using the cloud computing options [95 p. 169].
There are even security options are provided and somewhat are still required in the cloud services: User data hiding from other users (individual or organization) of the same cloud service, securing the data from the cloud provider itself, securing and hiding computation within several servers and securing the computation between unauthorized and unreliable parties [3].
Security Considerations for Cloud Computing
Research Aim and Objectives
Here, in this report we are going to discuss the IaaS approaches to migrating in cloud based organization database to the cloud, an approach to migrating database to the cloud service. Discussing the advantages and disadvantages of each approach and recommend with reason why this approach has been selected. Discussing further optimizing the resources of the compute by allocating it and considering the dynamic processing workload deserves investigation and finally the MapReduce programming model supporting that distributes computing large data sets on clusters of computing resources.
Research Questions
- Optimizing resources that have been allocated and then considering dynamic processing workload deserves investigation.
- An Infrastructure used for the scalability of computing resources.
- Effective sentiment analysis of twitter data by using map reduce programming
- Does the MapReduce program model supports the distribution of large dataset computing on the computer resources of cluster level?
Research Methodology
Cloud computing has been spoken as a vital topic because of the innovative implementation and approach. The most vital point about cloud computing was that, there was a presence of redefined computing that could be included in anything that people do or want to do. There is a consistent discussion on cloud computing such that to represent a shift in paradigm. In this chaptermain focus point is the implementation of specific procedure and framework which would be used in presence of the research work [46].Ones the security issue relate aspect would be resolved any other aspects are resolved, the cloud computing could enable the enterprise forexpandsion. Expansion can be in various fields such as its infrastructure, outsource the whole infrastructure, resulting in a better flexibility, a wider choice of resources related to computing and significant cost saving.What appears as a emerging is the taxonomy of cloud computing includes: infrastructure as a service (INFRASTRUCTURE AS A SERVICE); platform as a service (PLATFORM AS A SERVICE) and software as a service (SOFTWARE AS A SERVICE) [9].Consideration such as data privacy, network performance, security and economics are likely to be a donating sector in both within the organization and outside the organization. Cloud computing has gained acceptance and popularity. There are many concepts through which cloud computing can be related.
Research Outlines
The two types of research approaches are Inductive and deductive. The planning and framing out the appropriate theories is the main function of inductive approach, this helps in the fulfillment of the main objectives of the approach. Whereas, in the deductive approach the research support are generally conducted. Hence, the research theories are based on the previously studied theories and models [6].
At first the research subject has been covered by the researcher and then the researcher aims at planning and framing the exact model and theory related to previous study. The main aim of the exploratory design of research is to evaluate different context of the area in which research is to be done. The researcher adopts the inductive approach and then divides the research work into various parts. According to the data that could be defined in the terms of information sets, can be considered to be useful and relevant to the research. According to inductive approach of research, it has an effect over a longer time period [28]. Therefore, these collection of information can be utilized in the particular study is one of the most significant aspect conduct any particular work of research.
IaaS Approach to Migrating Databases to the Cloud
In the case of deductive approach the procedure of comparison are detected and are compared over the researches theory and hence frames new models or theories accordingly [16].This had stated that exploratory design usage does not ensure assurance may or may not be addressed of the conclusive solutions at the end of the research in particular.
Conclusion
In this chapter, we have discussed the basic concept of cloud computing and virtualization along with the need and requirement of this cloud services. Virtualization in Cloud Computing provides services to the cloud clients. This is a brand new concept of business paradigm which is based on the concept of multi-tenancy, virtualization and shared infrastructure. This chapter includes the motivation of this project analysis, along with research aim and objectives, the required questions and methodology for the research to carry onward.
IaaS Software
Definition of IaaS approach: Infrastructure as a Service (IaaS) refers to the type of cloud computing that can provide virtualization via computer resources over the internet. It resembles a self-service model that can access, monitor and manage any remote datacenter infrastructure. Like the computers that can stores, virtualizes network or networking services like the firewalls [68]. There is a chance of the customer buying IaaS platforms instead of using hardware on the basis of the outright consumption.
IaaS users as comparing to other approaches are responsible to manage the data, applications, middleware, runtime and OS system approaches. This platform provides offers to the database system, manages queries, servers, hard drives and network over the layer of virtualization. Here, user can install the required platform with IaaS infrastructure over the top [8]. On availability of new versions these users get responsible to update the system.
Software of IaaS provider combines or works solo for providing service to the organization or individual worldwide. Some of the examples of IaaS software:
- Microsoft Azure:This is a public cloud service providing platform and infrastructure. This offers wider array of integrated web service which can help in maintaining and building solution for any type and size of business [63 p-141]. Microsoft Azure is a combination of both IaaS and PaaS service.
- Google Cloud Platform:It refers to cloud-host platform that enables the users to use pre-defined web services and tools for creating web based solution for the business need. This platform helps to provide flexibility and scalability of IaaS, supporting both Google-offered web-service building blocksand third party software service and allocation.
- IBM SoftLayer:This cloud infrastructure have hardware-based servers, virtual servers and storage options are stored and delivered over the public, private and management network systems. This kind of infrastructure offers solution to the e-commerce, big data, government, digital marketing, gaming, private clouds and reseller hosting.
- CloudSigma: This resembles a cloud hosting service which hosts servers in the cloud, helps in manufacturing or construct a hybrid or cloud-based infrastructure. This helps in increasing the availability and performance without sacrificing visibility. This service transfers and migrates the existing system effortlessly to the clouds ensuring the visibility of the system and the data privacy and security with proper data protection standards.
- Oracle Cloud Infrastructure:The business uses this cloud service to run the workloads, replicate networks, VPNs, back up data offsite and much more.
- Amazon Web Services: AWS or the Amazon Web Services is a IaaS based platform that the business of the organization can use to create web-based solution. It provides wider variety of individual services as well and combines together to cover the business need like content delivery, database management, storage, networking, mobile services, analytics and more. AWS helps in constructing the solution they need for the business in aspect of IaaS, leaving the physical management of infrastructure to the Amazon. AWS provides a marketplace where third-party providers offer AWS-integrated solutions [47].
There are several other software providing IaaS service like Rackspace, Rackspace, VmwarevCloud Air, Verizon Enterprise’s information technology (IT), SingleHop, ServerCentral, Citrix Workspace Cloud, IronOrbit, CDI (Computer Design and Integration), Virtustream, Interoute’s Infrastructure as a Service (IaaS) and Aviatrix [96 p. 64].
Amazon Services
As discussed in the previous section, Amazon Web Services (AWS) is a cloud based platform for manufacturing and constructing solution for business using the integrated web services [5]. This provides hardware infrastructure that may enable the business to spread wider. AWS other than web services also provides platform services like email, mobile development, analytics, calendaring and application testing. Increases the productivity and efficiency of the business development like the management tools, developer tool, security services, identity protection services and other application services. But there is a dedicated network connection between its services and your business [77]. Through Internet of Things (LoT) web services AWS provides web services by using chip-embedded objects to integrate the business. This is a global service provider located in data centers, compliances and security programs for ensuring the development in the business in industry such that the service can meet the regional and global compliance standards of government and industries. AWS can provide a Virtual Private Cloud (VPC), which logically isolates AWS space, so one can have complete control over the environment. This software also supports hybrid environment maintaining the system resources on the premises.
Optimizing Compute Resources in the Cloud
There are several benefits of AWS as this platform does not only provides IaaS offerings.
Access Credentials
There is AWS specific security credential that verifies the identification of individuals or of group weather they have the permission for the accessibility of the resources that have been requested by the security. In some aspects one can make calls to the AWS without security credential like downloading a file that has been publicly shared in Amazon S3 bucket [79]. There are Temporary Security Credential as well that creates and provides the reliable users with temporary security credentials which can be accessed by the AWS resources. These credentials almost work identically in the longer term access key having differences like: The temporary security credentials are short term credentials that can be configured from anywhere within a limited amount of time. After the expiry of the credential AWS does not recognizes and does not even allow any accessibility from API request [41]. Second, these credentials are not stored with the users but can be generated dynamically and hence provided to the users as per request and requirement. The users will have to request new credentials after the expiry of the old ones.
There are several advantages on using these Temporary credentials:
- Once does not require to embed or distribute long term AWS security credentials with an application.
- Users can access to their identical AWS resources without having definite AWS identity. These temporary credentials are made on the basis of identity federation and roles.
- These have less and limited life time period hence does not require rotating or explicit revoked when no longer needed or expires. After this credential expires, users can no longer use them. The life limit of the credential can be specified as per validity [93].
MapReduce and Apache Hadoop
MapReduce are frameworks by which we can write applications which helps in processing huge amount of data beside huge cluster of commodity hardware in some reliable manner [29].
Apache Hadoop refers to the open source framework of software used to store and process the datasets of big data with the help of MapReduce programming model. This is consist of computer clusters built from commodity hardware.
MapReduce
This is processing technique and a model program for distributing java based computing. This algorithm contains two major tasks Map and Reduce [21]. Map includes data sets and converts it to some other sets of data is generally broke into tuples that are referred as keys or value pairs in individual elements. Reduces task that intakes the output as an input from the map and combines the whole tuple of data into smaller sets of data tuples. The main advantages of MapReduce are that it makes the data processscale much easier over the multiple computing nodes [7]. There are three stages of MapReduce: the Map stage, the Reduce stage and shuffle stage. versatility overhead is likewise lower in these bunch designs. Scaling up is more mind boggling to execute than scaling out because of the exchanging of hub sorts. In this report there has been a presentation in MLscale, autoscaling agnostics of application that may require a very little amount of manual tuning and knowledge. As described neutral network is needed to build the performance of online application models. The component like the black-box modeling methodologies predicts the impact of the action of the state of the system. Along with this, the architecture implements Amazon AWS, by utilizing several numbers of resources. This will lessen the versatility speed and draw out the reconfiguration overhead. The blended scaling is the most troublesome one to execute yet offers the best adaptability to coordinate with the workload change.
MapReduce Programming Model for Distributed Computing
Apache Hadoop
As portrayed the Apache Hadoop programming is an open source library including programming administration that permits the dispersion procedure of substantial arrangements of information over the bunches of PCs by utilizing basic program models [19]. It has been intended to scale up from one-to-different machines that is, from single server to various different servers or machines, which offers neighborhood stockpiling and calculation.
Apache Hadoop by an Example
Apache Hadoop is consisting of two core components:
- The Hadoop Distributed File System or HDFS for short refers to the Distributed File System.
- Framework and API for MapReduce jobs [84].
Example: Hadoop Word-Count, This program consists of MapReduce jobs that count the number of occurrences of each word in a file.
Figure1: Basic working of Map and Reduce Tasks in a MapReduce Job
(Source: Mishra et al. 2013)
The literature review is basically stated before the start of the thesis paper/research for achieving knowledge and skills on the changing in the management organization process. The primary literature review turns data that is based on the required design the thesis paper. With help of this particular chapter, the researcher can evaluate the change process concept in detailed manner. In this literature review we are going to initiate our discussion over the topic Virtualization and Cloud Computing.
Background data on Cloud Computing
The cloud computing trend introduces various data and information platform in the cloud market place. Along with advantages there are several business risks and challenges involved to retrieve the business advantages. The cloud provider creates cloud-based solution that is provided to users to reduce the price, scalability and improvement of the service, flexibility and effectiveness of the business [78]. The main concept of the IaaS, SaaS and PaaS are to construct, buy and deploy for the business infrastructure. The large set of data and information cannot backup by the traditional data processing application software, since this platform lacks the capability to handle the Big Data. This Big data are used for the improvement of the capability of storage and a resource that takes place with other devices on demand are generally known as the Cloud Computing. This also refers to the internet and network that have the accessibility, configuration and manipulation over the online application. Cloud Computing distributes computing on internet and delivery of computer services over the internet. IT refers to the configuring, manipulating and accessing of online applications, which offers online storage, infrastructure and application. It is the combination of software and hardware based computing resources which are delivered as network service. Deployment Models and Service Models are two working model for cloud computing. Service model is also called the reference model. Service model is divided in to IaaS, PaaS and Saas service cloud.
Inductive and Deductive Research Approaches
Computer Scalability concept is the capability of a network or any system or even a process that handles the amount of work growing and the capability to accommodate the growth of the work flow. A system or a process is defined as scalable if it has the capability to give output fully under the increased load when the hardware is added [71]. The cloud provider middleware infrastructure is basically known as the ACID transaction, this makes it much easier to distribute the transaction. This term is more associated with four main properties: Atomicity, Consistency, Isolation and Durability. Atomicity, consistency, and durability has stronger goals in the transaction-processing system, whereas due to the potential impact of strictness isolation on performance isolation is often more relaxed.
The Cloud Computing scalability depends on the sharing of computer resources rather than getting a personal device or a native server for handling applications [67]. Most of the firms try to access the cloud service. For instance, focusing on the core competencies than caring about the infrastructure. The Google offers the platform of Google cloud which provides a cloud computing service. This service is hence runs the infrastructure that the users implements and the end products which are included in the YouTube and Google has been suspected [54].The management tools Google also provides the cloud services that are module like the Data analytics, Storage data and also machine learning. This is another example of the Cloud based service platform. This gives several benefits to the Cloud computing scalabilities such as the cloud security of the business system that concerns with the full cloud service, focuses on the cost of the business for the implementation of hardware servers and for the management of the host company to maintain the hardware and software system standardization throughout the business. Along with the benefits, there are some drawbacks as well like: at most third parties are used to host the business information, there is a lack of good and consistent internet connection within the organizations and there is always a fluctuating cost increase in the hardware systems every month. Since it is important to process large volume of data some of the recent studies revealed the use Hadoop [65]. Initially that was barely explored, for processing Big Data and such large amount of datasets are vital to be adjusted through computer resources that must efficiently handle the dynamic process of workload. This is mainly done because the traditional computer infrastructure cannot support the rapid increment of the data volume. Some of the related works for example Amazon Services, MapReduce and Apache Hadoop, will be discussed briefly in the next section
Motivation, Aim, and Objectives of the Research
Amazon Simple Storage Service
As an example of Computer Scalability working on the cloud platform are Amazon S3.Cloud-storage services that provides system interface decrease the complexity of the traditional direct hardware management. This kind of services eliminates the direct oversights which enables the users with high service level requirement. Several work in the literature have faced the security related issues like integrity, availability, privacy etc. but a few organization has focused on the network performance and performances of this service [42]. As proposed by some analyst on the network associated with the storage service offered by the Amazon S3, the impact of quality of service experienced by the users depends on the location and the configuration of the services. This shows a considerable change in the performance according to the customers and the cloud region depending on. On increasing the number of organization and users depending on cloud–based infrastructure pressurizes the growth and development of the services provided and applications [52].
Cloud storage generally called as storage as a service has increased popularity online services with a variety of backup data and archiving. In the paper of ‘On the Network Performance of Amazon S3 Cloud-storage Service’ [67]. The aim was to improve knowledge of the cloud-to-user performance in network experience while depending on the Amazon Simple Storage Services (S3) [71]. Amazon provides general purpose storage as a service by the S3 provider, where the customer data has been organized by means of the objects stored in the buckets. The bucket is placed in the cloud region or the storage regions, the cost of the customers also depend upon the regions according to the pay-as-you-go model. The global content delivery is known as the CloudFront (CF) that is offered by Amazon and hence integrates with S3 in order to distribute contents with lesser latency and higher speed to the end users. Data is hence distributed to the users by global networking provided by the Amazon edge locations spreading all over the world. CloudFront are leveraged by combining with S3. The study is based on the non-cooperative approaches. Thus, the example of Cloud scalability drives the contribution (i) Amazon S3 provides a general assessment; (ii) The set of factors that are been provided to the general users are examined for impacts that comprises the cloud region, users location and the size of the file stored. The cloud storage hence offers a conventional and fast way to achieve and hence share object of various nature. To hide the system implementation details and performance figures the high level management interface is used [50]. The case study is on the performance of the cloud-to-user network for Amazon S3 cloud storage services. The dataset obtained from the Bismark platform is provided for the assessment of the service performance.
Definition of IaaS Approach
The Big Data Analysis using Apache Hadoop suggests the various methods for the problem faced through Map Reduce framework over Hadoop Distributed File System (HDFS). Minimization technique is used by Map Reduce for file indexing along with mapping, sorting, shuffling as well as reducing the files. Big Data categories its data on the basis of Volume, Variety and Velocity since most of the data present are unstructured or semi structured. In the Traditional Data management system, warehousing have several shortcoming to analyze a large amount of data present [78]. HDFS and Hadoop by Apache are highly used for storing and managing the Big Data. Hadoop runs the commodity hardware and uses the HDFS storage architecture which is designed to store large amount of files and sequence of blocks. The DataNode in the cluster follows a strict write once algorithm for the decision making regarding the replication of all available blocks [50]. The HDFS cluster is implemented with the help of Hadoop architecture, distributing tasks over two parallel clusters having single server and two separate slave nodes.
Figure 2: HDFS Architecture
(Source: Mahmoud and Abdulabbas 2017)
On the dynamic processing workload, framework of an auto-scaling is hence offers to the automatically scaled cloud computing resources for any Hadoop cluster aiming for the improvement of the efficiency and performance of the data processing big geospatial [54]. The framework follows a predictive scaling up algorithms considering the proposed scale up time and hence accurately determines the number of slaves that must be added to the monitor of the white box based metrics. The scaling frameworks efficiency and feasibility are demonstrated by using a protocol based clouds.
MapReduce Functions effort-based
In the case study of Multiple MapReduce Functions for Health Care Monitoring in a Smart Environment [ 3]. The health care monitoring system finds a solution for the problem regarding aging. This analysis has been done for the improvement of the life style of the old people by helping them be at their own place other than hospital and other care houses. Resulting in decrease of the cost of the medical care for the government and family of the old people [54]. The health data must be collected from the Wireless Sensor Network (WSN) which is basically used for medical rehabilitation to get the accurate data. In Manviand Shyam, introduced some statistical techniques like simple linear regression based on the algorithm of Map Reduce (2014). Two case studies were invested aiming to detect the influential outliers and anomalies in data extraction of WSN,including:
Examples of IaaS Software
Case 1: An elderly lady living in a smart home, where data has been collected from the real environment. The data analysis has been done on the behavior change in the lady by modeling the data using simple linear regression model. This study predicts the start time and duration, of the person’s significant location within the house. This case study influences greatly on the impact of the model fitting by comparing the index of the distance values that exceed their threshold with the actual data.
Case 2: MapReduce algorithm is efficiently used to process in parallel a large dataset collected from different smart environment according to the case study of the lady.
Motivation of the Project
Since it gets difficult to process and store a large amount of data and manage processors and distribution in environment Map Reduce provides a lean solution to the problem by supporting the distribution and parallel I/O scheduling which is fault tolerant and supports scalability of cloud and large dataset in Big Data [55].Big Data analysis tool like the Map Reduce over the HDFS and Hadoop provides the organizations better understanding of the customers and marketplace.
The case study introduces a statistical technique such as simple linear regression based on MapReduce algorithm in a simple smart home. The main objective was to detect the influential outliers and anomalies that data extraction forms the WSN. MapReduce algorithm efficiently used to process to parallel the large dataset collected from a WSN as it reduces the implementation cost of processing such large volume of data [54]. To find some non moderate duration time the detection of influential outliers was done.
Conclusion
The chapter introduced a variety of cloud-based applications and services. Big Data Hadoop defines an open-source software framework which can process and store Big Data. Cloud Computing increases the probability by managing and increasing the utilized resources, lesser cost and has appropriate delivery resources that is enabled by the organizations to access data correctly [56]. The main purpose of cloud computing is thus to minimized the energy cost, maximize effectiveness, wide accessibility to the public need and maximize the capability of storing data. There is still a big issue on utilization of the capabilities and advantages of cloud. The technology and business models have not yet reached maturity and hence standards are being developed. The approaches give several advantage and disadvantages to the business and IT infrastructures. The case studies related to the computer scalability has two approaches: the first one is the scale-up which is preferably in the classical enterprise setting of relational data base model strategies, flexible the ACID (Atomicity, Consistency, Isolation, Durability) transaction [75]. And access the single node; and in the second scale-out developing a cloud friendly environment, executing the single server and done not allow any kind of multiple row or multiple transactions [89]. Scalability and Fault tolerance architecture enables the host internet applications with virtually limitless on-demand resources, such that to cope up with the unpredicted spikes in workload. Hence, in this current cloud environment SaaS and PaaS scalability provides the responsibilities and IaaS provides scalability parameters which are potently influences the application performances that must be defined by the customers. There is always a need of data to be scaled and its main perspective that are needed to be scaled are memory, networks, storage, IO Latency, Backups, Provisioning time and lot more.
Benefits of AWS
The main aim of the project is to implements a dynamic MapReduce cluster with the use of Hadoop. On the basis of load of the cluster, it will scale within itself. Both private and public cloud can be used to run the cluster. The cluster can be monitored and debug in real-time to find the autoscaling strategies [94]. As the public cloud provider the Amazon Web Services (AWS) and the private cloud software provides the Eucalyptus which are used for running the Scientific computers in the Clouds generally called the SciCloudhence this is how the private cloud has been provided.
The Apache Hadoop project has been chosen as the implementation of MapReduce. It is an open source software provider, the platform skeptics and there are several numbers of documentation provided for configuration, installation, debugging and for running a installation of Hadoop system [61]. Provision of more than one machine is also possiblein AWS rather than it is possible in SciCloud infrastructure. As the result, there is a production of quality on the AWS and hence the quality of the SciCloud is experimented. Since the SciCloud Eucalyptus has still activated betas phase. If the production level of AMIs is IaaS then it is easier to fine tune the system.
Autoscalable AMIs
The full form of AMI is Advanced metering system. This is comprised of digital/electronic software and hardware, it also refers to the total collection and measurement of the system that includes customers sites meters, network communication amongst the customers and servers [22]. This combines the internal measurement of data including the continuous availability of remote calculations. There is an allowance in the measuring the detailed, time based data and collection of frequency and hence transmission of this information to other parties.
The implementation was already done on AMAZON using EC2,EMR, CLOUDWATCH, S3, ganglia, huE, HDFS [30]. The project works have the comparison graphs in it as well. There is a requirement of almost two AMI files to run the Hadoop cluster, the AMI files have specific roles to play like one acts as the master another acts as the Hadoop slave. At the running time, we need only the Hadoop master and it has to be started as the single Hadoop has already been booted [43]. There is no requirement for the human intervention for running Hadoop cluster and monitoring the status of the cluster. First master instance are required to be started manually. Control Panel is used for running the instance on a separate machine and via HTTPis only accessible to the cluster [58]. It is important for the Hadoop cluster to be dynamic since on booting the new Hadoop slave node, the new node joins the running cluster after booting by itself. Control Panel is defines as the Web service provider because of the instances it runs continuously. The architecture following should be able to start their own Hadoop daemons like the slaves starts and slave daemons, master start and master daemons during the boot time informing the Control Panel of the status [17]. There is a requirement of the instances to have an ample disk space at the disposal of the same and the AMI’s are always expected to be larger enough for the Hadoop data. This system though slows down the significance of the booting of AMI [31]. AMI storages are needed to be external. AMIs does not depend on any specific Eucalyptus or AWS users. By just configuring the clusters, the credential of the users should be set up at the Control Panel. AMI’s in future may not require any human intervention to terminate or boot up them, including the Hadoop cluster to create or join any human intervention. Total processes are managed by the startup of AMI procedures and hence monitors the success, from the Control Panel [57].
Processing the large volume of data is crucial; due to massive data volume there are several challenges that are to be intrinsic complexity and higher dimensions of the data sets [32]. The traditional computing infrastructures always do not scale well because of the rapid increase of data volume. In some recent studies, that investigated the adoption of Hadoop to process big data, but there is still a question on how the adjustment must be done to the computing resources that efficiently handles the dynamic process workload, which were barely explored in the past [23].
To bridge up this gap, the IT organization created a Hadoop cluster to scale automatically in the cloud environment that allocates the exact amount of computing resources on the basis of dynamic processing workload. Indeed, University datasets were taken from twitter app to process the data and sentiment analysis will also be performed by using map Reduce programming.
Step1: Created a cluster to process the large volume of data on Amazon AWS
A potential solution has been offered by the Cloud computing to point out the automated size of cluster challenges within a virtual cluster that could be provisioned in few minutes. The providers of the cloud cluster typically makes Hadoop clusters available as the web service, allowing the users to determine the provision and hence terminate the Hadoop cluster through an interface that is web based [12]. For example, Amazon Elastic MapReduce (EMR) which is created to auto scales the cluster.
Step2: Created a cluster with three instances and which were run successfully.
It is important to know the optimization of the allocated computing resources by considering the dynamic processing workload deserves investigation. A threshold based scaling is proposed that adds more nodes automatically, on the basis of clusters present in the black box performance metrics that is in CPU and RAM as well as predefined threshold values [80]. These new virtual machines of scaling provisions loads average clusters exceeding a threshold and then automatically dictates these machines to the cluster [11, p.11].
Step3: Setting up the threshold values for auto scaling the cluster
It is important to set up the values of nodes, by using instance type, instance count, auto scaling, purchasing options etc.
Meeting the Requirements
The triggered based scaling up depends on the current workload [66]. There is a difficultly faced by the cluster as it is problematic to scale up as it requires more time tp provision new virtual machine (as minutes to hours depending on the cloud platform). Example: As it may take almost 10 minutes in the provision Virtual Machine on Amazon’s Elastic Compute Cloud (Amazon EC2) and add this VM to a cluster. The workload may get changed during the time period such that more VMs are further not needed [87]. The scaling down Romer suggested manually termination of the individual nodes. A definitive objective is to produce ordinarily acknowledged cloud benchmarks and testing procedures. The provisions that is stable under MLscale highlighting the benefits. There has been a discussion of investigating the use of MLscale to autoscale in the aspect of performance. Relevant metrics like CPI (cycle-per-instruction) can be included as input to the neutral networking model for detection of interface. CPI improves the accuracy in modeling by 6.7% on oversubscription of CPU in OpenStack setup. Modification of MLscale can be done for the long-term changes in the workload. There is a prediction that the components can be kept online in response to an increment of the errors on modeling. These undertakings are normally extendable from the cloud execution models being proposed.
It has been shown by using autoscalling of clusters we can run Hadoop jobs quickly and load/processing can be quickly shared between resources whereas with manually creation of Hadoop clusters would take more time [26]. In this way scalability of computing is used in Hadoop jobs.
The Master and Slave AMIs
There was no workload on the cluster so the 3 nodes cluster was being scale down to 2nodes cluster. Here in the project, Operating system Linux has been chosen [90]. For the auto scaling structure Cluster screen driver constantly gathers the ongoing workload data from the bunch of Hadoop by utilizing the Hadoop API (Application Programming Interface), including the running/pending employments, running guide/decrease undertakings and pending guide/lessen assignments. The images on AWS were built by using versions of Debian and Eucalyptus [36]. The latest version of Hadoop stable released has been used for debugging. Java HotSpot 1.6.0_20 is used for running the Hadoop. There are initially two AMIs for the AWS and Eucalyptus. First the Hadoop master and the second one the slave image, the differences are running numbers of services as per Hadoop [61].
Dynamic Scaling of a Hadoop Cluster
There are two kind of slave configuration in Hadoop. One specifies the entire slave IPs in a file on the master instance, providing a single location for the addition or removal of salves. This helps in controlling the life cycle of the slave demon in a central place [65]. A single instruction or command can start or stop the slave of the cluster in master server.
Figure 3: Startup of Master and Slave Instance
(Source: Foster et al. 2008)
Ample Disk Space
It is required to keep the AMIs as short as possible, for that there is a usage of external storage that facilitates the need of the data storage. EBS volumes are used for this purpose. The control panel handles, creates and manages the EBS volumes. The instances communicate with the Control Panel via web services that requests the volume [12]. While the instance boots it notifies the Control Panel that needs the external storage. The volume is created by the by the Control Panel and then attach that to the instances that has been requested by the volume. The instance triggers the Control Panel unless it gets noticed that the volume has successfully been attached and created [40]. The instance mounts the disk attached format the disk along with the file system and then creates the necessary folder structure for Hadoop. Then the Hadoop services are started. Due to limitation of the size to the AMI files, this approach has been used only in theSciCloud. AWS uses 20GB EBS AMI files and does not require any external storage.
Hadoop Load Monitoring
The load and the size of the Hadoop cluster in the practical according to the jobs can be very dynamic at times, that the Hadoop runs the load of the nodes can change [1]. Focusing on the actions of the autoscaling framework the cluster changes the size. This size and performance of the cluster are to be monitored. The nodes send a heartbeat to the Control Panel every minute, representing the insight of the slaves of the clusterand the IP of the current master (Al 2013). The heartbeat and the system performance matrices are gathered separately, replacing one system by not affecting others. After collection of metrics if the matrices are not enough the data are needed to be visualized, store the historical data, run the companions and monitor the loads of the cluster in real-time.
Heartbeat
It refers to the signal used between the data node and name node or between the task tracker and the job tracker, if then job tracker or the name node does not respond to the signal, thus not considered as any issue with task tracker or any data node. Here, Name Node is the master and the heart of the Hadoop, maintains the system of namespace of hadoop, name node stores the metadata of data blocks that the data are permanently stored on local disk and it reduces the space in disk as well [4]. The DataNode are referred to the slaves, who are used in the actual storage and the works of the DataNode are totally based on the instruction of NameNode.
The hubs are set up for running the wget against the control board each moment by means of cron. The Boolean banner and the IP of the hub are speak to ted as the ace or not implanted into the demand as GET parameter [3]. At the Control Panel the demand has been logged and the IP are gone into a database utilizing an ace banner.
System Metrics
Metrics are represented as the statistical information exposed by the daemons. These are used for performing tuning, monitoring and debugging. By default there are several numbers of metrics available and are quite vital for troubleshooting.
In any thesis the metrics are used for identifying application hangings, resource hogings and also localizes the faults to the subset of the slave nodes within Hadoop cluster. For example: Cluster had been scale up to 4 nodes based on the data, Cluster had been scale up to 5 nodes based on the workload etc [8].
Ganglia
It refers to the open source software tools that are responsible for the high performance of the monitors like grids and clusters.
It is comprise of gmond, gmetad and a front end web. Gmond alludes to the multi-strung daemon that runs the aggregate hubs and data are accounted for back to gmetad. Gmetad helps in gathering the measurements, accomplishes those and thus produces the report [13]. It was utilized at first for the charting and information logging. The information is pictured by the front-end web packaged with ganglia. This observing framework offers an entire answer for filing, checking and representation of the framework measurements, offering the screens with no extraordinary coding.
Amazon CloudWatch
In this project work, the auto-scaling framework has been driven by the cluster monitor which continuous to collect the real-time workloads and information from the Hadoop cluster by using the Hadoop API (Application Programming Interface), including running/pending jobs, running map/reduce tasks, and pending map/reduce tasks. Example: Setting up the alarms for notifications. Amazon offers services for monitoring the performances of EC2 instances as well [15].
Conclusion
Every minute the Control Panel helps to send heartbeat to the nodes of cluster. The size of the cluster is hence determined by the Control Panel and log and then updates the IP addresses of the master node. Various performance metrics are hence logged from each node every 30 second [18]. The Python scripts into a database with the IP of the machine are used. The data are visualized by using a third reporting system called Ganglia. AWS CloudWatch service was not used is because of the fact that CloudWatch API compliant service for Eucalyptus.
To keep the heartbeat and gathering of performance metrics separate there has been used 3 services instead of one [26]. In a perfect world, we would need to run 2 benefits at most, a pulse and assembling of execution measurements [38]. We see that the right way to deal with take here at present is to cripple the social event of custom measurements and depend exclusively on Ganglia. However, there have been circumstances for us to get a kick out of the chance to see an AWS CloudWatch API consistent subsystem for Eucalyptus so that Eucalyptus yet until the point when then we think Ganglia is the best instrument and could have one API for both AWS for the employment [39].
Generating Test Load
It is the process of measuring a response by demanding a computer device or a software system. To determine the system behavior Load testing is performed within anticipated peak load condition and normal condition [45]. To test and observe the system of the real-world metrics, it is required to run the MapReduce jobs is to generate load and utilization of disks. The HDFS Synthetic Load Generator and sorting both bundled with the Apache Hadoop project [49].
HDFS Synthetic Load Generator
The Synthetic Load Generator (SLG) has been mainly designed to test the behavior of NmeNode under various workload. In this present research, there has been a use of the default setting for generating a directory structure and even the random files [51]. Then the load generator has been running over to collect the system metrics. Here there is no involvement of CPU, this test only provides the HDFS level load generation. It only involves the network IO or the disk IO.
Apache Hadoop Sort Benchmark
The Apache Hadoop Sort Benchmark comprises of two stages. The first is to create irregular information. The default settings are to create 1 TB of information. In our cases we changed that to be 5 GB [60]. Along these lines we could run more tests in a similar time allotment and we didn’t need to expand the capacity measure. The creating of 5GB information on a 4 hub bunch takes around 10 minutes [62]. The second step is to sort the information. This benchmark gives a huge load to the group. Indeed, even on a little group (10 Hadoop Nodes) the heap spiked as high as 20 on a few hubs. It additionally presents true like information exchange between the hubs and we could screen diverse framework measurements of the hubs [67].
Identifying Relevant Metrics
Through our testing we recorded 15,000 passages from 80 distinctive IP addresses in the measurements table. We didn’t run any factual breaks down on the information as we were not ready to get even comparative group normal burdens when running the same MapReduce work on similar information. We chose the heap normal to be the metric to proceed with in our trials and characterizing our autoscaling methodology [70].
Autoscaling Framework
AMIs that has been on boot begin the Hadoop administrations, we additionally designed the Hadoop establishments in a route that on begin they join a bunch. On every hub we additionally have designed metric social occasion to evaluate the heap of the bunch from a focal area. Presently we will assemble these all into an autoscaling structure [72].
Load Average Autoscaling
We scaled our bunch in view of the normal load. The period amid which the normal load needs to outperform the limit is 10 minutes and the edge is additionally 10. Additionally once the autoscaling system has made a move we need to present a chill off period. Amid this period our system ought not roll out any improvements in the measure of the bunch. This time is implied for the new hubs to join the group and hold up to be used [73].
Amazon AutoScaling
Describing the autoscaling with Amazon AutoScaling has been made reasonably basic. We need to portray an AutoScaling Group with our custom Hadoop AMIs and the hardware assurance (for our circumstance the Small Instance) and a short time later make a Trigger on CPU utilization that is over 1000% with a Period of 1 minute and Breach Duration 10 minutes Also we need to describe the base and most outrageous number of events that we will use [76]. Amazon AutoScaling will start watching the CPU utilization of the whole gathering every 1 minute. When it watches that the CPU utilize has been 1000% for 10 minutes it will start up another case. The cons of the Amazon AutoScaling are that it is not maintained in Eucalyptus and it depends upon Amazon CloudWatch which is not supported in Eucalyptus [74].
Custom Scaling Framework
Our custom scaling structure is a PHP content that breaks down the execution measurements accumulated by our Python content. This PHP content keeps running at the Control Panel and is summoned each moment by means of cron [82].
Scaling the Cluster Down
With Amazon EMR forms 5.1.0 and later, you can arrange the scale-down conduct of Amazon EC2 examples when an end ask for is issued. An end demand can originate from a programmed scaling approach that triggers a scale-in movement or from the manual expulsion of examples from an occasion gather when a group is resized [83].
Apache Hadoop and Cluster Changes
It has been seen at the time the 3 hubs had joined the bunch there were 7 lessen errands running on the 4 hubs. Hadoop began duplicates of a similar running lessen undertakings on the new hubs to use the new hubs. This shows that it is so wasteful to AutoScale a Hadoop bunch in view of the heap. Presently we had 7 hubs of which 5 were running the same diminish assignments contending This likewise demonstrates there is space for tweaking the systems of MapReduce executions on the proper behavior to group measure changes. The new hubs can be utilized for occupations in the line as opposed to rivaling each other [85].
Conclusion
Generally, the higher effectiveness advances the profitability, however the opposite may not hold, essentially. The QoS depends on client’s target. Diverse clients may set their own particular fulfillment limit for the QoS they can acknowledge. The productivity is controlled by the suppliers considering the enthusiasm of all client interests in the meantime. We outline beneath our real research discoveries from the thorough cloud benchmark tests performed in 2014. At that point, we recommend a couple of headings for advance R/D in advancing distributed computing applications. Thus, creates various outcomes in aspect of the usages and the costs. Results of this exploring both the batch and time process of the architecture includes performing local tests first then deployment of the script is required, execution of cluster within few minutes, future work may involve instances. Whereas real-time online process has Amazon cloud management allows the rights and roles to be assigned for more than one member of a team. A system for real-time investigation of Twitter information. This structure is intended to gather, channel, and examine surges of information and gives us an understanding to what is well known amid a particular time and condition. The structure comprises of three primary advances; information ingestion, stream handling, and information representation. Information ingestion is performed by Kafka, a capable message expediting framework to import tweets, and to convey it based on Topics that it characterizes, and to make it accessible over purchasers’ hubs to be handled by diagnostic devices. Apache Spark is utilized to get to these purchasers specifically and break down information by Spark Streaming. Information examination and machine learning calculations. Acquired for multi-class estimation investigation in the informational collection utilized was 60.2%. In any case, we trust that a more upgraded preparing set would exhibit better exhibitions. All through this work, we showed that multi-class assumption examination can accomplish high exactness level, however it remains a testing undertaking. An all the more intriguing assignment is to measure suppositions exhibit in the tweet. In this way, in a future work, we will utilize the outcomes acquired for ternary grouping (which accomplished a precision equivalent to 70.1%) to group tweets into “Positive”, “Negative” and “Unbiased”. The grouped nostalgic tweets (i.e., which have been ordered as “Positive” or “Negative”) will then be given scores for the comparing assumption subclasses.
Over all, we find that scaling-out is the most effortless one to actualize on homogeneous groups. The Our exploration commitments are compressed beneath in 5 specialized perspectives:
- New execution measurements and benchmarking models are proposed and tried in cloud benchmark tests. We contemplate the versatility exhibitions driven by proficiency and profitability, independently. This approach advances to various client bunches with expanded execution requests.
- Maintained execution of mists comes for the most part from quick flexible assets provisioning to coordinate with the workload variety. Scaling-out ought to be polished when the versatility is high, scaling-up is supportive of utilizing all the more capable hubs with higher effectiveness and efficiency.
- To accomplish gainful administrations, both scale-up and scale-out plans could be drilled. Scale-out reconfiguration has bring down overhead to actualize than those accomplished in flaky up tests. The versatility speed assumes a fundamental part n limiting the over-provisioning or under-provisioning holes of assets.
- We uncover high scale-out execution in HI Bench and Bench Clouds tests. Then again, we demonstrate that scaling up is more savvy with higher profitability and p-adaptability in YCSB and TPC-W tests. These discoveries might be valuable to anticipate other benchmark execution on the off chance that they endeavor to scale out or scale-up with comparable cloud setting and workload.
- The cloud profitability is incredibly credited to framework flexibility, effectiveness, and adaptability driven by execution. The cloud suppliers must authorize execution confinement for portion standing clients
Three recommendations are made beneath for additionally work.
- Other cloud benchmarks: CloudStone CloudCmp, and C-meter, could be additionally tried with the new execution models displayed. Future benchmarks are urged to assess PaaS and SaaS mists.
- To make mists all around satisfactory, we empower cloud analysts and designers to work mutually in building up an arrangement of use particular benchmarks for imperative cloud and enormous information application spaces.
- The cloud group is shy of benchmarks to test cloud capacity in enormous information examination and machine learning insight. This region is generally open, sitting tight for significant research/improvement challenges.
Keeping the major issues of data security, data loss and inconvenience in accessing the data the cloud computing can be used to resolve the issues very easily. Some of the major future aspects are:
- Minimizing data loss and having more secure data security.
- Migration time will become negligible.
- One-to-many device relationship.
- Identification of the processes and operation of data stored which is to be passed to the cloud.
- Evaluation the requirement regarding legal and technical security
- Analyzing the cloud types for a planned processing
- Choose of the service provider offering sufficient guarantees
References
- Ahram, A. 2011. The theory and method of comparative area studies. Qualitative Research, 11(1), pp.69-90.
- Al, S. 2013. Interpretive research design: concepts and processes. International Journal of Social Research Methodology, 16(4), pp.351-352.
- Almorsy, M., Grundy, J. and Müller, I., 2016. An analysis of the cloud computing security problem. arXiv preprint arXiv:1609.01107.
- Alvesson, M. and Sandberg, J. 2013. Constructing Research Questions. London: SAGE Publications.
- Asma, A., Chaurasia, M.A. and Mokhtar, H., 2012. Cloud Computing Security Issues. International Journal of Application or Innovation in Engineering & Management, 1(2), pp.141-147.
- Bhadra, M. 2012. Mental Health & Mental Illness: Our Responsibility. Health Renaissance, 10(1).
- Botta, A., De Donato, W., Persico, V. and Pescapé, A., 2016. Integration of cloud computing and internet of things: a survey. Future Generation Computer Systems, 56, pp.684-700.
- Bouayad, A., Blilat, A., Mejhed, N.E.H. and El Ghazi, M., 2012, October. Cloud computing: security challenges. In Information Science and Technology (CIST), 2012 Colloquium in (pp. 26-31). IEEE.
- Brodkin, J., 2008. Gartner: Seven cloud-computing security risks. Infoworld, 2008, pp.1-3.
- Brown, J. and Stowers, E. 2013. Use of Data in Collections Work: An Exploratory Survey.Collection Management, 38(2), pp.143-162.
- Buyya, R., Ranjan, R. and Calheiros, R.N., 2009, June. Modeling and simulation of scalable Cloud computing environments and the CloudSim toolkit: Challenges and opportunities. In High Performance Computing & Simulation, 2009. HPCS’09. International Conference on (pp. 1-11). IEEE.
- Calheiros, R.N., Ranjan, R., Beloglazov, A., De Rose, C.A. and Buyya, R., 2011. CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms. Software: Practice and experience, 41(1), pp.23-50.
- Cameron, R. 2009 ‘A sequential mixed model research design: design, analytical and display issues’, International Journal of Multiple Research Approaches, 3(2), 140-152.
- Carlin, S. and Curran, K., 2011. Cloud computing security.
- Chandra, S. and Sharma, M. 2013. Research methodology. Oxford: Alpha Science International Ltd.
- Chen, Y., Paxson, V. and Katz, R.H., 2010. What’s new about cloud computing security. University of California, Berkeley Report No. UCB/EECS-2010-5 January, 20(2010), pp.2010-5.
- Colton, P. and Sarid, U., Aptana, Inc., 2016. System and method for developing, deploying, managing and monitoring a web application in a single environment. U.S. Patent 7,596,620.
- Creswell, J., Hanson, W., Clark Plano, V. and Morales, A. 2010. Qualitative Research Designs: Selection and Implementation. The Counseling Psychologist, 35(2), pp.236-264.
- Curtis, P.M., Cochran, M.J., Clarke, K.J., Matczynski, M.J. and Shung, C., Verizon Patent and Licensing Inc., 2016. Virtual packet analyzer for a cloud computing environment. U.S. Patent 9,509,760.
- Curtis, P.M., Cochran, M.J., Considine, J.F. and Clarke, K.J., Verizon Patent and Licensing Inc., 2017. Virtual network device in a cloud computing environment. U.S. Patent 9,559,865.
- Da Cunha Rodrigues, G., Calheiros, R.N., Guimaraes, V.T., Santos, G.L.D., De Carvalho, M.B., Granville, L.Z., Tarouco, L.M.R. and Buyya, R., 2016, April. Monitoring of cloud computing environments: concepts, solutions, trends, and future directions. In Proceedings of the 31st Annual ACM Symposium on Applied Computing (pp. 378-383). ACM.
- Da Cunha Rodrigues, G., Calheiros, R.N., Guimaraes, V.T., Santos, G.L.D., De Carvalho, M.B., Granville, L.Z., Tarouco, L.M.R. and Buyya, R., 2016, April. Monitoring of cloud computing environments: concepts, solutions, trends, and future directions. In Proceedings of the 31st Annual ACM Symposium on Applied Computing (pp. 378-383). ACM.
- Dave, A., Patel, B. and Bhatt, G., 2016, October. Load balancing in cloud computing using optimization techniques: A study. In Communication and Electronics Systems (ICCES), International Conference on (pp. 1-6). IEEE.
- Di Spaltro, D., Polvi, A. and Welliver, L., Rackspace Us, Inc., 2016. Methods and systems for cloud computing management. U.S. Patent 9,501,329.
- Dillon, S., &Vossen, G. (2015). SaaS cloud computing in small and medium enterprises: A comparison between Germany and New Zealand. International Journal of Information Technology, Communications and Convergence, 3(2), 87-104.
- Eaton, S. 2013. The Oxford handbook of empirical legal research. International Journal of Social Research Methodology, 16(6), pp.548-550.
- Erl, T., Puttini, R., & Mahmood, Z. (2013). Cloud computing. Prentice Hall, May, 2, 528.
- Feng, D.G., Zhang, M., Zhang, Y. and Xu, Z., 2011. Study on cloud computing security. Journal of software, 22(1), pp.71-83.
- Ferris, J.M. and Darcy, A.P., 2008. Methods and systems for building custom appliances in a cloud-based network. U.S. Patent Application 12/128,787.
- Ferris, J.M. and Riveros, G.E., 2010. Methods and systems for converting standard software licenses for use in cloud computing environments. U.S. Patent Application 12/714,099.
- Ferris, J.M. and Riveros, G.E., Red Hat, Inc., 2016. Detecting events in cloud computing environments and performing actions upon occurrence of the events. U.S. Patent 9,389,980..
- Ferris, J.M., 2015. Methods and systems for optimizing resource usage for cloud-based networks. U.S. Patent Application 12/196,459.
- Ferris, J.M., 2015. Systems and methods for service level backup using re-cloud network. U.S. Patent Application 12/324,803.
- Ferris, J.M., 2016. Methods and systems for providing on-demand cloud computing environments. U.S. Patent Application 12/324,437.
- Forouzan, A. B. (2006). Data communications & networking (sie). Tata McGraw-Hill Education.
- Foster, I., Zhao, Y., Raicu, I. and Lu, S., 2008, November. Cloud computing and grid computing 360-degree compared. In Grid Computing Environments Workshop, 2008. GCE’08 (pp. 1-10). Ieee.
- Foster, I., Zhao, Y., Raicu, I. and Lu, S., 2008, November. Cloud computing and grid computing 360-degree compared. In Grid Computing Environments Workshop, 2008. GCE’08 (pp. 1-10). Ieee.
- Gornall, L. 2011. Book Review: Mixed Method Design: Principles and Procedures. Qualitative Research, 11(4), pp.456-457.
- Harrison, R. L. and Reilly, T. M. 2011 “Mixed methods designs in marketing research”, Qualitative Market Research: an International Journal, 14(1), pp. 7 – 26
- Hoffa, C., Mehta, G., Freeman, T., Deelman, E., Keahey, K., Berriman, B. and Good, J., 2008, December. On the use of cloud computing for scientific workflows. In eScience, 2008. eScience’08. IEEE Fourth International Conference on (pp. 640-645). IEEE.
- Hong-hui, C.H.E.N., 2011. Cloud Computing Security Challenges. Computer Knowledge and Technology, 24, p.014.
- Hwang, K. (2017). Cloud and Cognitive Computing: Principles, Architecture, Programming. MIT Press.
- Jandhyala, D., Nayak, R., Qiao, Y. and Kinman, M., Evapt, Inc., 2009. Systems and methods for metered software as a service. U.S. Patent Application 12/435,680.
- Kavis, M. J. (2014). Architecting the cloud: design decisions for cloud computing service models (SaaS, PaaS, and IaaS). John Wiley & Sons.
- Klassen, A., Creswell, J., Plano Clark, V., Smith, K. and Meissner, H. 2012. Best practices in mixed methods for quality of life research. Qual Life Res, 21(3), pp.377-380.
- Krutz, R.L. and Vines, R.D., 2010. Cloud security: A comprehensive guide to secure cloud computing. Wiley Publishing.
- Krutz, R.L. and Vines, R.D., 2010. Cloud security: A comprehensive guide to secure cloud computing. Wiley Publishing.
- LD, D.B. and Krishna, P.V., 2013. Honey bee behavior inspired load balancing of tasks in cloud computing environments. Applied Soft Computing, 13(5), pp.2292-2303.
- Leedy, P. and Ormrod, J. 2013. Practical research. Boston: Pearson.
- Leppink, J., Paas, F., Van Gog, T., van Der Vleuten, C.P. and Van Merrienboer, J.J., 2014. Effects of pairs of problems and examples on task performance and different types of cognitive load. Learning and Instruction, 30, pp.32-42.
- Li, B., Li, J., Huai, J., Wo, T., Li, Q. and Zhong, L., 2009, September. Enacloud: An energy-saving application live placement approach for cloud computing environments. In Cloud Computing, 2009. CLOUD’09. IEEE International Conference on (pp. 17-24). IEEE.
- Magilvy, J. K. and Thomas, E. 2009 ‘A first qualitative project: Qualitative description design for novice researcher’, Journal of the Society, vol. 14, no. 1, pp. 298-300
- Mahmoud, S.M. and Abdulabbas, T.E., 2017, June. Multiple MapReduce functions for health care monitoring in a smart environment. In E-Health and Bioengineering Conference (EHB), 2017 (pp. 507-510). IEEE.
- Manikandan, S.G. and Ravi, S., 2014, October. Big data analysis using Apache Hadoop. In IT Convergence and Security (ICITCS), 2014 International Conference on (pp. 1-4). IEEE.
- Manvi, S. S., &Shyam, G. K. (2014). Resource management for Infrastructure as a Service (IaaS) in cloud computing: A survey. Journal of Network and Computer Applications, 41, 424-440.
- Marino, C.C., Brendel, J., Amor, P. and Kothari, P., Cisco Technology, Inc., 2015. User-configured on-demand virtual layer-2 network for infrastructure-as-a-service (IaaS) on a hybrid cloud network. U.S. Patent 9,154,327.
- McKnight, R. and Anderson, G., Gateway, Inc., 2001. System and method for providing distributed computing services. U.S. Patent Application 09/847,828.
- Mestha, L.K. and Fisher, P.S., Xerox Corporation, 2008. Web enabled color management service system and method. U.S. Patent Application 12/127,649.
- Mishra, A., Mathur, R., Jain, S. and Rathore, J.S., 2013. Cloud computing security. International Journal on Recent and Innovation Trends in Computing and Communication, 1(1), pp.36-39.
- Mitchell, M. and Jolley, J. 2013. Research design explained. Australia: Wadsworth Cengage Learning.
- Moreno-Vozmediano, R., Montero, R.S., Huedo, E. and Llorente, I.M., 2017. Cross-Site Virtual Network in Cloud and Fog Computing. IEEE Cloud Computing, 4(2), pp.46-53.
- Novikov, A. and Novikov, D. 2013. Research methodology. Leiden, Netherlands: CRC Press/Balkema.
- Ogigau-Neamtiu, F., 2012. Cloud computing security issues. Journal of Defense Resources Management, 3(2), p.141.
- Paas, F., Renkl, A. and Sweller, J. eds., 2016. Cognitive load theory: a special issue of educational psychologist. Routledge.
- Pandey, S., Wu, L., Guru, S.M. and Buyya, R., 2010, April. A particle swarm optimization-based heuristic for scheduling workflow applications in cloud computing environments. In Advanced information networking and applications (AINA), 2010 24th IEEE international conference on (pp. 400-407). IEEE.
- Pandey, S., Wu, L., Guru, S.M. and Buyya, R., 2010, April. A particle swarm optimization-based heuristic for scheduling workflow applications in cloud computing environments. In Advanced information networking and applications (AINA), 2010 24th IEEE international conference on (pp. 400-407). IEEE.
- Persico, V., Montieri, A. and Pescapè, A., 2016, October. On the network performance of amazon S3 cloud-storage service. In Cloud Networking (Cloudnet), 2016 5th IEEE International Conference on (pp. 113-118). IEEE.
- Popovi?, K. and Hocenski, Ž., 2010, May. Cloud computing security issues and challenges. In MIPRO, 2010 proceedings of the 33rd international convention (pp. 344-349). IEEE.
- Popping, R. 2012. Qualitative Decisions in Quantitative Text Analysis Research.Sociological Methodology, 42(1), pp.88-90
- Potrata, B. 2010. Rethinking the Ethical Boundaries of a Grounded Theory Approach. Research Ethics, 6(4), pp.154-158.
- Rittinghouse, J.W. and Ransome, J.F., 2016. Cloud computing: implementation, management, and security. CRC press.
- Salaberry, M. and Comajoan, L. 2013. Research Design and Methodology in Studies on L2 Tense and Aspect. Boston: De Gruyter.
- Salkind, N. 2010. Encyclopedia of research design. Thousand Oaks, Calif.: Sage.
- Saunders, M. N., Lewis, P. and Thornhill, A. 2009. Research methods for business students, Page 52, 5th Harlow: Prentice Hall
- Shin, J.Y., Balakrishnan, M., Marian, T. and Weatherspoon, H., 2017. Isotope: ACID Transactions for Block Storage. ACM Transactions on Storage (TOS), 13(1), p.4.
- Somekh, B. and Lewin, C. 2011 Theory and Methods in Social Research, 2nd ed. London: Sage Publications
- Subashini, S. and Kavitha, V., 2011. A survey on security issues in service delivery models of cloud computing. Journal of network and computer applications, 34(1), pp.1-11.
- Swan, M., 2013. The quantified self: Fundamental disruption in big data science and biological discovery. Big Data, 1(2), pp.85-99.
- Takabi, H., Joshi, J.B. and Ahn, G.J., 2010. Security and privacy challenges in cloud computing environments. IEEE Security & Privacy, 8(6), pp.24-31.
- Takabi, H., Joshi, J.B. and Ahn, G.J., 2010. Security and privacy challenges in cloud computing environments. IEEE Security & Privacy, 8(6), pp.24-31.
- Talluri, S., 2016. Novel Techniques In Detecting Reputation based Attacks And Effectively Identify Trustworthy Cloud Services. IJSEAT, 4(6), pp.287-289.
- Thomas, J. 2013. Empathic design: Research strategies. Australasian Medical Journal, 6(1), pp.1-6.
- Truscott, D. M., Smith, S., Thornton-Reid, F., Williams, B. and Matthews, M. 2010 “A cross-disciplinary examination of the prevalence of mixed methods in educational research: 1995-2005”, International Journal of Social Research Methodology, 13(4), pp. 317-28.
- Tsukernik, M.A. and Curtis, P.M., Verizon Patent and Licensing Inc., 2017. Learning information associated with shaping resources and virtual machines of a cloud computing environment. U.S. Patent 9,641,441.
- Uprichard, E. 2013. Sampling: bridging probability and non-probability designs. International Journal of Social Research Methodology, 16(1), pp.1-11.
- Wang, S. and Mao, H. 2012. The Strategy Research of Modern Service Industry Development. Asian Social Science, 8(2).
- Wickremasinghe, B., Calheiros, R.N. and Buyya, R., 2010, April. Cloudanalyst: A cloudsim-based visual modeller for analysing cloud computing environments and applications. In Advanced Information Networking and Applications (AINA), 2010 24th IEEE International Conference on (pp. 446-452). IEEE.
- Williamson, E., 2016. Systems and methods for promotion of calculations to cloud-based computation resources. U.S. Patent Application 12/200,281.
- Woodside, J. M. (2015). Advances in Information, Security, Privacy & Ethics: Use of Cloud Computing For Education. In Handbook of Research on Security Considerations in Cloud Computing(pp. 173-183). IGI Global.
- Wu, L., Garg, S.K. and Buyya, R., 2011, May. Sla-based resource allocation for software as a service provider (saas) in cloud computing environments. In Cluster, Cloud and Grid Computing (CCGrid), 2011 11th IEEE/ACM International Symposium on (pp. 195-204). IEEE.
- Xia, Z., Wang, X., Zhang, L., Qin, Z., Sun, X. and Ren, K., 2016. A privacy-preserving and copy-deterrence content-based image retrieval scheme in cloud computing. IEEE Transactions on Information Forensics and Security, 11(11), pp.2594-2608.
- Yadav, D.S. and Doke, K., 2016. Mobile Cloud Computing Issues and Solution Framework.
- Yan, Q., Yu, F.R., Gong, Q. and Li, J., 2016. Software-defined networking (SDN) and distributed denial of service (DDoS) attacks in cloud computing environments: A survey, some research issues, and challenges. IEEE Communications Surveys & Tutorials, 18(1), pp.602-622.
- Yan, Q., Yu, F.R., Gong, Q. and Li, J., 2016. Software-defined networking (SDN) and distributed denial of service (DDoS) attacks in cloud computing environments: A survey, some research issues, and challenges. IEEE Communications Surveys & Tutorials, 18(1), pp.602-622.
- Zissis, D. and Lekkas, D., 2012. Addressing cloud computing security issues. Future Generation computer systems, 28(3), pp.583-592.
- Zissis, D. and Lekkas, D., 2012. Addressing cloud computing security issues. Future Generation computer systems, 28(3), pp.583-592.