Relations between Cloud Computing, Big Data Management and Hadoop Technology
Discuss about the Emerging Technologies And the Innovation.
In this article, the authors have discussed about the use of cloud computing and its security issues faced during big data management. The authors have emphasized that cloud computing has a wide range of uses in business organizations and hence, there should be no compromise with the security of the cloud during big data management. Big data management is an extremely complex task as it involves management of large and complex datasets and cloud computing helps to handle and manage these operations according to requirements. Most of the datasets handled by big data are confidential and secure information of business organizations and hence, they are the preferred targets of unethical internet users and hackers. Hence, cloud security is the most essential requirement for business organizations so that their confidential data and information are not lost due to breach of security attacks.
In this article, the authors have described about the relations between cloud computing, big data management and Hadoop technology. The discussion of the relations between the different technologies enabled the authors to realize the amount of data that is daily managed through big data and need for proper security. As per the authors, the volume of data entered into the big data are rising at an exponential pace as more and more data servers are virtualized instead of using physical storage spaces. As a result, hackers are using more advanced techniques to break into such data servers and steal confidential information from them. Even though the business organizations are using suitable security measures like data encryption, system and network firewalls, security softwares and others, every day, thousands of gigabytes of data and information are stolen by the hackers and other unethical third party users. This is because their hacking technologies are also evolving and getting stronger at a rapid pace.
The authors have mainly studied about the hacking technologies that are mostly responsible for breaching the security of the cloud computing in big data management. They have emphasized on the need to develop and upgrade current security systems at a faster pace that the hacking technologies that are the main threats of cloud computing. As per the authors, the hackers use decoding technologies to break through data encryptions and steal data from the decrypted data packet. The most common technique used by them is the injection of malicious files into the system that can work itself to reach data packets and start decryption work on the same. While some malicious files can be detected easily by the system firewalls and ejected out of the system, some others are non-detectable until they have completed the decryption process entirely. These types of malicious files are masked with different identified and hence are hard to detect as spam. These files generally come through online advertisements, emails and others.
The authors have explored how the masking of the spam files enter into the system without alerting the system firewalls. While using the systems, some users browse the internet for various websites and in many of these, various advertisement links pop up. While some are authentic advertisements, others are malicious files masked as advertisements. Furthermore, most of the masked files advert for very popular organizations like amazon and quickly raise the interest of the user. Whenever the user clicks on the link, instead of redirecting to the company page, the link downloads the malicious file into the system of the user. Once downloaded, the file works itself to spread into other systems connected to the same server and even the central database server. In each of the places, the files capture data packets and in case of encrypted data, the files use their decryption techniques for breaking into the packets and capture information.
Hacking Technologies Responsible for Breaching the Security of Cloud Computing in Big Data Management
The authors investigated another possible reason that causes the security issues in cloud computing. They stated that the users of cloud can manage the data that is used in cloud but have no control over the data once they are stored inside the server i.e. the users can determine the data managementprocesses to be conducted by the cloud they cannot determine the location where the data will be stored. As a result, from the user’s end, data storage location cannot be secured and hence, if a malicious file is injected inside the cloud storage, the user cannot recover the contents from the server. Hence, it is recommended that the entire network connected to the cloud should be secured so that the malicious injections do not work at all.
The authors of this article have suggested maintaining data logs while working in cloud. The regular data log will ensure a track is kept of what data has been entered into the system and what data has been extracted. Furthermore, regular auditing of the data log should be done in order to ensure there are no errors inside the data log. However if there is unusual data found during the auditing, it should be immediately removed from the storage as it may be the result of injection of malicious files inside the system. Such injections are very common incidents and hence, regular logging and auditing is required to monitor whether some unusual data are inserted along with the normal data to be stored inside the cloud. Furthermore, since the user has no control inside the cloud, he needs to secure the gateway itself that leads the data from the user’s end to the cloud storage server.
The authors have discussed about ransomware, probably the largest threat to cloud computing for big data management. Ransomwares are specially designed malicious files that enter into the system, capture significant amount of data and lock it with such strong encryption that cannot be broken by any regular means. As per the study of the authors, ransomwares use extremely complex encryptions that may take thousands of years if they are decrypted using regular means. The main purpose of these ransomwares is to gather a huge amount of ransom in exchange for decryption of the data. With increasing time, the amount of ransom demanded also increases significantly. Within a particular period of time, if the ransom is not paid, the files are either deleted permanently or sold to a rival organization for huge amount of money. As per the authors, there are no particular counters to ransomwares as the existing techniques are not sufficient for decrypting the files captured by ransomwares. Hence, it is better to protect the system and prevent the entry of ransomwares in it.
The authors have explored another main issue related to cloud security in big data management. According to the authors, the cloud service providers do not provide any security to the clients. Hence, the users of cloud have to ensure the security of the cloud themselves only. In many cases of common and general users, it has been found that they are not really aware of the lack of security in cloud and continue to use them without securing them. Mostly, the users have said that they are not aware of the fact that the cloud service providers do not provide cloud security systems. Hence, in addition to the development of cloud security, awareness should also be raised among users to use cloud security system before starting to use the services.
Masking of Spam Files
The authors discussed about some practical solutions that can be used to counter cloud security issues in big data management. The primary and most practical solution, as explained by the authors, is to reinforce the internal security of the system in order to prevent any unauthorized access into the server. The attacks occur through various sources of the internet and hence, the network layer should be secured using suitable network firewalls. The main emphasis should be on the transport layer that is the most vulnerable point and the attackers try to steal data during the transmission of the data. The authors also recommended the use of suitable antivirus softwares that will resist injection of malwares and other malicious files into the system.
The authors have explored more advanced methods of cloud computing security that is more applicable for business organizations. They have suggested the use of strong data encryptions in order to secure the data during the transmission. However, in addition, they have suggested that the encryption should be multilayered, i.e. instead of one, there should be at least two or three layers of encryption. This will strengthen the security of the data as it will be extremely difficult to break through several layers of encryption. As per their studies, most of the attacks in business organizations occur due to lack of sufficient encryption on the data and the advanced decryption technologies of hackers easily break through the encryptions and steal confidential and valuable data.
The authors have proposed the use of SA-EDS (Security-Aware Efficient Distributed Storage) for addressing data efficiency performance in cloud. They also suggested the development of Security Improvement Problem in Distributed Storage (SIPDS) for addressing the cloud security issues. This system along with the input data and the cloud server finds appropriate solution for securing the data while being stored inside the cloud. This system also delivers the data to the cloud server even though the execution time increases by some margin. The AD2 algorithm of the SA-EDS model is required to analyze the efficiency of the system regarding cloud security and provide suitable output regarding the amount of further security measures required in the server.
The authors explored several algorithms that are applicable for managing various operations of data inside the cloud in a secure way. AD2 algorithm was used by authors for managing data packets that are entered into the cloud server as inputs. This algorithm splits a data packet into several parts and names them separately. The advantage of using this particular process is that as the data splits into separate sub-packets, the attacker cannot access the entire data even if he breaks into the system successfully. If the attacker captures one or two sub-packets of data, he will only be able to steal a part of the data that will be not much meaningful. The authors have stated that this can be a very effective method of securing cloud servers where several confidential data like business strategies, finance reports and others are entered and stored.
Loss of Control over Data Storage Location
The authors have explored some other algorithms like SED2 and EDCon for securing cloud networks. As per their works, SED2 algorithm can be used for completing data processing before the data is sent over to the cloud. In most of cases, the data processing is done inside the cloud server after the transmission of data to the cloud server is complete. As a result, the hackers get access to raw data if they are able to break into the transport layer and capture the data packets during transmission. On the other hand, if the data processing is done earlier, the processed data becomes much more secure as it is not much of use to the hackers than unprocessed raw data. As per the authors, EDCon algorithm can be effectively used to develop a schematic architecture of data that will be much more secure and resistant to attacks than normal architectures.
The authors have explored a number of encryption techniques that can be used for the security of cloud data. The first encryption method explained by the authors is file encryption in which the files are encrypted before sent over to the cloud server for storage. However, in many cases, it has been found that file encryptions are not very efficient in cases of very powerful attacks that are executed by extremely advanced decryption techniques. Another encryption method explained by the authors is network encryption in which, instead of the data packets, the entire network is encrypted i.e. all communications and transmissions through the network are encrypted using some specific standards.
The authors have explained about some other techniques that are useful in developing cloud security. One of the methods includes maintenance of nodes so that no unwanted and unauthorized accesses are possible through them. The authors have highlighted the use of honeypot nodes – specially designed trap nodes that are placed within cluster of nodes. Whenever an attacker tries to enter such a node cluster, the honeypot nodes trap the attacker and immediately eject him out of the system. Honeypot nodes are masked in such a way that they are not distinguishable from other nodes as they appear the same to the attackers. The authors have also emphasized on access control i.e. control access to the network. This is possible by using several layers of verification processes to ensure there is no unauthorized access inside the system.
The authors have researched on the overall cloud security issues in big data management and have emphasized on the fact that cloud security is a problem that must be addressed with immediate effect. Each day, due to lack of sufficient cloud security, several terabytes of data are lost as more and more attacks are attempted on the cloud data servers especially of the large scale business organizations who extensively use cloud computing for management and storage of large volumes of big data. Hence, it is important to develop more advanced security systems for cloud computing that will not only help protect the large volumes of data but will also help to develop cloud computing significantly.
Maintaining Data Logs and Auditing
The red marked area in the diagram shows the current vulnerability/problem in the current cloud architectures used in most organizations i.e. there is no sufficient security in the transport layer between the user and the cloud and hence it is the most vulnerable point in the same. Hence, the proposed detailed structure of the transport layer security is shown in the following diagram.
Assunção, M. D., Calheiros, R. N., Bianchi, S., Netto, M. A., & Buyya, R. (2015). Big Data computing and clouds: Trends and future directions. Journal of Parallel and Distributed Computing, 79, 3-15.
Baccarelli, E., Cordeschi, N., Mei, A., Panella, M., Shojafar, M., & Stefa, J. (2016). Energy-efficient dynamic traffic offloading and reconfiguration of networked data centers for big data stream mobile computing: review, challenges, and a case study. IEEE Network, 30(2), 54-61.
Baek, J., Vu, Q. H., Liu, J. K., Huang, X., & Xiang, Y. (2015). A secure cloud computing based framework for big data information management of smart grid. IEEE transactions on cloud computing, 3(2), 233-244.
Bahrami, M., & Singhal, M. (2015). The role of cloud computing architecture in big data. In Information granularity, big data, and computational intelligence (pp. 275-295). Springer International Publishing.
Botta, A., De Donato, W., Persico, V., & Pescapé, A. (2016). Integration of cloud computing and internet of things: a survey. Future Generation Computer Systems, 56, 684-700.
Branch, R., Tjeerdsma, H., Wilson, C., Hurley, R., & McConnell, S. (2014). Cloud computing and big data: a review of current service models and hardware perspectives. Journal of Software Engineering and Applications, 7(08), 686.
Chang, V., & Ramachandran, M. (2016). Towards achieving data security with the cloud computing adoption framework. IEEE Transactions on Services Computing, 9(1), 138-151.
Chang, V., Kuo, Y. H., & Ramachandran, M. (2016). Cloud computing adoption framework: A security framework for business clouds. Future Generation Computer Systems, 57, 24-41.
Cui, L., Yu, F. R., & Yan, Q. (2016). When big data meets software-defined networking: SDN for big data and big data for SDN. IEEE network, 30(1), 58-65.
Cuzzocrea, A. (2014, November). Privacy and security of big data: current challenges and future research perspectives. In Proceedings of the First International Workshop on Privacy and Secuirty of Big Data (pp. 45-47). ACM.
Demchenko, Y., De Laat, C., & Membrey, P. (2014). Defining architecture components of the Big Data Ecosystem. In Collaboration Technologies and Systems (CTS), 2014 International Conference on (pp. 104-112). IEEE.
Dou, W., Zhang, X., Liu, J., & Chen, J. (2015). HireSome-II: Towards privacy-aware cross-cloud service composition for big data applications. IEEE Transactions on Parallel and Distributed Systems, 26(2), 455-466.
Fernández, A., del Río, S., López, V., Bawakid, A., del Jesus, M. J., Benítez, J. M., & Herrera, F. (2014). Big Data with Cloud Computing: an insight on the computing environment, MapReduce, and programming frameworks. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 4(5), 380-409.
Gai, K., Qiu, M., & Zhao, H. (2016, April). Security-aware efficient mass distributed storage approach for cloud systems in big data. In Big Data Security on Cloud (BigDataSecurity), IEEE International Conference on High Performance and Smart Computing (HPSC), and IEEE International Conference on Intelligent Data and Security (IDS), 2016 IEEE 2nd International Conference on (pp. 140-145). IEEE.
Gai, K., Qiu, M., Zhao, H., & Xiong, J. (2016, June). Privacy-aware adaptive data encryption strategy of big data in cloud computing. In Cyber Security and Cloud Computing (CSCloud), 2016 IEEE 3rd International Conference on (pp. 273-278). IEEE.
Hashem, I. A. T., Yaqoob, I., Anuar, N. B., Mokhtar, S., Gani, A., & Khan, S. U. (2015). The rise of “big data” on cloud computing: Review and open research issues. Information Systems, 47, 98-115.
Inukollu, V. N., Arsi, S., & Ravuri, S. R. (2014). Security issues associated with big data in cloud computing. International Journal of Network Security & Its Applications, 6(3), 45.
Kim, G. H., Trimi, S., & Chung, J. H. (2014). Big-data applications in the government sector. Communications of the ACM, 57(3), 78-85.
Li, Y., Gai, K., Ming, Z., Zhao, H., & Qiu, M. (2016). Intercrossed access controls for secure financial services on multimedia big data in cloud systems. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 12(4s), 67.
Li, Y., Gai, K., Qiu, L., Qiu, M., & Zhao, H. (2017). Intelligent cryptography approach for secure distributed big data storage in cloud computing. Information Sciences, 387, 103-115.
Liu, C., Yang, C., Zhang, X., & Chen, J. (2015). External integrity verification for outsourced big data in cloud and IoT: A big picture. Future Generation Computer Systems, 49, 58-67.
Lu, R., Zhu, H., Liu, X., Liu, J. K., & Shao, J. (2014). Toward efficient and privacy-preserving computing in big data era. IEEE Network, 28(4), 46-50.
Purcell, B. M. (2014). Big data using cloud computing. Journal of Technology Research, 5, 1.
Puthal, D., Sahoo, B. P. S., Mishra, S., & Swain, S. (2015, January). Cloud computing features, issues, and challenges: a big picture. In Computational Intelligence and Networks (CINE), 2015 International Conference on (pp. 116-123). IEEE.
Reed, D. A., & Dongarra, J. (2015). Exascale computing and big data. Communications of the ACM, 58(7), 56-68.
Sookhak, M., Gani, A., Khan, M. K., & Buyya, R. (2017). Dynamic remote data auditing for securing big data storage in cloud computing. Information Sciences, 380, 101-116.
Tan, Z., Nagar, U. T., He, X., Nanda, P., Liu, R. P., Wang, S., & Hu, J. (2014). Enhancing big data security with collaborative intrusion detection. IEEE Cloud Computing, 1(3), 27-33.
Terzi, D. S., Terzi, R., & Sagiroglu, S. (2015, December). A survey on security and privacy issues in big data. In Internet Technology and Secured Transactions (ICITST), 2015 10th International Conference for (pp. 202-207). IEEE.
Yi, S., Li, C., & Li, Q. (2015, June). A survey of fog computing: concepts, applications and issues. In Proceedings of the 2015 Workshop on Mobile Big Data (pp. 37-42). ACM.
Youssef, A. E. (2014). A framework for secure healthcare systems based on big data analytics in mobile cloud computing environments. Int J Ambient Syst Appl, 2(2), 1-11.
Zhang, Y., Qiu, M., Tsai, C. W., Hassan, M. M., & Alamri, A. (2017). Health-CPS: Healthcare cyber-physical system assisted by cloud and big data. IEEE Systems Journal, 11(1), 88-95.
Zhao, J., Wang, L., Tao, J., Chen, J., Sun, W., Ranjan, R., … & Georgakopoulos, D. (2014). A security framework in G-Hadoop for big data computing across distributed Cloud data centres. Journal of Computer and System Sciences, 80(5), 994-1007.