The Challenges and Opportunities of Big Data and Analytics
1.You are required to demonstrate and present some theoretical background about the concept of big data, a very new emergent technology trend that is considered as a big challenge for the majority of organizations in today’s business turbulent environment.
2.As an example for this assignment, you need to show how the following facilitate the management of big data: Hadoop, in-memory computing, analytic platforms.
1.Big Data is a method of data analysis which supports high-velocity data, storage, as well as analysis. Presently, it is a significant emerging technology and the 4Vs of big data i.e. volume, velocity, veracity and variety make management of data as well as analytics perplexing for the traditional data depositories. It is essential to consider big data and analysis together as big data can be used to describe the recent explosion of varying data from varying sources while Analytics is the exploration of data to derive interesting as well as relevant trends and patterns that can be utilized to notify the decisions, improve processes and even initiate new business models (El-Seoud, El-Sofany, Abdelfattah, & Mohamed, 2017). There are various challenges and threats that are required to be considered in the deployment of big data on cloud environment. The security vulnerabilities arise due to incorporation between big data and the cloud system environment, which results in platform heterogeneity (Manyika, et al., 2011).
The online services and mobile apps achieve various benefits by analyzing and using big data such as utilizing the collected data. The company taking data using any source and analyzing it can achieve cost savings, time reductions, new product development, and understanding of market conditions as well as in controlling the online reputation. Highly accurate big data analysis can be improved on the basis of utilization of advanced technology. The company can use various big data platforms which include Hadoop and Apache Spark that can provide unique analysis of huge sets of data (McKinsey & Company, 2018). Highly innovative big data technology can also generate stylish big data models in view of better depiction of the collected data. Some of the companies might prefer big data providers and select from the various choices available and suitable for their needs and generate accurate results (Delgado, 2015).
However, big data analytics is a remarkable tool that can assist in business decisions, it is associated with certain limitations as well. Data analysts utilize big data to tease out correlation, particularly, when one variable is associated to another; however, not all these correlations are considerable. Big data can be utilized to distinguish correlations and perceptions utilizing limitless range of queries and questions. It is on the user to figure out the significance of the questions. Achieving the right answer for the wrong question might become a costly affair to the user, the clients as well as the business. In addition, big data analytics is susceptible to data breaches and might be difficult to consistently transfer data to specialists for repeat analysis (Ciklum, 2017).
Benefits of Utilizing Big Data Platforms such as Hadoop and Apache Spark
Despite everything, it is beneficial for all the organizations to analyze big data, as it reveals information about the behaviors of customers which assist in performing business in an effective manner. It assists in developing new business applications and ensuring customer satisfaction. That is why, before deciding to work with big data, a company should address various people, organizations, and technology issues. Well-planned analytical processes and people with talent and skills are required to control the technologies as well as perform effective big data analytics initiative. The organizations and people should decide what they prefer out of big data before working with it and determining the security access for big data. The general way to quantify return on investment from the system is to regulate the speed of transactions and then induce its significance from the standpoint of captured revenues (Shacklett, 2015). Thus, right steps are required to be taken in order to optimize its value for the benefit of the organization.
2.Hadoop big data analytics is the actual platform for the big data and plays a significant part in big data analytics. The Rapid Miner platform is an outstanding explanation for dealing with unstructured data such as text files, web traffic logs as well as images. The analytical engines in Rapid Miner are in-memory data storage highly optimized for data access generally executed for analytical processes. In-memory analytics is always the quickest method to develop analytical models. The size of the data set is restricted by memory or hardware as with the availability of more memory, larger datasets can be analyzed. Hadoop offers a storage engine which is distributed along with the likelihood of utilizing the Hadoop cluster for a distributed analytical engine for big data analytics. It is not applicable for quick and interactive analysis and runtime is based on the authority of the Hadoop cluster, however, it has practically infinite scalability. As traditionally, in databases memory was an appreciated source, In-memory computing always utilizes memory initially and prevents touching the disk. In-memory computing is doing great from the standpoint of speed and evolution in memory technology and through it, persistent memory will be possible (Anadiotis, 2017).
The complicated predictive as well as prescriptive modeling can assist the organizations expecting business opportunities take decisions that influence profits in regions such as targeting market campaigns and avoiding failures of equipment. The historical data sets are extracted for patterns revealing future situations and behaviors through predictive analytics, while the results of predictive analytics are incorporated through prescriptive analytics, in order to advocate actions that will take paramount advantage of the projected consequences. Big data analytics tools can consume an extensive range of data types such as structured data, with distinct and reliable fields such as transaction data, kept in interpersonal databases; semi-structured data, for example mobile application log files or web servers; as well as unstructured data, including things such as emails, text messages, text files, and social media columns (Loshin, 2015).
Web mining is considered as the process of using techniques of data mining as well as algorithms to extract the information directly from the Web, by mining it from web content, hyperlinks, web documents and server logs. The main objective of Web mining is to search for patterns in the web data by gathering and examining the information so as to achieve an understanding into the developments, industry as well as the users in general (Patel, Chauhan, & Patel, 2011). It is considered as a branch of data mining which focuses on the World Wide Web by means of the primary data source together with its entire components from Web content, server logs concerning the whole thing. The contents of data mined from the Web might be a gathering of facts that Web pages are considered to be including and might consist of text or structured data for instance lists, tables, images, videos as well as audio.
There are various categories of Web mining which are;
- Web content mining- It is considered as the process of mining valuable information from the subjects of web pages and documents, majorly text, images as well as audio or video files.
- Web structure mining- It is the process of analyzing nodes and linking the structure of a website with the utilization of graph theory. There are two things that can be acquired through this, which are, the structure of a website from the viewpoint of its connection with other sites and the document structure of the website as well as the way individual pages are connected (Techopedia, 2018).
- Web usage mining- It is the process of extracting patterns and information from the server logs to achieve an understanding of the user activity, together with information about where the users are from, number of clicks on every item on the site as well as various activities being conducted on site.
References
Anadiotis, G. (2017). In-memory computing: Where fast data meets big data. Retrieved from Zdnet.com: https://www.zdnet.com/article/in-memory-computing-where-fast-data-meets-big-data/
Ciklum. (2017). Limitations of Big Data analytics. Retrieved from Ciklum.com: https://www.ciklum.com/blog/limitations-of-big-data-analytics/
Delgado, R. (2015). Improving the accuracy of Big Data analysis. Retrieved from Dataconomy.com: https://dataconomy.com/2015/10/improving-the-accuracy-of-big-data-analysis-2/
El-Seoud, S. A., El-Sofany, H. F., Abdelfattah, M., & Mohamed, R. (2017). Big Data and cloud computing: trends and challenges. iJIM, 11(2), 34-52.
Loshin, D. (2015). How big data analytics tools can help your organization. Retrieved from Techtarget.com: https://searchbusinessanalytics.techtarget.com/feature/How-big-data-analytics-tools-can-help-your-organization
Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., & Byers, A. H. (2011). Big Data: The next frontier for innovation, competition, and productivity. McKinsey Global Institute.
McKinsey & Company. (2018). How companies are using big data and analytics. Retrieved from Mckinsey.com: https://www.mckinsey.com/business-functions/mckinsey-analytics/our-insights/how-companies-are-using-big-data-and-analytics
Patel, K. B., Chauhan, J. A., & Patel, J. D. (2011). Web mining in E-Commerce: Pattern discovery, issues and applications. International Journal of P2P Network Trends and Technology, 11(3), 40-45.
Shacklett, M. (2015). 10 things you shouldn’t expect big data to do. Retrieved from Techrepublic.com: https://www.techrepublic.com/blog/10-things/10-things-you-shouldnt-expect-big-data-to-do/
Techopedia. (2018). Web mining. Retrieved from Techopedia.com: https://www.techopedia.com/definition/15634/web-mini