Machine Learning Platforms
1. Artificial intelligence is preferred to human beings because of their non-subjectivity to prejudice and bias (Nils Nilsson, 1998). They all work using facts in figures of history of performance to do analysis so as to come up with trends and highlight anomalies.
They have track records of accuracy and precision because of the predefined algorithms which will give output depending on the data fed into them.
2. There’s quite a number of machine learning platforms that have been incorporated into business activities for various purposes. Among the best are SAP’s HANA, Domo, Apptus, Microsofts’s Cortana and Avanade, General Electric’s Predix and Siemens’ MindSphere.
Apptus learns the search patterns of customers and after a little time will provide relevant suggestions to what customers are searching for (Kurzweil Ray, 2013). This makes it easy for customers to buy products online. It keeps a record of the trends of all customers’ searches and purchases, and therefore preferences. It then makes suggestions on what to do, in terms of improving product to attract more customers and improve sales.
Predix uses predictive analytics technology to compare performance history of equipment in the firm, then predicts a range of possible outcomes using this data. It then makes suggestions on when to stop operation of the equipment and do maintenance. It does this in conjunction with other industrial applications (Kurzweil Ray, 2013).
Avanade, a joint venture AI by Microsoft and Accenture involve Cortana to learn the search patterns of customers, just like in Apptus. It makes relevant products suggestions to the customers depending on what they search for. The AI can recognize speech also. It also uses predictive analytics (Marcus Hutter, 2005).
HANA, a cloud platform analyses transaction information history and data for patterns and anomalies. It can also be installed on the business’ servers or also through cloud. After collecting all business data from all digital devices used for company purposes including purchase orders etc., it analyses this data, learns the trend and spots all irregularities. It issues alerts and makes suggestions on possible solutions. HANA can spot any unexpected difference in the process of running business, which could be an excess product order which is odd for some reasons, or an equipment performing slower than usual. This reduces operation costs by a wide margin. Walmart uses this AI in its excess of 11,000 stores. An inventory research done by SAP revealed that the selected 10 organizations that use its cloud based platform expected a 575% 5-year return on investment (Russel Stuart & Norvig Peter, 2003)!
SAP’s HANA
Domo on the other hand collects data from third party platforms and applications like Twitter, Instagram, Facebook, Square, SalesForce etc. and uses it to provide insights related to business. It automatically and periodically searches for certain key words relating to the business’ line of trade and analyses this information using mining techniques. By monitoring performance of the business at sales points, any unexpected rise or drop in sales is flagged and suggestions made on possible solutions, which could be on increasing or decreasing supply to certain sales points. Just like in HANA and Predix, it needs predetermined data against which it will be making real-time comparisons.
MindSphere just like Predix, monitors the performance of equipment, comparing current results with pre-determined data (Nils Nilsson, 1998). It uses machine tool analytics and drive train to monitor fleets of equipment and makes suggestions on when to stop operations for servicing and maintenance. It can be used with machines from all manufacturers unlike some AIs that only work with specific companies.
Development in AI technology is current underway with some Europe-based companies researching and developing AIs that can think like human beings (McCorduck, 2004).
If they get hacked, the business operations may be compromised and consumers’ private information violated.
AIs currently can only learn from data they are made to access unlike human beings.
Artificial Intelligence machines and software work by analyzing data fed into them by humans. They learn trends and patterns in this data and give output depending on how they are programmed to do it (Bouckaert Remco & Frank Eibe, 2010). They are used in data mining, a subfield in computer engineering and technology, to obtain relevant, organized information from a dataset. The work discussed in this section is on analysis, classification and visual representation of information (bank data).
J48 decision tree classification is used to show the relationship among attributes in the data (Bouckaert Remco & Frank Eibe, 2010). J48 is java implementation of the C4.5 algorithms, which is the successor of the pioneer ID3 algorithm. The appealing visual representation obtained is easy to understand and follow through to the least informative attribute.
The decision tree is a recursive instance space partition which consists of numerous nodes forming a rooted tree (Trevor & Robert, 2012). It is a directed tree with no incoming edges. All other nodes have only one node incoming. A node having an outgoing edge is called an internal node. All others are terminals. Internal nodes do splitting of instant space into two or more attributes depending on the function used.
Domo
Weka’s user interface is user-friendly, and the algorithm works using basic principles (Han Kamber & Jaiwei, 2011). Going about the software and understanding represented data is rather simple. It analyses the dataset based on its algorithm programming. It then assigns attributes to the dataset depending on the variables available (Trevor & Robert, 2012). It is these attributes that will be analyzed and split for deeper analysis. Child nodes are generated depending on the best split that has been done, which also depends on the entropy evaluation of the specific attribute. The algorithm goes ahead to dig deeper into a child node, analyzing the entropy and information gain levels of attributes until a certain stopping criterion is arrived at. This could be because the decision tree has become too big, or the same class has been reached after recursion, meaning that there’s no more comparisons that can be made.
When building a decision tree, it is difficult to decide on the attribute to split to allow for evaluation of other attributes that come under it (Lovell Michael C, 1983). The solution is to evaluate the entropy and information gain levels of attributes. Entropy is a measure of uncertainty arising from each attribute. Information gain on the other hand, is a measure of the information an attribute can give. Low entropy means high information gain. Attributes with the lowest entropy and highest information gain are considered first then split into other attributes.
Expressions like YES 151.0/19.0 means that an attribute has been 151 times correctly classified, and 19 times wrongly classified (Bouckaert Remco & Frank Eibe, 2010). On this assignment, I will refer to them as a YES over NO classification fraction of 151.0/19.0.
The initial attribute arrived at in the bank data is the number of children.
The first attribute based on least entropy level is Children (Number of children an individual has); whether the number is less than or equal to 1, or greater than 1. Children attribute under less than or equal to 1 is further split into two, individuals with number of children less than or equal to 0, and those with number of children greater than 0. Those with a number of children less than or equal to 0, (<=0) are further evaluated under Married attribute. Instances of not married attributes are further evaluated under Mortgage attribute. Those with NO mortgage have a YES over NO classification fraction of 48.0/3.0. Instances with mortgage are further analyzed under Save_act attribute. Those with NO savings have a YES over NO classification fraction of 12.0/0.0, while those with savings have a NO over YES classification fraction of 23.0/0.0.
Apptus
Individuals who are married are analyzed under Save_act which is split into two, those with savings (=YES) and those without savings (=NO). Those with savings have a NO over YES classification fraction of 119.0/12.0. Those with no savings are analyzed further under mortgage attribute, which is also split into two: those with mortgage and those without. Individuals with mortgages have a YES over NO classification fraction of 25.0/3.0. Those with mortgage have a YES over NO classification fraction of 25.0/3.0. Individuals without mortgages are further analyzed under income attribute, which is also split into two: those with income of an amount less than or equal to 21506.2 and those with income greater than 21506.2. Instances greater than 21506.2 have a NO over YES classification fraction of 20.0/0.0. Instances of amounts less than or equal to 21506.2 are analyzed under age attribute, which is split into two: those with less or equal to 41 years having a NO over YES classification fraction of 11.0/1.0, and those who are older than 41 scoring a YES to NO classification fraction of 5.0/1.0.
Instance of number of children greater than 0 is analyzed under income attribute which is split into two: individuals with income of less than or equal to 15538.8 and those with income of amount greater than 15538.8 scoring a YES over No classification fraction of 111.0/5.0. Individuals with income amounts less than or equal to 15538.8 are analyzed under the age attribute, which is split into two: those with age of less than or equal to 41 years scoring a NO over YES classification fraction of 22.0/2.0, and those older than 41 years scoring a YES over NO classification fraction of 2.0/0.0.
Individuals with number of children greater than 1 (>1) are analyzed under income attribute. Income attribute is split into two: those with income amount of less or equal to 30403.3 scoring a NO over YES classification fraction of 124.0/12.0 and those with income amount greater than 30404.3. Instance of income amount greater than 30404.3 is further analyzed under Children (Number of children) attribute. This is split into two: those with income amount less or equal to 44288.3 scoring a NO over YES classification fraction of 19.0/2.0 and those with income amount greater than 44288.3 scoring a YES over NO classification fraction of 8.0/0.0.
Conclusion on results of analysis
Individuals with more than 1 child have an income greater than 15,538.8, and most of them older than 41 years. Those with no children and are married have no savings but have mortgage. Those with no mortgage have an income of less than 21,506.2.
Microsoft Cortana
Individuals not married have mortgages.
Those with more than 1 child have an income greater than 30,404.3, most having a maximum of 2 children. Individuals with income greater than 44,288.3 have more than 2 children.
A progress report or a dashboard is a visual representation highlighting at a glance the performance of the organization (Victoria Herrington, 2010). It is necessary that the information be consolidated in one screen for ease of accessibility.
It is normally displayed on a WEB page which is linked to a company database and can therefore be easily updated (Stacey Bar, 2010). For instance, a dashboard of a manufacturing firm may give numbers that are related to its productivity e.g., number of failed inspections of quality, number of manufactured parts etc. In the same way, a dashboard on human resources may display numbers that are related to retention and composition, staff recruitment etc. Some digital dashboards allow managers to monitor closely the performance of different departments under them. Such dashboards allow for capturing of specific data points to gauge the exact performance of a certain entity of the business. For this reason, detailed reports that show new trends can be generated and appropriate measures taken to correct them.
Key performance indicators can be viewed conveniently (Faw, 2006). Efficiencies and inefficiencies can be evaluated better and informed decisions made. Dashboards present a new ability to organizations to make more informed decisions using the information on the dashboard, as it is a conclusive summary of the Different dashboards may be developed for different purposes depending on the organization’s intention (Kantardzic Mehmed, 2003). They can generally be broken down into the following categories: analytical dashboards, strategic dashboards and informational dashboards. Analytical dashboards have more comparison, context or history along with milder evaluators of performance. They support interaction with data such as drilling down to get into more details. Strategic dashboard on the other hand show few snapshots of data which constantly change.
There are many software that have been developed for making of dashboards. Depending on the organization’s line of trade, software to be used should be appropriate and of the right context (Eckerson, May 2010). Unlike in the dashboard shown above where a technological-based software has been used for a business context.
Key indicators to a good dashboard include: minimized distractions that could confuse a viewer, simple and easy communication and the viewer should not strain when on the dashboard, applies the human visual perception in the design in visual presentation of information and supports an organized business with data that is meaningful and useful (Faw, 2006).
General Electric’s Predix
The one represented above shows customer information: Customer ID, Account number, Primary contact, Telephone number, Email address, Industry and Titles. It also has financial metrics including Net profit, Contribution Margin and Revenue Trend, and Non-financial trends including contracts renewed, service quality index and cases closed.
The dashboard, on its bottom left corner, has a remarks tab dedicated for comments by analysts. (Lovell Michael C, 1983) The bar graphs and pie charts have an appealing design. The detailing of figures has been done adequately. The graphs are also big enough to allow for fairly accurate estimation of values on the graphs that surpass the values indicated.
The information has been fairly well consolidated into one piece, however small, but the dashboard’s intended value has not diminished significantly.
The two bar graphs and pie chart have been squeezed into very little space. Occasionally, dashboards are allowed to cover a whole page (Faw, 2006). The information represented by the three graphs could have been clearer if more space was used. The lower representation of non-financial metrics is also squeezed into very little space, and much cannot be made out of it from a farther distance (Kantardzic Mehmed, 2003). The data represented has been fragmented in such a way that the viewers’ ability to derive meaningful relationships from the data is difficult.
The dazzling and flashing nature of the two charts on non-flashing metrics have subverted the dashboard’s goal of communication. The beautiful display will definitely lose its spark in a matter of days then become an annoying pictorial representation of performance. A dashboard should be more centered on providing information than on displaying flashy designs (Victoria Herrington, 2010).
The arrangement of data represented on the dashboard has been poorly done. The non-financial metrics part is not clearly defined. By simply looking at it, one cannot tell what is meant by the data. Flow of represented data should come automatically to the viewer when on a particular dashboard. One should almost know what to expect next after viewing a graph of some performance indicator. This dashboard however lacks that flow and it is impossible to predict what is being presented.
Adequate context for the data has not been supplied. We don’t know the exact product the organization is working on. From a general point of view, the organization could be in the investment banking business because of the indicators displayed, escalations, customer information, profits and revenue etc.
Siemens’ MindSphere
The design has meaningless variety (Faw, 2006). The relationship between the customer and the graphs is not indicated. It is difficult to tell whether the customer is the investor in the business, and is making the profit shown on the right side of the customer information tab. The product being traded in is not shown anywhere on the dashboard. The origin, context and use of the representation of case closed, escalations, training units purchased and professional service engagement has not been revealed.
Important data has not been highlighted adequately (Stacey Bar, 2010). The tabs on escalation, cases closed, training units purchased and professional service engagements have not been brought out clearly. A little straining on the viewers’ eyes is needed to make out the approximate values of these components of the dashboard. They are an important part of the dashboard as they show the general performance in terms of the low consumer escalations etc. This information should therefore be given more space on the dashboard (Briggs Jonathan, 2013).
The design of the contracts renewed and service quality index looks like a car speedometer which is not visually appealing (Victoria Herrington, 2010). This screen has been cluttered with decoration that is useless. This decoration is making the dashboard look like something else that it is not. It can be ruled as a fatal distraction that is absurd for it gets the mind of the viewer off the main task at hand, which is assessing the organization’s performance. Since this dashboard is business-based, it is only logical to use data representation techniques that point to a corporate setting. The setting the design is pointing to is more technology-based, which would have been excused. Better visual display designs should have been given to this data, like a good pie chart, or a bar graph (Stacey Bar, 2010).
Color has been grossly misused (Faw, 2006). Some shading of grey color has been used in the background. This is making it difficult for part of consumer’s information to be seen, like the email address. Figures on the YOY revenue trend table can barely be seen. A little straining is needed to see this data. The background color should have been left plain white for better visibility (Briggs Jonathan, 2013).
References
Bouckaert Remco & Frank Eibe, 2010. WEKA experiences with a Java Open-Source Project. Journal of Machine Learning.
Briggs Jonathan, 2013. Management reports and Dashboard best practice. Target dashboard.
Douglas Hawkins, 2004. The Problem of overfitting. Journal of chemical information and computer sciences.
Eckerson, W. W., May 2010. Performance Dashboard: Measuring, Monitoring, and Managing Your Business. s.l.:Wiley.
Faw, S., 2006. Information Dashboard Design: The effective Visual Communication of Data. s.l.:O’Reilly.
Fayyad Usama & Smyth Padhraic, 2008. From data mining to Knowledge Discovery in Databases. s.l.:s.n.
Han Jiawei & Kamber Micheline, 2001. Data Mining: Concepts and Techniques. s.l.:Morgan Kaufmann.
Han Kamber & Jaiwei, P., 2011. Data Mining: Concepts and Techniques. 3rd ed. s.l.:Morgan Kaufman.
Kantardzic Mehmed, 2003. Data Mining: Concepts, Models, Methods, and Algorithms. s.l.:John Wiley and Sons.
Kurzweil Ray, 1999. The Age of Spiritual Machines. s.l.:Penguin Books.
Lovell Michael C, 1983. Data Mining: The Review of Econoimics and Statistics. 4th ed. s.l.:s.n.
Lukasz Kurgan, 2006. A survey of Knowledge Discovery and Data Mining Process Models. 1 ed. New York: Cambridge University Press.
Marcus Hutter, 2005. Universal Artificial Intelligence. Berlin: Springer.
McCorduck, P., 2004. The rough shatterung of AI. AIs and Business, p. 424.
Nils Nilsson, 1998. Artificial Intelligence: a new synthesis. s.l.:Morgan Kaufmann.
Oscar Marban, 2009. A data mining and knowledge discovery process model. Vienna: Julio Ponce.
Russel Stuart & Norvig Peter, 2003. Artificial Intelligence: A modern Approach. Upper Saddle, Issue 2nd.
Stacey Bar, 2010. 7 Small Business Dashboard Design Dos and Don’ts. s.l.:An Insight ZSL Inc..
Trevor, H. & Robert, T., 2012. The Elements of Statistical Learning. s.l.:s.n.
Victoria Herrington, 2010. What is a dashboard. s.l.:s.n.
Witten Ian, 2011. Data Mining: Practical Machine Learning Tools. s.l.:Elsevier.