Task 01
a) Two possible applications of word embedding in business
Word embedding is widely used in ‘Natural Language Processing (NLP)’, which is a sub-field of machine learning.
One of the applications of word embedding can be found in analyzing the survey response. Word embedding solves a common business problem of not having enough tools and time to effectively and easily analyse the survey responses. Vector representation of words adapted to survey data-sets can solve the complex relationship between the context within which the responses were made and these responses being reviewed (Zhong et al. 2017).
Word embedding is also applicable to the domain of analysis of verbatim comments, which are very important for organisations particularly those which are customer-centric. Word2Vec, a kind of word embedding helps figure out the specific context within which a verbatim comment was made. This algorithm can be very useful in understanding the sentiment of customers towards a business (Zhong et al. 2017).
b) Two popular implicit biases
One of the implicit biases is technical bias. It emerges when there are mathematical, software and hardware constraints. An over fitted algorithm is considered bias as its interface is perfect for the training purpose and non-generalizable for new cases. The different types of mathematical models trained on the same data may have prediction accuracies; however, they will also have a different amount of biases. It is more due to the different cost functions, which the different mathematical models optimize (Ruder, Vuli? and Søgaard 2019).
Emergent bias erupts while evaluating the results for things applied in the context. When making algorithmic results to form a decision, there may arise an ethical problem. The decision thus proposed to contradict the existing normative values to be found in society. Algorithmic decision-making can lead to unfair practices such as unfavorable treatment offered to a particular group or individual (Ruder, Vuli? and Søgaard 2019).
c) Two most important measures/best practices
One of the ways to control word embedding biases can be to take the help of the data scientists in identifying the best model for a specific situation context. One model cannot be suitable for different situation contexts. The selection of the right model can be done by consulting with data scientists. Data scientists possess the expertise needed to identify the most suitable model (Bender and Friedman 2018). They consider different strategies before they build a model. It is like troubleshooting different ideas to identify the best model than to wait for the vulnerabilities to happen and then fix it.
a) Two possible applications of word embedding in business
In case there is insufficient data for one group, the weighting might be used to increase its importance in training. However, this needs to be done with utmost caution as it can lead to new unexpected biases. A sample consisting of a small group of people might require forcing the chosen model to consider the trends those few could have been identified. If this happens a large weight multiplier must then be used or else the model will have a higher risk of picking up new trends. It is, therefore, crucial to be careful with selecting the weights particularly the large ones (Bender and Friedman 2018).
a) Ethically significant benefits
Banks need to assess any loan applications on a scale of low-to-high risks to conclude whether their investment is into safe hands. The use of Big Data can help banks understand customer behavior based on their inputs received from various channels such as investment patterns, financial backgrounds, motivation to invest, shopping trends, etc. (Jagtiani, Vermilyea and Wall 2018). Therefore, Big Data can help to identify which loan to approve or else.
b) Ethically significant harms
As a result of a loan denial, Fred and Tamara might have suffered three ethically significant harms. These are as under (Calem, Correa and Lee 2019):
- Credit Rating– A rejected loan won’t affect the credit score; however, creditors may review a loan being applied later on. The hard inquiry, which the creditors will execute can hurt the credit scores.
- Longer Wait for Next Loan Application– As the loan is being rejected, Fred and Tamara would have to wait longer than six months before they could apply for another loan in some other banks. Instead, excellent credit scores help to get loans easily than someone whose credit score is not stellar.
- Fear to Lose Further on Credit Rating– As the loan applied by Fred and Tamara has been rejected, they will have to be super cautious before applying for the next loan. If they do not do so, this might affect their credit rating further. It is to be noted here that credit rating indicates whether the customer is eligible for a credit and the amount, which the person can borrow. In addition to this, a rejected loan may also affect the interest rate charged against a loan applied.
c) The broader harms to society
The loan evaluation process in the bank where Fred and Tamara had applied for a loan is increasingly dependent on one algorithm, which perhaps has few flaws and lacks significant measures to check its validity. Such a loan evaluation system won’t just affect the true dreams of many potential entrepreneurs like Fred and Tamara; but also disclose their lifestyles without their permissions to do so (Calem, Correa and Lee 2019).
d) Three measures/best practices
The harms that might have caused to Fred and Tamara can be reduced by some good measures. These are as under (Lenz 2016):
- Evolving attitudes need to be closely monitored– Banks should ensure that individuals who are responsible for development work on machine learning receive appropriate training on fair lending practices. This would encourage them to take necessary actions in the best interests of the bank, the software vendor and the customers.
- Potential bias should be put to consecutive test trials– There are some promising and potential technological solutions are emerging, which promise to help companies on the test and correct several times for bias in their algorithmic systems. Banks are expected to perform continuous monitoring of their algorithmic system to identify potential bias problems to head towards a more accurate algorithmic solution.
- The rationale for algorithmic features needs to be documented– As part of a risk management strategy, banking companies should already have defenses prepared for claims before these are raised. Institutions such as banks using algorithmic solutions in all credit transactions need to consider a few things. These are like how best to comply with the legal requirements for submitting any statements of specific reasons to entertain any request for record retention and information.
References
Bender, E.M. and Friedman, B., 2018. Data statements for natural language processing: Toward mitigating system bias and enabling better science. Transactions of the Association for Computational Linguistics, 6, pp.587-604.
Calem, P., Correa, R. and Lee, S.J., 2019. Prudential policies and their impact on credit in the United States. Journal of Financial Intermediation, p.100826.
Jagtiani, J., Vermilyea, T. and Wall, L.D., 2018. The roles of big data and machine learning in bank supervision. Forthcoming, Banking Perspectives.
Lenz, R., 2016. Peer-to-peer lending: opportunities and risks. European Journal of Risk Regulation, 7(4), pp.688-700.
Ruder, S., Vuli?, I. and Søgaard, A., 2019. A survey of cross-lingual word embedding models. Journal of Artificial Intelligence Research, 65, pp.569-631.
Zhong, J., Ogata, T., Cangelosi, A. and Yang, C., 2017, September. Understanding natural language sentences with word embedding and multi-modal interaction. In 2017 Joint IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob) (pp. 184-189). IEEE.