Data Collection and Storage Methods
Question:
Discuss about the COIT 20253 Report on Fraud Detection in Banking Sector.
Banks of the current era has been increasingly turning to analytics for predicting and preventing fraud at real-time. It has been at many times inconvenience for their clients who have been making massive purchases or undergoing travelling. However, the critical trouble for the banks has been top decrease billions of losses because of that fraud.
Fraud detection at banks has been already looking at various factors like poor IP addresses and unusual login times. On the other hand, Big Data application has been drastically altering the approach with an enhanced analytic solution (Sobolevsky et al. 2014). They are fast and powerful enough for detecting fraud in real time and identifying risks proactively.
The following report analyzes the data storage and collection for the current scenario. Then it demonstrates the data in action regarding consumer-centric product design and a recommendation system. Lastly, the business continuity is discussed regarding how today’s online business survives the disasters and power outrages.
Data collection system:
There are various kinds of data to be collected by banks through multiple processed. The examples are encountered below.
The data to be collected by banks |
The processes |
Corrupted data |
First of all the customers are to be found who have been appearing on treasury department of forensic assent control. Then the financial Action Taskforce must be ensured over money laundering compliance. Then a list of transactions must be produced with various agencies on the list of different non-cooperative territories and countries. |
Data about cash |
The transactions of money are to be identified below regulatory thresholds of reporting. Then a series of cash disbursements are to be determined through customer number that exceeds the statutory reporting thresholds together. Then the statistically unusual amount of transfer of cash is to identify done by customers or through bank accounts (Chen, Mao and Liu 2014). |
Billing data |
They are retrieved through a huge number of waved fees that are transferred to bank account and customer. |
Checking tampering |
The out of sequence, void, duplicate and missing check numbers are to be identified. Then checks paid are to be determined that has never been matching checks that are issued by check though banks. Then the falsification and check forgery are to be located on various loan applications (Martin 2015). |
Skimming |
First of all the concise time deposits must be highlighted and the withdrawal must be made on the similar account balance. Then the indicators of the kiting checks are to be indications. The duplication of skimming and credit card transactions are to be highlighted. |
Larceny |
First of all the customer account takeover must be identified along with co-opted information about customer account (Hilbert 2016). Then the number of loans must be located by the bank employee or customer instead of repayments. Later it must be found that the amount of investment has been higher that value of a particular item or the collateral. Then the sudden activities are to highlighted under dormant customer accounts. This involves the identification of the person which has been processing transactions against those accounts. Lastly, the mortgage fraud schemes are to be isolated through determining straw buyer indicators of a plan. |
Financial statement fraud |
For this, the suspense and dormant general ledger accounts must be monitored. Then the journal entries are to be identified at suspicious times. |
Storage systems:
To understand the storage systems by big data, various factors are considered. The first one is the area where the fraud takes place. The next one is the fraudulent activities that have been looking like the data. The last one is the data source that is needed to test the indicators of fraud.
Requirements for storage |
Methods to achieve them |
Calculating the statistical parameters |
These include high/low values, standard deviations and averages. The calculation is done by identifying outliers indicating fraud. |
Classification |
This is done through finding patterns among various data elements. |
Stratification of numbers |
These are done through identifying unusual entries like excessively low or high values (Kitchin 2014). |
Digital analysis through Benford’s law |
This is done through identifying various occurrences of digits that have been naturally taking place at data sets. |
Making joints with various diverse sources |
This is done by Big Data through identifying matching values like account numbers, addresses and names where they must not exist. |
Duplicating testing |
This is done through identifying different duplicate transactions like expenses claims and payment of report items. |
Gap testing |
This is done through identifying various duplicate transactions such as export report items claims and payments. |
Adding up of numeric values |
This is done through identifying various control totals which can be falsified. |
Validating entry dates |
This is to identify and inappropriate times to post or data entry. |
It must be noted that for storing data, different random samplings are not listed to been smart fraud detection technique foe banks. Sampling has been an effective data analysis technique to investigate data values. They have been consistent across the population data. Here the very nature of fraud has been different since it has been tending to take place randomly (Sobolevsky et al. 2015).
Investigation of big data is the method to examine various huge data sets. This is to uncover various patterns that are hidden. This involves the show changes in due time and confirming and challenging theories. Very often they are relayed with data analytics, decision-making and have applications beyond the for-profit world of business. Here, the two methods consumer-centric product design and recommendation system are investigated.
Consumer-centric product design:
It is also known as user-centric design. It is the process to frame services and products across the various limitations and necessities of end users done by Big Data. It has both regarding designing and quality of content, service and product (Varian 2014). For a bank, it indicates the entire development and design process must take place keeping the user of a database in mind from the step one. The various rules of customer-centric design involve the following.
Filling the reservoirs of bank’s customer knowledge:
For instance, the salespeople of a bank are accountable to send enquiries of customers on to the support. Here, the support staffs have been responsible for funneling the contents of various customer conversations for the proper produce person. Here, the project managers are accountable to decide what feedbacks must be acted upon (Chen and Lin 2014).
Types of Data and Processes Used by Banks
Using customer knowledge to make the banks customer-centric:
Here the customer-centric product team are attentive to the voice of customers at the early stage of product development prior its development, during construction and after the expansion. Next, the customer experiences and needs ate considerable factors to undertake decisions regarding what latest features have been implemented, where to expand and push the vision forward (Cao 2015).
Prioritizing roadmaps for business and users:
This can be effectively done through using the formula (Value/Complexity)*Necessity^2.
This must be done regularly to weigh the long listing ideas and to pick the one that is needed to be given the most priority. Here for all the ideas, a score from 1 to 10 must be assigned for every variable. Next, the totals score must be calculated ranking every idea.
Solving issues for the user during the creation:
Here, the critical phase has been designing the user experience. As the implementation process goes on, here, the customer-centric team focuses on developing and delivering a user experience that has been easy and intuitive for customers to use (Hu et al. 2014). Further, the design of user experience has been the design reframed. This term emphasizes solving problems for users. This is the reminder that the model has not been making pretty. Besides, it is the process to craft the holistic beginning-to-end experience accounting every detail of journey of users.
Communicating changes to a customer after the development is complete:
As the implementation is done, a new feature request is needed to be made. This is intended to be accountable regarding communicating changes to customers. Here, the communication has been the important method of establishing transparency and trust regarding delivering. It must be reminded for banks that quick turnarounds over bank’s customer feature requests have been in a long way to go. However, this has been a slower turnaround that must garner till there is any communication (Stimmel 2016).
Recommendation system:
The recommended systems for banks are complicated approaches to achieve customer goals. The various steps undertaken by Big Data analysis are analyzed below.
Problem facing:
It includes the collection of expectations, requirements and issues from concerned employees and business departments. Here, the interviews are conducted with stakeholders with various clarified needs for the system and managing vision of clients.
Data processing and collection:
At this step, every necessary CRM and various transactional data for the clients of banks are needed for subsequent data analysis and preprocessing. Here, the data set includes social-demographical data, purchase history of clients, history of balances, monthly payments and information about the client. The step involves the checking of data consistency taking place among various sources and treatment of inconsistencies. It further includes the handling of duplicates and null values (Li, Cao and Yao 2015). Moreover, this provides dictionary transformation and forming along with statistics calculation. Here, the problems detected and various solutions to it are reported to clients.
Data processing and development of algorithms:
This stage starts with data augmentation actions that result in mining changes to spend speed or growth of income and various client event triggers. For understanding the financial behavior, the banks must calculate the aggregate balances and outcome or income transfers for multiple periods. Here the features are merged with different elements from CRM system like purchase history of clients, social-demographic data and so on (Olson and Wu 2017). Here, various types of machine algorithms are applicable on the features to predict best recommendations regarding products for clients.
Consumer-Centric Product Design
A/B testing:
For proofing the effectiveness of designed solution the banks must plan an A/B test. Here, there are three groups for the experiment. The first one is the group formed by various segmentations which are done as per social and demography. This also includes different reference groups without any recommendations. The banks have been offering credit cards and multiple cards through call-downs at call centers, premium and debit tickets through SMS broadcasting. This is for various groups of clients (Kim, Trimi and Chung 2014). Outcomes of A/B testing display significant excellence according to sales and conversions. These are done in the groups created by recommender algorithm as compared to other groups.
In this section, how banks survive the case of power outrage and various other disasters are discussed. It is seen from various instances, that various vulnerabilities within power systems have been affecting various regions. It has shown different real-world instances regarding how slow the power companies have been taking to react as the disaster takes place. A study conducted on 2015 revealed that some of those inconsistencies and vulnerabilities are present in various banking systems. It proved that there has been delayed and prolonged recovery plan in that failure event. It was seen that the information has been inaccurate and has been taking about ten days in restoring power to the coverage area. As known today, the study has much more accurate (Thorstad and Wolff 2018).
As a result, the banks of today have been working on big data systems helping alleviate some of those issues for the future. Hence any analytics solution is needed to be designed for assisting companies in managing power outrages and recovering faster. In the next phase, predictive analysis is required to be handled for allowing officials for reacting as per in the event they need to plan for outage or system failure.
For example, the famous software named PowerON response can be used by banks. This has been designed for collecting various data from network and field service crews. They can manage and assess data regarding facility damages, circuit conditions and failures, equipment and devices and restore periods for customers. Further, the data can also be shared with various sources that include management teams, utility crews and even customers (Frizzo-Barker et al. 2016).
Here, the predictive side of the software has been obviously helping various groups to make preparation for further scenarios and then identify the different weak points within that system. Further, the field crews can react through changing the old tools, repairing damages devices and retrofitting previous circuit breakers. Here, the aim has been to help the banks and their teams avoid different power outrages and additional outrages as much possible. Otherwise, it is intended to at least recover very quickly as anything goes wrong (Hale and Lopez 2018).
Here, all the collected data and software must be sued to prevent the wide-scale outages. This is helpful for people as they are left without electricity. However, here the worst case is that it could be easily preventable and augmented with various factors. These are the two most relevant factors are poor coordination and miscommunication taking place between service teams. As any big data system is available, the scenario gets easier differently (Meier 2015). And, this is the exact thing on which banks and its power-related clients have been betting on. As it quells and prepares for various failures prior it can happen, it assures that people are not left behind in the dark for long.
Business Continuity
Conclusion:
Since more and more data are been generated and gathered through big data, analytic capabilities are continuing to progress. Thus, banks can create real value through monetization of data. Performing that accurately helps to maximize the value and secure every vital corporate reputation and brand. In conclusion, it can be said that the potential and promise of significant data needs are needed to be matched though a considered approach to use, license, store and collect. Remedies regarding misuse of data at bank are found to be challenging as there is devoid of any well though data strategy. Further, current copyright protections can help, contract and undertake confidential solutions for banks that have been more very much significant.
The various suggestions to check fraudulent activities taking place at banks through big data is identified below.
- Using data science solution pillars:
Banks can search data using Apache Solr to seek and create models. This can be done by utilizing Apache Pig, Spark and ecosystem analytics tool like R. Machine learning models. These can be tested and various investigations could be run on real and substantial data sets for improving and iterating processing.
- Real time recommended offers:
New product and services are to be targeted for right customers through implementing analytics engine supporting integrated and flexible process. This is done by a better understanding of motivations, buying patterns, preferences and customer needs.
- Making proper compliance with laws and regulations:
Various rules and regulations apply to several types of data and affect the security, sales, uses, storage and collection. Every applicable laws and regulation are needed to be ensured. These must be addressed as per the type of data while issuing.
References:
Cao, L., 2015, ’Big data analytics-innovation and practices’, In Reliability, Infocom Technologies and Optimization (ICRITO)(Trends and Future Directions), 2015 4th International Conference, pp. 1-1, viewed 7 April 2018,
https://www.amity.edu/aiit/icrito2015/ICRITO’2015%20Sessions.pdf
Chen, M, Mao, S & Liu, Y 2014, ‘Big data: A survey’, Mobile Networks and Applications, vol.19, no.2, pp.171-209, viewed 3 April 2018,
https://dl.acm.org/citation.cfm?id=2843712
Chen, X.W & Lin, X 2014, ‘Big data deep learning: challenges and perspectives’, IEEE, vol.2, pp.514-525, viewed 4 April 2018,
https://pdfs.semanticscholar.org/2724/5e65a27bde90b5b0bb25d157bb75a0ad8b5a.pdf
Ekbia et.al, 2015, ‘Big data, bigger dilemmas: A critical review’, Journal of the Association for Information Science and Technology, vol.66, no.8, pp.1523-1545, viewed 4 April 2018,
https://onlinelibrary.wiley.com/doi/pdf/10.1002/asi.23294
Frizzo-Barker et.al, 2016, ‘An empirical study of the rise of big data in business scholarship’, International Journal of Information Management, vol.36, no.3, pp.403-413, viewed 4 April 2018,
https://daneshyari.com/article/preview/1025514.pdf
Hale, G & Lopez, J.A 2018, Monitoring Banking System Fragility with Big Data, Federal Reserve Bank of San Francisco.
Hilbert, M, 2016, ‘Big data for development: A review of promises and challenges’, Development Policy Review, vol.34, no.1, pp.135-174, viewed 5 April 2018,
https://onlinelibrary.wiley.com/doi/abs/10.1111/dpr.12142
Hu et.al, 2014, ‘Toward scalable systems for big data analytics: A technology tutorial’, IEEE, vol.2, pp.652-687, viewed 5 April 2018,
https://ieeexplore.ieee.org/abstract/document/6842585/
Kim, G.H, Trimi, S & Chung, J.H 2014, ‘Big-data applications in the government sector’, Communications of the ACM, vol.57, no.3, pp.78-85, viewed 6 April 2018,
https://dl.acm.org/citation.cfm?id=2500873
Kitchin, R 2014,‘The real-time city? Big data and smart urbanism’, Geo Journal, vol.79, no.1, pp.1-14, viewed 6 April 2018,
https://pdfs.semanticscholar.org/6e73/7a0e5ef29303760a565ba5e9d98510ab0976.pdf
Li, D, Cao, J & Yao, Y 2015, ‘Big data in smart cities’, Science China Information Sciences, vol. 58, no.10, pp.1-12, viewed 6 April 2018,
https://epublications.bond.edu.au/cgi/viewcontent.cgi?article=1478&context=fsd_papers
Martin, K.E., 2015 Ethical issues in the big data industry, Browser Download This Paper.
Meier, P 2015, Digital humanitarians: how big data is changing the face of humanitarian response. Crc Press, viewed 5 April 2018,
https://www.crcpress.com/Digital-Humanitarians-How-Big-Data-Is-Changing-the-Face-of-Humanitarian/Meier/p/book/9781482248395
Meune, C 2017, ‘P4895Current use and misuse of troponin measurements form a large cohort: results of big data analysis’. European Heart Journal, vol.38, no.1, viewed 5 April 2018,
https://academic.oup.com/eurheartj/article/38/suppl_1/ehx493.P4895/4086414
Mulder, F 2016, ‘Questioning Big Data: Crowd sourcing crisis data towards an inclusive humanitarian response’, Big Data & Society, vol.3, no.2, viewed 5 April 2018,
https://doi.org/10.1177/2053951716662054
Olson, DL & Wu, D 2017, ‘Predictive Models and Big Data’, In Predictive Data Mining Models , pp. 95-97, Springer, Singapore.
Reimsbach-Kounatze, C 2015, ‘The Proliferation of “Big Data” and Implications for Official Statistics and Statistical Agencies’, viewed 6 April 2018,
https://www.oecd-ilibrary.org/docserver/5js7t9wqzvg8-en.pdf?expires=1523874446&id=id&accname=guest&checksum=32D97D4990081385D34AA2161A42F89E
Sobolevsky, S 2015, ‘Scaling of city attractiveness for foreign visitors through big data of human economical and social media activity,In Big Data (Big Data Congress)’, IEEE International Congress, pp.600-607, viewed 6 April 2018,
https://arxiv.org/ftp/arxiv/papers/1609/1609.09340.pdf
Sobolevsky, S 2015, ‘Money on the move: Big data of bank card transactions as the new proxy for human mobility patterns and regional delineation,In Big Data (Big Data Congress)’, IEEE International Congress, pp. 136-143, viewed 6 April 2018,
https://dl.acm.org/citation.cfm?id=2671976
Stimmel, CL 2016, Big data analytics strategies for the smart grid, Auerbach Publications.
Thorstad, R & Wolff, P 2018, ‘A big data analysis of the relationship between future thinking and decision-making’, Proceedings of the National Academy of Sciences, vol.115, no.8, pp.E1740-E1748, viewed 8 april 2018,
https://doi.org/10.1073/pnas.1706589115
Varian, HR 2014, Big data: ‘New tricks for econometrics’. Journal of Economic Perspectives, vol.28, no.2, pp.3-28, viewed 9 April 2018,
https://www.aeaweb.org/articles?id=10.1257/jep.28.2.3
Verhoef, PC, Kooge, E & Walk, N 2016, Creating value with big data analytics: Making smarter marketing decisions, Routledge.
Zhong, RY 2016, ‘Big Data for supply chain management in the service and manufacturing sectors: Challenges, opportunities, and future perspectives’, Computers & Industrial Engineering, vol.101, pp.572-591, viewed 10 April 2018,
https://dx.doi.org/10.1016/j.cie.2016.07.013