Illegal transactions detection model to prevent money laundering

working

Created by Riya Jain on 08, Feb 24 & supervised by Abhinash Jena

Money laundering is a multi-billion-dollar issue. Detection of laundering is very difficult. Banks and regulatory authorities struggle to identify these illegal transactions. Not only does it cost them billions in unpaid taxes, but it also promotes crime at the cost of socioeconomic health of a country.

There exist many automated algorithms which aim to detect illegal transactions but most of them have a high false positive rate: legitimate transactions are incorrectly flagged as laundering. The converse is also a major problem --false negatives, i.e. undetected laundering transactions. Naturally, criminals work hard to cover their tracks.

The aim of this module is to utilise IBM’s synthetic dataset of over 10 lakh financial transactions to create a deep learning model for fraud detection. The dataset, which is available on the Kaggle repository, is based on a virtual world inhabited by individuals, companies, and banks. Individuals interact with other individuals and companies. Likewise, companies interact with other companies and with individuals. These interactions can take many forms, e.g. purchase of consumer goods and services, purchase orders for industrial supplies, payment of salaries, repayment of loans, and more. These financial transactions are generally conducted via banks. Using a combination of supervised learning, deep learning, and GridSearchCV assisted models, this module will aim to achieve at least 90% accuracy in identifying illegal transactions.

Goal 1

To organise the money laundering dataset and test it with basic prediction models

Aim: This dataset, consisting of over 10 lakh banking transactions, is semi-organised. The purpose of this goal is to organise and clean it, understand its general characteristics, deduce new information from the data, and create basic prediction models using supervised learning methods like SVM (Support Vector Machines), Naïve Bayes, Logistic Regression, and Random Forest.

Method: Python programming language will be used to:

Create bar/ pie charts
Create new features of the dataset
Clean and organise the data
Build prediction models

Milestones

To contribute and publish select a pending milestone.

Completed

There are no completed milestones.

Pending

Understanding the nature of banks’ financial transactions using Python

1000 words
unpaid
1 day later
No files attached

Using Python to classify financial transaction data according to amount

1000 words
unpaid
2 days later
No files attached

Generating new features in a financial transactions dataset using Python

1000 words
unpaid
2 days later
No files attached

Cleaning financial transactions dataset using Python

1000 words
unpaid
2 days later
No files attached

Support Vector Machine (SVM) model to detect money laundering transactions

1000 words
unpaid
2 days later
No files attached

Naïve Bayes model to detect money laundering transactions

1000 words
unpaid
2 days later
No files attached

Logistic Regression model to detect money laundering transactions

1000 words
unpaid
2 days later
No files attached

KNN model to detect money laundering transactions

1000 words
unpaid
2 days later
No files attached

Random Forest model to detect money laundering transactions

1000 words
unpaid
2 days later
No files attached

Decision Tree model to detect money laundering transactions

1000 words
unpaid
2 days later
No files attached

LSTM model to detect money laundering transactions

1000 words
unpaid
2 days later
No files attached

Bi-LSTM model to detect money laundering transactions

1000 words
unpaid
2 days later
No files attached

CNN model to detect money laundering transactions

1000 words
unpaid
2 days later
No files attached

ANN model to detect money laundering transactions

1000 words
unpaid
2 days later
No files attached

Descriptive Statistics and Correlation Analysis

1000 words
unpaid
1 day later
No files attached

Assumption Testing for Regression Analysis

1000 words
unpaid
1 day later
No files attached

Ordinary Least Squares (OLS) Regression Analysis

1000 words
unpaid
1 day later
No files attached

Principal Component Analysis (PCA)

1000 words
unpaid
1 day later
No files attached

Goal 2

To build an optimized illegal transactions detection model which reduces false positives

Purpose: Existing models for detecting illegal/ fraudulent banking transactions are many. However the problem of false positives i.e., illegal transactions being identified as legitimate and vice versa. This causes not only financial losses for the bank but also fails to address the mammoth problem of money laundering. The purpose of this goal is to create an optimised illegal transactions detection model by using deep learning methods.

Method: We will use optimised deep learning models including GridSearchCV process on Python (JupyterNotebook).

Requirement: Advanced level kknowledge of Python is mandatory for attempting any goal in this milestone.

Milestones

To contribute and publish select a pending milestone.

Completed

There are no completed milestones.

Pending

Optimised SVM model to detect money laundering transactions

1000 words
unpaid
2 days later
No files attached

Optimised Decision Trees model to detect money laundering transactions

1000 words
unpaid
2 days later
No files attached

Optimised Naïve Bayes model to detect money laundering transactions

1000 words
unpaid
2 days later
No files attached

Optimised Logistic Regression model to detect money laundering transactions

1000 words
unpaid
2 days later
No files attached

Optimised Random Forest model to detect money laundering transactions

1000 words
unpaid
2 days later
No files attached

Optimised KNN model to detect money laundering transactions

1000 words
unpaid
2 days later
No files attached

Optimised LSTM model to detect money laundering transactions

1000 words
unpaid
2 days later
No files attached

Optimised CNN model to detect money laundering transactions

1200 words
unpaid
2 days later
No files attached

Discuss