Money laundering is a multi-billion-dollar issue. Detection of laundering is very difficult. Banks and regulatory authorities struggle to identify these illegal transactions. Not only does it cost them billions in unpaid taxes, but it also promotes crime at the cost of socioeconomic health of a country.

There exist many automated algorithms which aim to detect illegal transactions but most of them have a high false positive rate: legitimate transactions are incorrectly flagged as laundering. The converse is also a major problem --false negatives, i.e. undetected laundering transactions. Naturally, criminals work hard to cover their tracks.

The aim of this module is to utilise IBM’s synthetic dataset of over 10 lakh financial transactions to create a deep learning model for fraud detection. The dataset, which is available on the Kaggle repository, is based on a virtual world inhabited by individuals, companies, and banks. Individuals interact with other individuals and companies. Likewise, companies interact with other companies and with individuals. These interactions can take many forms, e.g. purchase of consumer goods and services, purchase orders for industrial supplies, payment of salaries, repayment of loans, and more. These financial transactions are generally conducted via banks. Using a combination of supervised learning, deep learning, and GridSearchCV assisted models, this module will aim to achieve at least 90% accuracy in identifying illegal transactions.

Goal 1

To organise the money laundering dataset and test it with basic prediction models

Aim: This dataset, consisting of over 10 lakh banking transactions, is semi-organised. The purpose of this goal is to organise and clean it, understand its general characteristics, deduce new information from the data, and create basic prediction models using supervised learning methods like SVM (Support Vector Machines), Naïve Bayes, Logistic Regression, and Random Forest.

Method: Python programming language will be used to:

  • Create bar/ pie charts
  • Create new features of the dataset
  • Clean and organise the data
  • Build prediction models
Milestones

To contribute and publish select a pending milestone.

Completed

There are no completed milestones.

Pending
Understanding the nature of banks’ financial transactions using Python
Using Python to classify financial transaction data according to amount
Generating new features in a financial transactions dataset using Python
Cleaning financial transactions dataset using Python
Support Vector Machine (SVM) model to detect money laundering transactions
Naïve Bayes model to detect money laundering transactions
Logistic Regression model to detect money laundering transactions
KNN model to detect money laundering transactions
Random Forest model to detect money laundering transactions
Decision Tree model to detect money laundering transactions
LSTM model to detect money laundering transactions
Bi-LSTM model to detect money laundering transactions
CNN model to detect money laundering transactions
ANN model to detect money laundering transactions
Goal 2

To build an optimized illegal transactions detection model which reduces false positives

Purpose: Existing models for detecting illegal/ fraudulent banking transactions are many. However the problem of false positives i.e., illegal transactions being identified as legitimate and vice versa. This causes not only financial losses for the bank but also fails to address the mammoth problem of money laundering. The purpose of this goal is to create an optimised illegal transactions detection model by using deep learning methods.

Method: We will use optimised deep learning models including GridSearchCV process on Python (JupyterNotebook).

Requirement: Advanced level kknowledge of Python is mandatory for attempting any goal in this milestone.

Milestones

To contribute and publish select a pending milestone.

Completed

There are no completed milestones.

Pending
Optimised SVM model to detect money laundering transactions
Optimised Decision Trees model to detect money laundering transactions
Optimised Naïve Bayes model to detect money laundering transactions
Optimised Logistic Regression model to detect money laundering transactions
Optimised Random Forest model to detect money laundering transactions
Optimised KNN model to detect money laundering transactions
Optimised LSTM model to detect money laundering transactions
Optimised CNN model to detect money laundering transactions