Investment Banks, with activities encompassing trading as well as deal making, are susceptible to information leakage across activities. Traders often have access to the material non public information (MNPI) i.e., insider knowledge from the deal makers, which if used can generate significantly higher (illegitimate) profits for the trader or the employer. In recent past, a few traders from leading investment banks have been found guilty of using MNPI and punished. It is beneficial for the banks from a pecuniary as well as reputation point of view to track insider trading activities and take actions against offenders.
However, it is difficult and resource consuming (financially and intellectually) to manually eyeball millions of trades done by employees and flag potentially fraudulent activity. What may help, though, is a set of robust logical and statistical filters that are designed to reduce the overall number of trades/traders subject to manual review, thus reducing the overall cost of regulatory compliance.
The approach described above requires a non-trivial effort up-front to acquire, aggregate, and clean the trading data. After the required cleaning of trades’ data and imputation of missing value of trades in certain time periods (assuming no trade), we build a general understanding of the traders’ behavior. We supplement this understanding with the help of logical/statistical filters which are a combination of basic measures (such as directionality of trading transactions, means and variances of trade values), trend analysis (such as correlation between trade and market movements, deviation from trader’s historical trading pattern) and statistical models (such as time-series models).

Fig 1: Example of correlation between market movements and trading
action
Fig 2: Example of analyzing directionality of trading actions
Some of the complexities of this approach lie in the inherent biases in any technique that the analytics as well as business teams should be aware of. Whether an OLS model will work as well as a time-series model to identify suspect transactions is a hotly debated question, for instance. The key to this problem, as in most of the advanced analytics
problems, is a strong coordination between business and analytics. Subject matter expertise around investment banking, trading, stock price movements, etc. must be built into a data-driven approach to solving the overall business problem.