Hundreds of papers are published in finance that claim to discover successful new investment strategies. However, when the strategies are deployed in real market situations, they often fail to perform as they did in the lab. Some of the major reasons for their failure are as follows:
- The real-life market data tends to be very different than the data used in the labs. Lab data contains full and corrected data whereas in actual situations data is delayed for several minutes or hours after markets close.
- Data is sometimes corrected in periods (days, weeks, or months) after the periods. For instance, data might contain corrected (restatements) earnings.
- The sampling of data may not be random as the model developer has ex-post knowledge of events and hence it is likely that data taken is from normal times and not when markets experienced abnormal periods.
- Other than the tick data, other data – especially fundamentals data – is extremely limited.
- While a pattern may be discovered but without proper economic foundations, it becomes too risky to deploy in production.
- Model works well with the data used by the modelers but fails to generalize (overfitting).
For those reasons, it is important to have standards developed for machine learning in finance. AIAI is working to establish those standards.