"The battle is more in the technology world at this point than in just having brilliant traders” - Jamie Dimon
Generating alpha and market-making today require asset managers and securities businesses to think differently. Specifically, it requires portfolio managers and traders to think beyond Excel.
While Excel is a useful tool for some use-cases, it is no longer sufficient for investing in public markets or for profitable intermediation. This has been the case for quite a few years now, yet a vast majority of asset managers and market-makers still rely heavily on spreadsheets for analysis.
Generating alpha and market-making today involves terabyte-level thinking. Not megabyte.
For example, successful asset managers no longer rely on investor relations, sell-side research or earnings releases for security selection and trading strategies. They have moved on from these small, structured and infrequent data sources to those that are large, unstructured and frequent. Today's best alpha generators are leveraging AI generated real-time analytics of earnings calls coupled with sentiment analysis instead of manually reading a transcript hours later.
As asset managers and market-makers make the move towards this tech-forward paradigm, they run into challenges that are known as the 3 V’s of Big Data.
The top asset managers in the world today, rely heavily on a wide range of data sources from satellite images to weather data. Some hedge funds and investment banks ingest hundreds of alternative data sources.
They are also analyzing terabytes of data to create/validate new trading strategies as well as to backtest strategies. For example, they might look at 10 years of tick-level data to backtest a basket trading strategy. That's a lot of data.
The more frequently the data changes, the more important it is to be able to analyze it in real-time. Alpha generating signals are ephemeral. The quicker you can analyze the data, the most valuable it is.
By definition, datasets that are large, unstructured and frequent cannot be effectively handled in Excel.
The good news is that modern open-source tools exist to solve these problems and win the battle in the technology world.
Spark is software that has become the defacto Big Data and AI engine used by companies all over the world. Spark is how social networks personalize feeds, ride-sharing predict arrival times and what streaming apps use to recommend what to watch next.
Pandas is a Python package for data manipulation and analysis. It’s a very powerful tool for manipulating time series data at a scale that Excel cannot handle. Koalas is a Python package that can scale data and processing across clusters providing a magnitude of processing over what Pandas can handle. Koalas takes Pandas and distributes the compute.
Delta Lake is a storage layer that makes your data ready for advanced analytics and AI. It’s a powerful tool for backtesting strategies and for processing new data in real time.
Again, all three are open source.
Successful trading businesses have embraced these tools, along with the public cloud, and have unlocked insights that would not have been possible without them.
Two examples of this are as follows.
1. I know of an advanced trading firm that employs machine learning to analyze billions of stock market events to predict liquidity for a given security at a specific point in time.
2. A hedge fund analyzes traffic data, anonymized credit card data along with real-time weather data to predict foot-traffic and same-store-sales at a retailer.
An important point to make is that most of the hedge-funds that leverage these tools are Long-Short funds. Not quant funds. These tools are embraced by the top fundamental investors to augment their investing strategies, not to replace them.
Open-source software, data and public cloud are indispensable weapons for the on-going tech battle on Wall Street. When firms enable big data and AI to augment the experience of portfolio managers and bankers, it becomes a key competitive advantage.
Junta Nakai is the Global Industry Leader for Financial Services at Databricks, a company founded by the creators of Apache Spark. Databricks pioneered the Unified Data Analytics platform that makes it easier for companies to leverage big data and accelerate their AI initiatives. Databricks recently completed a $400mn fund raising round at $6.2 billion valuation. Investors include Andreessen Horowitz, Blackrock, T Rowe Price, Tiger Global, Alkeon and Coatue. Junta is a former head of APAC sales at Goldman Sachs.
Photo by Dan Meyers on Unsplash