top of page

Data sources and machine learning techniques

Finance always has been a data-driven business. But the data that’s historically driven financial markets is the data produced by financial markets and the companies they track: prices, volumes, fundamentals, estimates. All are created as a byproduct of trading or by financial institutions participating in those markets.

In the world of quantitative research, money managers are searching high and low for new sources of information which may provide an untapped source for creating Alpha, which refers to fund outperformance. Better information generates alpha and nowadays this sort of information is supplied by alternative data.

Source : Eagle Alpha (link)

Alternative data refers to undiscovered data that's not within the traditional data sources such as financial statements, SEC filings, management presentations, and press releases. It is used to obtain insight into the investment process. Examples of alternative data sets include credit card transaction data, mobile device data, IoT sensor data, social media sentiment, product reviews, web traffic, app usage, ESG (environmental, social and corporate governance) data, weather patterns, logistics data and satellite imagery. The last three are the ones preferred by quantitative hedge funds. In order to make sense of all this data, machine learning techniques can be applied to extract insights from text data, assign ranking scores, and predict estimated future returns.

The figure below illustrates various categories of Machine Learning / Artificial Intelligence and potential applications in trading strategies using different sources of data. The steps represented by the grey boxes are initially provided to the algorithm (as part of the training set), and green boxes are generated by the Machine Learning algorithm.

Source: JPM (link)

bottom of page