官术网_书友最值得收藏!

The alternative data revolution

The data deluge driven by digitization, networking, and plummeting storage costs has led to profound qualitative changes in the nature of information available for predictive analytics, often summarized by the five Vs:

  • Volume: The amount of data generated, collected, and stored is orders of magnitude larger as the byproduct of online and offline activity, transactions, records, and other sources. Volumes continue to grow with the capacity for analysis and storage.
  • Velocity: Data is generated, transferred, and processed to become available near, or at, real-time speed.
  • Variety: Data is organized in formats no longer limited to structured, tabular forms, such as CSV files or relational database tables. Instead, new sources produce semi-structured formats, such as JSON or HTML, and unstructured content, including raw text, "images"? and audio or video data, adding new challenges to render data suitable for ML algorithms.
  • Veracity: The persity of sources and formats makes it much more difficult to validate the reliability of the data's information content.
  • Value: Determining the value of new datasets can be much more time- and resource-consuming, as well as more uncertain than before.

For algorithmic trading, new data sources offer an informational advantage if they provide access to information unavailable from traditional sources or provide access sooner. Following global trends, the investment industry is rapidly expanding beyond market and fundamental data to alternative sources to reap alpha through an informational edge. Annual spending on data, technological capabilities, and related talent is expected to increase from the current $3 billion by 12.8 percent annually through 2020.

Today, investors can access macro or company-specific data in real time that, historically, has been available only at a much lower frequency. Use cases for new data sources include the following:

  • Online price data on a representative set of goods and services can be used to measure inflation.
  • The number of store visits or purchases permits real-time estimates of company - or industry-specific sales or economic activity.
  • Satellite images can reveal agricultural yields, or activity at mines or on oil rigs before this information is available elsewhere.

As the standardization and adoption of big datasets advances, the information contained in conventional data will likely lose most of its predictive value.

Furthermore, the capability to process and integrate perse datasets and apply ML allows for complex insights. In the past, quantitative approaches relied on simple heuristics to rank companies using historical data for metrics such as the price-to-book ratio, whereas ML algorithms synthesize new metrics and learn and adapt such rules while taking into account evolving market data. These insights create new opportunities to capture classic investment themes such as value, momentum, quality, and sentiment:

  • Momentum: ML can identify asset exposures to market price movements, industry sentiment, or economic factors.
  • Value: Algorithms can analyze large amounts of economic and industry-specific structured and unstructured data, beyond financial statements, to predict the intrinsic value of a company.
  • Quality: The sophisticated analysis of integrated data allows for the evaluation of customer or employee reviews, e-commerce, or app traffic to identify gains in market share or other underlying earnings quality drivers.
  • Sentiment: The real-time processing and interpretation of news and social media content permits ML algorithms to both rapidly detect emerging sentiment and synthesize information from perse sources into a more coherent big picture.

In practice, however, data containing valuable signals is often not freely available and is typically produced for purposes other than trading. As a result, alternative datasets require thorough evaluation, costly acquisition, careful management, and sophisticated analysis to extract tradable signals.

主站蜘蛛池模板: 林甸县| 玛纳斯县| 商城县| 阆中市| 舒城县| 腾冲县| 虎林市| 榕江县| 丰县| 海原县| 北票市| 兰溪市| 咸宁市| 水城县| 山阳县| 报价| 高安市| 田林县| 隆昌县| 册亨县| 格尔木市| 景东| 互助| 麦盖提县| 昌江| 武定县| 上犹县| 霞浦县| 商城县| 京山县| 江山市| 铁力市| 英吉沙县| 澄江县| 博兴县| 拜城县| 肇东市| 安塞县| 抚远县| 黄平县| 安泽县|