Jan 24, 2024 6 MIN

Why data quality matters

In this post, we delve into the factors influencing the cost of financial market data and discuss the distinctions between personal and professional usage, the impact of time horizons on data needs, and the varying levels of market coverage. We also explain our approach to designing OpenBB Discord/Telegram Bot.

blog hero

There’s a saying in life, “You get what you pay for”. Paying, however, does not necessarily constitute a financial transaction. It might look more like elbow grease, time in the trenches.

Sometimes, the amount of time wasted is not justified by the cost of solving the problem at hand, instantly. In this case, forking over the cash is a no-brainer.

But what are we really paying for, and why is there such a wide cost discrepancy when it comes to financial market data?

Cost Factors

The price of market data will largely reflect factors covering:

  • Market coverage
  • Order book depth
  • Historical data granularity
  • Transmission latency
  • Consumption methods
  • Usage
  • Ownership rights

Generally, usage gets bucketed into two main categories:

  1. Personal
  2. Professional

Polygon outlines some key differences between them in this blog post.

A main factor for these distinctions is accountability - does the information impact anyone other than yourself?

There’s a lot more at stake when managing money on behalf of someone else, especially if the account size in question poses a systemic risk to the global economy (i.e., G-SIBs). An auditable trail of decisions and reasons satisfying the Compliance Department comes with an added cost.

Conversely, conducting trades in a self-directed account does not require a legal team to scrutinize citations and disclosures. Thus, information is priced accordingly.

Time Horizons

The distribution of information among investors boils down to those of means against those with time on their hands. Even those with time to spare will eventually run up against the wall built of latency.

The timely dissemination of new information is what firms pay billions of dollars for access to, all jockeying for that front-runner position, and one can be certain it cannot be had from a Robinhood trading account.

Information may travel at the speed of light, but real estate within close proximity to an exchange’s servers - shaving precious nanoseconds from transmission times - trades at a premium (see this article from the National Bureau of Economic Research).

Getting data before anyone else, that’s edge.

If your investment timeline is long-term, you aren’t overly concerned with the latency of market data. For those investors, end-of-day summaries for price, trade, and volume levels will likely suffice. The noise of intraday movements does not add new information to the standard deviation over a year, and the thesis will rely more on company fundamentals than price.

Traders, on the other hand, might be structuring entries or exits over the day, executing when a signal is generated. This is where latency plays a factor. The time between the signal being generated and the order making it to market becomes a risk factor. It could be the only difference between getting fills or not.

Market Coverage

Tiers of data subscriptions cater to these timing specialists, but don’t let the allure of an attractively priced service pull the rug over your eyes. Real-time data is not all the same.

Does your real-time market data include the consolidation of all markets?

  • NYSE American, LLC
  • Nasdaq OMX BX, Inc.
  • NYSE National, Inc.
  • FINRA NYSE TRF
  • FINRA Nasdaq TRF Carteret
  • FINRA Nasdaq TRF Chicago
  • FINRA Alternative Display Facility
  • International Securities Exchange, LLC - Stocks
  • Cboe EDGA
  • Cboe EDGX
  • NYSE Chicago, Inc.
  • New York Stock Exchange
  • NYSE Arca, Inc.
  • Nasdaq
  • Long-Term Stock Exchange
  • Investors Exchange
  • Cboe Stock Exchange
  • Nasdaq Philadelphia Exchange LLC
  • Cboe BYX
  • Cboe BZX
  • MIAX Pearl
  • Members Exchange
library
The most amount of US market coverage available from TradingView is approximately 35%.

Data bundles aimed at retail traders might only cover a handful of the venues listed above. If the service provides real-time data for free (or ultra-low cost), it is quite likely that the feeds are coming from one of three venues:

  • Investors Exchange
  • MIAX
  • EDGX

These venues offer discounted or waived exchange fees to individuals and favourable terms for businesses developing products using their feeds, but the data itself represents only a tiny slice of the complete picture, as illustrated below:

library
Share of market volume (NASDAQ, IEX, MIAX, EDGX). Source: https://www.cboe.com/us/equities/market_statistics/venue/market/all_market/

The coverage, depth, and quality of market data should be of concern to all investors alike. Common ground between Traders and Investors lies within another adage, “garbage-in, garbage-out”.

How effective could a momentum signal be if it is only seeing 3% of the total trading volume?

Posted quotes will be the exchange’s best bid-offer, not the National BBO, and the order book depth does not reflect the overall liquidity of a stock. Because only the primary exchange marks the official opening price of any given stock, even the opening prices will differ by venue.

Fees for Real-Time Data

As outlined in this article, real-time data is fee-liable. Somewhere along the way, someone is paying for it. Consolidating all exchange data is an additional expense.

Each trading venue requires a separate license and agreement. Data re-distributors can forward this data without an end-user agreement if they self-host, but warehousing it all with 99.99% uptime comes with significant overhead.

Those expenses are all passed through to the user, but the quality of data delivered remains the responsibility of the distributor. You get what you pay for.

OpenBB Bot Data

When OpenBB was designing its Discord/Telegram Bot, we climbed that decision tree.

Costs needed to be managed and minimized, and we carefully weighed wants versus needs.

We wanted the ability to display real-time prices and EDGX gave the best terms for a display feed. Price levels fit our criteria, however, volume levels made it difficult to produce meaningful OHLC+V charts and other technical indicators.

The stock market data for the Bots product is supplied by Polygon.

They understood our concerns and were willing to work with us on a solution.

What we came up with was to back-fill the volume data as delayed, while maintaining the live market prices. We placed more value on real-time prices than real-time volume.

This compromise saves a significant amount of money on exchange fees and still provides users with 100% market coverage of volume on a fifteen-minute delay.

As an investor, knowing what you don’t know is as important as the facts you know for certain.

Being an informed investor means doing your own research. Knowing your data, understanding the market plumbing, along with having an acute awareness of market participants and their behaviours, will help you become a more successful investor.

OpenBB is here to supply the tools for anyone to make more informed, better, investment decisions.

Join the OpenBB Hub for free and start exploring our products today!

Explore the
Terminal Pro


We use cookies

This website uses cookies to meausure and improve your user experience.