"Incorrect" Daily Data

Discussion in 'Data Sets and Feeds' started by dholliday, Mar 2, 2021.

  1. dholliday

    dholliday

    As an example of data that makes backtesting difficult and a question about other data services (Norgate, Polygon.io, Rithmic, ActiveTick, etc.).
    Monday, March 1st, 2021:
    Tesla's (TSLA) daily data shows an upper wick with a high of 872. This, of course, never happened (that you could act on). Tesla actually had a high of 719 at the close of RTH.
    Yahoo Finance Charts, TradingView, IQFeed, all show a high of 872. Interestingly IB daily charts show the correct value of 719. IQFeed minute and tick data are correct so you can build your own correct daily data if need be.
    I am interested in what data services, with an API, have real-time tick data and historical daily data with data that matches the real-time tick data?
    Thanks
     
    murray t turtle likes this.
  2. Perhaps the 872 high occurred during premarket or after hours. And some data sources only included regular hour trading.
     
  3. guru

    guru


    What do you mean by “matching real-time tick data”? They probably all match, so it’s just the question of filtering odd lot delayed out-of-sequence dark pool trades, cancellations and corrections. Some people complain they filter out too much, others complain they don’t filter enough, while others don’t specify or don’t know what they really want.
     
    Last edited: Mar 3, 2021
  4. dholliday

    dholliday

    It did not occur in the pre/post-market data. These are trades to occurred off of the exchanges but were reported at a later time. The point is that you could not participate in a trade at that price. If you were watching all day (and pre/post-market) you would not see a trade above 719. The daily data is "corrected" at some point after the market close. You will see this sometimes when at the end of the day the high is, for example, 719 and when you come back the next day (or do a refresh) the chart shows a high of 872. They have "corrected" the data with unusable data. I don't care if two mutual funds traded a stock between themselves at some crazy price and reported it after the close. It is not a trade you could participate in. You don't want to backtest with this data. I see this about once a month in my trading. I run a backtest in the evening and compare it with actual trades taken by my systems. If, for example, I had bought TSLA at 700 with a sell stop at 850 my backtest would show a huge profit but since the price never went above 719 the trade would not have been taken.
     
    benwm and Lou Friedman like this.
  5. dholliday

    dholliday

    My systems watch every tick/trade, and bid/ack change, throughout the day for many symbols (stocks).
    I bring this up now because everyone has access to Yahoo Financial Charts or TradingView. On either one bring up the daily chart for Tesla, look at 3/1/21, see the high of 872. Now switch to 1-minute bars (include pre/post-market if you like). The high of RTH was 719, post-market was 720 something. Though you can't test it now, during RTH the high was 719. At sometime after the close I came back to my computer and hit refresh, daily bar changed showing the 872 high.
    When I download tick and minute data (IQFeed), it is correct (un-corrected). This is the only data you can trade on. You can't trade on "corrected" data.
    I have observed this problem for at least the last 10 years. So this post is to warn backtesters and see if users of one of the new services like Polygon.io or ActiveTick handle this differently.
    Thanks
     
    spawnxxx and benwm like this.
  6. guru

    guru

    Yes, I’ve also seen a lot of bad data, whether on daily or x-minute charts from various data providers, especially outside RTH but can also happen during RTH. For that reason I also filter and summarize tick data. My summarized minute-data doesn’t show that crazy $872 price during the day, so it could be some special trade condition not caught by some systems. I do see $820 price March 1st after hours, but my system flagged it as likely invalid due to being out of sequence, odd lot and coming from a dark pool.
     
    benwm, spawnxxx and murray t turtle like this.
  7. benwm

    benwm

    Thanks for this thread, @dholliday, you've raised an important subject for backtesting. I think the implication is that it might be preferable to construct our own daily charts, etc from minute or tick prices to get the "tradeable" daily OHLC? I wondered how traders that use prior day OHLC for pivot points on other indicators based off HL handle this situation.

    Perhaps this information could be incorporated into a broader trading system. For example, take all the S&P500 stocks, and count how many highs / lows occur at an "untradeable" price on a particular day. So you get a daily metric ranging from +500 to -500 of "dark pool untradeable activity at extremes", where larger absolute numbers give a potential red flag..

    If such an indicator hit a higher level at quarter or year end, maybe this is indicative of some minor manipulation, i.e. more prices occurring that no-one could really trade off? A bit like an institution doing a tiny trade in an illiquid product near the close of day at year end to create a more favourable price to "mark their book".

    Or maybe I'm just reading into this too much.. :)
     
    murray t turtle likes this.
  8. SunTrader

    SunTrader

    Yahoo free data (or any other) - ya get what ya pay 4.
     
    avatar-ds likes this.
  9. jharmon

    jharmon

    $872 was caused by an out-of-sequence trade reported at 17:51 (i.e. after hours) actually done on Nasdaq which actually is a valid value to affect the high of the day according to the UTP rules. Can't even blame a dark pool for this one!

    Thankfully I don't trade outside of regular trading hours.

    UTP means "Unlisted Trading Priveleges" which is a very poor name.

    What it really means is:
    Trade aggreagator for Nasdaq-listed stocks across all trading venues.

    If you think this is outrageous, contact UTP and tell them that their trade conditions for daily high/low are irrational. https://utpplan.com/support
     
  10. SunTrader

    SunTrader

    TradeStation has after hours range as (and BTW I looked at the tick and one minute data itself to verify) $725.44 to $716.82

    So either it was omitted (curiously?) or IMO more likely corrected because $872 was a broken trade.
     
    #10     Mar 3, 2021