Standard deviation for irregularly spaced timeseries

Discussion in 'Automated Trading' started by kroxobor, Jan 11, 2021.

  1. kroxobor

    kroxobor

    Hi fellows,

    Suppose I have a collection of irregularly spaced time series (tick data with time_obs(i+1) - time_obs(i) ranging from 2ms to 20ms). I want to calculate 1-minute standard deviation of returns.

    The question is: should the number of observations be the same in the every sample of 1 minute series? i.e. for example, the first minute contains 50 observations and the second minute 100 has observations. After calculating standard deviations for both is it legal to make claims like volatility of price in the first minute is higher than in the second since std1 > std2?

    Thanks in advance.
     
  2. you dont have a time series ..garbage in and garbage out
     
    MarkBrown likes this.
  3. https://quant.stackexchange.com/que...of-a-sample-when-points-are-irregularly-space

    GAT
     
    longandshort likes this.
  4. kroxobor

    kroxobor

    yes and I want to make time series from this garbage. So any thoughts?
     
  5. MarkBrown

    MarkBrown

    use fill data

    use the exact same last bar until new data replaces it. the purpose is to fill every slot for data, make it up. but you don't want it to influence, so make it the same. logical?
     
  6. you should not be using time series methods at all. use the point process paradigm , where points are described by their inter-event durations
     
    longandshort likes this.
  7. you might be a good trader Mark but that's not good advice , the term for this is censoring in statistics
     
  8. MarkBrown

    MarkBrown

    let's say you have a series that updates every minute and a series that updates every five minutes to avoid the gap there is nothing "impacting" on the price to censor it. using the synthetic data fill method i described.
     
  9. I would be very concerned about a feed that updated at a fixed frequency. See https://vixra.org/abs/1211.0094
    The problem is this 1) any selection of frequency throws away information
     
  10. MarkBrown

    MarkBrown

    well that is true but faced with trying to decipher data sometimes smoothed is better than raw for observation.
     
    #10     Jan 11, 2021