Acrary's Strategy Formation

Discussion in 'Strategy Building' started by MustPlayOptions, Jan 16, 2007.

  1. Hi everyone. I've collected and tried to summarize my understanding of acrary's posts on how to find strategies. I will post my summarized understanding here and have included as an attachment of the file with the original collection of posts in my other thread. Here's the link for the collection of a lot of his posts:

    http://www.elitetrader.com/vb/attachment.php?s=&postid=1328844

    The collection file is very messy and randomly organized and contains much more than just his strategy management posts. I'm not planning on spending anytime on organizing it more than this - so it's given "as is" and without links to the actual posts.

    Please feel free to correct any mistakes I make or misunderstandings I have and I'll try to edit them into the summary. The summary is attached to this post.

    For a summary of his Mixed Strategy Management posts see my other thread at:

    http://www.elitetrader.com/vb/showthread.php?s=&threadid=84890
     
  2. 1) Be DUM

    "For every system I develop I use DUM.

    D - Define

    All systems are based on finding and pulling a fundamental truth about the market. Define what fundamental truth you'll be going after. Ex. All markets have a tendency to trend beyond random. Now you've got the definition that most technical-based hedge funds are derived from.

    U - Understand

    Determine the conditions under which the defined truth tends to occur. In the case of a trend tendency it could be when does the trend tendency begin beyond random? This will lead you to how do I measure a trend? Since trends can occur randomly, how do I determine if a trend is beyond a confidence level of randomness? Does the trending tendency beyond random exhibit the same degree of persistence beyond one year? two years? 5 years? If not, is there some point at which the persistence beyond random occurs every year? If so, does it also persist at the same frequency for 5, 10, 50 different markets? If so, you've discovered a fundamental truth and you now understand what you need to know about the behavior.

    M - Mine

    Once you understand the conditions under which the behavior occurs, you write the code necessary to map the understanding of the behavior. Is the code going to be all inclusive of many markets? or try to just go after the best of the best? Once mapped it's a mechanical process to determine how well it maps against the behavior. After you're satisfied you've developed a satisfactory method for mining the behavior, you can do an edge test to see if it happens beyond random. If not, use Monte Carlo sims to determine confidence levels for trading the method. Determine at what confidence level you'll stop trading. Examine the drawdown versus the profit. Is it worth risking any money on this? If so, allocate money using a money management scheme.

    After you're done with this, you'll have your first system. Next, develop a complimentary system (non-correlated). Go through the same process for say a range bound system. Once you've gone through the mining stage, use the correlation test to weight the two systems. Apply the weights to the money management scheme and move on to your third system."


    2) Target a behavior (Define)

    "I don't look for a market for a system. Each system I've developed is targeting some behavior. If that behavior is present in multiple markets, then I could test it to see if my system captures the behavior better than random. If so, I'd just trade it on that market and check to make sure the behavior was persistent. For instance I have a volatility breakout model that I've used successfully in the SP market. I tested it against the DAX market and found the edge (ability to capture profits at better than random), was better in the DAX than the SP. I've been trading the DAX with it since then and it's done very well. Only thing I don't like is getting up in the middle of the night to trade.

    Every model I've worked on has gone through the same process. Look at the behavior's present in a market, characterize them by creating a rule and checking the fit until all behavior's are noted. Then start looking to see if there is a component to the behavior that is non-random. If so, develop a system to mine it and create a way to monitor the behavior to ensure it's persistent over time. For example, one of the behavior's widely known is the trend day in the SP market. It can be identified just by visually inspecting a chart. I characterized it as a low/high within 10% of the low/high of the day and the close within 20% of the high/low of the day. With the definition I can see how many of these days have persisted over the years (averages about 25 days per-year). Then I can see if there is a way to identify these days in advance (realizing I'm going to also be capturing some false days as well)."

    3) Basic data mining process (Mine)

    "First find out what is going on in the market you want to trade in the timeframe you plan on holding a position. If you want to daytrade with one trade per-day then find out all the different ways the day has played out in the past. ex. trend day, two-way day, reversal day, etc.

    Once you've done this you should have an idea of which type of day is most common and which is most profitable. Then define something which could be of value to trade one of the market types. An example might be in a reversal day to find out how often the market makes a low of the day in the first 15 min. of the session. If it happens often enough to be of interest then you go on to the next step.

    Take every period for which the target is found and create a table of outputs with 1 for the target and 0 for non-targets. Then pre-process all the inputs into the target and convert them to binary inputs. (A common mistake is to take open, high,low, and close data -- analog and assume you can find relationships with the target). For ex. yesterday close > day before yesterday close. If found mark the input as a 1 if not present mark it as a 0. Do this for as many identifies as you can. This may present a hundred or more binary inputs leading to the target for each day of the data.

    Then you'd pass the data into a backprop neural net and have it train on the data. (you'll need to set aside some data for out of sample testing). Once it's trained to hit at least 90% correctly test the NN on the out-of-sample data. If you hit at least 85% correctly then you can do one of two things. If you're a discretionary trader, setup the NN and preprocess the inputs every day and use the net to predict whether tomorrow has the target (in this example the low of the day is within 15 min. of the start of the session). If so use it to trade to the upside as long the net remains 85% correct. If you're a systems trader then go back to the net and look at the weights of the net to see which of the binary inputs were most important in hitting the target. Use the inputs to create a backtestable system based on the patterns. A system might be when xyz pattern exists then buy next bar above the lowest bar as long as the time is within the first 15 min. of the day. Set the stop to one tick below the low.
    If the system tests profitable enough to be of interest then move on to the next step.

    Next, take the trades and test them against random trades pulled from the same year (the edge test). Rank the trades versus random for each year of the backtest. If the trades score consistently above the 70th percentile then you can guess you've found a edge-based system. If not, then you have to assume you've found a temporal characteristic in the data that can be exploited for some period of time.

    If it's edge based then all you need to do is adjust the trades for market volatility and apply a money management strategy. Check the trades on a periodic basis to ensure the edge continues and plan what to do with your next million. If it's not edge based you can still trade it but you need to setup a objective bailout method such as running a monte carlo sim and determining the bailout point to be say the 95% level of the predicted max drawdown point. Your trading would be more defensive using a non-edge based method as well. Maybe you'd split the trade size in half and have a 15 min. or 10% of daily range as a filter to adding the second position (letting the position prove itself) as long as the volatility was large enough to justify the scaled entry."
     
  3. 3) The Edge Test - What is it?

    "The "Edge Test" is a concept. The concept is to separate luck from skill in determining the backtested results. How you do define the measurement is up to you. If I had a Cray supercomputer and had 50 trades to test for a year I could probably rank all the different combinations of 50 trades within the 250+ trading days. Since I don't have that kind of computing I have to sample the trades from the pool of potential trades. If I had the desire I'd build a minute by minute database and rank each trade against the exact entry and exit time for every day of the year. Then come up with a weighting scheme for the proportion of longs versus shorts. I'm sure there's many other ways to do the same tests. In the end it all comes down to "does the random selection of trades adequately represent the luck component that I'm measuring against?". If so, then it's worth keeping. If not, then try other ways of accomplishing the same goal.

    When I've used the test I've disabled the stops and exit with profit strategies so that all I'm testing are my entry and exit at end of day versus random "luck" entry at open and close at end of day. The only purpose of the test is to determine if my entries are a result of luck or skill."

    4) How to do the Edge Test

    "To do the edge test you use a single method at a time.
    First you backtest on the data you're using to develop the method. Then, when you're satisfied with the overall results you separate the trades by long and short by year.

    It'll look something like this:

    1996

    Long +3.00 hold 1 day
    Short -2.00 hold 2 days
    Long -1.00 hold 2 days
    Short +4.00 hold 1 day
    etc.

    Then you process the year of 1996 and pull out random individual trades with the same length of hold (being careful to avoid reuse of any one day).

    Ex.
    Long -2.00 hold 1 day
    Short +1.00 hold 2 days
    Long -2.00 hold 2 days
    Short +1.00 hold 1 day
    etc.

    When you get done you total up results of the long and short trades for the random pass for the entire period.

    ex.
    Long -4.00
    Short +2.00

    You do this random pass thousands of times (Monte Carlo) and rank each each of the passes so that you have a distribution from 1% - 99% for both longs and shorts for each year of the tests.

    Ex.

    Long 1% -16.00
    etc.
    Long 99% +21.00

    Then you compare the total you have for your tested trades versus the distribution to rank where your trades are as compared to the random trades. (do this for both longs and shorts for each year). If both longs and shorts rank 70% or better (20%+ better than random) then you might be looking at a edge.

    Do the same test with out of sample data and shorten the time period to 3 months (so that you can view multiple forward time periods). If the numbers continue to be 70% or better on both longs and shorts then you probably are trading with a edge.
    You do this test every 3 months after you start trading it to make sure the edge is not deteriorating. If it drops below 70% then stop trading it."

    "I wrote a couple of programs. The first has a test trading system and writes a trade file with the entry date, length of hold in days and profit or loss in pts. with slippage included. The core code has to be changed for each idea that tested well in TS.

    For the second program, I enter the year(s) of the test, the market being tested, and whether it is a test for longs or shorts.
    I also have to enter the total profits generated from the original system (in pts.) for either longs or shorts. It then reads the trade file and selects the trades that are for the test period. It also reads a back-adjusted continuous file for the symbol and loads the data for the test year(s) in memory. Then it just does 5000 passes of random entries within the test period on the continuous data with a hold equal to each of the trade lengths.
    A array is marked to signal that the period was already used to prevent trade overlap during each pass. After each pass through the trade list, it saves the net profit/loss in a array. After the 5000 tests are done, the array is sorted from lowest to highest profits. Then it goes down the list to find out where the profits from the system fit within the tests. Ex. If the system test made 240 pts. profit in the SP market, it would scan through the 5000 trade runs to find out how many it beat. If it beat 4000 of the trade runs, then the system achieved a result of 80% versus a expected random result of near 50%. I also print out the 50%, 90%, 95%, and 99% ranks to get an idea of the possibilities in the market. If the longs and shorts both test above 70% for 3-5 separate years, I start trading it. It doesn't mean it won't have drawdowns, it just means I'm trading a edge that's likely to have better results than random.

    For daytrading, I only test a model to do the entries and exit at end of day on close (no stops or profit targets). This way, I've been able to find ideas that have an edge versus just mapping to the character of the market."

    "When I've used the test I've disabled the stops and exit with profit strategies so that all I'm testing are my entry and exit at end of day versus random "luck" entry at open and close at end of day. The only purpose of the test is to determine if my entries are a result of luck or skill."


    5) How to interpret the Edge Test

    "All of the edge based models are developed based on a perceived structural anomaly. For the edge criteria if the longs or shorts drop below 70% of the random trades then it's no longer traded... I also suspend trading any model if it hits the 95% modeled drawdown or the 5% return level from the Monte-Carlo sims."
     
  4. mbv

    mbv

    What I don't get about this edge test is what happens with commissions? Let's say you've got 1000 trades that you look at over a very short timeframe. And 80% are losers (thanks to commissions). Well if your system does better than 70% but still is losing money to commissions, that's not a very good test. And this sort of thing is going to be true for any system. I mean it's one thing to test against "randomness" and another to test against "randomness" + commissions. How does one go about quantifying that?
     
  5. He includes it:

    "I wrote a couple of programs. The first has a test trading system and writes a trade file with the entry date, length of hold in days and profit or loss in pts. with slippage included"

    Slippage I assume includes both commissions and actual slippage estimates...
     
  6. GTG

    GTG

    Thank you very much for that summary!
     
  7. greenmark

    greenmark

    Thanks a lot MPO, very helpful ! :)
     
  8. You're welcome - acrary's really the one to thank though for sharing. Don't forget to look at the other thread about using multiple strategies too.
     
  9. ES335

    ES335

    From Acrary: "How often should you hit a new equity high? It can be calculated by using the % losing trades. Here's how, take the % of losing trades and multiply it by itself until the number is approx. .01 (meaning 99% chance of seeing a run of however many times you do the mutiplication). For example, if I have a method that loses 40% of the time, then the number will be (.4*.4*.4*.4*.4 = .0124). This means a method with 40% losers will have no more than 5 losers in a row 99% of the time. Next, take the number of consecutive losses and multiply by 3. In this case, the number will be 15. This is called the trading cycle. The cycle is the maximum number of trades that should happen before a new equity high is achieved. Draw a line every 15 trades on your statements and make sure a new equity high is hit within the 15 trade period."

    Hi Mustplayoptions

    I was wondering if you could clarify the above from Acrary. He recommends having at least 100 trades in your sample to start with. Ok, so let's say you win 60%, lose 40% in that 100 trade sample, so 60 winners, 40 losers. The probability of 5 losers in a row is 0.4 raised to the power of 5 which is ~ 1%.

    (i) So does that mean that once out of 100 trades, you can expect to have 5 consecutive losers and that out of 200 trades, you can expect to have a run of 5 consecutive losers occur two times?

    (ii) Also, what would happen if you are using n systems, each with its own win rate and the sum total of all trades from all systems would lead to a given win rate as well. How would you compute the probability of consecutive losers? Could it be that diversification of multiple systems could also lead to a lower probability of consecutive losses? Does anyone know the math for this?

    (iii) Back to Acrary: Why does he multiply the number of consecutive losses by 3 to get to the trading cycle? Does anyone know the reasoning behind this?

    Thx in advance
    ES335
     
  10. Hi,

    First let me state that I'm not speaking for acrary, I just compiled his posts.

    The first two questions are about the mathematics of streaks that I believe are more difficult than they seem and I don't know the answers.

    As for (iii), I'm not sure why...

    Sorry I couldn't be more help.
     
    #10     Jan 21, 2007