C++ concurrent data structure for fast retrieval

Discussion in 'App Development' started by kmiklas, Sep 2, 2020.

  1. kmiklas

    kmiklas

    I have a requirement to create and maintain a ledger of orders. Thousands and of them, with many updates per second. It will operate in a high-speed environment: currently in single-digit mcs range; I'd like to see it in high nanos.

    - Must be thread-safe
    - Multiple threads and processes will be reading and writing concurrently
    - Said ledger will be updated with algos (no GUI order input)

    Noodling on how to best implement this... what data structure makes sense here?

    1. A vector of pointers or references to orders?
    2. A hash table?
    3. Caching? I fear race conditions.
    4. Database of some kind? I fear perf penalties.

    I was thinking a separate thread or process, dedicated to holding the structure in memory, with accessor and mutator methods? Make every effort to stick with atomic operations, to perf penalties associated with locking?

    Feedback and/or advice welcome. Thanks.
     
    Last edited: Sep 2, 2020
  2. thecoder

    thecoder

    As an experienced professional senior C++ dev and SW-arch, I can design and develop for you the fastest possible solution.
    You just need to give me the project, and of course pay me :)
     
  3. kmiklas

    kmiklas

    Oh yeah? How do you reverse a string?
     
  4. gaussian

    gaussian

    std::atomic is probably what you want to look at. You will have to insure atomicity.

    In terms of architecture you will at least need some way to persist orders so that your history can be rebuilt in the event of a crash. I would suggest first looking at the simplest possible solution: a local cache of orders in the current "window" that gets offloaded to a database at some interval. This database could be something as simple as redis, which then itself can replicate over to cold storage system later. A 3 layer caching strategy would guarantee reliability.

    Establishing atomic writes is not terribly difficult using a modern database and good engineering practices. I would suggest you first implement the simplest solution (above), benchmark it, and then make modifications to bring the execution time down to where you need it. Atomicity on the local system's ledger cache will need to be carefully engineered and tested. Consider using the Ravenscar profile of ADA/GNAT as a good baseline. You need a baseline first and reaching sub-microsecond execution times gets into a mixture of:

    1. How you programmed it (C++ is likely still too high level and you'll be looking into ASICs eventually)
    2. How you architected it (do you have fiber pipe going to databases?)
    3. The hardware it is running on (are your structures storeable in the L2 cache of a CPU? What hardware level caching do you need to fit your structures in?)


    This problem is not simple. You will want to tailor it to specific hardware and optimize your structures in-flight to fit inside of the fastest available CPU caches to insure top-tier performance. You will also want to look into the lag caused by branch prediction and several other things and write your make file/compiler flags to optimize for this. Good luck.
     
    Russell Shuffle and kmiklas like this.
  5. thecoder

    thecoder

    Just use this built in function from the std namespace:
    Code:
      reverse(str.begin(), str.end());
    
    Why do you ask such a childish question? Do we already have a contract? I don't think so :)
     
    Last edited: Sep 2, 2020
    taowave likes this.
  6. kmiklas

    kmiklas

    Fully agree. Locking would crush my performance.

    Yes, I was thinking of offloading this to a separate process/core with an absolute minimum of data sent across a socket.

    Yeah no ASIC or FPGA just yet. For now it's C++, C, and perhaps a little assembler in hotspots.[/QUOTE]

    What can't I offload this to another process/core on the same rig?

    This I did not think of. Thank you.

    Thank you. I appreciate the comments.
     
  7. kmiklas

    kmiklas

    NEXT!
     
    taowave likes this.
  8. thecoder

    thecoder

    And why? Do you have a faster algorithm? :)
    In case of multithreading you of course need to exec that piece of code in a locked state. That's so obvious.
     
    Last edited: Sep 2, 2020
  9. gaussian

    gaussian

    You can. I would suggest not doing that to free up resources for your ledger but if you have one of those fancy multi-CPU server boxes perhaps this is possible. I was just considering fault tolerance.

    If a prospective employee gave me your answer I would think it was sarcastic. Just because you can use a string reverse function doesn't mean you know how to reverse a string.

    You could use a stack which is the canonical solution. You could also do it in-place with pointers. Which is faster asymptotically?
     
    Russell Shuffle and kmiklas like this.
  10. thecoder

    thecoder

    @gaussian, the spec was then insufficient. He should clearly say that he wants to see an implementation, ie. the design, of such a function, not it's mere use.
     
    #10     Sep 2, 2020