Welcome to the Traders Laboratory Forums.
Automated Trading Black box systems, strategy automation, algorithmic trading, etc...

Reply
Old 07-06-2010, 03:15 PM   #1

Join Date: Sep 2009
Posts: 34
Ignore this user

Thanks: 40
Thanked 3 Times in 3 Posts

Tick Data Storage and Relay

I am currently logging tick data into binary files on one computer (Computer A). But I am looking for a database to store the data and furthermore, I want to be able to query Computer A to backfill my charting software on another computer, Computer B. After backfilling, I then want Computer A to relay all received ticks relevant to the instrument(s) being monitored by Computer B to be forwarded to Computer B. I know that it's not a good idea to relay data for a true automated HFT system. However, I am not doing HFT and that latency should be ok for now, but I'd like to keep it at a minimum. I am using Linux for both systems. Does anyone know of a good open-source database solution and method for relaying the ticks? Would master-slave database replication be the way to go? At this point, my database would be not much larger than a couple GBs, I could flush the database to binary files at the end of each week to keep it small if necessary.
SNYP40A1 is offline  
Reply With Quote
Old 07-06-2010, 08:32 PM   #2

Join Date: Sep 2009
Posts: 34
Ignore this user

Thanks: 40
Thanked 3 Times in 3 Posts

Re: Tick Data Storage and Relay

Redundant Post, please delete.
SNYP40A1 is offline  
Reply With Quote
Old 07-09-2010, 11:09 PM   #3

natedredd10's Avatar

Join Date: Dec 2009
Location: wny
Posts: 111
Ignore this user

Thanks: 2
Thanked 38 Times in 25 Posts

Re: Tick Data Storage and Relay

Hack the market blog on HDF5 is about the only good info ive found on tick db construction:
Hack the market billions and billions
Hack the market managing tick data with hdf5
Hack the market tick data & hdf5 (part 2)

From what I've found the biggest thing is how many instruments you want to be logging.
If you only want to store a few then go with one of the open source relational packages but keep in mind it probably wouldn't be to hard to max out performance with a non time series db if you start adding instruments down the line.
Trying to roll my own tick db from parts has been a really demoralizing experience to be honest. Its a pretty thin number of users so there isn't so much to go on. Retail is using commercial solutions from the charting software and then institutions are using ultra expensive time series solutions like KDB+..so you are really on your own being in the middle.
natedredd10 is offline  
Reply With Quote
Old 07-09-2010, 11:26 PM   #4

natedredd10's Avatar

Join Date: Dec 2009
Location: wny
Posts: 111
Ignore this user

Thanks: 2
Thanked 38 Times in 25 Posts

Re: Tick Data Storage and Relay

Actually though, if you are ok with flushing to binary files weekly, have you considered not even bothering with a db? Its hard to understand what you would be gaining from a db really with that time frame, unless these are baby steps of a much larger idea.
If you search on elitetrader for "tick database" or "tick db" and go back a few years there are some interesting discussions...In retrospect those discussions boiled down to morons like me trying to figure out how to use HDF5, berkeley db...monetdb now although I think thats too new to have come up on elite a few years ago.
Then there are guys in those discussions who realized this was a waste of time and just went with flat binary files...Don't even want to think about how much analysis they have done vs the time I've spent on this stuff...
Maybe I'm just hard headed but pytables/HDF5 is my last stand then I'm just going with binary files until its a problem...
this discussion will give you all the leads to search on you want in this area:
Nuclear Phynance
natedredd10 is offline  
Reply With Quote
The Following User Says Thank You to natedredd10 For This Useful Post:
SNYP40A1 (07-18-2010)
Old 07-12-2010, 03:37 AM   #5

BlowFish's Avatar

Join Date: Mar 2007
Location: In Da House
Posts: 3,292
Ignore this user

Thanks: 129
Thanked 1,054 Times in 702 Posts

Re: Tick Data Storage and Relay

Nate has nailed it really, pretty much anything will do unless you are dealing with lots (100's or maybe even 1000's) of instruments. The key thing is to structure your code properly so all data base stuff is done through a couple of primitive routines. More sophisticated stuff uses those primitives. If you architect sensibly you should be able to change at a later stage in hours or days rather than days or weeks. Go with what you know or fancy learning about.
BlowFish is offline  
Reply With Quote
The Following User Says Thank You to BlowFish For This Useful Post:
SNYP40A1 (07-18-2010)
Old 07-17-2010, 05:41 PM   #6

natedredd10's Avatar

Join Date: Dec 2009
Location: wny
Posts: 111
Ignore this user

Thanks: 2
Thanked 38 Times in 25 Posts

Re: Tick Data Storage and Relay

Quote:
Originally Posted by BlowFish »
Nate has nailed it really, pretty much anything will do unless you are dealing with lots (100's or maybe even 1000's) of instruments. The key thing is to structure your code properly so all data base stuff is done through a couple of primitive routines. More sophisticated stuff uses those primitives. If you architect sensibly you should be able to change at a later stage in hours or days rather than days or weeks. Go with what you know or fancy learning about.
Forums - How do you guys store tick data?

Threads like that are what keep me searching though...It still strikes me though this decision comes down to KDB is the obvious choice, HDF5 or berkley is next up to fudge a KDB type setup then flat files if you just don't want to bother....
It depends on a philosophy i soppose that you aren't going to out time series a single time series..

Last edited by TLadmin; 07-21-2010 at 11:44 PM. Reason: competitor URL removed
natedredd10 is offline  
Reply With Quote
The Following User Says Thank You to natedredd10 For This Useful Post:
SNYP40A1 (07-18-2010)
Old 07-18-2010, 02:00 PM   #7

Join Date: Sep 2009
Posts: 34
Ignore this user

Thanks: 40
Thanked 3 Times in 3 Posts

Re: Tick Data Storage and Relay

Thanks Nate and Blowfish, I appreciate the info. I actually posted a thread over at "that other place" and came to the conclusion that binary files are the absolute fastest way to store tick data. The more I thought about it, it's not that hard to write some code that will search among the binary files for the proper range that one is seeking. In fact, since the data will be stored in time order anyways, I don't see what value a database would add for what I am considering now. I can always go DB later if the need arises.

I actually had read all those articles before you posted. If I went with a DB, it would probably be HDF5. Berkley DB supports concurrency (the concurrent version, data store version does not support concurrency at all) through internal locking. Most databases might work that way, but I don't want to ever have the writer blocked for a reader. Most important function of my tick datalogger is to log data. I was also concerned about the possibility of database corruption with HDF5. Unless the hard drives starts to fail, you can't really corrupt a binary file. So I may revisit this topic later, but for now, simple binary files seem to be the way to go for my current purposes. In any case, I appreciate the info!
SNYP40A1 is offline  
Reply With Quote
Old 07-20-2010, 06:54 AM   #8

BlowFish's Avatar

Join Date: Mar 2007
Location: In Da House
Posts: 3,292
Ignore this user

Thanks: 129
Thanked 1,054 Times in 702 Posts

Re: Tick Data Storage and Relay

Maybe flat binary files with 'tree' like pointers into them. So you might have an index of days that pointed at an index of minutes that point to an entry point in the flat file. So to load from N days back you simply look at days [N] minutes [zero] to get your entry point into the flat file. intuitively that always seemed like a decent way to approach it to me.
BlowFish is offline  
Reply With Quote

Reply

Thread Tools
Display Modes Help Others By Rating This Thread
Help Others By Rating This Thread:


All times are GMT -4. The time now is 09:40 PM.
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.
CS to VB integration by DeskLancer
©2006-2011 Traders Laboratory, All Rights Reserved.