Welcome to the Traders Laboratory Forums.
Automated Trading Black box systems, strategy automation, algorithmic trading, etc...

Reply
Old 08-22-2009, 02:45 AM   #1

Join Date: May 2009
Location: Szczecin
Posts: 33
Ignore this user

Thanks: 0
Thanked 3 Times in 3 Posts

Anyone Interested in a Complete CME Feed Sample?

Hello folks

As part of my development process for a trading platform, I am testing my data collection system next week. This is a stress test and will take 1-2 weeks - I can not rule out I have to restart of get a crash first week, and I Want a complete week sample.

Anyhow, I will collect a complete data feed from the CME, electronic trading (basically: the same you get via zen-fire, just COMPLETE, not filtered like zen-fire is, and not limited by symbol).

This will include:
* Preopening, openeing, closing etc. information
* Trades, obviously
* Best bid, Best ask, Bids AND Asks, including their invalidation (setting the bid / ask volume to 0 when the price moves out and the exchange does not track them.

This will include the following instruments:
* Naturally all futures.
* All options, are published by the exchange. This is the majority of data.
* Virtual instruments, such as spreads trades that are done on the exchange level. Most people won't know that but you can execute those on the exchange, which guarantees the spread value then. Those CAN be ignored - spread executions also show up as the separate leg trades in the instruments.

The timestamps are in microsecond format from the original data capture equipment. The stream is totally unedited. It does NOT include cancels an corrections to my information. I record it to get an idea of the volume, analyse that my data capture approach and get an idea how to actually process that hugh amount of data - as a stress test for my equipment.

Anyone interested in a copy (ONLY for the complete data) can contact me via PM. This is initially a free offer, but if the volume goes too high up I may have to ask for a small contribution for the data transfer amounts. Can not have 20 people downloading 50gb or so for free We talk of SMALL here - basically covery my traffic costs. Not sure how large the archive will be - ask me in 2 weeks. Anyone else is free to put that archive up on peer to peer - I even may do so, but with a very limited bandwidth

The file format will be tab separated windows text file (CRLF delimiters), naturally compresed, most likely with 7zip Expect SIGNIFICANT Archives. We talk of 30.000+ lines PER SECOND in the the hot phases. Even during the night, often, the lines are not readable on a text output, because every trade results in a lot of bid adjustments. It will contain two file sets - one with the exchange prints, one with instrument metadata (name, allowed prices etc.). I may actually have to split the allowed prices out into a third file for technical reasons.
C# code for parsing the files will be available, unless trivial (so basically for the print files, where not all lines will be identical).
At a later stage I will possbily also have a binary format available - part of the data capture approach is in order to actually find out the possible range of values for some items I get text encoded so I can make a proper binary representation that is efficient - I plan on storing that complete data stream in a server for real time retrieval.

The offer is one time. I have no intention with this in getting in as data provider. I just think it may be valuable for someone working on his own technology to have access to a high end stream to see what really goes on - and possibly stress test his technology. And other sources of that data are - hm - hard to find or hard to pay
NetTecture is offline  
Reply With Quote
The Following User Says Thank You to NetTecture For This Useful Post:
flyingdutchmen (08-22-2009)
Old 08-22-2009, 04:54 PM   #2

Join Date: May 2008
Location: amsterdam
Posts: 114
Ignore this user

Thanks: 25
Thanked 38 Times in 30 Posts

Re: Anyone Interested in a Complete CME Feed Sample?

i do not have the need for it myself but i still like to thank
you for this kind offer.
data is expensive.
take it people, something like this you will not get anymore
flyingdutchmen is offline  
Reply With Quote
Old 08-24-2009, 12:38 AM   #3

Join Date: May 2009
Location: Szczecin
Posts: 33
Ignore this user

Thanks: 0
Thanked 3 Times in 3 Posts

Re: Anyone Interested in a Complete CME Feed Sample?

Ok, status update

Change of hearts. Well, not really.

* Data collector running since yesterday around 1900 UTC (for those now knowledgable: That is Greenwhich Mean Time WITHOUT summer/winter time - computers use that internally).
* The log file so far has the form like

009-08-24 04:26:24.149830 ZDU9 CME Bid 0 9534
009-08-24 04:26:24.149830 ZDU9 CME Bid 10 9533
009-08-24 04:26:24.150741 DDU9 CME Ask 2 9572
009-08-24 04:26:24.150741 DDU9 CME Bid 0 9530
009-08-24 04:26:24.150741 DDU9 CME Bid 0 9547
009-08-24 04:26:24.150741 DDU9 CME Bid 2 9546
009-08-24 04:26:24.154835 ZDU9 CME Bid 2 9539
009-08-24 04:26:24.154835 ZDU9 CME Bid 12 9538
009-08-24 04:26:24.154983 DDU9 CME Bid 3 9529
009-08-24 04:26:24.155119 YMU9 CME Bid 4 9546
009-08-24 04:26:24.155385 ZDU9 CME Bid 0 9547
009-08-24 04:26:24.155385 ZDU9 CME Bid 1 9528
009-08-24 04:26:24.157930 DDU9 CME Bid 0 9548
009-08-24 04:26:24.157930 DDU9 CME Bid 2 9547
009-08-24 04:26:24.157930 DDU9 CME BestBid 2 9547
009-08-24 04:26:24.158331 YMZ9 CME Bid 2 9487
009-08-24 04:26:24.158331 YMZ9 CME Bid 3 9486
009-08-24 04:26:24.158331 YMZ9 CME Bid 0 9384
009-08-24 04:26:24.158331 YMZ9 CME Bid 1 9487
009-08-24 04:26:24.158331 YMZ9 CME BestBid 1 9487
009-08-24 04:26:24.158483 ZDU9 CME Bid 0 9548
009-08-24 04:26:24.158483 ZDU9 CME Bid 2 9547
009-08-24 04:26:24.164320 DDU9 CME Bid 0 9529
009-08-24 04:26:24.172657 ESU9 CME Bid 91 1030.75
009-08-24 04:26:24.172657 ESU9 CME Bid 100 1030.5
009-08-24 04:26:24.175702 ESU9 CME Bid 76 1029.75
009-08-24 04:26:24.175702 ESU9 CME Bid 128 1029.5
009-08-24 04:26:24.179895 YMZ9 CME Bid 0 9482
009-08-24 04:26:24.179895 YMZ9 CME Bid 10 9480
009-08-24 04:26:24.183040 ZDH0 CME Bid 0 9389
009-08-24 04:26:24.183040 ZDH0 CME Bid 5 9386
009-08-24 04:26:24.183040 ZDH0 CME BestBid 5 9386
009-08-24 04:26:24.199314 ZDZ9 CME Bid 0 9483
009-08-24 04:26:24.199314 ZDZ9 CME Bid 1 9481
009-08-24 04:26:24.199314 ZDZ9 CME BestBid 1 9481
009-08-24 04:26:24.200362 6JU9 CME Ask 21 0.01055
009-08-24 04:26:24.200362 6JU9 CME BestAsk 21 0.01055
009-08-24 04:26:24.200463 6JU9 CME Bid 18 0.010548
009-08-24 04:26:24.200463 6JU9 CME BestBid 18 0.010548
009-08-24 04:26:24.220230 6JU9 CME Ask 1 0.010549
009-08-24 04:26:24.220230 6JU9 CME Ask 0 0.010554
009-08-24 04:26:24.220230 6JU9 CME Ask 83 0.010552
009-08-24 04:26:24.220230 6JU9 CME BestAsk 1 0.010549

It is slightly bad: the hour is in 12 hour format instead 24. Not too bad for my purposes though. Main problem, though: it is already around 800mmb big I wont have the space for a whole week on the particular drive.

I will stop collecting at the end of the day when the ES has closed or tomorrow - depending on disc usage today. I will then finish creating the other files needed tomorrow (mostly the descriptions and pricing steps) and make the archive available.

I THEN go on with a binary representation, based most likely on delta storage (after all, things do not change that much, so I can but the price in one byte easily most of the time, between prints in one symbol). The idea is to store information "per symbol" in time slices of maybe 5 minutes (or more, depends on the amount of data I reallly get) in binary fields . And will see what loading that into my SQL Server says... (which has a lot more space than the drive set aside NOW for the log - I seriously did not expect THAT much data). The textual log file is simply waaaaayyyy tooo large. I will possibly reduce the granularity on the timestamp to about MS resolution. May be less. I do not really see a need for a better granularity than about 25ms (which is what NxCore uses, too).

I will most likely start collecting again next week (I actually want to make full collection for my own purposes starting 1st of September), but I may filter the data - I simply do not need data on virtual instruments (suchj as spreads) And have not exactly a large need for options, so I may filter out the order book there, keeping only best bid and ask

For those interested, btw: CPU utilization on my collecting station is really low so far, and network IO is around 1.6 megabyte per minute. That is around 200kbit I Post another update after market open - that is when it gets interesting.
NetTecture is offline  
Reply With Quote

Reply

Thread Tools
Display Modes Help Others By Rating This Thread
Help Others By Rating This Thread:


Similar Threads
Thread Thread Starter Forum Replies Last Post
EasyLanguage Indicator -- How Long (in Min) 1500 Tick Bar Took to Complete Frank Coding Forum 3 03-16-2010 10:47 AM
Interested in a Poker Forum? Soultrader General Discussion 11 07-17-2008 08:37 PM
Will the YM Data Feed be included with the Globex E-mini Data Feed The Bear E-mini Futures Trading Laboratory 1 01-19-2008 01:05 PM
Interested in trading and investing mechatrader442 Beginners Forum 2 12-12-2007 05:23 PM
For those interested in cars james_gsx General Discussion 10 09-13-2007 06:56 PM

All times are GMT -4. The time now is 10:46 AM.
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.
CS to VB integration by DeskLancer
©2006-2011 Traders Laboratory, All Rights Reserved.