Jump to content

Welcome to the new Traders Laboratory! Please bear with us as we finish the migration over the next few days. If you find any issues, want to leave feedback, get in touch with us, or offer suggestions please post to the Support forum here.

  • Welcome Guests

    Welcome. You are currently viewing the forum as a guest which does not give you access to all the great features at Traders Laboratory such as interacting with members, access to all forums, downloading attachments, and eligibility to win free giveaways. Registration is fast, simple and absolutely free. Create a FREE Traders Laboratory account here.

darthtrader3.0beta

Lets Roll Our Own Software!@

Recommended Posts

Agekay,

 

Interesting comments and questions!

 

I would absolutely love to discuss those technical issues further and hear more about what you did for graphics and WPF but let's take that off line since it's not on the topic of this thread directly, right?

 

Someone else did point me here as they saw the comments and suggested responding.

 

This was my last post on this thread.

 

Wayne

Share this post


Link to post
Share on other sites
I think we need to put the breaks on "secondary agendas" here. It clearly drove away that poker guy.

I'm very glad to see tickzoom here...Keep in mind I didn't invite him here like I posted earlier in this thread, I guess someone else pointed him here. If you followed the elitetrader threads at all what you quickly see is he just really likes to talk :) If you didn't follow those threads it will look like aggressive marketting.

I would like to ask James to consider that this is early alpha stage software that is clearly trying to do something different. I can't see how you can really be a sponsor until you have a first release...In the mean time he seems extremely open to ideas, so bomb away.

 

Actually Uma was selling nothing that was the big difference. It was just a few 'sceptics' thought he might do in the future. Completely different.

 

Here there appears to a product available now for a monthly lease. It seems in contravention of the forum rules whereas Uma was clearly not breaking any rules. Not my forum not my rules. <shrug> Much better to get things on track now rather than have to throw people out. Other great contributors have been asked to cease and desist for much less btw.

 

The point remains if you build something on a propriatery engine whether its is NeoTicker or TickZoom you are limited by that core engine. It defeats many of the objectives of roll your own. I am a great believer of using code modules toolboxes and libraries but unless you have access to the source you could find yourself hamstrung.

 

Incidentally Neoticker will probably do everything you require, of course knowing what you require is half the battle :) As yet no one has discussed that in anything but the vaguest of terms.

 

Having said that I agree diverse skills are needed however if you don't have those skills really your only option is learn them (realistically I'd allow 5 years if you have aptitude) or simply use an off the shelf. Isn't smartquant open source? dunno never looked at it seriously. Neoticker will do more or less anything you can conceive (no affiliation).

 

Actually the task is not difficult (if you have the skills) but it is far from trivial.

 

Wane, To answer your questions no it makes no sense to do the things you mention unless you have the expertise to do so quickly and efficiently (and as well or better than has been done before). Actually I do, but don't have the desire too :D In a previous life I was a software engineer so these threads interest me, ahh nostalgia. Actually it makes no sense to roll your own unless if you have at least a vague idea what you want and how you might achieve it.

 

To be honest for things to move beyond 'wouldn't it be cool' it needs someone to sit down, write a brief, and hammer out a suitable architecture. It's not hard but would be time consuming, hell I cant be bothered to trade more than a few hours a week nowadays. Anyway credit to you mate as it seems you have done that.

 

Seems that there is a niche for a decent user friendly open source project, but sadly I can't see it coming from this.

Share this post


Link to post
Share on other sites
Agekay,

 

This was my last post on this thread.

 

Wayne

 

I hope thats not on my account. I am just a punter here gently (well I hope gently...others seem far more gruff:)) saying.....well it dosen't matter seeing as it was just an opinion :D. Certainly did not want to cause offence!

Share this post


Link to post
Share on other sites
I think we need to put the breaks on "secondary agendas" here. It clearly drove away that poker guy.

I'm very glad to see tickzoom here...

 

Totally agree. It's a shame that people that share valuable information are discouraged to post. I really liked to hear first hand comments from the developer of TickZoom.

 

This was my last post on this thread.

 

:(

 

My point for this thread was just that everyone has indicated the downsides of buying a commercial app, the downsides of using open source software and also the downsides of rolling your own software.

 

We are both obviously biased here, so let me comment on rolling your own software from the ground up.

 

To totally roll your own and write every line of code yourself as an individual is nearly unthinkable.

 

This is exactly what I do and it is not as much work as it might seem at first. But let me qualify this a bit. Ok, this is my 5th complete rewrite of my own trading software and I have learned a lot with regards to architecture and performance of trading applications. So I am building on past ideas and implementations but still completely rewriting my app since I wanted it to be 'clean' and free of legacy code that could force me into making suboptimal implementations. I am also using the latest and greatest in technology (best development environment and best APIs) that is available to me which also speeds up the process enormously. And I consider myself a pretty good developer.

 

For example, if someone rolls their own software, does it make sense to write their own graphics engine with so many commercial and open source ones out there?

 

No, not if it is GDI which always runs in software. But with WPF is free and you get hardware acceleration for free which you simply can't beat with GDI, and it will only get better in the future since Microsoft is heavily investing in WPF. The WPF 2D Graphics API looks very similar to GDI+, but still easier to use since you don't have to worry about when to invalidate what since WPF does that all for you. So there is really no advantage of using GDI over WPF. Doing graphics with WPF is trivial compared to GDI.

 

Does it make sense to write a high speed tick processing engine?

 

Doesn't really take that long. All you do is read the trades from disk and put them in your custom objects. Implementing a streaming list is trivial (I used generic lists before and just implemented a streaming list a few minutes ago. Took me less than 30min including testing that everything still works).

 

Does it make sense to rewrite quantlib or talib, etc. from scratch?

 

I have never heard these libraries before so I guess you don't really need this.

 

Does it make sense to write your own code to communicate with your broker if TradeLink (open source and free) does that already with a easy api?

 

This is a good point, which actually supports my next point. This totally depends on how well designed the APIs of your broker are. It can be a piece of cake or it could be a nightmare. For example, TT's API is pretty straight forward.

 

When you roll your own, you have only yourself to turn to for any assistance if things don't work. And your code won't progress or add features or get bugs fixed unless you do it yourself.

 

So a la carte has enormous advantages since other people are working on, supporting, fixing, improving, the different parts of your system. And you can upgrade any individual part at will.

 

I actually don't trust other peoples code. This is not because I think I am better than other developers, but based on experience. You always end up regretting that you are dependent on someone else's code. Bugs, performance problems, missing flexibility, there is always a problem. Sure, you can go in and fix it if it's open source but I usually takes me less time to just implement the damn thing myself myself than read the entire source code and figure out how it works.

Of course, you are always dependent on some API. I chose .NET in which Microsoft has heavily invested and they usually develop better code than me, you or any other open source project contributors.

 

 

So in my point of view, the advantages of completely rolling your own is complete control over your software and independence from others. I essentially create my own perfect trading environment, everything is exactly like I want it to be, and no one can do that for you. I also decide what is most important to me right now, be it performance, that new chart/indicator, visuals, performance analysis of my own trading, etc...

Share this post


Link to post
Share on other sites
I hope thats not on my account. I am just a punter here gently (well I hope gently...others seem far more gruff:)) saying.....well it dosen't matter seeing as it was just an opinion :D. Certainly did not want to cause offence!

 

No offense. I didn't mean it that way.

 

Wayne

Share this post


Link to post
Share on other sites

Seems that there is a niche for a decent user friendly open source project, however but sadly I can't see it coming from this.

 

Well first I respect that Wayne is done with this thread. I just disagree that its a big deal that they have pricing for alpha stage software on the site. I mean while I may very well end up getting a license, I'm certainly not going to pay for alpha stage softare. Just like I wouldn't pay up as a sponsor when I'm only offering alpha stage software but I do see how that is a bit unfair here to other sponsors.

To me the real problem with a great open source C# trading platform is all you will really end up doing is alot of grunt work for anyone that has programmers on staff. Its like how I've read David Shaw's last published paper was on how to connect Unix machines to market data...Then he realized how stupid it was to just be doing alot of work for free for other people and went into business with his ideas.

 

I would actually like to discuss with you real programmers, why not just use matlab? It seems like the only stumbling block is getting data into matlab...Once you have that then most the other grunt work is done for you..

The grunt work for almost any database you would want to use is done, plotting/visualation wise I don't see how anyone can compete with the work that has been put into matlab, you can probly find almost any algorithm already done, all wrapped in a deadly programming language. Like I said in that post earlier, being able to build a market profile from tick data with this few lines of code is just off the map mind blowing considering what else it can do.

MATLAB CODE:

 

% clear all

% clc;

% %assuming tick by tick data

% data = load('may01d.mat');

% ESdata = data.ESlast;

ESdata=close;

 

ESprices=unique(ESdata);

bars=20;

timertick=1;

periods=floor(size(ESdata,1)/(timertick*bars));

MP=zeros(size(ESprices,1),periods);

for i=1:periods

g=ESdata(timertick*bars*(i-1)+1:timertick*bars*i,1);

for m=1:size(g,1)

s=g(m);

s1=find(ESprices==s);

MP(s1,i)=i;

end

end

 

B = MP;%A(:,2:end) ;

B(B==0) = inf ;

B = sort(B,2) ; % sort each row

B(isinf(B)) = 0 ;

 

TPO=zeros(size(B));

for l=1:numel(B)

if B(l)~=0

TPO(l)=B(l)+64;

end

end

 

Also matlab has an engine now that supports Nvidia's CUDA directly for doing computation on the GPU..

http://www.accelereyes.com/

Edited by darthtrader3.0beta

Share this post


Link to post
Share on other sites
I'm certainly not going to pay for alpha stage software.

 

Sorry for your confusion. That pricing page was just a draft for discussion. It was never "alpha" pricing. As you observed that defies common sense.

 

Alpha period will end in a matter of days and seems successful.

 

There's not even a check out page or paypal setup yet for this. So it's all for discussion right now. I just updated that page to make that more clear.

 

Can we please cease discussion of tickzoom here?

 

FYI, there has been discussion of using TZ Engine to feed data to Matlab and otherwise connecting them. That was somewhere on ET. But don't ask me where.

 

Wayne

Share this post


Link to post
Share on other sites

I agree that matlab is in another dimension, the only hassle is getting the data inside, there are several ways to do that the plugin matlabtoib is interesting, another way for free

 

http://www.maxdama.com/2008/12/interactive-brokers-via-matlab.html

 

http://www.matlabtrader.com/

 

Sadly there are more programmers into ninja or even more in tradestation than into matlab

Share this post


Link to post
Share on other sites

Darth one of the fundemental decisions to be made is how are you going to store your data? Many people go with SQL, it is not optimal for storing financial time series however on the flip side if you choose something like MYSQL there is a whole bunch of 'stuff' to get your data into other 'things' (like matlab).

Share this post


Link to post
Share on other sites
Darth one of the fundemental decisions to be made is how are you going to store your data? Many people go with SQL, it is not optimal for storing financial time series however on the flip side if you choose something like MYSQL there is a whole bunch of 'stuff' to get your data into other 'things' (like matlab).

 

I agree, I can't think of a reason one would want to do a query against a database that contains sequential data. I think saving them to file is the best solution and then read it in a stream since that is exactly how you receive real-time data. This also prevents you from doing backtesting mistakes where you know more about the future than you would in reality (e.g. accessing futures trades). In my current implementation I just have a list that only holds the last 2 trades so that I can never access more than the last trade and the trade before that, which has the additional benefits of small memory footprint, very fast insertion and access times.

 

File based storage has the advantages of mobility, easy backup and can be accessed from all sorts of technologies. And again, you are not dependent on other products.

Share this post


Link to post
Share on other sites
There are a couple of datbase products that are designed for sequential data....I am having a senior moment though and the names escape me.

 

haha, comeon old man...you need one last interesting project to work on before the grave..:)

As far as data storage its not that hard to figure out that a relational database is a joke when what we are really talking about is financial TIME SERIES..why TIME SERIES DBs are not a big topic on message boards is why i made this thread to start with, it just doesn't make sense..

http://en.wikipedia.org/wiki/Time_series_database

The king there bang for the buck is berkley DB..

http://en.wikipedia.org/wiki/Berkeley_DB

 

If your talking massive data storage with a time series component though your talking HDF5...

http://en.wikipedia.org/wiki/Hierarchical_Data_Format

Obviously your not going to beat the performance of accessing flat binary files of tick data directly with an engine for backtesting..My problem with that though is I'm looking out 10 years from now. 10 years of tick data as flat binary files is just not logical..There is a reason to have a database...

Matlab is already integrated with HDF5 directly..

http://www.mathworks.com/access/helpdesk/help/techdoc/ref/hdf5.html

 

Like I said, there is almost nothing you can think of that has not already been thrown at matlab...BESIDES retail market datafeeds..which the only current real world solution seems to be IB data...my only hang up there is if IB data is "good" enough because i would be storing timestamps with "only" 20ms of granularity.

Its absurd...I'm not looking to do high freq algo trading...Its just the stupid bias that I'm missing out on something, even though ninja was "filtering" the tape because I didn't understand how it handled my "true tick datafeed" for basically the past year.

.

Share this post


Link to post
Share on other sites

My problem with that though is I'm looking out 10 years from now. 10 years of tick data as flat binary files is just not logical..There is a reason to have a database

 

Why is tick data in files not logical? What do you need a database for? If you were to really think long-term than you would realize that your favorite database technology will probably not exist in 10 years anymore. Either that or you would have to upgrade a few times or migrate to other products. I save each day of ticks in a file and have a hierarchical folder structure for the years, months and contracts. This is infinitely scalable and I could compress data I don't use that often while a database of 10 years of tick data would be massive and the sheer size of your tables will make your queries a lot slower.

Share this post


Link to post
Share on other sites

Actually exactly how it is stored is not the big issue, (I guess how you 'chunk it up' might be). The big issue is (of course) is exactly how you time index sequential data. maybe not such a big deal. Chances are once you decide where you want to read from you are going to want to read sequentially until you meet some other criteria. So is the smallest granularity index you maintain seconds or maybe minutes will suffice?

 

Just thinking as I go along you might well be able to shoehorn it into a tree type structure (whilst maintaining sequential tick storage) without too much compromise. (Because seconds would be indexed by minutes would be indexed by hours etc.)

Share this post


Link to post
Share on other sites

BlowFish,

 

I think you are making it way too complicated. All you need to do is read your data one trade at a time. Absolutely no need for databases and therefore no need for indexes (since there are no tables and no relational data). For example, put each trade of a day in one .csv file. Then read that file line by line (each line has all the information of one trade) and do what you need to do with the data. Reading a whole days worth of trades takes less than one second on my computer of which 66% is spent splitting the line in it's individual values using the build-in System.String.Split(delimiter) (I haven't optimized the code, I am sure you can make it much faster if you need it). Processing the trades into other things like bars takes milliseconds. And you could then even save those bars (seconds, minutes, hours, whatever) to a .csv file to be able to read in the future even faster.

Share this post


Link to post
Share on other sites

You are probably correct! but I am a great believer in getting the architecture 'right' :D There are charting packages that use simple flat files of tick data in daily chunks they handle things reasonably well.

 

Personally I think you would want a way to get quickly to ES 3/1/98 11:43:21 without reading sequentially from the start. Of course if you divide files up into daily chunks (as you suggest) that's a kind of proxy for indexing. You could quickly get to the appropriate day (using the file system) and then read that file sequentially until you have the data you want. That might be absolutely fine for your purposes but is not that scalable. If you just collected S&P constituents that's errrr over a million files for 10 years data (though my math may be off).

 

I dont think it would be too complex to maintain time base indexes as pointers into the data stream(s). Again it boils down to what you want to do. If I was rolling my own I would want a schema that could handle thousands of symbols for many years even if I was only going to trade the ES based off 5 minute candle patterns (or whatever).

Share this post


Link to post
Share on other sites

Personally I think you would want a way to get quickly to ES 3/1/98 11:43:21 without reading sequentially from the start.

 

Well, your program would basically read it all in once on loading of the chart so that you can quickly move to any point in time of your chart. You never have to access it again for the day. And a database cannot be faster than the simple file read operation. Databases are not magically faster. They still somehow reside on the hard drive and have to read the data from disk first before you can do anything.

 

That might be absolutely fine for your purposes [/i]but is not that scalable. If you just collected S&P constituents that's errrr over a million files for 10 years data (though my math may be off).

 

What is not scalable about that? The number of files does not matter. I'd rather have millions of small files than a huge file that can could easily be corrupted.

Share this post


Link to post
Share on other sites

Of course a 'database' works on top of the file system. The question is will an index designed specifically to get to the data you want be better than the native file system (lets assume NTFS....actually I guess a precursor discussion is what platform you are going to run on).

 

With NTFS file access times can become un-acceptably long when dealing with 'large' numbers of files. This can be alleviated somewhat by having a suitable hierarchical directory structure, but the fact remains that NTFS performance is not good with 'lots' of files. Essentially in this case you are using the file system as your index I guess. As an aside WinFS that was supposed to ship as Vistas file system but was not ready is based on a relational database! Maybe that will improve things maybe not.

 

It is interesting to look at the architecture MS adopted for Exchange server, completely different application but one of the issues they needed to face was how to deal with lots of 'files' (emails).

 

If you are not handling a lot of data all this is a non issue, which I guess is your point. :)

Share this post


Link to post
Share on other sites
Why is tick data in files not logical? What do you need a database for? If you were to really think long-term than you would realize that your favorite database technology will probably not exist in 10 years anymore. Either that or you would have to upgrade a few times or migrate to other products. I save each day of ticks in a file and have a hierarchical folder structure for the years, months and contracts. This is infinitely scalable and I could compress data I don't use that often while a database of 10 years of tick data would be massive and the sheer size of your tables will make your queries a lot slower.

 

Well first though your thinking of a database as a relational database..Its obvious if you investigate this area that a relational database is just a bad idea when it comes to market data exactly because the tables become absurdly large.

I'm a dotcom CS casualty so I'm talking out my ass here but I would think the reason to use a database instead of flat files is a simple arguement if you get into algorithm analysis as far as sorting and searching with an index vs not having an index. From what I understand HDF5 is basically an indexed version of having flat binary files in a hierarchical tree. "H"ierarchical "D"ata "F"ormat..

Also from what I understand to say HDF5 won't be around in 10 years doesn't make any sense. At this point it seems to be a defacto standard in the scientific computing of multiple terrabyte data. Even if a technology breakthrough occurs in the next 10 years it will surely support an easy move from HDF5 because thats what the guys at that level use. Thats also why I like matlab..if someone comes up with the ultimate AI/machine learning algorithm in the next 10 years, its an easy bet that matlab will support it.

Share this post


Link to post
Share on other sites

With NTFS file access times can become un-acceptably long when dealing with 'large' numbers of files. This can be alleviated somewhat by having a suitable hierarchical directory structure

 

Or the issues with NTFS can be alleviated by moving the kitchen sink to a Unix environment...

Share this post


Link to post
Share on other sites
but I am a great believer in getting the architecture 'right' :D There are charting packages that use simple flat files of tick data in daily chunks they handle things reasonably well.

 

To your first point, without question....Thats why I'm waiting for you geeks to knock down the idea I have that there is no reason not to use matlab and HDF5 for this..Whatever can be done on a retail windows machine with that setup can transparently move to a massively parralell unix environment 20 years from..

To your second point, if you have alot of tick data building bars in matlab goes a lil something like:

 

% numberOfBarsPerDay: 24 = 1 hour bars (60 minutes)

% 48 = 30 minute bars (24*60)/30

% 72 = 20 minute bars (24*60)/20

% 96 = 15 minute bars (24*60)/15

% 144 = 10 minute bars (24*60)/10

% 288 = 5 minute bars (24*60)/5

% 1440 = 1 minute bars (24*60)/1

%%--------------------------------------------------------------------------

 

numberOfBarsPerDay=24;

x=Bid;

datetimeGrid=(floor(datetime.*numberOfBarsPerDay))./numberOfBarsPerDay;

timeChgPointIndex=find(diff(datetimeGrid)~=0);

intervalDatetimeStart=datetimeGrid(timeChgPointIndex);

intervalDatetimeEnd=datetimeGrid(timeChgPointIndex+1);

intervalDatetimeActual=datetime(timeChgPointIndex);

intervalData=x(timeChgPointIndex);

 

 

 

I don't currently understand that code but its so simple, elegant and deadly....

Share this post


Link to post
Share on other sites
O

With NTFS file access times can become un-acceptably long when dealing with 'large' numbers of files. This can be alleviated somewhat by having a suitable hierarchical directory structure, but the fact remains that NTFS performance is not good with 'lots' of files.

 

...

 

If you are not handling a lot of data all this is a non issue, which I guess is your point. :)

 

I think your thinking is too theoretical. Realistically, you would not load 10 years of data in one go (and you could not even do that with a database either since you have only a limited amount of RAM), maybe one day of 500 markets. And yes, I do not plan to store data of the individual S&P stocks. If you store futures tick data only, then you'd be fine with a file based storage solution.

Edited by AgeKay
Added the text in parentisis

Share this post


Link to post
Share on other sites

Incidentally Neoticker will probably do everything you require, of course knowing what you require is half the battle :)

 

comeon Grandpa...do you own neoticker? Of course not...its absurd they are still supporting delphi. Maybe if they started over and converted everyting to C# it would be interesting but in its current incarnation neoticker is a scatterbrained joke.

Share this post


Link to post
Share on other sites
comeon Grandpa...do you own neoticker? Of course not...its absurd they are still supporting delphi. Maybe if they started over and converted everyting to C# it would be interesting but in its current incarnation neoticker is a scatterbrained joke.

 

No to be I haven't for quite a while, but it really is pretty powerful. It supports a variety of languages and frameworks including c sharp and .net only problem is is its a steep learning curve.

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.