I figure I will have new found time in December (I write the first CFA exam on December 4th) so I might as well try to do this. First step will be to make a simple weighted linear regression machine. If I can implement one in Excel surely it can't be so hard to implement one anywhere else. Then figuring out the actual algorithm will be a combination of digging into the R source code, using some common sense, and talking to a friend who has actually built one of these before.
But I'd love help if anyone's game. Even just to answer questions. Like, how do I actually set up a test Hadoop server? Or more importantly, is this a silly exercise?
2 comments:
Take a look at the Mahout logistic regression code. It is a blazing fast stochastic gradient descent implementation that makes map-reduce implementations much less interesting by being *really* fast.
Thanks Ted. I'll take a look.
Post a Comment