Uncategorized – Free Search

↧

MapReduce cookbook for machine learning

July 30, 2007, 3:15 pm

Here’s a paper from Stanford showing how to use MapReduce to scalably implement ten different machine learning algorithms!

View Article

Cloud: commodity or proprietary?

April 9, 2008, 9:45 am

A few days ago Google announced its App Engine, which lets folks build applications that run in Google’s cloud. Amazon has for a while had a number of services to let folks run applications in Amazon’s...

View Article

Hadoop Sorts a Petabyte

May 12, 2009, 10:45 am

Woot! Owen and Arun have posted new Hadoop sort benchmark results. This is a great milestone for both throughput (a petabyte in ~16 hours) and latency (a terabyte in ~1 minute).

View Article

Some early Avro benchmarks

May 12, 2009, 1:00 pm

Avro is my current project. It’s a slightly different take on data serialization. Most data serialization systems, like Thrift and Protocol Buffers, rely on code generation, which can be awkward with...

View Article

Joining Cloudera

August 10, 2009, 1:01 pm

I will be leaving Yahoo! at the end of this month to join Cloudera. About five years ago I was working with Mike Cafarella on Apache Nutch, an open-source web-search engine. Initially we were able to...

View Article