Serialized data
2014-08-12 20:51:31.995945+00 by Dan Lyke 0 comments
Profiling Hadoop jobs with Riemann:
In almost every job Ive profiled, serialization dominates. In fact, it might be safe to say that less than 10% of the compute time in our Hadoop jobs is actually doing real work. The majority is spent parsing serialized data structures and emitting new ones.
A little lesson about distributed compute and frameworks there...