Fast In-Memory MapReduce Using Apache Ignite
Apache Ignite In-Memory MapReduce allows to effectively parallelize the processing of data stored in any Hadoop file system, including the In-Memory File System provided by Ignite.
It eliminates the overhead associated with name-node, job tracker, and task trackers in a standard Hadoop architecture while providing low-latency distributed processing.
Ignite MapReduce performs much better than Hadoop due to push-based resource allocation (vs. pull-based in Hadoop), as well as in-process collocation of computations with data.
In HDFS, NameNode stores all the metadata and can be a single point of failure. In Ignite, every client can determine which node a key belongs to by plugging it into a hashing function, without a need for any special mapping servers or name nodes.
Since Ignite File System (IGFS) does not need a NameNode, when used with IGFS, Ignite MapReduce jobs go directly to the IGFS data nodes in a single round-trip.