Discover our quick start guide and build your first
application in 5-10 minutes
Apache Ignite enables real-time analytics across Apache Hadoop operational and historical data silos.
Ignite enables low-latency and high-throughput access while Hadoop continues to be used for long-running OLAP workloads.
To achieve the performance acceleration of Hadoop-based systems, deploy Ignite as a separate distributed storage that maintains the data sets required for your low-latency operations or real-time reports
Depending on the data volume and available memory capacity, you can enable Ignite native persistence to store historical data sets on disk while dedicating a memory space for operational records.
You can continue to use Hadoop as storage for less frequently-used data or for long-running and ad-hoc analytical queries.
Your applications and services should use Ignite native APIs to process the data residing in the in-memory cluster. Ignite provides SQL, compute (aka. map-reduce), and machine learning APIs for various data processing needs.
Consider using Apache Spark DataFrames APIs if an application needs to run federated or cross-database queries across Ignite and Hadoop clusters.
Ignite is integrated with Spark, which natively supports Hive/Hadoop. Cross-database queries should be considered only for a limited number of scenarios when neither Ignite nor Hadoop contains the entire data set.
Use Apache Ignite for tasks that require:
– Low-latency response time (microseconds, milliseconds, seconds)
– High-throughput operations (thousands and millions of operations per second)
– Real-time processing
Continue using Apache Hadoop for:
— High-latency operations (dozens of seconds, minutes, hours)
— Batch processing
The best operations are those that require low-latency response time, high-throughput, and real-time analytics.
Consider enabling Ignite native persistence, or use Ignite as a pure in-memory cache, or in-memory data grid that persists changes to Hadoop or another external database.
Ensure they use Ignite native APIs to process Ignite data and Spark for federated queries.
GridGain Data Lake Accelerator
To write-through changes to Hadoop directly,
implement Ignite's CacheStore interface.