Apache Hadoop Performance Acceleration

Benefits Of Using Apache Ignite

Real-time analytics

Apache Ignite enables real-time analytics across Apache Hadoop operational and historical data silos.

Low-latency and high-throughput operations

Ignite enables low-latency and high-throughput access while Hadoop continues to be used for long-running OLAP workloads.

There are 3 basic steps:

Depending on the data volume and available memory capacity, you can enable Ignite native persistence to store historical data sets on disk while dedicating a memory space for operational records.

You can continue to use Hadoop as storage for less frequently-used data or for long-running and ad-hoc analytical queries.

Your applications and services should use Ignite native APIs to process the data residing in the in-memory cluster. Ignite provides SQL, compute (aka. map-reduce), and machine learning APIs for various data processing needs.

Consider using Apache Spark DataFrames APIs if an application needs to run federated or cross-database queries across Ignite and Hadoop clusters.

Ignite is integrated with Spark, which natively supports Hive/Hadoop. Cross-database queries should be considered only for a limited number of scenarios when neither Ignite nor Hadoop contains the entire data set.

How Can You Split Data And Operations Between Ignite And Hadoop?

Use Apache Ignite for tasks that require:
– Low-latency response time (microseconds, milliseconds, seconds)

– High-throughput operations (thousands and millions of operations per second)
– Real-time processing

Continue using Apache Hadoop for:
— High-latency operations (dozens of seconds, minutes, hours)
— Batch processing

5 Steps To Implement The Architecture In Practice

Download and install Apache Ignite to your system.

Select a list of operations for Ignite.

The best operations are those that require low-latency response time, high-throughput, and real-time analytics.

Consider enabling Ignite native persistence, or use Ignite as a pure in-memory cache, or in-memory data grid that persists changes to Hadoop or another external database.

Update your applications

Ensure they use Ignite native APIs to process Ignite data and Spark for federated queries.

If you need to replicate changes between Ignite and Hadoop clusters, use existing change-data-capture solutions:

Debezium
Kafka

GridGain Data Lake Accelerator
Oracle GoldenGate

To write-through changes to Hadoop directly,
implement Ignite's CacheStore interface.

Ready to Start?

Discover our quick start guide and build your first
application in 5-10 minutes

Quick Start Guide

Want to Learn More?

Read the Apache Spark acceleration article

Apache Spark Acceleration Article

Accelerate Existing Hadoop Deployments
With Apache Ignite

Benefits Of Using Apache Ignite

Real-time analytics

Low-latency and high-throughput operations

How Does Apache Ignite Acceleration Work?

There are 3 basic steps:

How Can You Split Data And Operations Between Ignite And Hadoop?

5 Steps To Implement The Architecture In Practice

Accelerate Existing Hadoop Deployments With Apache Ignite

Benefits Of Using Apache Ignite

Real-time analytics

Low-latency and high-throughput operations

How Does Apache Ignite Acceleration Work?

There are 3 basic steps:

How Can You Split Data And Operations Between Ignite And Hadoop?

5 Steps To Implement The Architecture In Practice

Accelerate Existing Hadoop Deployments
With Apache Ignite