Accelerate Apache Spark SQL Queries

Running SQL queries using Ignite shared RDDs or DataFrames is much faster than running Spark SQL via native RDDs or Data Frame implementations.

In-Memory Indexes

Spark does not support SQL indexes, resulting in slow SQL queries due to full scans across the whole data set. Such full-scan queries in spark can take minutes and introduce significant wait times, especially when running many queries within the same Spark application.

Apache Ignite, on the other hand, supports SQL with in-memory indexing. Because of advanced in-memory indexing capabilities, IgniteRDD executes SQL queries 100s of times faster than Spark native RDDs or Data Frames.

Off-Heap Memory

Ignite stores data and indexes in an off-heap memory that allows holding petabytes of data in Ignite and process them from Spark without worrying about JVM garbage collection overhead.

Run SQL Queries against Ignite cluster

Speeding up DataFrames access with Ignite