Memory-Centric Storage

Apache Ignite is based on distributed memory-centric architecture that combines the performance and scale of in-memory computing together with the disk durability and strong consistency in one system.

The main difference between the memory-centric approach and the traditional disk-centric approach is that the memory is treated as a fully functional storage, not just as a caching layer, like most databases do. For example, Apache Ignite can function in a pure in-memory mode, in which case it can be treated as an In-Memory Database (IMDB) and In-Memory Data Grid (IMDG) in one.

On the other hand, when persistence is turned on, Ignite begins to function as a memory-centric system where most of the processing happens in memory, but the data and indexes get persisted to disk. The main difference here from the traditional disk-centric RDBMS or NoSQL system is that Ignite is strongly consistent, horizontally scalable, and supports both SQL and key-value processing APIs.

Memory Usage Modes
Mode Description
In-Memory

The whole data set is stored in memory. In this scenario, you can achieve the maximum performance possible because the data is never written to disk. To prevent possible data loss when a single cluster node fails, it is recommended to configure a number of backup copies (aka. replication factor) appropriately. Swap space can be used to prevent memory overflow.

Use cases: in-memory caches, in-memory data grids, in-memory computations, web-session caching, real-time processing of continuous data streams.

In-Memory + 3rd party database

Ignite can be used as a caching layer (aka. data grid) above an existing 3rd party database - RDBMS, NoSQL, or HDFS. This mode is used to accelerate the underlying database. Automatic integration is provided with most of the known databases, like Oracle, MySQL, PostgreSQL, Apache Cassandra, etc.

Use cases: Ignite as In-Memory Data Grid - adds acceleration and scale to existing database deployments (RDBMS, NoSQL, etc).

In-Memory + Full Copy on Disk

The whole data set is stored in memory and on disk. The disk is used for data recovery purposes in case of full cluster crashes and restarts. Ignite native persistence is used to store the data on disk.

Use cases: Ignite as an In-Memory Database - provides SQL, key-value and collocated processing APIs to in-memory data.

100% on Disk + In-Memory Cache

100% of data is stored in Ignite native persistence and smaller subset of data is cached in memory. The more data is cached in memory, the faster is the performance. The disk serves as the primary storage that survives any type of cluster failures and restarts.

Use cases: Ignite as a Memory-Centric Distributed Database - provides distributed database with SQL, key-value and collocated processing APIs.

Ignite Native Persistence

Ignite Persistence is the most flexible, scalable, and convenient way of persisting data in Ignite. It is widely used in scenarios where applications need a distributed memory-centric database.

Ignite native persistence is a distributed, ACID, and SQL-compliant disk store that transparently integrates with Ignite's memory-centric storage. Ignite persistence is optional and can be turned on and off. When turned off Ignite becomes a pure in-memory store.

Following are the advantages and characteristics of Apache Ignite when Ignite native persistence is used together with Ignite in-memory store:

In-Memory
  • Off-Heap memory
  • Removes noticeable GC pauses
  • Automatic Defragmentation
  • Predictable memory consumption
  • Boosts SQL performance
On Disk
  • Optional Persistence
  • Support of flash, SSD, Intel 3D Xpoint
  • Stores superset of data
  • Fully Transactional
    • Write-Ahead-Log (WAL)
  • Instantaneous Cluster Restarts

3rd Party Persistence

Ignite can be used as a caching layer (aka. data grid) above an existing 3rd party database - RDBMS, NoSQL, or HDFS. This mode is used to accelerate the underlying database that persists the data. Ignite stores data in memory, distributed across multiple nodes providing fast data access. It reduces the network overhead caused due to frequent data movement between an application and the database. However, there are some limitations in comparison to the native persistence. For instance, SQL queries will be executed only on the data that is in RAM, thus, requiring to preload all the data set from disk to memory beforehand.

Swap Space

If you do not want to use Ignite native persistence or 3rd party persistence, you can enable swapping, in which case, Ignite in-memory data will be moved to the swap space located on disk if you run out of RAM. When swap space is enabled, Ignites stores data in memory mapped files (MMF) whose content will be swapped to disk by the OS depending on the current RAM consumption. The swap space is mostly used to avoid out of memory errors (OOME) that might happen if RAM consumption goes beyond its capacity and you need more time to scale the cluster out to​ redistribute the data sets evenly.

Collocated vs Client-Server Processing

The disk-centric systems, like RDBMS or NoSQL, generally utilize the classic client-server approach, where the data is brought from the server to the client side where it gets processed and then is usually discarded. This approach does not scale well as moving the data over the network is the most expensive operation in a distributed system.

A much more scalable approach is collocated processing that reverses the flow by bringing the computations to the servers where the data actually resides. This approach allows you to execute advanced logic or distributed SQL with JOINs exactly where the data is stored avoiding expensive serialization and network trips.

Partitioning & Replication

Depending on the configuration, Ignite can either partition or replicate data across its memory-centric storage. Unlike REPLICATED mode, where data is fully replicated across all nodes in the cluster, in PARTITIONED mode Ignite will equally split the data across multiple cluster nodes, allowing for staring TBs of data both in memory and on disk.

Redundancy

Ignite also allows to configure multiple backup copies to guarantee data resiliency in case of failures.

Consistency

Regardless of which replication scheme is used, Ignite guarantees data consistency across all cluster members.

More on Memory-Centric Storage

Feature Description
Persistence

Ignite native persistence is a distributed, ACID, and SQL-compliant disk store that transparently integrates with Ignite memory-centric storage:

Partitioning & Replication

Depending on the configuration, Ignite can either partition or replicate data. Unlike REPLICATED mode, where data is fully replicated across all nodes in the cluster, in PARTITIONED mode Ignite will equally split the data across multiple cluster nodes.

Distributed Database

Apache Ignite can be used as all-in-one distributed database that supports SQL, key-value, compute, machine learning and other data processing APIs:

In-Memory Database

Apache Ignite can be used as a distributed and horizontally scalable in-memory database (IMDB):

Data Grid

Ignite can act as a data grid that is a distributed, transactional key-value store. Unlike other in-memory data grids (IMDG), Ignite enables storing data both, in memory and on disk, and therefore is able to store more data than can fit in physical memory:

Database Caching

Ignite is used as a caching layer (aka. data grid) above 3rd party databases such as RDBMS, Apache Cassandra, MongoDB: