Skip to main content

High-Performance Computing
With Apache Ignite

Schema-driven colocation and compute-to-data patterns

Compute-To-Data Pattern

Ignite supports compute-to-data patterns by executing calculations on nodes where data resides. Schema-driven colocation through table design enables local joins without network transfers. Recommendation engines and analytics benefit from significant latency reduction through colocation.

Both Ignite 2 and Ignite 3 support compute-to-data patterns. Ignite 2 uses Affinity Key annotation for colocation. Ignite 3 uses colocation keys defined in table schema. Both versions provide Compute APIs for executing code across cluster nodes.

How Colocation Works

Schema-driven colocation keeps related data on the same nodes for local processing

Schema-Driven Colocation

Define colocation keys in table schema to control data placement. Related records (orders with order items, users with transactions) stored on same partition. Local joins execute without network transfers. Significant latency reduction compared to distributed joins.

Compute-To-Data Execution

Execute calculations on nodes where data resides. Compute APIs broadcast tasks to cluster nodes. Custom code executes locally on colocated data sets. Eliminates network transfers for data-intensive calculations.

Architecture Pattern

Local Joins With Colocated Data

Define colocation keys in table schema to ensure related records reside on same partitions, enabling local joins without network overhead.

Integration Pattern: Design table schemas with colocation keys that match join patterns. Orders table colocated with order items using orderId. Users table colocated with transactions using userId. Local joins execute on single node without network transfers.

Performance Characteristics: Local joins deliver significant latency reduction compared to distributed joins. Network transfers eliminated for colocated data. Memory-first storage enables low-latency join execution. Horizontal scalability maintained through proper partitioning strategy.

Version Support: Ignite 2 uses Affinity Key annotation for colocation. Ignite 3 uses colocation keys defined in CREATE TABLE statements. Both versions support same performance benefits.

Recommendation Engines

Execute recommendation algorithms on nodes where user and product data resides, avoiding network transfers for large feature sets.

Integration Pattern: Colocate user profiles, purchase history, and product catalogs using userId as colocation key. Execute recommendation algorithms using Compute APIs on colocated data sets. Results calculated locally without network transfers.

Performance Characteristics: Compute-to-data pattern delivers significant latency reduction for recommendation calculations. Feature extraction from colocated data avoids network overhead. Parallel execution across cluster nodes for multiple user recommendations.

Example Use Cases:

E-commerce product recommendations based on purchase history and browsing patterns. Content recommendations for streaming platforms. Personalized search results.

Key Benefits

Significant Latency Reduction

Local joins and calculations on colocated data eliminate network transfers. Significant latency reduction compared to distributed joins across nodes. Memory-first storage delivers low-latency data access for local operations. Particularly effective for join-heavy queries and recommendation algorithms.

Schema-Driven Design

Define colocation keys in table schema to control data placement. Compile-time awareness of colocation patterns. Query optimizer leverages colocation for local execution. Explicit schema design makes colocation patterns visible in DDL.

Horizontal Scalability

Add nodes to increase compute capacity while maintaining colocation benefits. Each partition processed independently in parallel. Scales to large data sets through proper partitioning strategy. Compute-to-data pattern scales linearly with cluster size.

Familiar SQL Patterns

Standard SQL joins work automatically with colocated data. No specialized APIs required for local joins. Query optimizer detects colocation and executes locally. Compute APIs available for custom algorithms on colocated data.

When This Pattern Works

Best For Join-Heavy Workloads

This pattern works well when workloads have predictable join patterns (orders with order items, users with transactions). Schema-driven colocation enables local joins for related records. Significant latency reduction for join-heavy queries. Best when colocation key matches most frequent join patterns.

Requires Careful Schema Design

Effective colocation requires upfront schema design with appropriate colocation keys. Poor colocation key choice results in distributed joins. Single colocation key per table limits flexibility for multiple join patterns. Query patterns should be analyzed before defining colocation strategy.

Example Applications

This pattern applies to:

  • E-commerce platforms with product recommendations based on purchase history
  • Financial applications with account-based analytics requiring local joins
  • Content platforms with personalized recommendations based on user behavior
  • IoT analytics with device-based aggregations on colocated sensor data

Concrete Example:

  • Order Processing: Orders table colocated with order_items using orderId. Local joins for order totals without network transfers. Significant latency reduction for checkout processing.
  • Recommendation Engine: User profiles colocated with purchase history using userId. Execute recommendation algorithms locally on colocated data. Parallel execution across cluster for multiple users.
Ready to Start?

Discover our quick start guide and build your first application in 5-10 minutes

Quick Start Guide