Cross-cluster Replication Extension
Overview
Cross-cluster Replication Extension module provides the following ways to set up cross-cluster replication based on CDC.
-
Ignite2IgniteClientCdcStreamer - streams changes to destination cluster using Java Thin Client.
-
Ignite2IgniteCdcStreamer - streams changes to destination cluster using client node.
-
Ignite2KafkaCdcStreamer combined with KafkaToIgniteCdcStreamer streams changes to destination cluster using Apache Kafka as a transport.
-
Ignite2PostgreSqlCdcStreamer - streams changes to destination PostgreSQL.
|
Note
|
Conflict resolver should be defined for each cache replicated between the clusters. |
|
Note
|
All implementations of the cross-cluster replication support replication of BinaryTypes and TypeMappings |
|
Note
|
To use SQL queries on the destination cluster over CDC-replicated data, set the same VALUE_TYPE in
CREATE TABLE on both source and destination clusters for each table.
|
Installation
-
Build
cdc-extmodule with maven:$~/src/ignite-extensions/> mvn clean package -DskipTests $~/src/ignite-extensions/> ls modules/cdc-ext/target | grep zip ignite-cdc-ext.zip -
Unpack
ignite-cdc-ext.ziparchive to$IGNITE_HOMEfolder.
Now, you have additional binary $IGNITE_HOME/bin/kafka-to-ignite.sh and $IGNITE_HOME/libs/optional/ignite-cdc-ext module.
|
Note
|
Please, enable ignite-cdc-ext to be able to run kafka-to-ignite.sh.
|
Conflict resolution
Conflict resolver should be defined for each cache replicated between the clusters. Cross-cluster replication extension has the default conflict resolver implementation.
|
Note
|
Default implementation only select correct entry and never merge. |
The default resolver implementation will be used when custom conflict resolver is not set.
Configuration
| Name | Description | Default value |
|---|---|---|
|
Local cluster id. Can be any value from 1 to 31. |
null |
|
Set of cache names to handle with this plugin instance. |
null |
|
Value field to resolve conflict with. Optional. Field values must implement |
null |
|
Custom conflict resolver. Optional. Field must implement |
null |
Conflict resolution algorithm
Replicated changes contain some additional data. Specifically, entry’s version from source cluster is supplied with the changed data.
Default conflict resolve algorithm based on entry version and conflictResolveField.
Conflict resolution based on the entry’s version
This approach provides the eventual consistency guarantee when each entry is updatable only from a single cluster.
|
Important
|
This approach does not replicate any updates or removals from the destination cluster to the source cluster. |
-
Changes from the "local" cluster are always win. Any replicated data can be overridden locally.
-
If both old and new entry are from the same cluster then entry versions comparison is used to determine the order.
-
Conflict resolution failed. Update will be ignored. Failure will be logged.
Conflict resolution based on the entry’s value field
This approach provides the eventual consistency guarantee even when entry is updatable from any cluster.
|
Note
|
Conflict resolution field, specified by conflictResolveField, should contain a user provided monotonically increasing value such as query id or timestamp.
|
|
Important
|
This approach does not replicate the removals from the destination cluster to the source cluster, because removes can’t be versioned by the field. |
-
Changes from the "local" cluster are always win. Any replicated data can be overridden locally.
-
If both old and new entry are from the same cluster then entry versions comparison is used to determine the order.
-
If
conflictResolveFieldis provided then field values comparison is used to determine the order. -
Conflict resolution failed. Update will be ignored. Failure will be logged.
Custom conflict resolution rules
You’re able to define your own rules for resolving conflicts based on the nature of your data and operations. This can be particularly useful in more complex situations where the standard conflict resolution strategies do not apply.
Choosing the right conflict resolution strategy depends on your specific use case and requires a good understanding of your data and its usage. You should consider the nature of your transactions, the rate of change of your data, and the implications of potential data loss or overwrites when selecting a conflict resolution strategy.
Custom conflict resolver can be set via conflictResolver and allows to compare or merge the conflict data in any required way.
Configuration example
Configuration is done via Ignite node plugin:
<property name="pluginProviders">
<bean class="org.apache.ignite.cdc.conflictresolve.CacheVersionConflictResolverPluginProvider">
<property name="clusterId" value="1" />
<property name="caches">
<util:list>
<bean class="java.lang.String">
<constructor-arg type="String" value="queryId" />
</bean>
</util:list>
</property>
</bean>
</property>
Apache, Apache Ignite, the Apache feather and the Apache Ignite logo are either registered trademarks or trademarks of The Apache Software Foundation.