Interface ComputeTask<T,​R>

  • Type Parameters:
    T - Type of the task argument that is passed into map(List, Object) method.
    R - Type of the task result returning from reduce(List) method.
    All Superinterfaces:
    Serializable
    All Known Implementing Classes:
    ComputeTaskAdapter, ComputeTaskSplitAdapter

    public interface ComputeTask<T,​R>
    extends Serializable
    Grid task interface defines a task that can be executed on the grid. Grid task is responsible for splitting business logic into multiple grid jobs, receiving results from individual grid jobs executing on remote nodes, and reducing (aggregating) received jobs' results into final grid task result.

    Grid Task Execution Sequence

    1. Upon request to execute a grid task with given task name system will find deployed task with given name. Task needs to be deployed prior to execution (see IgniteCompute.localDeployTask(Class, ClassLoader) method), however if task does not specify its name explicitly via @ComputeTaskName annotation, it will be auto-deployed first time it gets executed.
    2. System will create new distributed task session (see ComputeTaskSession).
    3. System will inject all annotated resources (including task session) into grid task instance. See org.apache.ignite.resources package for the list of injectable resources.
    4. System will apply map(List, Object). This method is responsible for splitting business logic of grid task into multiple grid jobs (units of execution) and mapping them to grid nodes. Method map(List, Object) returns a map of with grid jobs as keys and grid node as values.
    5. System will send mapped grid jobs to their respective nodes.
    6. Upon arrival on the remote node a grid job will be handled by collision SPI (see CollisionSpi) which will determine how a job will be executed on the remote node (immediately, buffered or canceled).
    7. Once job execution results become available method result(ComputeJobResult, List) will be called for each received job result. The policy returned by this method will determine the way task reacts to every job result:
    8. Once all results are received or result(ComputeJobResult, List) method returned ComputeJobResultPolicy.REDUCE policy, method reduce(List) is called to aggregate received results into one final result. Once this method is finished the execution of the grid task is complete. This result will be returned to the user through ComputeTaskFuture.get() method.

    Continuous Job Mapper

    For cases when jobs within split are too large to fit in memory at once or when simply not all jobs in task are known during map(List, Object) step, use ComputeTaskContinuousMapper to continuously stream jobs from task even after map(...) step is complete. Usually with continuous mapper the number of jobs within task may grow too large - in this case it may make sense to use it in combination with @ComputeTaskNoResultCache annotation.

    Task Result Caching

    Sometimes job results are too large or task simply has too many jobs to keep track of which may hinder performance. In such cases it may make sense to disable task result caching by attaching @ComputeTaskNoResultCache annotation to task class, and processing all results as they come in result(ComputeJobResult, List) method. When Ignite sees this annotation it will disable tracking of job results and list of all job results passed into result(ComputeJobResult, List) or reduce(List) methods will always be empty. Note that list of job siblings on ComputeTaskSession will also be empty to prevent number of job siblings from growing as well.

    Resource Injection

    Grid task implementation can be injected using IoC (dependency injection) with ignite resources. Both, field and method based injection are supported. The following ignite resources can be injected: Refer to corresponding resource documentation for more information.

    Grid Task Adapters

    ComputeTask comes with several convenience adapters to make the usage easier:
    • ComputeTaskAdapter provides default implementation for result(ComputeJobResult, List) method which provides automatic fail-over to another node if remote job has failed due to node crash (detected by ClusterTopologyException exception) or due to job execution rejection (detected by ComputeExecutionRejectedException exception). Here is an example of how a you would implement your task using ComputeTaskAdapter:
       public class MyFooBarTask extends ComputeTaskAdapter<String, String> {
           // Inject load balancer.
           @LoadBalancerResource
           ComputeLoadBalancer balancer;
      
           // Map jobs to grid nodes.
           public Map<? extends ComputeJob, ClusterNode> map(List<ClusterNode> subgrid, String arg) throws IgniteCheckedException {
               Map<MyFooBarJob, ClusterNode> jobs = new HashMap<MyFooBarJob, ClusterNode>(subgrid.size());
      
               // In more complex cases, you can actually do
               // more complicated assignments of jobs to nodes.
               for (int i = 0; i < subgrid.size(); i++) {
                   // Pick the next best balanced node for the job.
                   jobs.put(new MyFooBarJob(arg), balancer.getBalancedNode())
               }
      
               return jobs;
           }
      
           // Aggregate results into one compound result.
           public String reduce(List<ComputeJobResult> results) throws IgniteCheckedException {
               // For the purpose of this example we simply
               // concatenate string representation of every
               // job result
               StringBuilder buf = new StringBuilder();
      
               for (ComputeJobResult res : results) {
                   // Append string representation of result
                   // returned by every job.
                   buf.append(res.getData().string());
               }
      
               return buf.string();
           }
       }
       
    • ComputeTaskSplitAdapter hides the job-to-node mapping logic from user and provides convenient ComputeTaskSplitAdapter.split(int, Object) method for splitting task into sub-jobs in homogeneous environments. Here is an example of how you would implement your task using ComputeTaskSplitAdapter:
       public class MyFooBarTask extends ComputeTaskSplitAdapter<Object, String> {
           @Override
           protected Collection<? extends ComputeJob> split(int gridSize, Object arg) throws IgniteCheckedException {
               List<MyFooBarJob> jobs = new ArrayList<MyFooBarJob>(gridSize);
      
               for (int i = 0; i < gridSize; i++) {
                   jobs.add(new MyFooBarJob(arg));
               }
      
               // Node assignment via load balancer
               // happens automatically.
               return jobs;
           }
      
           // Aggregate results into one compound result.
           public String reduce(List<ComputeJobResult> results) throws IgniteCheckedException {
               // For the purpose of this example we simply
               // concatenate string representation of every
               // job result
               StringBuilder buf = new StringBuilder();
      
               for (ComputeJobResult res : results) {
                   // Append string representation of result
                   // returned by every job.
                   buf.append(res.getData().string());
               }
      
               return buf.string();
           }
       }
       
    • Method Detail

      • map

        @NotNull
        @NotNull Map<? extends ComputeJob,​ClusterNode> map​(List<ClusterNode> subgrid,
                                                                 @Nullable
                                                                 T arg)
                                                          throws IgniteException
        This method is called to map or split grid task into multiple grid jobs. This is the first method that gets called when task execution starts.
        Parameters:
        arg - Task execution argument. Can be null. This is the same argument as the one passed into Grid#execute(...) methods.
        subgrid - Nodes available for this task execution. Note that order of nodes is guaranteed to be randomized by container. This ensures that every time you simply iterate through grid nodes, the order of nodes will be random which over time should result into all nodes being used equally.
        Returns:
        Map of grid jobs assigned to subgrid node. Unless ComputeTaskContinuousMapper is injected into task, if null or empty map is returned, exception will be thrown.
        Throws:
        IgniteException - If mapping could not complete successfully. This exception will be thrown out of ComputeTaskFuture.get() method.
      • result

        ComputeJobResultPolicy result​(ComputeJobResult res,
                                      List<ComputeJobResult> rcvd)
                               throws IgniteException
        Asynchronous callback invoked every time a result from remote execution is received. It is ultimately upto this method to return a policy based on which the system will either wait for more results, reduce results received so far, or failover this job to another node. See ComputeJobResultPolicy for more information about result policies.
        Parameters:
        res - Received remote grid executable result.
        rcvd - All previously received results. Note that if task class has ComputeTaskNoResultCache annotation, then this list will be empty.
        Returns:
        Result policy that dictates how to process further upcoming job results.
        Throws:
        IgniteException - If handling a job result caused an error. This exception will be thrown out of ComputeTaskFuture.get() method.
      • reduce

        @Nullable
        R reduce​(List<ComputeJobResult> results)
          throws IgniteException
        Reduces (or aggregates) results received so far into one compound result to be returned to caller via ComputeTaskFuture.get() method.

        Note, that if some jobs did not succeed and could not be failed over then the list of results passed into this method will include the failed results. Otherwise, failed results will not be in the list.

        Parameters:
        results - Received results of broadcasted remote executions. Note that if task class has ComputeTaskNoResultCache annotation, then this list will be empty.
        Returns:
        Grid job result constructed from results of remote executions.
        Throws:
        IgniteException - If reduction or results caused an error. This exception will be thrown out of ComputeTaskFuture.get() method.