Evaluation Points

Evaluation points are one of Cyclonetix’s most powerful features, allowing for dynamic decision-making within workflows. This guide explains how to use evaluation points to create adaptive, intelligent workflows.

What Are Evaluation Points?

An evaluation point is a special task that can:

Make decisions about what happens next in a workflow
Dynamically modify the execution graph
Conditionally execute downstream tasks
Serve as approval gates for human intervention
Integrate with external systems to determine workflow paths

How Evaluation Points Work

When a task is marked as an evaluation point:

The task executes normally like any other task
After completion, its output is analyzed
Based on the output, the workflow’s execution graph may be modified
The orchestrator reevaluates the graph with these modifications

Defining an Evaluation Point

To make a task an evaluation point, set the evaluation_point flag to true in the task definition:

id: "evaluate_model"
name: "Evaluate Model Performance"
command: "python evaluate.py --model ${MODEL_PATH} --threshold ${THRESHOLD}"
dependencies:
  - "train_model"
evaluation_point: true
parameters:
  threshold: 0.85

Implementation Patterns

Exit Code Pattern

The simplest way to implement an evaluation point is using exit codes:

#!/bin/bash
# evaluate_data.sh

# Run validation
./validate_data.py --input ${INPUT_PATH}

# Check validation result
if [ $? -eq 0 ]; then
  # Data is valid, proceed to full processing
  echo "CYCLO_NEXT=process_data" > $CYCLO_EVAL_RESULT
  exit 0
else
  # Data is invalid, go to error handling
  echo "CYCLO_NEXT=handle_invalid_data" > $CYCLO_EVAL_RESULT
  exit 1
fi

JSON Result Pattern

For more complex scenarios, you can output a JSON result:

# evaluate_model.py
import json
import sys

# Perform model evaluation
accuracy = evaluate_model(model_path)

# Decide next steps based on accuracy
result = {
    "metrics": {
        "accuracy": accuracy
    }
}

if accuracy >= 0.90:
    result["next_tasks"] = ["deploy_model", "notify_success"]
elif accuracy >= 0.75:
    result["next_tasks"] = ["tune_model", "retry_training"]
else:
    result["next_tasks"] = ["notify_failure"]

# Write result to the evaluation result file
with open(os.environ.get("CYCLO_EVAL_RESULT", "eval_result.json"), "w") as f:
    json.dump(result, f)

# Exit with appropriate code
sys.exit(0 if accuracy >= 0.75 else 1)

Evaluation Results Specification

The evaluation result can specify:

Next tasks: Which tasks should be executed next
Parameters: Parameters for those tasks
Context updates: Changes to the execution context
Metadata: Additional information about the evaluation

The evaluation result file (specified by the CYCLO_EVAL_RESULT environment variable) should contain a JSON object with the following structure:

{
  "next_tasks": ["task_id_1", "task_id_2"],
  "parameters": {
    "task_id_1": {
      "param1": "value1",
      "param2": "value2"
    },
    "task_id_2": {
      "param1": "value1"
    }
  },
  "context_updates": {
    "variable1": "new_value",
    "variable2": "new_value"
  },
  "metadata": {
    "reason": "Model accuracy below threshold",
    "metrics": {
      "accuracy": 0.82,
      "precision": 0.79
    }
  }
}

Common Use Cases

Conditional Workflow Branching

One of the most common uses is to create branches in your workflow:

id: "check_data_quality"
name: "Check Data Quality"
command: "python scripts/validate_data.py --input ${INPUT_PATH} --output $CYCLO_EVAL_RESULT"
dependencies:
  - "ingest_data"
evaluation_point: true

The script might output something like:

{
  "next_tasks": ["process_clean_data"],
  "metadata": {
    "quality_score": 98,
    "issues_found": 0
  }
}

Or if problems are found:

{
  "next_tasks": ["clean_data"],
  "metadata": {
    "quality_score": 68,
    "issues_found": 12
  }
}

Human Approval Gates

Evaluation points can serve as approval gates requiring human intervention:

id: "approval_gate"
name: "Approve Production Deployment"
command: "python scripts/wait_for_approval.py --model ${MODEL_NAME}"
dependencies:
  - "validate_model"
evaluation_point: true

The wait_for_approval.py script might wait for a response via API or UI interaction.

Dynamic Task Generation

Evaluation points can dynamically determine which tasks to run:

id: "analyze_data_types"
name: "Analyze Data Types"
command: "python scripts/data_analyzer.py --input ${INPUT_PATH} --output $CYCLO_EVAL_RESULT"
dependencies:
  - "load_data"
evaluation_point: true

The script might detect different data types and schedule appropriate processing tasks:

{
  "next_tasks": ["process_images", "process_text", "process_numerical"],
  "parameters": {
    "process_images": {
      "count": 150,
      "path": "/data/images"
    },
    "process_text": {
      "count": 500,
      "language": "en"
    }
  }
}

Integration with AI Decision-Making

Evaluation points are particularly powerful when combined with AI for intelligent workflow orchestration:

id: "ai_analyze_results"
name: "AI Result Analysis"
command: "python scripts/ai_analyzer.py --results ${RESULTS_PATH} --output $CYCLO_EVAL_RESULT"
dependencies:
  - "run_experiment"
evaluation_point: true

The AI analyzer might make sophisticated decisions about next steps based on experiment results.

Best Practices

Keep evaluation logic focused - Evaluation points should make decisions, not perform heavy processing
Prefer declarative over imperative - Specify what should happen, not how it should happen
Include clear metadata - Document why decisions were made for better observability
Handle failure gracefully - Ensure evaluation points have clear error paths
Test with different scenarios - Verify all possible branches work as expected

Debugging Evaluation Points

To debug evaluation points:

Enable verbose logging in your evaluation scripts
Check the evaluation result file to ensure it contains valid JSON
Review orchestrator logs for decision-making information
Use the UI visualization to see how the graph changes after evaluation

Next Steps

Learn about Contexts & Parameters to complement evaluation points
Explore Git Integration for version-controlled workflows
See the Troubleshooting & FAQ for common issues