Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.docpipe.ai/llms.txt

Use this file to discover all available pages before exploring further.

The validation node checks data against configurable criteria before passing it to downstream nodes. Use it to enforce data quality, ensure required fields are present, verify formats, or run custom validation logic. Five validation modes are available, from simple field rules to AI-powered checks.

When to use validation

  • You want to catch bad extractions before delivery: missing required fields, malformed totals, wrong currency.
  • You want to gate runs that should go to review when something looks off. Pair validation Warn with a downstream review node.
  • You need to enforce a strict contract with an external system. Use Schema mode with a JSON Schema document.
  • You want a programmable check (regex, cross-field math). Use Script mode for deterministic logic, AI mode when the check is genuinely judgment-based.
  • Skip validation when the criteria are simple presence checks already covered by the schema’s required field on extract.

Failure actions

Every validation mode uses a failure action to control what happens when validation errors are found:
ActionBehavior
FailThe step fails and the pipeline stops. Errors are reported in the run detail.
WarnThe step succeeds and the pipeline continues. Validation errors are converted to warnings.

Modes

Rules

Define field-level validation rules using a visual rule editor. Each rule specifies a field path (supports dot notation and array indexing, e.g. items[0].name), an operator, and an optional value. You can also set a custom error message per rule.

Operators

OperatorGroupDescriptionValue required
requiredPresenceField must existNo
not_emptyPresenceField must not be empty or nullNo
equalsComparisonField must equal the valueYes
not_equalsComparisonField must not equal the valueYes
gtComparisonField must be greater than the valueYes
gteComparisonField must be greater than or equal to the valueYes
ltComparisonField must be less than the valueYes
lteComparisonField must be less than or equal to the valueYes
containsStringField must contain the substringYes
not_containsStringField must not contain the substringYes
starts_withStringField must start with the valueYes
ends_withStringField must end with the valueYes
matches_regexStringField must match the regular expressionYes
is_emailFormatField must be a valid email addressNo
is_numberFormatField must be a numeric valueNo
is_dateFormatField must be a valid dateNo
is_urlFormatField must be a valid URLNo
in_listListField must be one of the listed valuesYes
not_in_listListField must not be one of the listed valuesYes
length_minLengthField length must be at least the valueYes
length_maxLengthField length must be at most the valueYes
length_betweenLengthField length must be between two valuesYes (two values)

Schema

Validate data against a JSON Schema document. Paste or write your schema in the editor and the node validates the incoming data against it. Useful when you need to enforce a strict contract on the data shape.

Endpoint

Call an external HTTP endpoint to validate data. The node sends the data to your URL and interprets the response to determine pass or fail.
FieldTypeRequiredDescription
URLstringYesThe endpoint URL. Supports template substitution.
MethodselectYesHTTP method (GET, POST, PUT, etc.)
Headerskey-valueNoCustom request headers
BodystringNoRequest body template
TimeoutnumberNoRequest timeout in seconds (1–120, default 30)
Response mappingobjectNoDot-path to valid and errors fields in the response JSON
By default, the node looks for valid or isValid (boolean) and errors (string array) at the root of the response body. Use response mapping to point to different paths if your endpoint returns a different shape. Transient errors (HTTP 429, 502, 503, 504) are retried automatically.

Script

Write JavaScript validation logic that runs in a sandboxed runtime. The incoming data is available as the data variable. Runtime limits:
LimitValue
Execution timeout5 seconds
Max statements10,000
Memory16 MB
The editor provides syntax highlighting, autocomplete, and inline error reporting. Return values:
Return typeInterpretation
trueValidation passes
falseValidation fails (generic error)
"error message"Validation fails with the given message
["error1", "error2"]Validation fails with multiple errors
undefined / nullValidation passes
The script must contain a return statement. Object return values are not supported.
Example:
if (!data.email || !data.email.includes('@')) {
  return 'Invalid email address';
}
if (data.items && data.items.length === 0) {
  return 'At least one item is required';
}
return true;

AI

Use an LLM to validate data against natural language instructions. Write a prompt describing what valid data looks like, and the model returns whether the data passes along with any errors.
FieldTypeRequiredDescription
PrompttextYesNatural language validation instructions
PrecisionselectYesModel quality level: Small, Medium, or High
Higher precision uses a more capable model at a higher credit cost.

Output format

The validation node passes upstream data through and appends a validation object:
{
  "data": { },
  "validation": {
    "isValid": true,
    "errors": [],
    "warnings": [],
    "mode": "rules",
    "failureAction": "fail"
  }
}
FieldTypeDescription
dataobjectThe original upstream payload, passed through unchanged
validation.isValidbooleanWhether the data passed validation
validation.errorsstring[]Validation error messages (empty when valid)
validation.warningsstring[]Warning messages (populated when failure action is warn)
validation.modestringThe validation mode that was used
validation.failureActionstringThe configured failure action (fail or warn)

Inputs and outputs

Allowed inputs: Extract, transform, route, merge, review, parse. Output: Original data with validation results appended. When failure action is fail and validation errors exist, the step fails and downstream nodes do not execute.

Common pitfalls

Fail halts the pipeline and downstream nodes never run. If you wanted the run to continue but to flag the issue, switch to Warn and consume the validation.warnings array downstream.
By default, the endpoint mode looks for valid (or isValid) and errors at the response root. If your service returns a different shape, set Response mapping to point at the correct paths or every response will be treated as invalid.
The script must contain a return statement. A script that finishes without returning is treated as undefined (which passes validation), so a missing return silently disables the check.
AI validation costs LLM credits per run. If the check is “amount must be a positive number” or “currency must be USD or EUR,” use Rules or Script mode for free, deterministic results. Reserve AI mode for genuinely judgment-based checks.
A required field marked on the extract schema already gates the field’s presence. Don’t repeat the same check in validation; use validation for the harder constraints (formats, ranges, cross-field rules).

Extract action

Extract structured data before validating

Review action

Add human review for failed validations

Transform action

Transform validated data to a delivery format

Route action

Route data based on conditions before validating