Cascade · detection architecture

One frame is a hypothesis. Evidence over time becomes an incident.

The naive approach is to ask the strongest model to inspect every frame of every camera. That is expensive, slow, and wasteful — most frames show ordinary traffic. VESU stages the work: low-cost checks always-on, lightweight perception casts a wide net, temporal reasoning waits for evidence, strong verification is reserved for credible candidates.

01 · The four stages

Each stage is narrower than the last.

VESU does not spend the strongest model on empty roads. It spends compute in proportion to evidence.

STAGE 00

Camera health

Deterministic CV: frozen feed, blur, exposure, obstruction, PTZ motion, view mismatch. A blind camera is worse than an empty alert.

freq

every frame

cost

< 1ms

STAGE 01

Frame perception

Lightweight VLM reads sampled frames against the camera's learned scene model. Strict structured output. Hypotheses, not decisions.

freq

every ~5s

cost

low

STAGE 02

Temporal reasoning

Rules and open-set reasoning over a rolling window. Persistence, region stability, frame count, camera-health state — evidence, not confidence.

freq

rolling ~30s

cost

medium

STAGE 03

Clip verification

Strong multimodal model on a short clip or keyframe set. Confirms or retracts. Rare by design — only credible candidates reach here.

freq

on escalation

cost

high

02 · Walkthrough

A stopped vehicle, from first observation to incident.

A stopped vehicle should not page an operator because it appeared in one frame. VESU waits for evidence; only then does it publish. Timings are representative.

t = 0s

STAGE 0

Camera healthy

Feed is decoding, exposure normal, no PTZ motion. Stage 1 may run.

t = 5s

STAGE 1

Possible stopped vehicle

Vehicle observed in shoulder region. Hypothesis recorded; no incident yet.

t = 30s

STAGE 1

Still in shoulder

Same region. Persistence builds. Camera health still OK.

t = 50s

STAGE 2

Candidate

Persistence threshold met across 8 frames. Region stable. Class-specific rule passes.

t = 51s

STAGE 3

Verified

Strong model reviews a short clip. Confirms class: stopped_vehicle. Severity: medium.

t = 51.4s

PUBLISH

Incident delivered

Incident package built. Coverage state OK. Posted to ATMS event queue.

03 · Incident lifecycle

Six possible states. All of them are explicit outputs.

"Candidate suppressed" and "not enough evidence" are valid outcomes, not silent dropouts.

Candidate

Persistence/evidence threshold met. Not yet verified or published.

Verified

Stage 3 confirmed. Incident package built with clip, reason, provenance.

Published

Delivered to the operator system. Lifecycle tracking begins.

Updated

Severity or extent change. Update delivered through same event path.

Resolved

Cleared in the scene. Resolution event sent.

Suppressed

Evidence retracted before publish, or class did not meet a publication gate.

04 · The open-set path

The road will always produce cases that do not fit a closed list.

Stage 2 carries an open-set path for evidence that looks safety-relevant but does not match a named incident type. The goal is not to invent a label; it is to route evidence for review and improve the ontology over time.

Detect

Stage 2 flags persistent visual evidence that the closed ontology cannot explain.

Verify

Stage 3 confirms the evidence is real and safety-relevant; describes it in plain English.

Route

Surfaced for review with full evidence — not auto-published as an invented type.

The cascade is the part that earns operator trust.

We are happy to walk through how a credible candidate moves through the four stages — and how Stage 0 and the suppression path keep noise out of operator queues.

Talk to Tollscopic →