Evaluation · calibration · improvement

Audit-only first. Activation after evidence.

Tollscopic DOT# can run in a non-actioning mode, collect labelled examples from local traffic, measure performance by condition, and promote downstream uses only when the data supports them.

01 · The improvement loop

Five stages. Run in a circle.

The output of each becomes the input of the next. Activation is the gate, not the start. Real production traffic feeds the labels; replay closes the loop.

01 · AUDIT-ONLY EVENTS

02 · REVIEW · LABELS

03 · CALIBRATION SNAPSHOT

04 · REPLAY · REGRESSION

05 · CONTROLLED ACTIVATION

02 · What gets measured

Categories, not vanity numbers.

The site does not publish exact thresholds or holdout scores. The discipline is the message: each of these is tracked, named, and tied to a decision about activation.

Candidate-generation recall

Is the correct carrier in the shortlist at all?

Top-1 carrier correctness

When the top candidate is chosen, is it right?

Verified precision

For events emitted as verified, how often is the carrier correct?

Verified coverage

What fraction of eligible commercial traffic emits verified?

Abstention rate

Ambiguous + unreadable + no_panel + catalog_miss together.

Duplicate-event rate

How often the same physical pass produces more than one event.

Event latency

From vehicle pass to identity event available for downstream use.

Event availability

Per-eligible-track event production rate.

03 · Conditions matter

Aggregate accuracy hides a multimodal system.

A system that works in clear daytime and collapses at night should not hide behind one number. Tollscopic DOT# reports performance by condition — not just an overall.

Day / night

lighting affects panel contrast

Rain / fog / glare

visibility degrades readably

Motion blur

speed and shutter interact

Stop-and-go

spacing fragments tracks

High speed

fewer usable frames per pass

Occlusion

trailers, other vehicles, infrastructure

Camera side

left vs right tractor door

Vehicle type

sleeper, day cab, straight truck

Panel state

damage, missing markings, off-spec

04 · Hard cases are named

Each one is an instrumented category.

Hard cases are not apology. They are how Tollscopic DOT# knows what to improve and how to decide which permissions to activate next. Aggregate hides them; categories surface them.

Trailer occludes tractor door

The panel is behind the trailer for part of the track. Selection and consensus absorb it — most of the time.

Night and rain

Reduced contrast on the panel. Frame selection and preprocess help. Confidence reports it honestly.

Tractor / trailer markings disagree

Tractor and trailer can show different identities. Tollscopic DOT# is primarily a tractor-door product; mismatch is reported.

Newly registered carrier

Catalog snapshot may be stale. Event class: catalog_miss — flagged for catalog or process review.

Plate path succeeds; Tollscopic DOT# ambiguous

Belongs to existing plate workflow. DOT# reports ambiguous; no double-bill risk.

Close-following vehicles

Track fragmentation. The system either reunites tracks or refuses with unreadable.

05 · Replay and regression

Stored evidence. Versioned decisions. No silent drift.

When a model, prompt, scorer, or catalog snapshot changes, historical evidence can be reprocessed to understand how the change would have affected real cases. Regression review gates promotion.

Stored evidence

Selected frames, trajectory summaries, and observations preserved by track.

Versioned decisions

Every event records the model, scorer, prompt, and catalog versions that produced it.

Candidate or shadow

New versions can run against historical evidence in parallel to production.

Promoted on review

Changes are activated only after no-regression review against the replay corpus.

Audit-only is a real starting point.

It produces useful evidence about your traffic mix before anyone asks Tollscopic DOT# to take stronger action. Activation is the gate at the end of the loop, not at the start.

Talk to Tollscopic →