Last updated: 2026-04-30 UTC

Public explainer for the DecencyMeter scoring anomaly that motivated experimental v0.2

DecencyMeter demo Pressure tests DecencyMeter bridge v2.2 demo Submit critique

DecencyMeter Scoring Anomalies

Why experimental v0.2 exists, and why that does not make DBaD responsible for scoring-layer interpretation mistakes.

This page explains the clearest public scoring anomaly: a procedurally clean incident trace can present as DecencyMeter v0.1 / baseline / Perfect-looking incident trace / advisory: 100/100 under the current model. That anomaly belongs to DecencyMeter interpretation, not to DBaD structural validation.

Boundary Block

  • DBaD validates structure, not truth or morality.
  • DecencyMeter scoring is advisory interpretation only.
  • v0.2 is experimental and contestable.
  • A lower score is not proof of wrongdoing.
  • A higher score is not proof of goodness.

Portable Citation Object

Portable citation object: a reusable advisory notice designed to travel with screenshots, quotes, press references, decks, emails, and regulator or journalist excerpts. It is not a legal disclaimer or enforcement tool; it is a presentation-discipline tool that makes omission of context visibly dishonest.

Short Notice

DecencyMeter Advisory Notice: Advisory only. Not DBaD validation. Not proof of ethical behavior, truth, or outcome quality. Profile, version, and context matter.

Standard Notice

DecencyMeter Advisory Notice: This is a procedural trace-interpretation score only. It does not verify truth, ethics, harm, safety, or real-world outcomes. Scores vary by profile and version and must not be presented as proof of ethical behavior.

Synthetic / Experimental Notice

DecencyMeter Advisory Notice: This synthetic pressure-test output and experimental DecencyMeter output are advisory only. They are not DBaD validation and not proof of ethical behavior, truth, or real-world outcome quality. Profile, version, and case context matter.

Observation Check

System is now in observation phase. These links route reviewers into the existing critique/report path without adding storage or new backend behavior.

Was this helpful?

Confusing?

Misleading?

The v0.1 anomaly

The clearest misread risk found in pressure testing was simple:

A procedurally clean incident trace can still surface a selected advisory score of 100/100 under v0.1 / baseline.

That happened because v0.1 rewarded visible process cleanliness but did not directly deduct for outcome_status=incident unless another signal such as mismatch or unresolved governance also fired.

Why the anomaly happens

  • No expectation/outcome mismatch
  • Evidence present
  • No open blind spots
  • Completeness declared complete
  • No closure problem
  • No direct v0.1 incident deduction

Why it can be misread

A viewer can easily mistake a high advisory score for goodness or safety even when the recorded observed outcome is still incident.

Why this is not a DBaD failure

What DBaD did correctly

DBaD recorded the trace fields, the outcome status, the evidence posture, the blind-spot state, the completeness state, and the closure state without claiming those fields were moral truth.

Where responsibility belongs

DecencyMeter chose how to interpret those visible fields. If the score feels too clean, that is a scoring-layer design problem, not a DBaD protocol failure.

DBaD validates structure, not truth or morality. The protocol can be structurally correct while a downstream score is still contestable or incomplete.

What experimental v0.2 changes

Experimental v0.2 stays downstream from DBaD and adds visible deductions only. It does not change validation, mutate traces, or store scores.

  • Direct outcome-status deductions for visible concerning outcomes such as incident.
  • An extra caution deduction when incident coexists with approved_to_continue.
  • An extra caution deduction when incident coexists with no blind spots and declared_complete completeness.
  • An over-verification / minimal-evidence caution when verification history is heavy but supported transition evidence still remains thin.

v0.2 is experimental and contestable. It exists to make the anomaly visible, not to define final truth.

What to look for

  • Cases where the score feels wrong even if the visible logic is working as designed.
  • Cases where profiles disagree strongly and the reason is not obvious at first glance.
  • Cases where a clean trace still feels misleading.
  • Cases where a visibly bad trace still scores unexpectedly high.

Examples

These examples use the current baseline profile. Audit, stress, and the perfect-looking incident trace are all now public demo cases. The perfect-looking incident trace remains synthetic pressure-test material rather than real-world evidence.

Presentation rule: no bare score appears below without version, profile, case context, and advisory labeling.

Quote rule: copy one of the portable citation notices above rather than quoting an anomaly score alone.

Example Kind Outcome v0.1 Experimental v0.2 Delta New v0.2 signals
Runtime-audited trace
trc_20260428181140_42396240
live public demo case upheld
ADVISORY ONLY — NOT DBaD VALIDATION — NOT PROOF OF ETHICAL BEHAVIOR
Profile: baseline
Case: Runtime-audited trace
Runtime-audited trace.
Selected score: 90/100
ADVISORY ONLY — NOT DBaD VALIDATION — NOT PROOF OF ETHICAL BEHAVIOR
Profile: baseline / experimental v0.2
Case: Runtime-audited trace
Experimental. Runtime-audited trace.
Selected score: 90/100
+0 none
Current public proof case. Only missing evidence triggers, so v0.2 leaves the score unchanged. Open this live demo case.
Scoring stress test trace
trc_20260428185100_7f3c9d21
live public demo case unknown
ADVISORY ONLY — NOT DBaD VALIDATION — NOT PROOF OF ETHICAL BEHAVIOR
Profile: baseline
Case: Scoring stress test trace / synthetic pressure test
Synthetic pressure test, not real-world evidence.
Selected score: 60/100
ADVISORY ONLY — NOT DBaD VALIDATION — NOT PROOF OF ETHICAL BEHAVIOR
Profile: baseline / experimental v0.2
Case: Scoring stress test trace / synthetic pressure test
Experimental. Synthetic pressure test, not real-world evidence.
Selected score: 60/100
+0 none
Current public contrast case. Runtime-normalized outcome remains unknown, so v0.2 currently adds no new outcome deduction. Open this live demo case.
Perfect-looking incident trace
trc_20260428193300_pf67c1e2
live public demo case incident
ADVISORY ONLY — NOT DBaD VALIDATION — NOT PROOF OF ETHICAL BEHAVIOR
Profile: baseline
Case: Perfect-looking incident trace / synthetic pressure test
Synthetic pressure test, not real-world evidence.
Selected score: 100/100
ADVISORY ONLY — NOT DBaD VALIDATION — NOT PROOF OF ETHICAL BEHAVIOR
Profile: baseline / experimental v0.2
Case: Perfect-looking incident trace / synthetic pressure test
Experimental. Synthetic pressure test, not real-world evidence.
Selected score: 82/100
-18 outcome status incident, incident plus no-blind-spots confidence caution
This is the clearest anomaly: clean process plus incident outcome. It is synthetic pressure-test material, not real-world evidence, and not a DBaD failure. Open this live demo case.

What not to conclude

  • A lower score is not proof of wrongdoing.
  • A higher score is not proof of goodness.
  • A clean DBaD trace does not mean the observed outcome was morally good.
  • A contested DecencyMeter score does not invalidate DBaD structural validation.

Links Back

/decencymeter/demo · /decencymeter/pressure-tests · /decencymeter-bridge · /v2-2-demo · /break-dbad/report