Whitepaper
Auditable Compliance Crosswalks
A Rules-Based Approach to SKOS Mapping in the Era of LLM-Based Ontology Matching
Cross-framework compliance crosswalks – such as DISA STIG to NIST 800-53 or CMMC to NIST 800-171 – are foundational to U.S. federal-government and defense-industrial-base assurance workflows, yet they decay silently as underlying standards re-version, transitive inferences cascade beyond authored intent, and intermediate bridge vocabularies get bypassed.
This paper presents a rules-based authoring framework for SKOS-typed compliance crosswalks, grounded in the W3C SKOS Reference and a research library of twenty-four peer-reviewed and primary-source citations. The framework comprises fourteen numbered rules and six anti-patterns, each linked to cited authorities, and is implemented as a human-in-the-loop maintainable specification backed by a Postgres-stored mapping table with row-level provenance.
Three structural failure modes are identified and addressed: standard-revision drift, transitive-inference cascade under SKOS exactMatch semantics, and bridge-layer collapse. Two additional rules responding to 2023–2024 LLM-based ontology-matching literature demonstrate the framework's maintainability under field evolution, and the full framework is published openly as decision record DR-006 of the GRCSchema project.
Key ideas
Silent Decay of Compliance Crosswalks
Compliance crosswalks fail not at the moment of authoring but in the gap between authoring and audit. Underlying standards re-version without automatically invalidating derivative citations, and ambiguity markers accumulate without written justification, causing crosswalks to pass surface inspection while failing under scrutiny.
SKOS Mapping Predicates as a Rigorous Foundation
The W3C SKOS Reference defines five mapping predicates – exactMatch, closeMatch, broadMatch, narrowMatch, and relatedMatch – with formal symmetry, transitivity, and disjointness constraints. Adopting these predicates without accompanying authoring rules produces crosswalks that are syntactically correct but semantically inconsistent under transitive closure.
Three Structural Failure Modes
The framework identifies and mitigates three recurrent failure modes: standard-revision drift (citations pointing to retired or renumbered controls), transitive-inference cascade (silent entailments produced by the symmetric and transitive exactMatch predicate), and bridge-layer collapse (bypassing known intermediate vocabularies such as the DoD's Common Configuration Identifier registry).
Rules-Based Authoring with Research-Grounded Provenance
Fourteen numbered rules and six anti-patterns are each linked to primary-source citations and include worked examples drawn from real compliance vocabulary pairs. A change log records the date, author, and triggering citation for every rule update, creating a provenance audit trail that survives maintainer turnover.
LLM-Proposed Predicates Require a Validation Gate
Recent work shows that few-shot LLM prompting can match specialist ontology-matching systems on benchmark tasks, yet prompt engineering alone fails on real-world alignment workloads. Rule R14 therefore requires that any LLM-proposed predicate pass a set-overlap validation gate (Tversky or Jaccard index), with exactMatch proposals scoring below 0.9 downgraded to closeMatch or escalated for human review.
Conjunctive Mappings Expose an Expressivity Gap in SKOS
Compliance requirements are frequently conjunctive – a single NIST control may require multiple conditions each implemented by a separate STIG rule – yet SKOS models only binary mappings. Rule R13 prescribes authoring each binary mapping individually with a shared group identifier, explicitly flagging this as a workaround pending a SKOS extension or domain-specific predicate.