<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="stratml_AI_Highlight.xsl"?>
<StrategicPlan xmlns="urn:ISO:std:iso:17469:tech:xsd:stratml_core" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <Name>A Possible Approach for Evaluating AI Standards Development</Name>
  <Description>A strategic articulation of a theory-of-change-based approach for evaluating whether AI standards achieve goals related to innovation, competition, harm minimization, and public trust.</Description>
  <OtherInformation>Submitter&apos;s Note: This StratML rendition was compiled from the source by ChatGPT and edited in the form at https://stratml.us/forms/Claude/Part1.html^^Scope Note: The report is limited to evaluating the development of AI standards (documentary standards developed by standards development organizations), not evaluating AI systems themselves.^^Intent Note: The document is intended to stimulate discussion with a wide variety of stakeholders about evaluating the effectiveness, utility, and relative value of AI standards development as an intervention.^^Context Note: The report uses data integration—especially entity resolution—as an illustrative application area to make evaluation concepts concrete.</OtherInformation>
  <StrategicPlanCore>
    <Organization>
      <Name>National Institute of Standards and Technology</Name>
      <Acronym>NIST</Acronym>
      <Identifier>401b4307-bc95-4e65-a591-c8d43055426f</Identifier>
      <Description>A U.S. federal agency that advances measurement science, standards, and technology to enhance innovation and trust.</Description>
      <Stakeholder StakeholderTypeType="Person">
        <Name>Julia Lane</Name>
        <Description>Author | NIST Associate | Professor Emerita, Wagner Graduate School of Public Service, New York University</Description>
      </Stakeholder>
    </Organization>
    <Vision>
      <Description>AI standards that measurably advance innovation, trustworthiness, and public benefit.</Description>
      <Identifier>a9a50fdf-8870-46be-be5f-d2491c46d59e</Identifier>
    </Vision>
    <Mission>
      <Description>Define and apply a rigorous evaluation framework to determine whether AI standards achieve intended outcomes and long-term goals.</Description>
      <Identifier>3bb87677-73ec-4fd2-b033-96b655548ef3</Identifier>
    </Mission>
    <Value>
      <Name>Trustworthiness</Name>
      <Description>Increase justified confidence in AI systems by promoting reliable governance, assessment, and transparency practices.</Description>
    </Value>
    <Value>
      <Name>Public Benefit</Name>
      <Description>Evaluate AI standards in terms of their contributions to the public good, including harm reduction and trust-building outcomes.</Description>
    </Value>
    <Value>
      <Name>Innovation</Name>
      <Description>Support innovation by clarifying foundational concepts and enabling interoperability and adoption.</Description>
    </Value>
    <Value>
      <Name>Competition</Name>
      <Description>Promote competitive, widely accessible ecosystems by reducing coordination costs and improving market acceptance of trustworthy AI.</Description>
    </Value>
    <Value>
      <Name>Transparency</Name>
      <Description>Enable meaningful visibility into AI system and data characteristics so that stakeholders can assess reliability and risks.</Description>
    </Value>
    <Value>
      <Name>Privacy</Name>
      <Description>Protect sensitive and personally identifiable information as data and models are shared, combined, and deployed.</Description>
    </Value>
    <Value>
      <Name>Security</Name>
      <Description>Reduce risk and harm by strengthening controls and practices that protect data and AI-enabled processes.</Description>
    </Value>
    <Value>
      <Name>Evidence</Name>
      <Description>Use tested evaluation approaches and measurable indicators to distinguish actual effects from confounding changes in the environment.</Description>
    </Value>
    <Value>
      <Name>Inclusiveness</Name>
      <Description>Involve relevant stakeholders in evaluation design and measurement so that qualitative knowledge informs consequences and tradeoffs.</Description>
    </Value>
    <Value>
      <Name>Consensus</Name>
      <Description>Respect the private sector-led, consensus-based approach to standards while evaluating both positive and negative consequences.</Description>
    </Value>
    <Goal>
      <Name>Evaluation</Name>
      <Description>Establish a coherent framework for evaluating the causal impact of AI standards development.</Description>
      <Identifier>6574cbce-5fe5-47a6-b202-04b7c9949bc2</Identifier>
      <SequenceIndicator>1</SequenceIndicator>
      <OtherInformation>Evaluation is treated as a form of assessment that is periodic and objective, and is distinguished from simply comparing before-and-after changes without a counterfactual. The theory-of-change framing organizes inputs, activities, outputs, outcomes, and goals, and is intended to support learning about what works and why.</OtherInformation>
      <Objective>
        <Name>Results Chain</Name>
        <Description>Define the inputs, activities, outputs, outcomes, and goals associated with AI standards development.</Description>
        <Identifier>9b82b1a5-74c3-4740-83e5-a7b90b2f8fd5</Identifier>
        <SequenceIndicator>1.1</SequenceIndicator>
        <OtherInformation>The report frames standards development as an intervention whose constituent parts should be measured before, during, and after development to enable later causal evaluation.</OtherInformation>
      </Objective>
      <Objective>
        <Name>Counterfactuals</Name>
        <Description>Identify appropriate alternative states of the world against which the impact of AI standards can be assessed.</Description>
        <Identifier>35638d72-497f-40c6-b974-5b7455d400ad</Identifier>
        <SequenceIndicator>1.2</SequenceIndicator>
        <OtherInformation>Without an appropriate comparison group or counterfactual, observed changes cannot be reliably attributed to standards development because the baseline environment may also change over time.</OtherInformation>
      </Objective>
      <Objective>
        <Name>Metrics</Name>
        <Description>Specify qualitative and quantitative measures for assessing adoption, outcomes, and impacts of AI standards.</Description>
        <Identifier>bd091918-4f5e-4697-abdc-2cc2270a1b15</Identifier>
        <SequenceIndicator>1.3</SequenceIndicator>
        <OtherInformation>The report emphasizes that evaluation can be expensive depending on structure and timing; designs using existing data can be more cost-effective than randomized trials, but still require careful attention to confounding factors and baseline measurement.</OtherInformation>
      </Objective>
    </Goal>
    <Goal>
      <Name>Adoption</Name>
      <Description>Support widespread and effective use of AI standards by relevant stakeholders.</Description>
      <Identifier>9775895f-3953-4c44-b0ca-84caec8df679</Identifier>
      <SequenceIndicator>2</SequenceIndicator>
      <OtherInformation>The report highlights foundational areas where standards can accelerate adoption and interoperability, and thereby enable both private and public value creation.</OtherInformation>
      <Objective>
        <Name>Terminology</Name>
        <Description>Promote convergence on shared AI terminology and taxonomies to enable interoperability and reduce coordination costs.</Description>
        <Identifier>5fa58600-fb97-455c-8a1e-914cb1072b38</Identifier>
        <SequenceIndicator>2.1</SequenceIndicator>
        <OtherInformation>Illustrative context: terminology and taxonomy standards can clarify AI techniques and tasks (e.g., classification, named entity recognition, fuzzy matching), improving communication within and across agencies and potentially accelerating deployment of data integration tools.</OtherInformation>
      </Objective>
      <Objective>
        <Name>TEVV</Name>
        <Description>Advance testing, evaluation, verification, and validation methods and metrics for AI systems.</Description>
        <Identifier>05f11989-71f7-456a-919f-2295623b5c95</Identifier>
        <SequenceIndicator>2.2</SequenceIndicator>
        <OtherInformation>Illustrative context: shared TEVV methods and metrics can improve the reliability of AI-assisted data integration, including in demanding environments such as integration of electronic health records across sources.</OtherInformation>
      </Objective>
      <Objective>
        <Name>Training Data Practices</Name>
        <Description>Encourage sound governance and quality management of data used to train AI systems.</Description>
        <Identifier>407f2b75-3f94-4685-b5fc-9c2e7120bafb</Identifier>
        <SequenceIndicator>2.3</SequenceIndicator>
        <OtherInformation>The report notes training data practices needing standardization, including preprocessing technique selection, dataset change management, efficient use of scarce data, management of diverse formats, and identification of data permitted for or excluded from training use.</OtherInformation>
      </Objective>
    </Goal>
    <Goal>
      <Name>Trust</Name>
      <Description>Increase justified confidence in AI systems through consensus-based standards.</Description>
      <Identifier>13adb777-4dbc-4b26-96ef-80129c27e352</Identifier>
      <SequenceIndicator>3</SequenceIndicator>
      <OtherInformation>The report stresses that AI standards can further both private and public good, particularly by increasing trust and reducing harm; it also cautions that standards can have negative consequences, including exclusionary effects such as non-tariff barriers to trade.</OtherInformation>
      <Objective>
        <Name>Governance</Name>
        <Description>Establish norms for accountability, risk management, security, privacy, and transparency in AI development and deployment.</Description>
        <Identifier>b39e883e-2e88-4c72-b16f-f4c157956d64</Identifier>
        <SequenceIndicator>3.1</SequenceIndicator>
        <OtherInformation>The report identifies security, privacy, and transparency among AI actors about system and data characteristics as high-priority topics for standards development.</OtherInformation>
      </Objective>
      <Objective>
        <Name>Conformity Assessment</Name>
        <Description>Foster mechanisms for assessing and attesting conformity with AI standards.</Description>
        <Identifier>69eb319b-59a8-4789-b46a-31be1470d86a</Identifier>
        <SequenceIndicator>3.2</SequenceIndicator>
        <OtherInformation>Conformity assessment strengthens trust by enabling consistent claims about adherence; it also enables evaluation by making adoption and implementation more observable.</OtherInformation>
      </Objective>
    </Goal>
    <Goal>
      <Name>Learning</Name>
      <Description>Enable continuous improvement of AI standards through evidence and stakeholder-engaged evaluation.</Description>
      <Identifier>7ed24ee5-c0ee-4eb6-92b5-5c9959298795</Identifier>
      <SequenceIndicator>4</SequenceIndicator>
      <OtherInformation>The report’s iterative evaluation emphasis treats evaluation evidence as a mechanism for improving future standards development and dissemination, recognizing both costs and the need for fit-for-purpose designs.</OtherInformation>
      <Objective>
        <Name>Stakeholder Engagement</Name>
        <Description>Involve relevant stakeholders in evaluation design and measurement from the beginning.</Description>
        <Identifier>ddcab975-0ad9-481f-b17f-856a4439960b</Identifier>
        <SequenceIndicator>4.1</SequenceIndicator>
        <OtherInformation>Stakeholder qualitative knowledge is treated as a key input for describing consequences of developing particular AI standards and for shaping what should be measured.</OtherInformation>
      </Objective>
      <Objective>
        <Name>Iteration</Name>
        <Description>Refine standards and evaluation methods based on observed outputs, outcomes, impacts, and counterfactual comparisons.</Description>
        <Identifier>b79b8c66-4a27-494d-b075-d3ed66127aaf</Identifier>
        <SequenceIndicator>4.2</SequenceIndicator>
        <OtherInformation>The report frames evaluation results as inputs to future standards development, building a body of knowledge about what works and why.</OtherInformation>
      </Objective>
    </Goal>
  </StrategicPlanCore>
  <AdministrativeInformation>
    <PublicationDate>2026-01-01</PublicationDate>
    <Source>https://doi.org/10.6028/NIST.GCR.26-069</Source>
    <Submitter>
      <GivenName>Owen</GivenName>
      <Surname>Ambur</Surname>
      <EmailAddress>Owen.Ambur@verizon.net</EmailAddress>
    </Submitter>
  </AdministrativeInformation>
</StrategicPlan>