AI Governance Framework Comparison Series Methodology
Jump to:
Understanding assessment methodology and limitations. AI Governance Frameworks Comparison post 6 of 6.
Author:Mac Jordan|Post Date: Mar 21, 2025 |Last Update: Mar 21, 2025 |Related Posts
This post details the research approach, assessment criteria, and analytical methods for the AIGFC series. It aims to transparently explain our methodology and the limitations identified with it.
In particular, this post outlines what framework profiles aimed to achieve and how frameworks were assessed using standardized criteria. Documenting these processes provides critical context for interpreting the conclusions drawn in this series.
We aim to be as transparent as possible with our reasoning and processes.
Wider Research
Finding sources: most sources were found through either a keyword search or other sources; while many high-quality sources were found, the AI Governance field remains disparate, so many valuable resources, including surveys and models, are likely not accounted for in this report
Quality of evidence analyzed: most evidence for AI and AI Governance adoption trends among businesses comes from survey data, which is useful for understanding how businesses view AI and AI Governance. However, due to mistaken or dishonest responses, it may fail to capture reliable information about what businesses are doing with AI and AI Governance. In contrast, evidence that more directly gauges what businesses are doing, such as publicly released AI Governance Frameworks and independent audits, currently has limited availability.
Accounting for source dates: some effort was made to discuss survey results within the context of their release date, but we did so inconsistently across analyses. For example, the key barriers to AI adoption section doesn't explicitly consider survey dates when making inferences about the most relevant barriers. Furthermore, there's a lack of historical data for many subjects, which limits what can be said about how they've changed or where they're trending.
Accounting for variance in source methodology: when multiple surveys would be discussed as supporting data points for some argument, e.g., what the key barriers to AI adoption are, the degree of variance in sample size, audience, question-wording, and other survey features were often not thoroughly accounted for in inferences made from each source. This issue risks misinterpreting results and making claims not supported by the evidence.
Framework Assessments
Framework comparability: there is high variance in the subject matter, format, and intended use among the frameworks investigated; a standardized assessment method was applied, which attempted to account for different broad characteristics of each framework, but comparisons between frameworks can only be reasonably limited to what they offer on risk mitigation and Data Management
Assessment categories and benchmarks: the two categories identified don't represent the full extent of AI Governance assessment categories. For example, operational tasks and guidance on implementing a framework are essential to AI Governance but couldn't easily be assessed. However, they are each important to AI Governance and our audience and could be assessed using reputable taxonomies of key practices as benchmarks for classifying guidance. These taxonomies are not currently used as standard benchmarks for classifying guidance (see more on assessment validity below).
Assessments focus on describing framework content: we were able to assess the type and amount of guidance offered in frameworks but not its quality; our assessments do not evaluate the accuracy or effectiveness of a framework's guidance and cannot anticipate what guidance most aligns with a business's objectives
Depth of investigations: frameworks' breadth, depth, and practicality could only be assessed at a moderate depth of investigation for each framework; as such, there will likely be a substantial amount of guidance we haven't factored into our assessments
Validity of assessment methodology: the assessment methodology used in this report, from the assessment categories it assessed to the criteria it used to assess guidance, was novel; no testing or expert consultation on its development occurred before it was used
Reliability of assessments: assessments were often made in ad-hoc steps, such as how deeply each framework was investigated, how many times a framework was revisited, when and what type of notes were taken, etc., which did not control for a consistent treatment of each framework; furthermore, each criterion was subjectively assessed by a non-expert, meaning that assessments may not effectively represent a framework's information value or diverge significantly from how others assess each framework
Public release of practice-based frameworks: very few large businesses, including those whose currently released framework guidance we assessed, have publicly released information on their AI Governance frameworks; this situation means there's a limited pool of resources to draw insights
Framework Investigation Methodology
Framework Profiles
Introduction and Content Summary
While the Introduction provides basic background details on a framework, the Content Summary explains its key contents. The key details in the Introduction include who created the framework and when, its aims, who its target audience is, and what plans for its future development are. Content Summaries focused on outlining the key content categories and the type of information available on each, e.g., actions, definitions, etc.
Best Practice Highlights
Highlights aim to make useful guidance from each framework highly accessible to businesses and indicate the quality and type of additional guidance a framework may offer. We exclusively identified risk mitigation and Data Management highlights in alignment with our framework assessment methodology (outlined below). The final examples of guidance highlighted are directly taken from each source with minimal editing.
Assessments of Value Potential
In addition to describing specific details for each framework, we assessed several general qualities of its content to indicate how potentially valuable it could be for businesses to investigate it further. Assessments involved subjective ratings for the breadth, depth, and practicality of a framework's guidance on risk mitigation and Data Management during the deployment or operation phase of the AI lifecycle.
We used the AI Claude to help assess the breadth criterion. Specifically, we used Claude to parse each framework and identify any substantive guidance for each assessment subcategory. It used the descriptions of and explanations for each subcategory as outlined in NIST's AI RMF and DAMA's DMBOK for risk mitigation and Data Management guidance, respectively. For each suggestion, Claude provided page numbers for the referenced content and a brief explanation of what makes its suggested guidance qualify for a given subcategory. The researcher ultimately determined whether each piece of guidance qualified for a particular subcategory or not.
In addition to these quantitative assessments, we offered high-level guidance on the value potential of supplementary resources and what each framework was best for. Many supplementary resources meaningfully increase a core framework's overall value potential
Assessment Methodology
Assessment Categories
From among the types of practices used in AI Governance identified, only risk mitigation and Data Management could be effectively operationalized using reputable taxonomies of practices for each category as benchmarks for assessments.
Risk Mitigation
This assessment category identifies measures explicitly to mitigate risks using the seven subcategories of AI trustworthiness and risk identified in NIST's AI RMF as a benchmark.
Subcategories:
Valid and reliable
Safe
Secure and resilient
Accountable and transparent
Explainable and interpretable
Privacy-enhanced
Fair – with harmful bias managed
Data Management
This assessment category identifies guidance on data practices using the eleven subcategories of Data Management classified in DAMA's DMBOK were used as a benchmark.
Subcategories:
Data Governance
Data Warehousing & Business Intelligence
Reference & Master Data
Document & Content Management
Data Integration & Interoperability
Data Security
Data Storage & Operations
Data Modeling & Design
Data Architecture
Data Quality
Metadata
Assessment Criteria
Breadth
Definition: rough indication of the range of risk mitigation and Data Management practices that a framework offers guidance on
Operationalization: the total number of assessment subcategories for which a framework includes at least one piece of substantive guidance, i.e., more than the mentioned risk. As the number of subcategories per assessment category is not 10, to be consistent with how thoroughness and practicality are also rated, ratings are normalized to be a score out of 10. However, should a framework offer no relevant guidance, a score of 0 will be assigned, which would then be reflected in in-depth and practicality ratings.
Depth
Definition: the specificity and volume of detail of a framework's guidancee.g., defined terminology and thorough explanations
Operationalization: level of detailed guidance on risk mitigation and Data Management ranges from very low (1-2)to very high (9-10). A 1 would offer no useful information, and 10 would provide all the details a business would need to understand best practices. A rating of 0 is possible if and only if breadth is rated as a 0. The researcher subjectively applies this rating and briefly shares the reasoning behind the conclusion.
Practicality
Definition: the degree to which a framework's guidance is explicit on how to conduct AI Governance in practice, e.g., specific step-by-step procedures
Operationalization: overall practicality of risk mitigation and Data Management guidance ranges from very low (1-2) to very high (9-10). A 1 offers no guidance that businesses could put into practice as is, while a 10 thoroughly lays out the governance actions businesses need to take and how to take them. A rating of 0 is possible if and only if breadth is rated as a 0. The researcher subjectively applies this rating and briefly shares the reasoning behind the conclusion.
Overall Assessments
Value Assessment Ratings
Most importantly, because the number of subcategories of risk mitigation and Data Management used to measure breadth aren't 10 each, whereas depth and practicality are subjectively rated out of 10, the raw rating for breadth is normalized to become a score out of 10.
Once the breadth rating has been normalized, the mean of all three ratings is taken as the overall assessment rating for each assessment category.
Best For
Acknowledges the specific use case(s) the framework best suits. This indicator is included to specify the more nuanced value proposition of each framework outside of our standardized assessments. It's possible for the specific aims and strengths of a given framework to align with a given business's AI Governance needs despite it being rated poorly in our assessments.
Value of Supplementary Resources
This rating of low, moderate, or high roughly indicates how much value supplementary resources are estimated to add to the core framework assessed in full. However, the rating may be weighted heavily on a single, especially high-value resource, such as for NIST and AWS frameworks.
Only shallow research was done into supplementary resources, including a review of their table of contents, when available, and skim-reading a selection of relevant sections.
AI Assistant Disclaimer
AI was used to help produce this document throughout its development. Below is a brief list of specific ways in which it was used and the measures we took to maximize its reliability:
During research to find sources: requiring that sources be hyperlinked or URLs provided before reviewing the validity of all sources used
Writing: reviewing and editing written outputs for accuracy and quality of writing
Feedback on human-produced writing and ideation: requiring high reasoning transparency for how it came to its conclusions
Finding key details in AI Governance frameworks: requiring the page numbers for facts extracted from documents then verifying the accuracy of facts
Retroactively organizing endnotes: reviewing and editing endnote details and updating endnotes based on a review of all sources used per section
Conclusion
This post concludes the AIGFC series. The series covers the need for AI Governance among businesses adopting AI before assessing seven frameworks on key criteria for effective AI Governance. The research methodology used provides both targeted insights on the value potential of leading AI Governance frameworks for data practitioners and C-level executives and an approach for efficiently researching key insights about future AI Governance frameworks.
Businesses seeking to implement AI Governance frameworks should consider this project's findings as informative rather than prescriptive, using them as one input among many when determining which approaches best align with their specific needs.
Mac Jordan
Data Strategy Professionals Research Specialist
Mac supports Data Strategy Professionals with newsletter writing, course development, and research into Data Management trends.
Comparison of Data Management Maturity Assessments (DMMA)
A thorough look at different models for assessing Data Management maturity.
Data Strategy Best Practices
The most effective ways to manage and leverage your data.
Data Management Job Market Projections
An exploration of how AI is redefining data-related roles.
Best Data-Related Certifications
This list of seven top certifications will help you attain valuable credentials in 2025.
Essential Guide to the CDMP Specialist Exams
Book recommendations to guide your preparation for Practitioner and Master level certifications.
Foundations of Data Strategy
Learn the steps to roll out an effective Data Management initiative.
Comparison of Data Management Maturity Assessments (DMMA)
A thorough look at different models for assessing Data Management maturity.
Data Strategy Best Practices
The most effective ways to manage and leverage your data.
Data Management Job Market Projections
An exploration of how AI is redefining data-related roles.
Best Data-Related Certifications
This list of seven top certifications will help you attain valuable credentials in 2025.
Essential Guide to the CDMP Specialist Exams
Book recommendations to guide your preparation for Practitioner and Master level certifications.
Foundations of Data Strategy
Learn the steps to roll out an effective Data Management initiative.
Comparison of Data Management Maturity Assessments (DMMA)
A thorough look at different models for assessing Data Management maturity.
Data Strategy Best Practices
The most effective ways to manage and leverage your data.
Data Management Job Market Projections
An exploration of how AI is redefining data-related roles.
Best Data-Related Certifications
This list of seven top certifications will help you attain valuable credentials in 2025.
Essential Guide to the CDMP Specialist Exams
Book recommendations to guide your preparation for Practitioner and Master level certifications.
Foundations of Data Strategy
Learn the steps to roll out an effective Data Management initiative.