School of Medicine Research using Artificial Intelligence
GUIDANCE Contents
Purpose and Applicability
This guidance applies to UW reviewed human subjects research involving the use of Artificial Intelligence Systems (AI) when:
- It is led by School of Medicine Principal Investigators (PIs); AND,
- It involves either the targeted enrollment of UW Medicine patients, OR use of UW Medicine Data. Note that UW Medicine Data is NOT limited to clinical records and includes many types of data stored in any UW Medicine system or application.
This guidance is intended to align with the UW Medicine Policy on Use of Artificial Intelligence in the Healthcare Setting and uses the definitions of AI and UW Medicine Data from the glossary of this policy. It covers both research involving the development of AI systems and the use of AI as a tool to facilitate the administration of a research study (e.g., recruitment, safety monitoring, data analysis). Exception: This guidance does not apply to the use of AI as tool for the administration of research when: 1) the AI tool(s) have been approved for use in clinical care and UW medicine business operations as described in UW Medicine’s policy and 2) are being used for their approved purpose.
HSD has revised its interpretation of the regulatory definition of a human subject to capture some research that should be reviewed by the IRB to mitigate the risks to subjects that may result from re-identification of their data. This means that IRB review will be required for some research involving AI and the secondary use of de-identified data that did not previously require IRB review. Use the Human Subjects Research Determination worksheet to determine if your research requires IRB review.
For research covered by this guidance, the UW IRB now requires researchers to complete and submit the SUPPLEMENT Artificial Intelligence form with their IRB application. The supplement is designed to be used with this guidance to develop and describe a plan to address the risks associated with research involving the use of AI, and to provide the IRB with the information necessary to complete its review of the risk mitigation plan.
When the research will not involve data collection through interaction with research participants (e.g. it involves only use of secondary data), the supplement may be submitted in conjunction with the shorter IRB Protocol, No Contact form. Otherwise, the standard IRB Protocol form should be used.
Context
AI systems introduce unique and evolving risks when used in human research, stemming from their complexity, scale, and unpredictability. Unlike traditional technologies, AI can produce outputs that are fabricated, difficult to interpret, or that reflect and amplify societal biases. These systems may also re-identify individuals from datasets previously considered de-identified or reveal sensitive information, raising concerns about equity, participant safety, privacy, and confidentiality. Adaptive AI, which continues to learn and evolve based on new data or interactions, introduces additional challenges, such as performance drift, unpredictable behavior, and difficulty in validating outputs over time. These characteristics can complicate informed consent, challenge participant autonomy, obscure accountability, and increase the likelihood of evolving risks that may not be foreseeable at the outset of a study.
This guidance is designed to establish a standardized and risk-based approach to the review of research involving AI that will help the IRB identify the risks in a study, determine when risks have been appropriately mitigated, and communicate the IRBs expectations to researchers.
The approach in this guidance is largely based on the white paper A Novel, Streamlined Approach to the IRB Review of Artificial Intelligence Human Subjects Research (AI HSR) and the Multi-Regional Clinical Trials Center’s (MRCT) Framework for Review of Clinical Research Involving Artificial Intelligence. Both the white paper and the MRCT framework call for the IRB to consider the stages of AI development when determining the level of oversight and risk mitigations measures required. The guidance also draws extensively from the Taxonomy of Trustworthiness for Artificial Intelligence and the National Institutes of Standards and Technology (NIST) AI Risk Management Framework, as well as relevant FDA guidance and presentations from various IRB forums.
Role of the IRB
The UW IRB ensures that research involving AI adheres to the three fundamental ethical principles described in the Belmont Report: Beneficence, Justice and Respect for Persons. These principles are applied through the established regulatory criteria for IRB approval of research. As the unique and evolving risks introduced by AI technologies present new ethical challenges, the purpose of this section is to explain how the UW IRB interprets and applies both the Belmont Principles and federal human subjects regulations in the context of AI research. The UW IRB will also use the information in the remaining sections of this document to guide its review.
Beneficence.
Regulatory Criteria:
- Risks to subjects are minimized by using procedures that are consistent with sound research design and that do not unnecessarily expose subjects to risk, and whenever appropriate, by using procedures already being performed on the subjects for diagnostic or treatment purposes.
- Risks to subjects are reasonable in relation to anticipated benefits, if any, to subjects, and the importance of the knowledge that may reasonably be expected to result.
- There is an adequate plan for monitoring the data collected to ensure the safety of subjects.
- There is an adequate plan to protect the privacy of subjects and to maintain the confidentiality of data.
IRB review:
- Evaluates whether the research plan includes sufficient measures to reduce harm that may result from AI system issues, such as inaccurate misleading system outputs, performance drift, lack of transparency in decision-making processes, the potential for overreliance on AI, and limited explainability of system outputs, and the risk of AI systems retaining or disclosing information beyond their intended scope.
- Evaluates whether the potential benefits of AI research justify the risks, including the potential for AI to cause individual harm in a clinical setting and the potential benefits of AI for individuals in clinical setting and broadly for improvement/scalability of medical care.
- Evaluates whether there are sufficient data monitoring and feedback systems to detect and mitigate adverse effects that may be caused by errors, biases, or unexpected behaviors.
- Evaluates whether there are sufficient privacy and security measures built into the AI system design, testing, deployment, and operation.
Justice.
Regulatory Criteria:
- Selection of subjects is equitable. In making this assessment the IRB should take into account the purposes of the research and the setting in which the research will be conducted.
IRB review:
- Evaluates whether there are sufficient measures in place to assess and mitigate computational bias (including biased input data and biased model design) so that underrepresented populations are not unfairly excluded, disproportionately affected, or disenfranchised by AI-driven decisions.
- Encourages equitable distribution of research benefits and burdens across populations.
Respect for Persons.
Regulatory Criteria:
- The IRB ensures that informed consent will be obtained from each participant (or their legally authorized representative) before they participate in the research, unless the research qualifies for a waiver of the consent requirements.
- The IRB ensures that consent will be appropriately documented unless the research qualifies for a waiver of the requirement for documented consent.
IRB review:
- Verifies that consent processes are clearly documented and that waivers are justified when applicable.
- Ensures that the consent materials clearly explain the role of AI in the study, how data will be used, describes the risks and limitations of the AI system, the safeguards that will be in place, and how any results will be returned to participants.
IRB Review and Stage of Development or use of AI
The questions in the SUPPLEMENT Artificial Intelligence are structured around the stage of development of the AI system, and the use of AI as a tool to facilitate the administration of a research study. Breaking the review process down into the stages of development allows for a more targeted and efficient evaluation of AI-related clinical research by addressing the specific challenges and considerations at each stage. Researchers can use the information and resources provided in the Identifying and Assessing Risks section to help them complete the supplement and the information in the Consent Considerations section to design the consent process.
For research involving the development of an AI system, the IRB will review only one stage at a time. Researchers must submit either a new application or a study modification for each additional stage with an updated supplement.
- As defined in the table, Stage 1 and Stage 2 studies are usually eligible for expedited review. These studies do not directly impact patient healthcare or treatment and tend to involve minimal risk.
- Stage 3 studies are more likely to require review by the convened IRB, particularly when they directly impact patient healthcare or treatment.
- HSD will generally review Stage 2 and 3 studies in consultation with AI subject matter experts.
STAGE/USE | DESCRIPTION |
---|---|
Stage 1 – Discovery | This stage focuses on the conceptual and exploratory development of AI algorithms. It involves gathering and early analysis of training data to explore potential use cases. During this stage, hypotheses are built and tested through iterative algorithm building on retrospective (sometimes prospective) datasets. The emphasis is on selecting appropriate algorithmic approaches and establishing preliminary associations to inform future development. Stage 1 research must not impact participant or patient healthcare, treatment or clinical decision-making. Stage 1 research may not release results to the medical records, patients or providers for clinical care purposes. |
Stage 2 – Translation | This stage of AI development involves advancing AI systems in research from ‘conceptual development’ to ‘validation’, emphasizing performance testing and identifying risks. This stage may include:
Stage 2 research must not impact participant or patient healthcare, treatment or clinical decision-making. |
Stage 3 – Deployment | The use of a tested and validated AI system within a research context to confirm clinical efficacy, safety, and risks. It involves clinical investigation to collect real world evidence. Stage 3 research has the potential to impact patient healthcare or treatment. |
AI for administration of research | The use of artificial intelligence technologies to facilitate various aspects of the research process. This may include, but is not limited to recruitment, data analysis, transcription, and patient monitoring. |
Identifying and Assessing AI Related Risks
The table below describes the primary risks that should be considered when conducting human research involving the use of AI, questions to consider, and resources to assist in the design of a risk mitigation plan. The questions and resources are intended to aid researchers in completing the SUPPLEMENT Artificial Intelligence and the IRB in its review of the study. The relevance of the questions to consider, and applicability of the resources will vary depending on the stage of the study or use of the AI system.
AI Risk | Questions to consider | Relevant resources |
---|---|---|
Accuracy and Reliability Accuracy refers to the degree to which a model’s outputs are correct when compared to ground truth. AI systems can suffer from accuracy issues due to flawed training data, incomplete information, and limitations in their ability to distinguish between truth and falsehood. These issues can lead to incorrect predictions, biased outputs, and even the generation of fabricated or false information (i.e. hallucinations). In adaptive AI, this variability can be influenced by changes in input data, environmental context, or internal model updates, making it difficult to ensure consistent performance. |
|
|
Bias and Equity Bias arises in artificial intelligence models in multiple ways. The data sets used to train AI models can reflect the biases that pervade societies and cultures that produced the data they contain. For example, generative AI models can exhibit bias by reinforcing cultural stereotypes present in their training data. In addition, the design of AI systems reflects the values, assumptions, and experiences of the decision makers responsible for their development. |
|
|
Privacy and Security AI systems raise significant privacy concerns due to their reliance on vast amounts of personal data for training and operation. This data can be vulnerable to breaches, misuse, unauthorized access, and re-identification of seemingly de-identified data and images, potentially revealing sensitive information and leading to harm. Providing data to third-party AI services for analysis may constitute a breach of participant privacy. |
|
|
Transparency and Explainability Transparency in AI refers to the degree to which an AI system’s operations and decisions are clear, understandable, and accessible for review or scrutiny by users and stakeholders. Explainability refers to the ability to provide a user-friendly explanation of the reasons behind an AI system’s output (e.g., a diagnosis or prediction), to provide an understanding of its decision process. Several factors complicate AI transparency and explainability, such as the complexity of algorithms, limited visibility into training data, and the dynamic and adaptive nature of some models. |
|
|
Consent Considerations
Informed consent in research involving artificial intelligence (AI) must address the unique risks and ethical complexities introduced by these technologies. The consent processes should be tailored to the nature of the research, the stage of development, the role and function of the AI system, the type of data used, and address the unique risks, benefits, and uncertainties associated with the AI system. As AI models evolve, the associated risks and benefits can change and it’s important to consider whether these changes could impact a participant’s decision to continue in the study and the need for reconsent or ongoing communication.
Most AI research involving only human data does not require direct interaction with participants and uses large scale data sets. This research may qualify for a waiver of the informed consent requirement when it involves secondary use of existing data and poses no more than minimal risk of harm to subjects. In situations where consent must be obtained from research participants, the information below should be included in the consent form in addition to the required elements of consent. Refer to HSD’s Designing the Consent Process guidance for additional information about designing an informed and meaningful consent process.
- Explain the role of AI in the study
- Clearly state that AI is being used in the study and describe its role in the study (e.g., generates predictions, classifications, or decisions).
- Indicate whether the AI system is static (fixed rules and predictable performance) or adaptive (learns and evolves in real time).
- If applicable, explain whether AI outputs will be reviewed by a human before influencing decisions.
- Explain how data will be used
- Explain whether data will be reused, shared, or commercialized, and if participants will share in any profit.
- Disclose if data will be retained after participant withdrawal and describe limits for removal of data from the AI model, including the data entered in the model and the data that it generates.
- Describe risks and limitations
- Describe any known and potential risks and limitations. These include but are not limited to:
- Re-identification of de-identified data
- Misclassification or incorrect predictions
- Algorithmic bias and fairness
- Psychological, social, or employment impacts
- Explain that AI systems may evolve over time, and outputs may change as models are updated.
- Explain what safeguards will be in place to mitigate these risks.
- Privacy and Confidentiality
- Describe any potential privacy and confidentiality issues related to the sharing of data and use of AI.
- If applicable, describe safeguards against re-identification, especially when combining datasets.
- Explain how privacy and confidentiality will be protected (e.g., use of encryption, access controls, and secure storage).
- Return of Results
- If AI will generate clinically actionable findings, refer to HSD’s guidance on return of individual results and designing consent.
Related Materials
SUPPLEMENT Artificial Intelligence
References
- Comeau, “Collaborative Ethics: The Role of IRBs in Navigating AI Oversight in Medical Research”. [Conference presentation] Public Responsibility in Medicine & Research, AER, December 2023
- Eto, “Harmonizing Health and AI: Navigating Innovation and Ethics”, [Webinar presentation] Consortium for Applied Research Ethics Quality, February 2024
- Eto, T. (2024) Pre-Print: A Novel, Streamlined Approach to the IRB Review of Artificial Intelligence Human Subjects Research (AI HSR). Version 1. Stanford Digital Repository.
- FDA Guidance, Good Machine Learning Practice for Medical Device Development: Guiding Principles
- FDA Guidance, Software as a Medical Device: Clinical Evaluation
- FDA Guidance, Transparency for Machine Learning-Enabled Medical Devices: Guiding Principles
- Lifson, Loufek, Eto, “A Simplified IRB Review Process: AI HSR in 3 Phases”. [Conference presentation] Public Responsibility in Medicine & Research, AER, January, 2025
- Multi-Regional Clinical Trials (MRCT) Center, Framework for Review of Clinical Research Involving Artificial Intelligence
- National Institutes of Standards and Technology, AI Risk Management Framework
- Secretary’s Advisory Committee on Human Research Protections (SACHRP), IRB Considerations on the Use of Artificial Intelligence in Human Subjects Research, October 19, 2022
- Silverman, “IRB Review of Research Involving AI”, [Videocast Presentation] Office of Human Subjects Research Protections, Education Series, April 2024
- UC Berkeley Center for Long-Term Cybersecurity (CLTC), Taxonomy of Trustworthiness for Artificial Intelligence: Connecting Properties of Trustworthiness with Risk Management and the AI Lifecycle
- UW Medicine Policy: Use of Artificial Intelligence (AI) in the Healthcare Setting
Version Information
Open the accordion for version changes to this guidance.
Version History
Version Number | Posted Date | Implementation Date | Change Notes |
---|---|---|---|
1.0 | 08.29.2025 | 08.29.2025 | Newly implemented guidance |
Keywords: Artificial Intelligence; Large Language Models, Deep Learning, Generative AI, Machine Learning.