Researchers from Weill Cornell Medicine have proposed a new theoretical conceptual framework called “dynamic deployment” for potentially conducting clinical trials of AI systems in healthcare settings, arguing that traditional approaches fail to capture the adaptive, continuously learning nature of modern large language models. The perspective piece, published in npj Digital Medicine in May 2025, outlines a conceptual approach for how medical AI systems could be evaluated as evolving, complex systems rather than static tools.
Key Points
- This perspective article presents a theoretical framework comparing traditional “linear” AI deployment models with proposed “dynamic deployment” approaches that would allow continuous learning and adaptation.
- The authors cite literature showing that more than 40% of 521 FDA-approved medical AI devices lacked clinical validation data, while only 86 randomized controlled trials of machine learning interventions have been conducted globally as of 2024.
- The authors propose modifying existing adaptive clinical trial methodologies (used in Phase I trials for over 30 years) to potentially enable continuous monitoring and updating of AI systems during deployment.
- The framework specifically targets large language model implementations where continuous learning capabilities already exist but remain underutilized in clinical settings.
- The authors explicitly note this framework may not be suitable for high-risk applications such as surgical robotics or fully autonomous systems, where more conservative static deployments may be required.
The proposed dynamic deployment framework represents the authors’ theoretical vision for how medical AI systems might be validated and implemented in the future, potentially bridging the gap between rapid AI development and slow clinical adoption while maintaining safety through continuous monitoring.
The Data
The authors cite the following statistics to contextualize the current state of medical AI implementation:
- A 2023 insurance claims analysis found only 16 medical AI procedures with billing codes, which the authors note indicates “minimal real-world implementation” despite thousands of published AI studies.
- Survey data shows 20% of UK physicians report using consumer generative AI tools in practice despite lack of medical validation, highlighting what the authors see as an urgent need for proper clinical trials.
- Adaptive continual reassessment methods using Bayesian frameworks have been successfully used in oncology trials for 30+ years, providing precedent that the authors suggest could inform dynamic AI deployments.
- The authors acknowledge significant infrastructure, cost, and regulatory challenges, noting most leading LLMs are proprietary systems that don’t allow parameter modification.
Industry Context
Dynamic deployments can be used in the context of intervention arms in AI clinical trials to facilitate comparison with control groups and estimation of the causal effect of AI system implementation.
Dr. Jacob T. Rosenthal, Lead Author, Tri-Institutional MD-PhD Program, Weill Cornell Medicine
The implementation gap between AI research and clinical practice represents one of the healthcare technology sector’s most pressing challenges. Despite the proliferation of medical AI research, with thousands of papers published annually, the translation to patient care remains minimal. The authors’ perspective builds on established adaptive trial methodologies, particularly continual reassessment methods used in early-phase cancer trials since the 1990s.
The authors critically examine why current linear deployment models—where AI systems are developed, frozen, and deployed unchanged—may be incompatible with modern large language models that can learn from user interactions, adapt through reinforcement learning, and evolve through prompt engineering. They argue that treating AI as a static technology ignores fundamental capabilities like in-context learning and continuous adaptation that make these systems valuable. Their proposed framework envisions theoretical feedback mechanisms including patient outcome metrics from electronic health records, workflow efficiency measures, and user satisfaction surveys.
However, the authors acknowledge significant barriers that would need to be overcome for any practical implementation of their theoretical model. Infrastructure requirements for continuous monitoring and updating would demand substantial investment from healthcare systems already struggling with IT budgets. Privacy concerns, cybersecurity risks, and the proprietary nature of leading AI models present additional challenges. The FDA’s recent guidance on predetermined change control plans represents regulatory progress, but comprehensive frameworks remain underdeveloped. Careful oversight would be necessary to govern appropriate use of such dynamic AI systems, and these decisions would be highly context-dependent.
The perspective piece was authored by Jacob T. Rosenthal (Tri-Institutional MD-PhD Program), Ashley Beecy (Division of Cardiology, Weill Cornell Medicine), and Mert R. Sabuncu (Department of Radiology, Weill Cornell Medicine). The lead author was supported by NIH grant T32GM152349. The authors declared no competing interests.
The perspective article, “Rethinking clinical trials for medical AI with dynamic deployments of adaptive systems,” was published in npj Digital Medicine, May 2025 (DOI: 10.1038/s41746-025-01674-3).