Loading stock data...

New research introduces a robust system designed to determine the relative accuracy of predictive AI in a hypothetical medical setting and to decide when the system should defer to a human clinician. The work aims to improve safety and reliability in healthcare by clarifying the boundaries of AI usefulness and ensuring that clinicians remain integral to high-stakes interpretations. This initiative emphasizes not only enhancing AI performance but also ensuring transparent, human-centered decision-making in medical imaging tasks. The study, conducted in collaboration with Google Research and Nature Medicine, centers on a framework named CoDoC (Complementarity-driven Deferral-to-Clinical Workflow), which seeks to optimize the interaction between predictive AI tools and clinicians to achieve the most accurate interpretations possible. The overarching goal is to identify the contexts in which AI can contribute meaningfully to diagnosis and when it should yield to human judgment, thereby strengthening the accuracy and safety of patient care.

CoDoC: Concept, aims, and significance

CoDoC represents a deliberate shift in how clinical AI tools are integrated into practice. Rather than expecting AI to replace clinician expertise, CoDoC is designed to harness the complementary strengths of both artificial intelligence and human expertise. By focusing on the critical question of when predictive AI is more reliable than a clinician and when the converse is true, the system supports better decision-making in high-stakes imaging scenarios. This approach aligns with a growing recognition in medicine that AI can offer powerful pattern recognition and rapid analysis, while clinicians provide nuanced interpretation, contextual understanding, and ethical judgment that machines alone cannot replicate.

The central aim of CoDoC is to build an AI-enabled workflow that knows when it does not know—that is, when it should defer to a human expert to ensure the highest possible accuracy. In practice, this involves integrating an AI model with a mechanism that assesses confidence, interprets qualitative signals from clinicians, and uses a ground-truth reference to guide decision-making. The result is a collaborative pipeline in which the AI tool and the clinician contribute to the final interpretation, with the system actively guiding when one partner should take the lead. By making this collaboration explicit, CoDoC seeks to improve the reliability of predictions in real-world settings where patient outcomes depend on precise imaging analysis.

In methodological terms, CoDoC is designed to be universally accessible to a broad range of healthcare providers. The researchers emphasize three foundational criteria that guide its development. First, the system should be deployable by non-machine-learning experts, such as frontline clinicians or radiology staff, on a single standard computer. This criterion is rooted in the practical realities of healthcare settings where specialized engineers may not be available for every site. Second, the training requirements should be modest, typically involving only a few hundred labeled examples. This constraint makes it feasible for institutions with limited data resources to adopt the approach without engaging in large-scale data collection efforts. Third, CoDoC is intended to be compatible with any proprietary AI model and should not require access to the underlying model’s internals or the data on which it was trained. In other words, it can be layered onto existing AI tools without necessitating modification of those tools, preserving the integrity and commercial viability of third-party models.

These guiding principles reflect a commitment to practical applicability, safety, and transparency. By prioritizing ease of deployment, minimal data needs, and compatibility with a wide range of AI systems, CoDoC seeks to lower barriers to adoption and enable healthcare providers to improve the trustworthiness and effectiveness of AI-assisted diagnostics. The emphasis on complementarity—not competition between humans and machines—underpins the theoretical rationale for the framework. The researchers argue that the most accurate interpretations often emerge when the strengths of AI (e.g., rapid pattern recognition across large datasets) are combined with human capabilities (e.g., contextual reasoning, knowledge of patient history, and clinical judgment). The ultimate objective is to deliver a clinically meaningful improvement in diagnostic accuracy while preserving clinician oversight.

The study’s collaboration with multiple healthcare organizations, including the United Nations Office for Project Services’ Stop TB Partnership, underscores the broad relevance of this approach. The partnership signals a shared interest in improving the transparency and safety of AI models used in real-world clinical settings, particularly in domains where accurate imaging interpretation is critical for patient outcomes. The open-source release of CoDoC’s code is highlighted as a mechanism to accelerate progress in the field by enabling researchers and healthcare providers to build on the work and adapt it to diverse contexts. The combination of cross-organizational collaboration and open-source sharing reflects a strategy aimed at advancing the reliability and safety of AI in healthcare across different systems and geographies.

In sum, CoDoC embodies a pragmatic, actionable approach to integrating AI into clinical workflows. It is designed not to supplant clinicians but to augment their capabilities by providing a controlled, transparent mechanism for deferring to human expertise when it enhances accuracy. The framework is anchored in three practical criteria that make it implementable in real-world clinical environments, ensuring that improvements in AI-assisted interpretation are aligned with the needs and constraints of day-to-day healthcare delivery. The promise of CoDoC lies in its potential to reduce diagnostic errors, minimize unnecessary interventions, and promote safer, more trustworthy deployment of predictive AI tools in medicine.

How CoDoC works: inputs, learning signals, and deferral logic

At the heart of CoDoC is a straightforward, yet powerful, concept: for each case, the system is designed to determine whether the predictive AI tool or the clinician should provide the final interpretation in order to achieve the highest accuracy. This notion hinges on the idea that predictive AI systems carry confidence signals about their own assessments, while clinicians bring interpretive wisdom grounded in clinical experience and patient context. CoDoC formalizes these elements into a practical framework that can be trained with a small set of data and then applied to new cases without requiring access to the internal workings of the predictive model.

For each case in the training dataset, CoDoC requires three specific pieces of information. First, the predictive AI tool outputs a confidence score that captures its assessment of the presence or absence of disease. This confidence score is normalized to a range from 0 to 1, where 0 indicates high confidence that no disease is present, and 1 indicates strong confidence that disease is present. This signal provides a quantitative measure of the AI’s certainty about its predictions, which CoDoC uses to gauge when to trust the machine and when to seek human input.

Second, CoDoC takes into account the clinician’s interpretation of the medical image. This input represents the human expert’s judgment, which may reflect radiological findings, contextual clues, and other clinical considerations that influence final decision-making. The clinician’s interpretation serves as a real-world predictor of how a human expert would assess the same imaging data, and it is used in tandem with the AI confidence signal to learn the complementary dynamics between machine and human judgment.

Third, CoDoC relies on ground truth indicating whether disease was truly present, as established through definitive follow-up methods such as biopsy or subsequent clinical outcomes. This ground-truth information provides the objective standard against which both AI predictions and clinician interpretations are evaluated. Importantly, CoDoC operates with the notable caveat that it does not require access to the raw medical images themselves to function, a design choice that can enhance privacy, simplify data handling, and broaden applicability across settings where image-sharing constraints exist.

The learning objective for CoDoC is to calibrate a deferral rule that maximizes diagnostic accuracy by leveraging the combined strengths of AI and human expertise. Rather than aiming to minimize AI errors in isolation or maximize clinician input in all cases, the system learns when deferring to the clinician is likely to yield a more accurate interpretation and when the AI’s reading is sufficiently reliable to stand on its own. This decision-making process is grounded in empirical signals—the AI confidence score, the clinician’s read, and the ground-truth outcome—that together inform a principled deferral strategy.

To make the deferral decision practical in clinical contexts, CoDoC translates these inputs into a simple, usable rule set. The system evaluates the AI’s confidence level in conjunction with the clinician’s interpretation and cross-references them with known outcomes within the training data. In this way, it learns patterns such as: high AI confidence with concordant clinician interpretation may favor automated final decisions; high AI confidence but discordant clinician input might prompt a deferral to the clinician depending on the historical accuracy of each modality for similar cases; low AI confidence generally favors clinician involvement. The result is a workflow where deferral decisions are not arbitrary but are guided by data-driven insights into which pathway historically yields better results for specific imaging scenarios.

A key design feature is the system’s ability to operate without requiring access to the medical images themselves. This aspect has meaningful implications for privacy, data governance, and interoperability across different institutions and electronic health record configurations. By decoupling the deferral mechanism from direct access to imaging data, CoDoC can be layered onto a range of AI tools and clinical workflows without necessitating image-sharing pipelines or workflow modifications that could introduce additional risk or complexity.

The evaluation of CoDoC’s performance centers on its impact on diagnostic accuracy, particularly in terms of false positives and true positives. In reported results, the approach led to a notable reduction in false positives—by 25%—on a large, de-identified UK mammography dataset when compared with conventional clinical workflows. Crucially, this improvement did not come at the cost of missing true positives; the system maintained or improved sensitivity in the tested scenarios. This balance between reducing unnecessary positive calls and preserving diagnostic sensitivity is central to the practical value of CoDoC, as it translates into fewer unnecessary follow-up procedures, reduced patient anxiety, and more efficient use of clinical resources, while preserving the ability to identify actual disease when present.

In practice, deploying CoDoC involves three primary inputs per case in the training data: the AI-provided confidence score, the clinician’s interpretation, and the ground-truth outcome. The simplicity of this structure is deliberate, designed to facilitate adoption in settings where resources are constrained. The training data do not require access to raw imaging files, which lowers the barrier to use and supports broader implementation across diverse healthcare environments. The resulting deferral policy can then be applied to new cases in real time, guiding when to rely on the AI or defer to a clinician to maximize diagnostic accuracy. The practical implication is a more reliable, collaborative workflow that respects the complementary strengths of AI and human decision-makers, while also aligning with privacy and data governance constraints.

The overall architecture emphasizes ease of integration, minimal data requirements, and broad compatibility. CoDoC is designed to function as an add-on layer that can be implemented atop existing AI tools, even if those tools operate as black boxes. This modularity ensures that institutions can benefit from improved performance without having to overhaul their current AI investments or disclose proprietary model details. By prioritizing a user-friendly deployment model, the framework aims to be accessible to clinicians and healthcare providers who may not have extensive machine-learning expertise, thereby promoting safer and more transparent AI-assisted care across a range of imaging applications.

Practical deployment and user-centered considerations in clinical settings

The practical deployment of CoDoC in real-world clinics demands careful attention to workflow integration, user experience, and operational constraints. A core premise of the approach is that non-machine-learning experts should be able to set up and run the system on common hardware. This requirement covers radiology departments, primary care imaging facilities, and specialty clinics where clinicians routinely interpret imaging studies. By removing the need for specialized infrastructure or expert technicians, CoDoC aims to democratize access to improved AI-assisted decision-making, enabling a broader range of institutions to benefit from advances in predictive analytics.

Moreover, the modest data requirement—typically a few hundred labeled examples—is crucial for practical uptake. Many healthcare providers face challenges in acquiring large labeled datasets due to privacy restrictions, data governance policies, and the costs associated with manual annotation. The design of CoDoC acknowledges these realities and offers a feasible training pathway that can be adapted to the specific imaging modality and disease domain of a given clinical setting. This adaptability supports use across various imaging tasks beyond mammography, including chest imaging and other radiology workflows, where similar deferral dynamics could prove beneficial.

Compatibility with proprietary AI models is another salient feature. In many healthcare environments, AI tools come from multiple vendors, each with its own data processing pipelines and internal architectures. CoDoC’s requirement—that it does not need access to model internals or training data—offers a practical workaround that allows healthcare providers to leverage existing AI investments while adding a safety-focused deferral mechanism. This approach reduces vendor lock-in risk and preserves the ability to adopt diverse AI solutions without compromising the integrity or safety of the clinical workflow. It also helps manufacturers maintain their innovation pathways, as CoDoC does not demand invasive changes to established AI systems.

From a user experience perspective, the success of CoDoC depends on the clarity and interpretability of the deferral cues. Clinicians must understand when and why the system is deferring a case to them, and they should be able to trust that the deferral decisions align with observed performance in similar cases. The design thus emphasizes transparent reasoning signals, clear audit trails, and straightforward explanations that clinicians can assess within their routine workflow. Practically, this includes documentation of training data characteristics, deferral thresholds, and the historical performance metrics that underpin the deferral decisions. A robust auditing framework contributes to ongoing safety improvements and fosters clinician trust in AI-assisted care.

Another important consideration is privacy and data governance. Because CoDoC operates without requiring access to raw imaging data, it reduces direct exposure to patient images in the deferral process. This privacy-preserving aspect can be particularly valuable in multi-institution collaborations and research contexts where data-sharing restrictions are stringent. However, when training and validating the model, institutions must still handle imaging-derived data and annotations responsibly, following applicable regulatory and ethical guidelines. The overall approach is designed to minimize data exposure without compromising the quality and relevance of the training signals that guide the deferral policy.

In terms of clinical impact, the potential benefits of CoDoC extend beyond accuracy alone. By reducing false positives, the approach can lower unnecessary follow-up imaging, biopsies, and related patient anxiety, while preserving the ability to detect true disease promptly. This balance can translate into improved patient experiences, better allocation of radiology resources, and more efficient workflows. The approach also aligns with broader healthcare goals of enhancing patient safety, reducing diagnostic variability, and promoting evidence-based use of AI technologies in medicine.

As with any AI-enabled clinical tool, practical deployment will require ongoing monitoring, validation, and governance. Institutions adopting CoDoC should plan for continuous performance assessment, updating deferral strategies as new data become available and as clinical practices evolve. Regular reassessment helps ensure that the system continues to function as intended, maintaining alignment with patient safety standards, regulatory expectations, and the evolving needs of clinicians and patients alike.

Collaboration, transparency, and open-source contribution

A distinctive aspect of the CoDoC project is its emphasis on collaboration and openness. The effort features active collaboration with several healthcare organizations, illustrating a commitment to translating theoretical insights into practical improvements in patient care. The partnership with the Stop TB Partnership, under the United Nations Office for Project Services, highlights a shared priority of strengthening AI transparency and safety in real-world health initiatives. This collaboration signals the potential applicability of the CoDoC framework to global health challenges where accurate imaging interpretation can influence treatment decisions and disease control strategies.

In addition to the collaborative ethos, the project undertakes an open-source release of CoDoC’s code. By making the codebase publicly available, the researchers invite other teams to inspect, validate, and extend the work. Open sourcing supports rigorous scrutiny, reproducibility, and collaborative innovation—critical elements for advancing AI safety and reliability in healthcare. Researchers, clinicians, and developers can adapt CoDoC to new imaging modalities, disease areas, and deployment environments, accelerating the maturation of human-AI collaboration in medicine. The open-source model also fosters a community-driven approach to addressing practical deployment challenges, such as data privacy considerations, regulatory compliance, and integration with diverse clinical workflows.

This openness aligns with the broader movement toward transparent AI in healthcare, where stakeholders seek to understand not only how models perform but also how decisions are made and what data influence outcomes. By sharing the code, the team supports benchmarking, independent validation, and iterative improvements within a diverse ecosystem of researchers and practitioners. The collaborative and open nature of the project is intended to accelerate the development of safer, more reliable AI tools in medicine, ultimately benefiting patients and clinicians who rely on imaging-based diagnoses.

Methodological attributes: training, evaluation, and safety considerations

CoDoC’s methodological backbone emphasizes safety, practicality, and generalizability. The training process utilizes three inputs per case in the training dataset and relies on the relationship among AI confidence, clinician interpretation, and ground-truth outcome to characterize when deferral to a clinician is likely to enhance accuracy. The low data requirement—typically a few hundred labeled cases—reflects a design that prioritizes feasibility in diverse healthcare settings where data availability may be constrained. This approach reduces the burden on institutions seeking to adopt the framework while maintaining the integrity of the deferral mechanism.

Evaluation of CoDoC centers on real-world performance improvements, including the reduction of false positives while preserving true positives. In the reported mammography scenario, the framework achieved a 25% reduction in false positives compared with traditional clinical workflows, without missing any true positives. This result demonstrates the potential for meaningful improvements in diagnostic accuracy through human-AI collaboration, with direct implications for patient safety, resource utilization, and clinical efficiency. The case study underscores the value of grounding AI-assisted decisions in empirically derived deferral rules that reflect observed performance across cases that involve both AI predictions and human interpretations.

An essential safety aspect is the absence of reliance on raw imaging data by CoDoC itself. This design choice helps mitigate privacy concerns and reduces the complexity of data-sharing arrangements that can impede broader adoption. By not requiring access to the original images or the AI model internals, CoDoC supports a more flexible integration path, enabling healthcare providers to leverage existing tools while introducing a governance-friendly safety mechanism. The approach thereby aligns with healthcare institutions’ needs to balance innovation with patient privacy, regulatory compliance, and risk management.

From a governance perspective, deploying a deferral-based system calls for clear accountability and traceability. Clinicians must be able to understand the circumstances under which CoDoC defers a decision and should have accessible documentation that explains the deferral criteria and historical performance data. Maintaining auditable records of deferral decisions is critical for ongoing quality improvement, regulatory review, and patient safety. The method’s transparency supports clinician confidence and patient trust by clarifying how AI guidance interacts with human oversight and how final interpretations are derived.

The method also invites future enhancements in several directions. Researchers may explore expanding the input modalities to other imaging domains, such as chest radiographs, CT or MRI studies, or even non-imaging clinical data streams. Additional work could investigate dynamic deferral strategies that adapt to changing clinical contexts, patient populations, and disease prevalence. There is also the potential to refine the confidence calibration of AI tools and the interpretation framework used by clinicians, incorporating more nuanced descriptors of uncertainty and more granular scoring systems. While the current design focuses on practicality and broad compatibility, ongoing research can broaden the applicability and robustness of the deferral mechanism in diverse healthcare environments.

Case study focus: mammography and potential implications for breast cancer screening

The practical demonstration of CoDoC in a UK mammography dataset provides a concrete illustration of how the framework can influence diagnostic workflows. In this scenario, the system achieved a reduction in false positives by 25% relative to commonly used clinical workflows. This improvement is clinically meaningful because false positives in mammography can lead to unnecessary biopsies, anxiety, additional imaging, and increased healthcare costs. By reducing these unnecessary follow-ups without compromising the detection of true disease, CoDoC demonstrates the potential for more efficient, patient-centered screening processes.

The UK mammography dataset used in the evaluation was large and de-identified, reflecting real-world conditions in which privacy and data protection remain paramount. The goal was to assess how the deferral mechanism operates when confronted with a representative but anonymized set of images and predictions. The outcome—fewer false alarms without missing cancers—speaks to the practical value of the approach in a screening context. If such gains can be replicated across other populations and imaging modalities, CoDoC could contribute to more targeted screening programs, improved patient experiences, and optimized allocation of radiology resources.

It is important to interpret these results within the scope of the study. While the 25% reduction in false positives is a notable achievement, clinicians and administrators should consider context-dependent factors when generalizing to other settings. Variables such as image quality, prevalence of disease, interpretive standards, local protocols, and the availability of radiologists can influence the effectiveness of any AI-assisted workflow. The CoDoC framework is designed to be adaptable, allowing institutions to calibrate deferral decisions to their specific practice patterns and patient populations. Future work could involve validating the approach across broader data collections, diverse clinical settings, and additional imaging tasks to establish robust, generalizable benefits.

Beyond the immediate clinical implications, the mammography case study highlights broader themes around AI-human collaboration. It demonstrates that thoughtfully designed AI-augmented workflows can reduce error rates while preserving essential human oversight. The success of CoDoC in this context reinforces the view that healthcare systems can benefit from models that explicitly account for the complementary strengths of machines and clinicians. The approach aligns with a broader movement toward integrated decision support that respects clinical expertise and prioritizes patient safety above all.

Open questions, limitations, and pathways for future work

Like any innovative approach, CoDoC raises important questions about applicability, scalability, and long-term impact. One area for further exploration concerns the generalizability of the three-input training schema across different diseases, imaging modalities, and clinical settings. While the design is intentionally minimal and versatile, the degree to which the same training framework will yield consistent improvements outside the mammography context remains an empirical question. Future studies could investigate domain-specific adjustments to the deferral rule and whether certain disease categories or imaging techniques benefit more from defer-to-clinician strategies than others.

Another line of inquiry involves the integration with other decision-support tools and clinical workflows. How CoDoC interacts with multidisciplinary teams, tumor boards, and follow-up protocols could affect its real-world impact. The system’s deferral decisions may need to harmonize with existing workflows, such as second-opinion processes, tumor-board discussions, and imaging-based triage protocols. Understanding how CoDoC harmonizes with these processes will be essential for maximizing its practical benefits without adding complexity to clinicians’ routines.

The privacy-preserving design of CoDoC invites further examination of data governance implications. While the tool itself does not require access to medical images, the training and validation stages involve labeled data that may be derived from sensitive sources. Establishing robust data handling procedures, consent frameworks, and secure data-sharing practices will be important as institutions consider adopting the framework across networks or collaborations. The approach therefore sits at the intersection of clinical innovation, data protection, and ethical practice, requiring thoughtful governance structures to sustain safe deployment.

A related question concerns the long-term sustainability of open-source contributions in the medical AI domain. Maintaining code quality, addressing security concerns, and ensuring compatibility with evolving proprietary AI models will demand ongoing community engagement and stewardship. The open-source nature of CoDoC invites a collaborative ecosystem, but it also places responsibilities on contributors and users to maintain high standards of security, documentation, and verifiability. Active maintenance, transparent governance, and rigorous validation will be necessary to realize the full benefits of the open-source model.

From a clinical safety perspective, ongoing monitoring and post-deployment surveillance are essential. The deferral mechanism should be continuously evaluated to detect any drift in performance, bias, or unintended consequences across patient populations. Establishing feedback loops that capture clinician experiences, diagnostic outcomes, and patient-reported impacts will help ensure that the system remains aligned with safety and quality objectives. The integration of such feedback mechanisms supports iterative improvements and helps sustain trust among clinicians and patients.

In terms of research directions, expanding the scope of CoDoC to include other modalities and diseases could amplify its impact. Investigations into chest imaging for infectious diseases, cardiovascular imaging, or neurologic imaging could reveal additional opportunities for gains in diagnostic accuracy through coordinated human-AI decision-making. Moreover, combining CoDoC with other forms of decision support—such as image-quality assessment tools, risk prediction models, or treatment planning aids—could lead to more holistic, patient-centered workflows that integrate multiple streams of AI-derived insights with human expertise.

Broader implications for clinical practice, policy, and education

The development of CoDoC carries implications for how healthcare systems approach AI adoption, clinical governance, and professional training. By emphasizing complementarity and deferral to clinicians in appropriate contexts, the framework reinforces a philosophy in which AI serves as a valuable ally rather than a replacement for human judgment. This stance can influence policy discussions surrounding AI deployment in medicine, particularly with respect to safety regulations, validation requirements, and the delineation of responsibilities among AI developers, healthcare institutions, and individual clinicians.

For educators and trainees, CoDoC highlights the importance of cultivating skills in human-AI collaboration. Medical education and continuing professional development programs may incorporate modules focused on interpreting AI outputs, understanding when AI guidance should be trusted, and articulating the rationale for deferral decisions. Training clinicians to engage effectively with AI-enabled workflows can enhance acceptance, trust, and confidence in AI-assisted care, ultimately supporting better patient outcomes.

From a policy perspective, the CoDoC framework offers a model for how deferral-based decision-support systems can be evaluated, regulated, and governed. Regulators and accrediting bodies may consider criteria for validating the safety and effectiveness of such systems, including transparency around deferral rules, performance benchmarks for AI and human contributors, and mechanisms to monitor and address potential biases. The balance between patient privacy, data sharing, and the need for robust validation will continue to shape the regulatory landscape as AI-enabled imaging tools become more prevalent.

Clinically, the CoDoC approach has the potential to reduce practice variation by standardizing when AI assistance is applied and when human review is required. Consistency in decision-making can contribute to more predictable diagnostic timelines, improved quality of care, and streamlined workflows. However, it is important to preserve clinical autonomy and avoid rigid one-size-fits-all rules. The design should allow clinicians to exercise judgment and override deferral when necessary, ensuring that patient-specific factors and clinical context remain central to the diagnostic process.

In terms of international impact, the collaboration with global health partners and the open-source dissemination of CoDoC can spur cross-border adoption and adaptation. Different health systems bring diverse imaging practices, regulatory standards, and data governance regimes. The flexibility of CoDoC to operate without access to raw images and its compatibility with various proprietary AI models can facilitate broader uptake, while still requiring context-sensitive validation and local governance to ensure patient safety and quality of care.

Conclusion

The CoDoC framework embodies a principled approach to advancing AI-assisted medical imaging by prioritizing complementary human-AI collaboration over unilateral automation. Through a simple, data-driven deferral mechanism, the system learns when predictive AI should be trusted to interpret an image and when a clinician’s expertise should take the lead to maximize accuracy. The demonstrated potential to reduce false positives by a meaningful margin in a large mammography dataset, without sacrificing the detection of true positives, signals a practical path toward safer, more efficient diagnostic workflows. The framework’s emphasis on accessibility for non-experts, modest training data requirements, and compatibility with existing proprietary AI models positions it as a realistic option for diverse healthcare environments seeking to augment care without overhauling established systems.

CoDoC’s collaboration with global health partners and its open-source release further underscore a commitment to transparency, safety, and collective advancement in AI-enabled medicine. By providing a tool that clinicians can deploy on a single computer, train with a few hundred examples, and integrate with a range of AI tools without accessing their internal mechanisms, CoDoC lowers barriers to adoption while elevating the reliability of diagnostic decisions. The approach aligns with broader goals to reduce unnecessary interventions, promote patient safety, and optimize clinical resources, all while respecting privacy and governance considerations.

Looking ahead, CoDoC invites ongoing validation across diverse diseases and imaging modalities, continuous refinement of deferral policies, and deeper integration with broader decision-support ecosystems. As healthcare systems increasingly rely on AI-enabled tools, the ability to transparently balance machine insight with human expertise will be essential. The CoDoC framework stands as a thoughtful, practical model for achieving safer, more accurate, and ethically grounded AI-assisted care, ultimately contributing to better patient outcomes and higher confidence in predictive analytics in medicine.