Researchers in China are warning of a "model collapse" that threatens the reliability of artificial intelligence systems used in security and defense. This phenomenon, if not managed, could reduce the accuracy of strategic assessments and undermine decision-making during military engagements. Experts point to potential vulnerabilities in defense capabilities as long as training methodologies are not adapted, with possible repercussions beyond Chinese borders as the integration of AI into operations accelerates globally.
Model collapse occurs when models are retrained on their own synthetic outputs over successive generations. The outputs become progressively more homogeneous, biased, and less accurate, which can compromise critical functions, from situational analysis to decision support. As applications of AI in military and security domains grow, such a drift places the reliability of systems at the heart of operational concerns, especially when decision chains heavily rely on algorithmic results.
According to assessments relayed by specialists, the increasing integration of AI amplifies the impact of potential failures linked to this phenomenon. A window of vulnerability is thus mentioned, while research and development teams adapt their datasets and training protocols to limit self-reference and maintain a constant level of performance. The consequences would primarily be measured in the accuracy of systems and the robustness of assessments, with an increased risk of errors during engagements if degradation is not detected in time.
Studies presented at the ICLR 2024 conference by Quentin Bertrand and his co-authors showed that retraining stability remains possible if the initial model is sufficiently mature and a significant portion of the data remains real. However, the authors emphasized that below a certain threshold of real data, generative models are at risk of collapse, producing increasingly poor and inaccurate outputs. These findings provide concrete safeguards to guide the mixed datasets used in iterative training.
In a similar vein, an article published in the journal Nature concludes that recursive training on uncontrolled synthetic data can lead to the collapse of generative models. The question of detecting artificial content within datasets becomes all the more complex as this data spreads widely. Without robust measures, models risk self-reinforcing degraded synthetic styles. For defense applications, the quality of the dataset and the traceability of training data thus become critical technical parameters.
The reliability of models is also a source of concern beyond China. Researchers at Apple indicated last weekend that advanced reasoning models showed "fundamental limitations" and a complete collapse of accuracy in the face of highly complex problems. They observed that as this critical threshold approached, some models reduced their reasoning effort, a behavior deemed particularly concerning. These elements intensify the debate on the actual robustness of current approaches when task complexity significantly increases.
In an adjacent context, Anthropic unveiled this week the Glasswing project, aimed at strengthening AI-assisted defenses before they are overwhelmed by attackers equipped with advanced models. Partners like Amazon Web Services, Apple, Cisco, Google, and Microsoft have access to Claude Mythos Preview, an unpublished model that has already identified thousands of vulnerabilities, including in every major operating system and browser. Anthropic has informed U.S. officials, including the Cybersecurity and Infrastructure Security Agency and the National Institute of Standards and Technology's AI standards center.
In the United States, the Central Intelligence Agency recently used artificial intelligence to produce an intelligence report, a first. The agency notes it tested 300 projects in 2025 and plans to integrate AI partners into all its analytical platforms in the coming years. "In the coming years, we will have AI partners integrated into all CIA analytical platforms," said Michael Ellis. The CIA's Cyber Intelligence Center will oversee the implementation and development of these capabilities.
Meanwhile, the U.S. Department of Defense has confirmed agreements with seven major AI companies to integrate their technologies into classified military networks. The Pentagon justifies this approach by the need to rapidly improve national security capabilities, operational efficiency, and decision-making. Anthropic is excluded from these agreements and remains at odds with the department, fueling debates on the balance between risk control, solution diversity, and the accelerated adoption of AI technologies in sensitive environments.
In Europe, the Swedish government is considering a sovereign cloud platform to host classified data from the armed forces and the SAPO security service. Swedish intelligence services have identified the risk of leaks through foreign platforms and have submitted a report to the government. Defense Minister Paul Johnson highlighted the Ukrainian experience and the mass of battlefield data from aerial sensors, drones, and satellites, estimating that the use of cloud solutions and their application has become essential under these conditions.
These international movements highlight the stakes associated with data quality, infrastructure security, and model reliability. In China, if model collapse is not properly addressed, significant vulnerabilities could emerge in defense capabilities. The announced implications extend beyond the national framework, with several states mentioning the need to re-evaluate their AI strategies in light of documented limits and risks, whether related to data quality, reasoning robustness, or software attack surfaces.
Chinese researchers are thus expected to make methodological adjustments to preserve a sufficient proportion of real data and reinforce safeguards against uncontrolled self-training. Close monitoring of these developments remains relevant for defense actors at a time when the rapid spread of AI in sensitive processes increases the demand for reliability. In the meantime, the window of vulnerability mentioned in the assessments remains, pending the consolidation of training approaches capable of preventing any collapse in performance.

