Our data comes from Epoch AI’s AI Models dataset, a superset of our Notable Models dataset which removes the notability requirement. This allowed us to accurately describe the compute trend in models from China, which have been less likely to be notable historically. We used a snapshot of the dataset from 7th January 2025. For each model, the dataset has a corresponding publication date, training compute quantity, and country of affiliation, among other fields.
Our data initially lacked training compute estimates for some models of particular interest, such as GPT-4o and Claude 3.5 Sonnet. We estimated the training compute of these models (11 in total) by imputing from benchmark scores. These compute estimates are not in the original dataset because they are more speculative than most other estimates. However, by covering these important models, we believe the estimates lead to a higher-quality analysis overall. In this case, adding these estimates did not noticeably change the bottom-line results.
Before filtering our data, there are 2361 models. We dropped models with a missing publication date or training compute (after imputing training compute for 11 models of particular interest as noted above), leaving 2233 models. We then categorized the models as described in the Overview, with 485 models “Developed in China” and 1748 “Not developed in China”. Following previous work, we considered AlphaGo Master and AlphaGo Zero to be outliers, excluding them from the data. To avoid duplicating training compute data, we also filtered out models that were finetuned separately from another model, such as Med-PaLM.
After removing separately fine-tuned models, we filtered to the rolling top-10 models in each country category. Finally, we filtered the rolling top-10 models down to the “Language” or “Multimodal” domain. These two filtering steps were applied in this order to focus on language models that are close to the overall frontier of training compute. The main effect of this ordering is to leave out language models prior to 2020 that were still catching up to the frontier, as observed in previous work. After this filtering, we were left with 49 models “Developed in China”, and 60 models “Not developed in China”.
The Epoch AI data team has made a focused effort to increase coverage of models developed in China, but coverage may still be worse for these models. While it is more difficult for the team to discover models through Chinese language documents and websites, we believe coverage issues are limited when tracking frontier models.