AI uses extrapolative learning to master materials prediction beyond existing data

April 16, 2025

The Gist

Editors' notes

AI uses extrapolative learning to master materials prediction beyond existing data

by Research Organization of Information and Systems

Learning the learning method for extrapolative prediction using the E2T algorithm. Credit: The Institute of Statistical Mathematics

A research group has developed an innovative machine learning technology that enables predictions beyond the distribution of training data and demonstrated its effectiveness in materials research. The team includes Kohei Noda, a researcher at JSR Corporation, and Professor Ryo Yoshida at the Institute of Statistical Mathematics.

The ultimate goal of materials science is to discover new materials in unexplored domains where no data exists. However, predictions made by machine learning are generally interpolative, with their applicability typically limited to regions close to the distribution of existing data. Additionally, in materials research, the high cost of data acquisition makes it difficult to obtain sufficient training data, necessitating exploration beyond the range of available data.

To address this challenge, the research group developed a machine learning algorithm called E2T (extrapolative episodic training). In E2T, a model known as a meta-learner is trained using a large number of artificially generated extrapolative tasks derived from the available dataset. As a result, the model autonomously learns a learning method to perform extrapolative predictions.

In this study, E2T was applied to material property prediction tasks, demonstrating high predictive accuracy even for materials with elemental and structural features not present in the training data. Furthermore, it was revealed that models trained on a large number of extrapolative tasks could rapidly acquire predictive capabilities in unknown domains with only a small amount of additional data.

These research findings were in Communications Materials on February, 22 2025.

Get free science updates with Science X Daily and Weekly Newsletters — to customize your preferences!

Research outcomes

In recent years, the application of machine learning has led to remarkable progress on the discovery and development of new materials. At the core of this progress lies property prediction technology driven by machine learning. By leveraging predictive models, we can explore millions or even billions of candidate materials to identify those with desired properties from vast search spaces.

However, many studies face the challenge of limited data availability, which restricts the range of applications for machine learning. Furthermore, the ultimate goal of materials science is to uncover unknown materials with groundbreaking properties.

Despite this, machine learning's predictive capabilities are generally confined to regions near the training data, making it difficult to explore uncharted territories. For instance, even generative AI, such as large language models that have revolutionized AI in recent years, are inherently interpolative—they replicate tasks that humans have encountered before. Developing AI technologies capable of predicting beyond existing data represents a grand challenge not only for materials science but also for advancing next-generation AI.

Band gap prediction of organic-inorganic hybrid perovskites using E2T. Credit: The Institute of Statistical Mathematics

In the field of machine learning, various methodologies have been explored to achieve extrapolative predictions, including:

Domain generalization: Techniques that aim to learn shared feature representations across diverse tasks.
Data augmentation: Methods to enhance model performance by increasing the diversity of training data.
Integration of physical knowledge with machine learning: Approaches that embed prior knowledge, such as physical laws, into machine learning frameworks (e.g., physics-informed neural networks).
Meta-learning: Techniques that train models to acquire generalized learning strategies by exposing them to a diverse range of tasks.

This study introduces a novel meta-learning approach that enables models to directly acquire broadly applicable learning methods for extrapolative predictions.

In this study, a neural network equipped with an attention mechanism was employed to train a model capable of learning the methods required for achieving extrapolative predictions. Specifically, a training dataset and an input-output pair ( , ), extrapolatively related to , were sampled from a given dataset. Here, represents a material, and represents its properties. These three components together form an "episode," which can be generated arbitrarily.

Using a large number of artificially generated episodes, a meta-learner = ( , ) was trained to predict from. The trained model learns what function is required to predict ( , ) in an extrapolative relationship with any training dataset. The research group named this novel learning algorithm E2T (extrapolative episodic training).

The research group applied E2T to over 40 property prediction tasks involving polymeric and inorganic materials to evaluate its performance. The results showed that, in almost all cases, models trained with E2T outperformed conventional machine learning models in terms of extrapolative accuracy. Additionally, in predictive performance near the training data, E2T demonstrated accuracy equivalent to or greater than that of traditional machine learning.

However, the extrapolative performance of E2T did not reach that of an ideal model (called oracle) trained on the entire dataset including the extrapolative region. In other words, while E2T consistently improved prediction accuracy in extrapolative regions, it fell short of achieving "ultimate extrapolative capability."

A particularly noteworthy finding was that models trained on a large number of extrapolative tasks demonstrated the ability to quickly adapt to new extrapolative tasks through fine-tuning with a limited amount of data. Remarkably, these models achieved comparable performance to an oracle model trained on extrapolative regions, despite requiring significantly less data.

In humans, rapid adaptability in humans is believed to result not only from innate traits but also from extensive training and experience. This study revealed that a similar phenomenon may occur in the learning processes of AI, where adaptability is enhanced through systematic exposure to diverse tasks.

Future outlook

The ultimate goal of materials research lies in exploring uncharted material spaces where no data currently exists. For instance, researchers aim to investigate the properties of materials formed by combinations of elements or raw materials that have never been tested before or when sample fabrication protocols are significantly altered.

This study began with a fundamental question: Can models trained to achieve extrapolation with existing datasets acquire extrapolative capabilities and adaptability to unknown environments? The researchers presented a remarkably simple solution to this question. While the current evidence is limited to specific cases, if the learning capability of E2T proves to be universal, its impact could extend beyond materials science, influencing a wide range of fields within AI for Science.

One particularly exciting prospect is the application of E2T to the development of foundation models. Foundation models are trained on large-scale, versatile datasets and are expected to exhibit the ability to adapt to a wide variety of downstream tasks.

By fine-tuning these models for specific downstream tasks, it is possible to reduce the amount of data required while achieving high predictive accuracy. The extrapolative performance and domain adaptability of E2T have the potential to drive groundbreaking innovations in the development of foundation models, significantly advancing the broader scientific landscape.

More information: Kohei Noda et al, Advancing extrapolative predictions of material properties through learning to learn using extrapolative episodic training, Communications Materials (2025).

Source code for E2T:

Journal information: Communications Materials

Provided by Research Organization of Information and Systems

�鶹��Ժ

AI uses extrapolative learning to master materials prediction beyond existing data

Research outcomes

Future outlook

New review urges rigorous testing for single-atom catalysts in industry

New study discovers unexpected role of 4f-orbital covalency in driving chemical reactivity

Scientists develop novel strategy to enhance water oxidation catalysis

Surface reconstruction strategy can enable affordable hydrogen fuel production

Chemical recycling turns used silicones into pure building blocks, promising infinite reuse

Cerium glows yellow: Chemists discover how to control luminescence of rare earth elements

Computational model predicts a chemical reaction's point of no return

The first experimental observation of Dirac exceptional points

Structure of lipid-transfer tunnel protein in C. elegans revealed

Pigs can regrow their adult teeth. What if humans could, too?

Ultrafast optical technique reveals how electrical double layers form in liquids

Magnetic confinement advance promises 100 times more fusion power at half the cost

Gapless genome sequence reveals hybrid origins of Hong Kong's iconic orchid tree

Up to 42% of insect behavioral experiments not reproducible across laboratories

Observatory develops high-efficiency muon detection system with novel plastic scintillator design

'Cryosphere meltdown' will impact Arctic marine carbon cycles and ecosystems, new study warns

Ultrasonic nanocrystal surface modification restores stainless steel's corrosion resistance

Hey, what are these curved green flashes above my polymer semiconductor?

Get Instant Summarized Text (GIST)