Microstructure – Data Science

Contact persons: Lars Griem, Johannes Steinhülb

Research

The research group “Microstructure – Data Science” focuses on the data-driven analysis and optimization of microstructures. To this end, methods for segmentation, characterization, and structure synthesis are developed, alongside data-driven analysis tools that make the interplay between microstructural features and macroscopic material behavior observable. In addition, computer vision techniques are specifically applied to medical imaging data. The developed methods are designed such that they can be transferred across domains and address both materials science and medical applications. For the creation of digital twins of microstructures, large-scale phase-field simulations are combined with computer vision methods that segment image data from various imaging modalities and reconstruct them into high-resolution 3D models. Building on these representations, generative algorithms and generative AI models are employed to synthesize microstructures with controllable properties, enabling realistic representations of porous systems such as membranes, grain structures, and geological packings. In collaboration with the “Research Data Management” group, workflows are developed for the reproducible and FAIR-compliant analysis of large datasets, ensuring that the described methods are automated, standardized, and reusable across different application contexts. The overarching aim of the research activities is the bridging of length scales through the identification of effective structure–property relationships and the development of data-driven predictive models that support accelerated and informed materials design.

Computer Vision

In the field of computer vision, versatile methods are developed for the automated analysis of complex image data from a wide range of modalities, including CT, MRI, and confocal microscopy. Core tasks include segmentation, reconstruction, registration, and super-resolution. By employing flexible model architectures, strategies for limited datasets, and the generation of synthetic training data, robust approaches are created that can be reliably transferred across different domains. In this way, both the creation of digital twins in materials science and the precise analysis of medical imaging data are supported.

Segmenting structures of interest is a crucial first step in any quantitative analysis pipeline. In our group, robust and adaptable segmentation approaches are developed for different application domains. While the primary focus lies on material-science and medical imaging data, the methods are designed to be transferable and easily applied to new datasets.

To overcome the limited availability of annotated images, synthetic training data are generated for supervised segmentation tasks. Binary structures are produced using a GAN and subsequently refined with a CycleGAN to add realistic noise and texture. In this way, highly realistic training data are created to support the development of robust segmentation models.

Super-resolution methods are used to increase the resolution of volumetric or two-dimensional image data. In volumetric MRI, for example, low through-plane resolution often limits downstream analysis. To address this issue, a supervised VAE–GAN framework is used to generate realistic high-resolution training data. This enables accurate super-resolution reconstruction and leads to more robust volumetric MRI segmentation.

Event detection is an essential task when working with temporal image sequences. In dynamic contrast-enhanced MRI (DCE-MRI) of the lung, for example, breathing motion must be identified and removed to enable meaningful downstream analysis. To automate this step, learning-based models are employed that reliably detect respiratory motion, even under varying image contrast conditions.

A broad portfolio of analysis methods is available to reliably quantify geometric properties in digital material twins. Metrics such as porosity, wall thickness, and pore radius can be extracted with high accuracy, providing a detailed description of the underlying microstructure. These quantitative descriptors form an essential basis for establishing structure–property relationships and supporting data-driven materials analysis.

Data-driven analysis methods are used to extract higher-dimensional descriptors from microstructures. Statistical metrics such as two-point correlation functions combined with principal component analysis, as well as neural network–based approaches, enable the identification of latent structural features. These representations support the clustering and comparison of microstructures and provide deeper insights into underlying structure–property relationships.

To accelerate the simulation of wetting behaviour in porous membranes, pore network models are extracted from digital membrane twins. These models provide a simplified yet informative representation of the pore space, enabling faster simulations and allowing larger material domains to be analyzed efficiently while preserving the essential characteristics of the underlying microstructure.

To quantify perfusion in dynamic contrast-enhanced MRI (DCE-MRI) of the lungs, an automated pipeline is developed. Based on a mathematical model, the residue function is obtained from 3D+t lung measurements by deconvolving the sub-traction image with the arterial input function. This residue function is subse-quently used to compute quantitative perfusion maps, including parameters such as pulmonary blood flow (PBF) and the percentage of perfusion defects (QDP).

Characterization

To characterize microstructures, methods are developed for the quantitative description of complex geometries and statistical features. Based on digital material twins, classical descriptors such as porosity, wall thickness, pore-size distributions, and tortuosity are determined, along with higher-dimensional features derived from data-driven approaches such as two-point correlation functions or principal component analysis. In addition, neural networks are employed to identify latent structural patterns. Beyond materials science applications, medical imaging data are also analyzed, for example segmented lungs for perfusion assessment.

Structure Synthesis

Various methodological approaches are employed for the generation of synthetic microstructures. Simulation-based strategies rely on the phase-field method, while geometry-based techniques such as Voronoi constructions or packing algorithms enable the direct parametrization of structural properties. In addition, generative AI, particularly diffusion models and variational autoencoders, is used, building on reconstructed digital twins to enable rapid and tunable synthesis of realistic variants. In this way, arbitrary microstructures can be generated synthetically.

Diffusion models are trained on segmented and reconstructed digital material twins to generate synthetic, highly realistic microstructure representations. By conditioning the generation process, the properties of the generated microstructures can be specified in a targeted and controlled manner.

Geometric structure generation methods are employed to synthesize a wide range of microstructures, including membranes, foams, particle packings for geological and battery-related applications, as well as triply periodic minimal surfaces. Algorithmic approaches enable targeted control over structural features, allowing key properties to be adjusted directly through model parameters.

Workflows

Zur Automatisierung der entwickelten Methoden, von der Segmentierung und Rekonstruktion bis hin zur Charakterisierung und Struktursynthese, werden in Zusammenarbeit mit der Forschungsgruppe Forschungsdatenmanagement reproduzierbare Workflows entwickelt. Mithilfe des KadiStudio-Workflow-Editors werden generische Prozessketten modelliert und in FAIR-konformer Weise ausführbar gemacht. Die modulare Struktur dieser Workflows ermöglicht deren Wiederverwendung in unterschiedlichen Anwendungsszenarien und gewährleistet eine effiziente Analyse großer und heterogener Datensätze.

Team
Name	Function
Griem, Lars Christoph	Group leader
Steinhülb, Johannes	Group leader
Kocak, Muhammed Saadeddin	Research Assistant

Associated team members
Name	Function