The FAIRmat Concept presented as a "Perspective" in “Nature”
Since 2020, the NOMAD Laboratory at the Fritz Haber Institute of the Max Planck Society, together with other universities and research institutes, has been developing an AI-supported research database for the field of materials science and solid-state physics. In a recent "Perspectives" article in the renowned scientific journal "Nature", the scientists involved present the concept of their data infrastructure NOMAD/FAIRmat.
"Nature" is one of the world's most-read and most prestigious multidisciplinary, academic journals. As a special component, "Nature" features "Perspectives". These are more forward looking contributions and intended to stimulate discussion and new scientific approaches. After the international Conference on a FAIR Data Infrastructure in summer 2020 the leadership of FAIRmat was invited by nature's board of editors to write a "Perspective", describing its forward looking concepts on the revolutionary impact of data-centric research in Materials Science and the Chemical Physics of Solids.
The prosperity and lifestyle of our society depend to a large extent on the achievements of condensed matter physics, chemistry and materials science, because new products for energy, the environment, health, mobility, IT, etc. are largely based on improved or even novel materials. Examples include solid-state lighting, touch screens, batteries, implants, drug delivery and much more. The enormous amounts of research data produced daily in this field represent a "gold mine of the 21st century". However, this "gold mine" is of little value if these data are not comprehensively characterised and made available. This requires an efficient and easily usable research database. This is what the NOMAD Laboratory at the Fritz Haber Institute of the Max Planck Society has been developing with other research partners since 2020 with the NOMAD/FAIRmat database. The acronym FAIRdi stands for Findable, Accessible, Interoperable and Reusable Data-Infrastructure. This is intended to make it easy to share data that has already been collected and to explore it using data analysis and artificial intelligence (AI) methods.
An important prerequisite for this database in the field of materials science is that the experimental (or computational) conditions and the results are actually documented in all details in such a way that the studies are reproducible and the data collection (including the comprehensive characterisation of the experimental set-up) is automated as far as possible. This sounds like an outdated requirement, but it has not yet been consistently implemented, and is essential for data-centred science.
Moreover, this requires close cooperation between experts from the fields of data science, IT infrastructure, software engineering and materials science as equal partners. In FAIRmat, this is realised through a central hub of specialists at the Department of Physics at Humboldt-Universität zu Berlin. In addition, hardware for data storage and processing, advanced analyses and high-speed networks is a basic prerequisite for building the data infrastructure described. Furthermore, "middleware", e.g. for the efficient exchange of data generated in or by different digital environments, is needed. Finally, new software tools are also being developed, for example for fitting data, removing noise from data and learning the rules behind patterns in data. With such tools, it will also be possible to identify "material genes", i.e. physical parameters related to the processes that trigger, facilitate or hinder a particular material property or function. FAIRmat will promote the international coordination of such tool developments in the broader materials science community.
FAIRmat will thus fill a digitisation gap in the field of materials science, as has already been done, for example, in the field of life sciences through the introduction of digital libraries.