The Deep-time Digital Earth program: Data-driven discovery in geosciences
DDE aims to harmonize deep-time Earth data based on a knowledge system to investigate the evolution of Earth, including life, Earth materials, geography, and climate. Integrated methods include artificial intelligence (AI), high performance computing (HPC), cloud computing, semantic web, natural language processing, and other methods.
Credit: Science China Press
Humans have long explored three big scientific questions: the evolution of the universe, the evolution of Earth, and the evolution of life. Geoscientists have embraced the mission of elucidating the evolution of Earth and life, which are preserved in the information-rich but incomplete geological record that spans more than 4.5 billion years of Earth history. Delving into Earth’s deep-time history helps geoscientists decipher mechanisms and rates of Earth’s evolution, unravel the rates and mechanisms of climate change, locate natural resources, and envision the future of Earth.
Deductive reasoning and inductive reasoning have been widely employed for studying Earth’s history. In contrast to deduction and induction, abduction is derived from accumulation and analysis of large amounts of reliable data, independently of a premise or generalization. Abduction thus has the potential to generate transformative discoveries in science. With the accumulation of enormous volumes of deep-time Earth data, geoscientists are poised to transform research in deep-time Earth science through data-driven abductive discovery.
However, three issues must be resolved to facilitate abductive discovery using deep-time databases. First, many relevant geodata resources are not in compliance with FAIR (findable, accessible, interoperable and reusable) principles for scientific data management and stewardship. Second, concepts and terminologies used in databases are not well defined; thus, the same terms may have different meanings across databases. Without standardized terminology and definitions of concepts, it is difficult to achieve data interoperability and reusability. Third, databases are highly heterogeneous in terms of geographic regions, spatial and temporal resolution, coverages of geological themes, limitations of data availability, formats, languages and metadata. Due to the complex evolution of Earth and interactions among multiple spheres (e.g., lithosphere, hydrosphere, biosphere and atmosphere) in Earth systems, it is difficult to see the whole picture of Earth’s evolution from separated thematic views, each with limited scope.
Big data and artificial intelligence are creating opportunities for resolving these issues. To explore Earth’s evolution efficiently and effectively through deep-time big data, we need FAIR, synthetic and comprehensive databases across all fields of deep-time Earth science, couple with tailored computation methods. This goal motivates the Deep-time Digital Earth program (DDE), which is the first “big science program” initiated by the International Union of Geological Sciences (IUGS) and developed in cooperation with national geological surveys, professional associations, academic institutions, and scientists around the world. The main objective of DDE is to facilitate deep-time, data-driven discoveries through international and interdisciplinary collaborations. DDE aims to provide an open platform for linking existing deep-time Earth data and integrating geological data that users can interrogate by specifying time, space, and subject (i.e., a “Geological Google”) and for processing data for knowledge discovery using a knowledge engine (Deep-time Earth Engine) that provides computing power, models, methods, and algorithms (Figure 1).