Transitioning Geoscience (Literature) into the Age of AI

I co-authored a paper presented at the Geological Society of London late last year on the topic of transitioning geoscience (literature) into the age of AI. The aim is to inject more creativity into the way we explore big data geoscience documentation and release free, easy to use web apps using techniques enabled by AI. The design follows UNESCO guidelines on AI Ethics, so the provenance of the source information used to generate the output is transparent.

Using Natural Language Processing to Discover Analogues and Insights in Unstructured Geoscience Literature

Phoebe McMellon1, Alistair Reece2 and Paul Cleverley3
GeoScienceWorld1,2, Robert Gordon University3

Abstract
GeoScienceWorld (GSW) is a non-profit founded in 2004 by geological societies to advance the dissemination of earth science information, supporting over 100,000 researchers worldwide. The platform includes 210,000+ articles, 39,000+ eBook chapters, and 4.6 million geoscience abstracts and records from GeoRef.

The standard method for searching geoscience literature via full-text indexing and keyword-based retrieval has significantly improved information discovery.  However, extracting geoscience insights across the corpus proves challenging. Large Language Models (LLM) are emerging and useful text and data mining capabilities exist in the geosciences, although most are not easily or freely available to all geoscientists, nor can they be easily applied to large repositories of relevant domain-specific content.

We built a web application using text embeddings (arrays of numbers that can be used to represent words based on their co-occurrence), complemented by Natural Language Processing and Knowledge Representations (Cleverley 2017). These latent patterns within texts can enable abductive discovery of geoscientific information that could lead to the discovery of a less obvious analogue, undiscovered opportunity, or association (Lawley et al 2022), sparking a new idea or line of thinking.

A new Digital Assistant, GeoSapien® will be presented to the public powered by GSW content. Early feedback from geoscience researchers is encouraging. Dr. Gene Rankey, a senior geoscience researcher stated, “This innovative tool is like Google Scholar on steroids, extending well beyond simply suggesting documents, it provides a means to explore spatial, temporal, and topical associations among geoscience concepts.  Its utility and value for exploratory data analysis, mining data spread across the literature, hypothesis generation and testing, and deeper in-depth analytical analyses are clear.” The goal is to make GeoSapien® freely available to the geoscience community in 2024.

Leave a comment

Website Powered by WordPress.com.

Up ↑