Generative AI research with Geoscientists

I believe this may be the first research published on what geoscientists think of Generative AI responses. The experiment tested the impact of enriching text chunks generated from 100 public domain geoscience reports using Retrieval Augmented Generation (RAG). The tagging had the effect of influencing the top text chunk candidates from the vector database used as context for Large Language Model (LLM) answer generation (in this case ChatGPT-3.5 Turbo). All other parameters remained the same and the temperature parameter was set to zero.

Experienced geoscientists were asked to rank which responses they thought were most informative and why, what they liked and did not like. They did not know how each response was generated, other than that they were created using Gen AI.

The untagged response is shown in red, the tagged responses in green and control (erroneously tagged) in blue. This provides some early evidence that enriching text chunks using Natural Language Processing (NLP) may lead to more informative and useful responses for Generative AI.

It also highlights for RAG – which many companies appear to be deploying from the literature – just how sensitive the responses are to the ranking of the text chunks – before the “Gen AI” part even takes place.

So for RAG the usefulness of responses from Gen AI for open questions – may not necessarily be directly related to…Gen AI.

Among the underlying themes indicated by geoscientists in this study (see table in the image) was the need for more informative answers and how poor some of the responses were. For open questions that do not have a single right answer, there does appear to be a tendency for Gen AI to produce truisms – stating the obvious and repetition. For domain subject matter experts responses are likely to require more depth to be useful. This is an area for further research, how we can make Gen AI responses to open questions more useful for geoscientists.

I presented this research yesterday at the Society for Professional Data Managers (SPDM), a big thank you to those that participated.

#geosciences #geoscience #digital #digitalgeoscience #largelanguagemodels #artificialintelligence #naturallanguageprocessing #digitalinnovation #research #subsurface #chatgpt #generativeai

Leave a comment

Website Powered by WordPress.com.

Up ↑