I tested if we might differentiate rock types and their associations based on the patterns of words that occur around them in large archives of geological reports. Using a text embeddings model generated through the unsupervised machine learning from thousands of geological survey reports, approximately 2,000 rock type names were compared to each other. The... Continue Reading →
Text Embeddings for Mineral Association Discovery
Data driven discovery: It may be interesting to compare the similarities of minerals based on their co-occuring words in large amounts of archive geological reports, to actual known reported mineral occurrences in databases such as Mindat. One could perhaps easily automate this algorithmic comparison, leaving ranked "candidate" mineral associations not present in reference databases. There... Continue Reading →
Misconceptions of LLM Chatbots in Geoscience
Misconceptions of LLM Chatbots: For scientists and business professionals it is critical to know the source of any AI generated answer or assertion. If we cannot trace the sources accurately we are unlikely to trust the output. Imagine reading a literature review where no sources were cited.The technique used to provide as accurate as possible... Continue Reading →
Clustermap of Text Embedings for Critical Minerals
ext embeddings for Critical Minerals: Applying unsupervised machine learning to thousands of geological survey reports to create text embeddings. These are displayed in a clustermap which is a matrix plot with a heatmap and two clustering dendograms. The 37 critical minerals for the energy transition (IEA) are on the x-axis and US states on the... Continue Reading →
Over 150 BSc, MSc and PhD geological questions released to help benchmark geological Gen AI
Over 150 BSc, MSc and PhD geological questions released to help benchmark geological Gen AI. These were released by the team at GeologyOracle the free AI to answer geological questions, extract data from documents, code and interpret sketches and photographs.Hopefully more elements will be Open-sourced over the coming months such as the open-access training data... Continue Reading →
Critical Minerals, Artificial Intelligence and the United States Geological Survey (USGS).
A collaboration between the USGS, DARPA, and ARPA-E called CriticalMAAS could deliver AI tools to solve US critical mineral challenges.“Geologists and innovators from the U.S. Geological Survey, the Defense Advanced Research Projects Agency (DARPA), the Advanced Research Projects Agency-Energy (ARPA-E), and other partners came together Jan. 13-17 to collaborate, train, and transition artificial intelligence (AI)... Continue Reading →
Geological Artificial Intelligence: Large image caption models and fossil detection
Experimenting with Florence-2 image captioning. It does a good job in this example predicting its a fossil tooth. I would have been even more impressed if it described it as a Mastodon tooth but that is where domain training may take over. You can try for yourself in Huggingface to test the capabilities and understand... Continue Reading →
You Only Look Once (YOLO) Models and Environmental Geoscience
So many applications of machine vision YOLO (You Only Look Once) models in the earth sciences. I came across a few environmental examples where YOLO models were used on unmanned boats and drones to detect rubbish (garbage). This might help clean up our rivers to some extent. There are other examples where even recyclables can... Continue Reading →
Using Large Language Models (Google Gemini) to estimate earthquake shaking intensity from social media posts
Using Large Language Models to estimate the intensity of earthquake shaking from multimodal social media posts.Interesting paper from Mousavi et al (2025) using Google’s Gemini 1.5 Pro LLM to estimate earthquake intensity from social media and CCTV. The authors state:“Our experiments demonstrate that Gemini can estimate ground shaking intensity based on the content of a... Continue Reading →
Querying structured databases in natural language using Large Language Models (Open AI’s GPT-4) for Geoscience Data Analysis
Open access code: Querying one of the largest mineral databases in the world using natural language for co-occurrence mineral analysis and heat map visualization for geoscience data analysis.Interesting paper from Zhang et al (2025) from the University of Idaho connecting Open AI's GPT-4o Large Language Model (LLM) through prompt engineering to the mineral database Mindat... Continue Reading →