There are incredible open-access datasets out there. For example the GLiM Global Lithological Map containing over 1.2 million polygons of rock types.From Hartmann and Moosdorf (2012) made interactive through a web viewer in 2017. Link in the comments.AbstractLithology describes the geochemical, mineralogical, and physical properties of rocks. It plays a key role in many processes... Continue Reading →
Spatialising word vectors
In areas of sparse data, patterns in text may be a helpful geoscience screening tool. One technique may be to build a text embedding model which allows you to compare the vectors of target geological concepts to location names.Disambiguation here is vitally important. The prototype example shown is for the vector of 'Monzonite' to vectors... Continue Reading →
Combining minerals and lithology text embeddings for data discovery
I've combined text embeddings generated from word co-occurrences within thousands of geological reports for both lithology and minerals in a 3D t-SME plot. Following on from some recent posts I made, it may be interesting to explore similarity (cosine vector similarity) between lithologies-lithologies, minerals-minerals and lithologies-minerals.This is a technique anyone can conduct on large volumes... Continue Reading →
Text Embeddings for Rock Classifications
I tested if we might differentiate rock types and their associations based on the patterns of words that occur around them in large archives of geological reports. Using a text embeddings model generated through the unsupervised machine learning from thousands of geological survey reports, approximately 2,000 rock type names were compared to each other. The... Continue Reading →
Text Embeddings for Mineral Association Discovery
Data driven discovery: It may be interesting to compare the similarities of minerals based on their co-occuring words in large amounts of archive geological reports, to actual known reported mineral occurrences in databases such as Mindat. One could perhaps easily automate this algorithmic comparison, leaving ranked "candidate" mineral associations not present in reference databases. There... Continue Reading →
Misconceptions of LLM Chatbots in Geoscience
Misconceptions of LLM Chatbots: For scientists and business professionals it is critical to know the source of any AI generated answer or assertion. If we cannot trace the sources accurately we are unlikely to trust the output. Imagine reading a literature review where no sources were cited.The technique used to provide as accurate as possible... Continue Reading →
Clustermap of Text Embedings for Critical Minerals
ext embeddings for Critical Minerals: Applying unsupervised machine learning to thousands of geological survey reports to create text embeddings. These are displayed in a clustermap which is a matrix plot with a heatmap and two clustering dendograms. The 37 critical minerals for the energy transition (IEA) are on the x-axis and US states on the... Continue Reading →
Over 150 BSc, MSc and PhD geological questions released to help benchmark geological Gen AI
Over 150 BSc, MSc and PhD geological questions released to help benchmark geological Gen AI. These were released by the team at GeologyOracle the free AI to answer geological questions, extract data from documents, code and interpret sketches and photographs.Hopefully more elements will be Open-sourced over the coming months such as the open-access training data... Continue Reading →
Critical Minerals, Artificial Intelligence and the United States Geological Survey (USGS).
A collaboration between the USGS, DARPA, and ARPA-E called CriticalMAAS could deliver AI tools to solve US critical mineral challenges.“Geologists and innovators from the U.S. Geological Survey, the Defense Advanced Research Projects Agency (DARPA), the Advanced Research Projects Agency-Energy (ARPA-E), and other partners came together Jan. 13-17 to collaborate, train, and transition artificial intelligence (AI)... Continue Reading →
Geological Artificial Intelligence: Large image caption models and fossil detection
Experimenting with Florence-2 image captioning. It does a good job in this example predicting its a fossil tooth. I would have been even more impressed if it described it as a Mastodon tooth but that is where domain training may take over. You can try for yourself in Huggingface to test the capabilities and understand... Continue Reading →