Geological Inference from Textual Data using Word Embeddings.

Geospatial Exploration through Natural Language Processing: Geological Inference from Textual Data using Word Embeddings.

Interesting paper from Linphrachaya et al (2025). “By automating the initial stages of resource exploration, we aim to provide a tool that can streamline the process of identifying promising locations for further investigation. Given the critical importance of lithium for the future of renewable energy technologies, our research offers timely insights into how data-driven techniques can complement traditional exploration methods.”

This builds on similar research I published in 2023 on my blog.

Abstract
This research explores the use of Natural Language Processing (NLP) techniques to locate geological resources, with a specific focus on industrial minerals. By using word embeddings trained with the GloVe model, we extract semantic relationships between target keywords and a corpus of geological texts. The text is filtered to retain only words with geographi- cal significance, such as city names, which are then ranked by their cosine similarity to the target keyword. Dimensional reduction techniques, including Principal Component Analy- sis (PCA), Autoencoder, Variational Autoencoder (VAE), and VAE with Long Short-Term Memory (VAE-LSTM), are applied to enhance feature extraction and improve the accuracy of semantic relations. For benchmarking, we calculate the proximity between the ten cities most semantically related to the target keyword and identified mine locations using the haversine equation. The results demonstrate that combining NLP with dimensional reduction techniques provides meaningful insights into the spatial distribution of natural resources. Although the result shows to be in the same region as the supposed location, the accuracy has room for improvement.

https://arxiv.org/pdf/2504.07490

Leave a comment

Website Powered by WordPress.com.

Up ↑