Text embeddings – baseflow, interflow and runoff

Using word vectors from 1,500 papers in the NERC Open Research Archive (NORA) on hydrogeology, automatically comparing minerals to the three component system baseflow-interflow-runoff in a ternary diagram. #hydrology #hydrogeology #geology #naturallanguageprocessing #groundwater #geotechnical #geoenvironmental #mineralogy #artificialintelligence

Detecting objects on images in documents

It can be useful to detect objects on images within documents. I labelled boreholes/well objects on 30 public domain images to illustrate what results can be achieved in less than an hour on unseen data. There are many other use cases in the subsurface such as objects on borehole logs, satellite imagery, remote sensing, thin... Continue Reading →

Towards General Geoscience Artificial Intelligence Systems

Interesting article from Zhang and Xu (2023) postulating what geoscience language models may become. Multi-disciplinary, Multi-modal inputs and outputs. They state Language Model's capability for scenario planning and qualifying uncertainty mean it could be a critical tool to address important issues such as climate change, natural hazards and sustainable development of natural resources. They describe... Continue Reading →

Generating questions

I've been experimenting using ChatGPT to generate candidate questions given document text input. The example is on Ground Source Heat Pumps (GSHP) from a British Geological Survey Report in the NORA collection. It might be useful for organisations to store a 'question bank' of such Generative AI outputs (questions) for a corpus, sliced in numerous... Continue Reading →

Text Embeddings – no single truth!

I’ve been experimenting using text embeddings to identify relative topic emphasis in text corpora, as an example of similarity based unsupervised machine learning. The examples below show the relative similarity of the word vectors for ‘aquifer’ (top) and ‘groundwater’ (bottom) to word vectors of various forms of contamination, comparing the US Geological Survey public collection... Continue Reading →

Text Embeddings App

Using text embeddings for lookbacks. I’m making this app freely available to the Norwegian Petroleum Directorate and UK North Sea Transition Authority along with various NLP outputs for the benefit of the geoscience community. This is as input to the hackathon organised by FORCE led by Peter Bormann This particular example uses 800 license relinquishment... Continue Reading →

Discovering topics in text

Discovering topics in text. This is an interactive noun-noun-phrase network of body text within 1,500 UK NERC Open Research Archive (NORA) groundwater hydrology reports related to aquifers. These inductive statistical type techniques can be a useful first pass to assess key topics and trends in a large amount of documents. Reference van Eck, N.J. and... Continue Reading →

Text Embeddings – Analogies

Text embeddings can capture some interesting semantic relationships. Given an analogy “Quartz is to Sandstone” what “…….. is to Limestone” - using vector additions and subtractions, latent trajectories in embedding space produce “calcite” as the answer. Given enough text, this technique may be capable of producing results that spark new lines of thought in science... Continue Reading →

Website Powered by WordPress.com.

Up ↑