Another sentiment visualisation using just text embeddings, tracking changes over time. In this case a subset of North Sea Transition Authority offshore license reports between 2008-2017 for the word vector 'seal'. These types of techniques can support analogues and insights for renewables, carbon capture and storage sites, subsurface radioactive storage, oil & gas exploration, mineral exploration, geohazards... Continue Reading →
It’s all about the data
Its all about the data. There are some fascinating interactive visualisations avalable from the Organisation for Economic Co-operation and Development (OECD). This chart shows the flow of Venture Capital (VC) investment in data startups by industry, from one country to another in 2022. These can be animated through time, 2023 showing growth in healthcare and... Continue Reading →
Natural Language Processing (NLP) Research Taxonomy
The area of Natural Language processing (NLP) research has exploded in recent times. Building Large Language Models (LLM) is a big player within the NLP landscape, but not the only game in town. I would like to point you towards an excellent paper by Schopf et al (2023) who classified and analysed NLP research papers... Continue Reading →
Using Natural Language Processing (Transformers) for Subsurface Carbon Capture and Storage Site Selection.
Mathur et al (2023) published an interesting paper recently. Transformers for Site Assessment for Carbon Capture and Sequestration using Legacy Well Data Y Mathur, J Chen, I Folmar, Z Dong, Q Su, L Lu, M Sidahmed Third EAGE Digitalization Conference and Exhibition 2023 (1), 1-5, 2023 Carbon Capture and Sequestration (CCS) is one of the... Continue Reading →
Geoscience Sentiment (Using Text Embeddings)
I've been experimenting using text embeddings to generate sentiment of a corpus of documents. In this approach it is generated by geological age (but can be other contexts). Taking any input query e.g. "aquifer" then combining that (adding vectors) with geological age vectors and comparing to the cosine of the vector of various sentiment themes,... Continue Reading →
Word Vectors (Embeddings) through (Geological) Time.
I've created text embeddings based on all of Geoscience Australia's Stratigraphic Unit Descriptions (18,500+ data points). The example below is a boxplot of the similarity of 'granite' (as a word vector) on the x-axis through geological time (vectors on y-axis) old at the base, young at the top. The further to the right (closer to... Continue Reading →
Text embeddings – baseflow, interflow and runoff
Using word vectors from 1,500 papers in the NERC Open Research Archive (NORA) on hydrogeology, automatically comparing minerals to the three component system baseflow-interflow-runoff in a ternary diagram. #hydrology #hydrogeology #geology #naturallanguageprocessing #groundwater #geotechnical #geoenvironmental #mineralogy #artificialintelligence
Detecting objects on images in documents
It can be useful to detect objects on images within documents. I labelled boreholes/well objects on 30 public domain images to illustrate what results can be achieved in less than an hour on unseen data. There are many other use cases in the subsurface such as objects on borehole logs, satellite imagery, remote sensing, thin... Continue Reading →
Academic Publishing on Geoscience Natural Language Processing (NLP)
Its been 10 years since my first academic research into Natural Language Processing (NLP) applied to Geoscience. I've done a quick analysis in Google Scholar for one measure on how the discipline has evolved. It is exciting to see the probable exponential growth in the number of papers about (or referencing) this topic. I've also... Continue Reading →
Towards General Geoscience Artificial Intelligence Systems
Interesting article from Zhang and Xu (2023) postulating what geoscience language models may become. Multi-disciplinary, Multi-modal inputs and outputs. They state Language Model's capability for scenario planning and qualifying uncertainty mean it could be a critical tool to address important issues such as climate change, natural hazards and sustainable development of natural resources. They describe... Continue Reading →