Graph of Text Embeddings

Showing similar terms to a search query, in this case 'Pegmatite' using text embedddings in a graph network. This can support data exploration, the thickness of the edge is related to cosine similarity (the thicker the green line, the more similar the association). For this visualisation I modelled text embeddings into NetworkX to display in... Continue Reading →

Radar Plots Using Word Vectors

Radar plots using word vectors can be useful to compare what ‘the text’ of many reports might be saying (perhaps too much to read) to quantitative data on aspects such as risk or uncertainty. It may highlight mismatches or contradictions requiring further investigation. The illustrative example is driven from a corpus of millions of words... Continue Reading →

Ternary plots using word vectors

Ternary plots are used for visualisation and modelling in the geosciences for three component correlation. Typically they represent the compositions of soils, rocks and minerals. The interactive plots above have been generated just from the patterns of millions of words trained from geoscience literature. Each axis represents cosine similarity, the closer to 1 the more... Continue Reading →

Data-Driven Discovery in Geosciences: Opportunities and Challenges

Chen et al (2023) published a very interesting special edition editorial for Springer's Mathematical Geosciences recently. "This special collection explores scientific research related to data-driven discoveries in geosciences and provides a timely presentation of progress in developments and/or applications of AI and big data approaches to multiple aspects of geosciences. " I think this next... Continue Reading →

NASA BERT-E Earth Science Large Language Model (LLM)

To understand domain terminology effectively in areas like healthcare and geoscience, domain training has been shown to improve results. https://www.nature.com/articles/s41586-023-06291-2 BERT-E The NASA IMPACT team published a paper at AGU back in 2021 on BERT-E an Earth Science trained language model (270k articles) comparing to Sci-BERT (see screenshot). https://agu2021fallmeeting-agu.ipostersessions.com/default.aspx?s=9D-AC-B5-BA-E8-8D-CE-44-5F-17-8E-3F-B5-16-0E-60 The model may be superseded by... Continue Reading →

3D Word Vector Plots

3D Word Vector Visualisations: The provision of free web tools for all geoscientists to easily explore hidden semantic relations in textual content may increase the chances of abductive discovery in our discipline. Following on from previous posts on word vectors in space and time, in this example the 'first' axis is Lacustrine, 'second' is Evaporite... Continue Reading →

Website Powered by WordPress.com.

Up ↑