
ext embeddings for Critical Minerals: Applying unsupervised machine learning to thousands of geological survey reports to create text embeddings. These are displayed in a clustermap which is a matrix plot with a heatmap and two clustering dendograms. The 37 critical minerals for the energy transition (IEA) are on the x-axis and US states on the y-axis for illustration of the technique – but it can be anything of course.
The distance of the embeddings for these entities are clustered in the heatmap so similar rows and columns are next to each other. The magnitude is shown of the text embedding (i.e. red=similar, blue=dissimilar). This provides us with a data driven view where there are similar states and minerals generated from the patterns of words in text.
Where there are large volumes of geoscience text that are too great for us to read, we may use these techniques to explore patterns and relationships for further investigation. To perhaps identify anomalies or entities clustered with other entities which may be surprising, just from latent patterns of words in text.
Any very large corpus of documents has the affordance to produce patterns that may lead us to new ideas and theories, not present in any single document. Similar techniques using unstructured text for knowledge discovery have already led to new scientific discoveries in the geological sciences, biosciences and material sciences.
hashtag#geology hashtag#geosciences hashtag#earthsciences hashtag#machinelearning hashtag#unstructureddata hashtag#naturallanguageprocessing hashtag#artificialintelligence hashtag#ai hashtag#criticalminerals hashtag#analytics hashtag#energytransition hashtag#data
Leave a comment