Natural Language Processing: Cross Plots to determine potentially hidden associations

Word embeddings in Natural Language Processing (NLP) are a representation of words in real valued vectors that encode the meaning of the word. Words closer in vector space are likely to be similar in meaning.

From 5,000 reports, the cross plots above shows the word vectors for 1500 minerals to the word vectors for hydrothermal (x-axis) and mudrocks (y-axis).

By cross plotting numerous concepts in various contexts, it’s possible to surface hitherto unknown associations leading to new research questions or discoveries.

See also: https://paulhcleverley.com/2019/06/03/word-embeddings-and-language-models/

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Website Powered by WordPress.com.

Up ↑

%d bloggers like this: