Natural Language Processing: Cross Plots to determine potentially hidden associations

Word embeddings in Natural Language Processing (NLP) are a representation of words in real valued vectors that encode the meaning of the word. Words closer in vector space are likely to be similar in meaning.

From 5,000 reports, the cross plots above shows the word vectors for 1500 minerals to the word vectors for hydrothermal (x-axis) and mudrocks (y-axis).

By cross plotting numerous concepts in various contexts, it’s possible to surface hitherto unknown associations leading to new research questions or discoveries.

Share this:

Leave a comment Cancel reply