Contradictions in text: Been conducting some research on natural language processing, blending textual entailment using probabilistic language models with graph based associative text extraction. Take the two sentences: 1. Good oil shows and good poroperm values were observed in core x in the Raptor Sandstone. 2. The Raptor Sandstone was water wet. The predictive model... Continue Reading →
Friend Of A Friend (FOAF) Concepts, Rocks and Inference.
Friend Of A Friend (FOAF) concepts in Graph data structures can be useful for detective work. For example: Mr Green "knows" Mr Blue who "knows" Mr Red. Mr Green and Mr Red "stayed at" the same address in 2010. But Mr Green and Mr Red (deny that they) "know" each other. Automatically extracting people's names,... Continue Reading →
Question and Answer Digital Assistant
It’s become quite easy to deploy simple question & answer extraction tools to search unstructured text. I posed the question “What share of the UK market do electric cars have in 2022?”. This could be phrased in many ways such as “In the UK what is the electric car market share?” etc, which returns the... Continue Reading →
Garnet Schist from Wissahickon Valley Philadelphia
At a GeoScienceWorld meeting this week. Some beautiful dark red garnets in a schist (metamorphosed shale) from Wissahickon Valley in Philadelphia.
Searching for information – what’s user satisfaction got to do with it?
26 Business Professionals in a multinational corporation were asked to assess their search skill prior to undertaking 2 exploratory search goal tasks (not a single right result) using their enterprise search engine. Task #1 could have potentially many results, Task #2 very few. For each task 4 high value documents were hidden in the search... Continue Reading →
The value of data – through text analytics
I created a vectorspace model using 700 UK license relinquishment reports, comparing companies to risk (x-axis) and uncertainty (y-axis) using word vectors and cosine similarity. Based on patterns in text, those companies in the top right quadrant have a higher 'similarity' to risk and uncertainty; those in the bottom left - the opposite. The companies... Continue Reading →
Natural Language Processing: Cross Plots to determine potentially hidden associations
Word embeddings in Natural Language Processing (NLP) are a representation of words in real valued vectors that encode the meaning of the word. Words closer in vector space are likely to be similar in meaning. From 5,000 reports, the cross plots above shows the word vectors for 1500 minerals to the word vectors for hydrothermal... Continue Reading →
Geology of Mars by Text Analytics 2
I have been experimenting with text analytics on 500 public Mars Geology documents. Following on from my last post spatialising data on a map, I have also explored multivariate heat map clustering. Recognition to Metsalu and Vilo (2015) for clustering visualisations originally developed for Nucleic Acid research.
Mars Geology by text analytics
#mars #nasa #geology #naturallanguageprocessing