Word embeddings and language models in Geoscience

Understanding a word by the company it keeps (Firth 1957) and the Distributional Hypothesis (Harris 1954) - words that occur in the same contexts tend to have similar meanings - are concepts that have been with us for over a half a century. However, in the past few years we have seen a remarkable body... Continue Reading →

Appointed Visiting Professor of Information Science & Technology

Delighted to be appointed Visiting Professor of Information Science & Technology at RGU. An exciting time for studying the intersection between search & discovery in the enterprise, advanced analytics and human behaviour.

From Geological Text Mining, Bio-erosion and Oil Exploration to Plate Tectonics, Geothermal Power, Schlumberger and Wine Making!

I was in Healdsburg, California this week with the¬†GeoScienceWorld team. Some very interesting demonstrations from the¬†University of Kansas discussing text mining to support research questions such as "what causes bioerosion fluctuations through geological time?" which is important for oil and gas reservoir quality. Healdsburg is 70 miles north of San Francisco in the Sonoma Valley... Continue Reading →

Shark teeth

Some of the fossil shark teeth I found recently from the Peace River in Florida. Megalodon (large), Lemon Shark (top left), Sand Tiger Shark (bottom middle narrow), others include Tiger Shark, Snaggletooth Shark and Stingray. Miocene to Early Pliocene age (23-5 Million years ago) when most of Florida was submerged. You stand in the river... Continue Reading →

Pattern Recognition (Human Based!)

I conduct research on pattern recognition in geoscience unstructured text. But nothing can beat the real thing! A half broken Ichthyosaur Vertebra I found last month from the dark mudstones, clays & marls of the Liassic on the Dorset coast in England. Palaeontology is after all about pattern recognition. Looking for specific patterns (deductive) but... Continue Reading →

Combining meaning and prediction in text analytics

Part of Speech (POS) tagging is an important technique in Natural Language Processing (NLP). For example, differentiating between 'play' (grey) as a noun and 'play' (green) as a verb. Whilst most practitioners use this technique for rule-based NLP approaches, it also has its uses in unsupervised Machine Learning (ML). For example when using vectorspace/text embeddings... Continue Reading →

Powered by WordPress.com.

Up ↑