Understanding a word by the company it keeps (Firth 1957) and the Distributional Hypothesis (Harris 1954) - words that occur in the same contexts tend to have similar meanings - are concepts that have been with us for over a half a century. However, in the past few years we have seen a remarkable body... Continue Reading →
Delighted to be appointed Visiting Professor of Information Science & Technology at RGU. An exciting time for studying the intersection between search & discovery in the enterprise, advanced analytics and human behaviour.
I was in Healdsburg, California this week with the GeoScienceWorld team. Some very interesting demonstrations from the University of Kansas discussing text mining to support research questions such as "what causes bioerosion fluctuations through geological time?" which is important for oil and gas reservoir quality. Healdsburg is 70 miles north of San Francisco in the Sonoma Valley... Continue Reading →
My review of Wu and Liang's Book - Mobile Search Behaviors: an in-depth analysis based on contexts, apps and devices (review published in JLIS) is now on OpenAir (RGU's Open Access site). https://openair.rgu.ac.uk/handle/10059/3428
Some of the fossil shark teeth I found recently from the Peace River in Florida. Megalodon (large), Lemon Shark (top left), Sand Tiger Shark (bottom middle narrow), others include Tiger Shark, Snaggletooth Shark and Stingray. Miocene to Early Pliocene age (23-5 Million years ago) when most of Florida was submerged. You stand in the river... Continue Reading →
The write up from the expert centric digital technology seminar in January 2019 is now on the Finding Petroleum website Proceedings A good write-up of my presentation and feedback.
I conduct research on pattern recognition in geoscience unstructured text. But nothing can beat the real thing! A half broken Ichthyosaur Vertebra I found last month from the dark mudstones, clays & marls of the Liassic on the Dorset coast in England. Palaeontology is after all about pattern recognition. Looking for specific patterns (deductive) but... Continue Reading →
Part of Speech (POS) tagging is an important technique in Natural Language Processing (NLP). For example, differentiating between 'play' (grey) as a noun and 'play' (green) as a verb. Whilst most practitioners use this technique for rule-based NLP approaches, it also has its uses in unsupervised Machine Learning (ML). For example when using vectorspace/text embeddings... Continue Reading →
Semantics is about meaning. We often use different terms to describe the same thing. This might be because the words are very similar (synonyms) for example, or because we are choosing different levels of granularity when we describe (hypernyms). See Table 1 for a more complete list. Table 1 - Some different types of lexical... Continue Reading →