Combining meaning and prediction in text analytics

Part of Speech (POS) tagging is an important technique in Natural Language Processing (NLP). For example, differentiating between 'play' (grey) as a noun and 'play' (green) as a verb. Whilst most practitioners use this technique for rule-based NLP approaches, it also has its uses in unsupervised Machine Learning (ML). For example when using vectorspace/text embeddings... Continue Reading →


What are we missing? The effect of semantics on search results.

Semantics is about meaning. We often use different terms to describe the same thing. This might be because the words are very similar (synonyms) for example, or because we are choosing different levels of granularity when we describe (hypernyms). See Table 1 for a more complete list. Table 1 - Some different types of lexical... Continue Reading →

Text Analytics – Surprising Sentences in Oil and Gas Exploration Geoscience

Driven by an information need to ‘show me something I don’t already know’, I conducted an exploratory study recently to investigate whether algorithms in general had the potential to suggest ‘surprising sentences’ from geoscience text. Ten geoscientists (consisting of 8 experienced exploration geoscientists and 2 support staff with geoscience backgrounds) each rated 100 test sentences... Continue Reading →

Interview with the AAPG

    I I was interviewed by Dr Susan Nash this week, Director of Innovation and Emerging Science, at the American Association of Petroleum Geologists (AAPG). The topic was how I became involved in innovation - along with some of my recent work on the detection of 'surprise' and the 'potentially surprising' in geoscience texts.... Continue Reading →

Introducing Infoscience Technologies Ltd

  I founded a new technology start-up recently focusing on developing Python code, lexicons and training sets to aid the extraction of knowledge from geoscience unstructured text. Target industries range from geological surveys, oil & gas exploration, economic mining through to geo-health and space exploration. The focus is development of Intellectual Property (IP) through Python... Continue Reading →

Detecting surprise in geoscience text

  The video and slides from the conference last week in London are online. Click on the article title 'Detecting surprise in geoscience text' in this tile to bring up the post full view and then you will see the link to the video is clickable. Click here to watch the talk and slides

Oil and Gas Taxonomy

Using taxonomies and ontologies to extract knowledge from text. Domain Taxonomies can play a crucial role in many automated Machine Learning tasks. However, in one Study research showed that over 34% of concepts in a taxonomy can remain undetected (false negatives) if a taxonomy is only created manually. Augmenting the taxonomy design process with inductive... Continue Reading →

Enterprise search satisfaction

Happy New Year! A nice start to 2019, academic paper published 2nd Jan 2019 in Vol 45(1) Journal of Information Science. "Enterprise search and discovery capability: The factors and generative mechanisms for user satisfaction". Available in RGU OpenAir here and SAGE Journals subscription here

Powered by

Up ↑