Congratulations to Dean Pereira de Melo, Geological Data Manager at Petrobras in Brazil on his MSc with Distinction award! His dissertation on "Information Culture in Oil & Gas Companies" is an insightful work of value to academics & practitioners. He undertook his Masters in Petroleum Data Management at the University of Aberdeen where I acted... Continue Reading →
Erratics in Central Park, New York
Even amongst the hustle and bustle of New York City, Geological marvels can be found. I could not resist taking a photo of these 12 feet high boulders while visiting this week. Central Park is peppered with huge boulders that look precariously perched on top of the ancient glistening bedrock. These are rounded glacial 'erratics'... Continue Reading →
Transforming Text Extraction in Petroleum Geoscience through Machine Learning: 94.52% Accuracy
One of the key tasks in Natural Language Processing (NLP) for the Petroleum Geoscientist is detecting entities in text, such as 'source rock'. The challenge is that just using the term 'source rock' and it's plural form 'source rocks', would miss 22% (recall) of all occurrences (false negatives) for 'source' in its word sense of... Continue Reading →
Machine Learning in Oil & Gas Exploration: Clustering Annotations
I've clustered the labels I annotated recently for 22,528 sentences (extracted from randomly sampled public domain petroleum exploration reports). There are 73 labels, I've shown a subset in the poster above. The labels represent 96,197 label relations (arc edges). The hierarchical cluster heatmap (Metsalu and Vilo 2015) in the poster uses Pearson Correlation (rather than... Continue Reading →
Bypassed Information Pay
I have been thinking about conceptual models relating to the vast (and continually growing) unstructured text collections within enterprises. Regardless whether this is in hardcopy form in libraries/archives or digital form on file systems or document management systems. In the oil & gas industry, the concept of 'missed pay' is given to a reservoir zone... Continue Reading →
Ammonite Pavement
Always a privilege this week to see the 'ammonite pavements' in Lyme Regis on the Jurassic Coast in Dorset, UK. Above are some of my photographs of hundreds of large ammonites exposed at low tide. Thought provoking to imagine as you walk over the fossilised sea floor of 200 Million years ago, when the UK... Continue Reading →
Finished labelling 25,000 petroleum geoscience sentences for machine learning
It’s taken me 6 months elapsed time, but I have finally finished manually labelling 25,000 (yes - twenty-five thousand!) petroleum geoscience sentences from global public domain sources. I’m using these to experiment training a machine learning classifier which, using deep context, can predict the topics of any passage of geoscience text hitherto unseen by the... Continue Reading →
Introducing the DMA Model for Text Analytics
When presented with large volumes of text there are a number of techniques when applying text analytics. I developed the DMA Model as a simple conceptual way to categorize the main types. Rules based or machine learning techniques can be used individually or together for each of these 3 areas: Document Centric This scenario occurs... Continue Reading →
Enterprise Search: A State Of The Art
New research published this week for Enterprise search. Insights from the petroleum, life sciences, aerospace, intelligence services, manufacturing, retail and legal sectors for digital transformation. In many respects Enterprise Search & Discovery may have become part of the Corporate Exobrain. Complementing tacit networks allowing individuals and teams to extend brainpower by searching and exploiting explicit... Continue Reading →
Word embeddings and language models in Geoscience
Understanding a word by the company it keeps (Firth 1957) and the Distributional Hypothesis (Harris 1954) - words that occur in the same contexts tend to have similar meanings - are concepts that have been with us for over a half a century. However, in the past few years we have seen a remarkable body... Continue Reading →