I recently ran 1 million sentences from public domain geoscience literature articles & reports through the OpportunityFinder® algorithm.
The aim was detecting hydrocarbon exploration play elements and interesting combinations using Natural Language Processing (NLP) and Machine Learning.
This involved analysing over 2 Trillion possible permutations hidden within the text. Through iterative design, I arrived at an optimised method which means the Python based algorithm can complete this process in under one hour using a modest high street i5 laptop.
The key ingredients involved combining hash tables with hierarchical logic. I’m currently experimenting on additional techniques in this area.
Sounds promising – how adaptable to domain variatiosn – e.g. if you wanted to find mineral sands or something like that?
LikeLike
Hi Richard – Thanks for the question. I have a clue lexicon of 20,000 ways that hydrocarbon play elements may be mentioned in text. Whilst there are some similarities with mining associations (e.g. black shales, hydrothermal) these would need to be tweaked for various mineral exploration targets. The associative clues would be different. It is something on my list to try & experiment with at some point. Paul
LikeLike
Thanks Paul.
LikeLike