Analysing over 2 Trillion Possible Permutations from Unstructured Text in Under 1 Hour on an i5 laptop.

I recently ran 1 million sentences from public domain geoscience literature articles & reports through the OpportunityFinder® algorithm.

The aim was detecting hydrocarbon exploration play elements and interesting combinations using Natural Language Processing (NLP) and Machine Learning.

This involved analysing over 2 Trillion possible permutations hidden within the text. Through iterative design, I arrived at an optimised method which means the Python based algorithm can complete this process in under one hour using a modest high street i5 laptop.

The key ingredients involved combining hash tables with hierarchical logic. I’m currently experimenting on additional techniques in this area.

3 thoughts on “Analysing over 2 Trillion Possible Permutations from Unstructured Text in Under 1 Hour on an i5 laptop.”

Add yours

Richard Scott says:

August 27, 2020 at 12:07 am

Sounds promising – how adaptable to domain variatiosn – e.g. if you wanted to find mineral sands or something like that?

LikeLike

Reply
1. phcleverley says:
  
  August 27, 2020 at 6:47 am
  
  Hi Richard – Thanks for the question. I have a clue lexicon of 20,000 ways that hydrocarbon play elements may be mentioned in text. Whilst there are some similarities with mining associations (e.g. black shales, hydrothermal) these would need to be tweaked for various mineral exploration targets. The associative clues would be different. It is something on my list to try & experiment with at some point. Paul
  
  LikeLike
  
  Reply
Richard Scott says:

August 27, 2020 at 8:28 am

Thanks Paul.

LikeLike

Reply

Share this:

3 thoughts on “Analysing over 2 Trillion Possible Permutations from Unstructured Text in Under 1 Hour on an i5 laptop.”

Add yours

Leave a comment Cancel reply