Over the past few years, geoscience and data science knowledge was used to label over one million diverse geoscience sentences from public domain Internet sources (papers, reports, presentations etc.).
The purpose was to identify clues for source rock, maturation, migration, hydrocarbon occurrence, reservoir, trap and seal as mentioned in unstructured text; in such a way they could be used for automated inference. This included both obvious explicit terms and phrases, along with more subtle non-obvious textual clues.
These data are used with Natural Language Processing, Machine Learning and a First-of-a-Kind novel ‘DNA inspired’ method to create a predictive classifier. The algorithm (OpportunityFinder®) can surface non-obvious patterns of interest that may be useful to an Exploration Geoscientist.
These may be contained within any repository of reports, documents or text, too large for a person to ever read. This may include old hardcopy reports now scanned/digitized, those in different languages, external and internal to an organization. The resulting patterns, which are surfaced from trillions of permutations, can be displayed in time and space to assist the Geoscientist with discovery and ideation.
The image below in Fig 1 is a simulation of data extracted from a large body of reports and geo-referenced using OpportunityFinder®. The pie-charts represent the differing elements that have been discovered in text (e.g. potential trap/seal clues in green).
Fig 1 – Thematic play elements from text (public domain WMS Basins data)
These allow the Geoscientist to drill down in more detail. These raw ‘DNA’ are used by the data driven pattern algorithm in OpportunityFinder® to surface potential plays, leads and opportunities that may not be obvious. These may be browsed by the geoscientist stimulating lines of thought that may not have necessarily occurred had it not been for the algorithm.
This may provide a ‘fast start’ to organizations and aid companies with geoscience exploration. The algorithm (Python) can plug-in to existing search & discovery approaches used by organizations, who can also fork their own version of OpportunityFinder® should that be required.
There are also opportunities to target a variety of geological themes not currently addressed should that be of interest. More at:
Image credit: Bletchley Park Bombe (replica of the original Bombe) Antoine Taveneaux CC BY-SA 3.0