Friend Of A Friend (FOAF) concepts in Graph data structures can be useful for detective work. For example:
Mr Green “knows” Mr Blue who “knows” Mr Red. Mr Green and Mr Red “stayed at” the same address in 2010. But Mr Green and Mr Red (deny that they) “know” each other. Automatically extracting people’s names, addresses, dates and associated textual relations automatically from text into Graph data structures – can help us join the dots and surface interesting insights. These might be oddities, contradictions or outliers. Especially useful when there is too much document text to ever feasibly read.
Applying to geology.
I’ve been automatically extracting Geobodies, Stratigraphic / Petrologic Units and their properties from reports along with levels of speculation (see image). Simple example:
Automatically extracting from text, Lithostratigraphic Unit (LU) Yellow “has age” of Campanian. LU Yellow “has lithology” of organic rich black shales. This lithology may give rise to a possible source rock. So although it is never actually explicitly stated in thousands of reports, we can deduce through transitive inference in the graph that there is a possible Source Rock of Campanian Age. This contradicts the hypothetical orthodoxy for the area – which may lead to a new line of thinking.
Highlighting connections that are most interesting – likely to surprise, likely involve surfacing a contradiction to an existing mental model or orthodoxy of ‘the way things are’.
This could be achieved by loading existing knowledge representations into the same graph as those automatically extracted from text.
Another method is linguistically modelling obvious and non-obvious clues and detect uncertainty-speculative tone around the concepts and entities being extracted from text. Storing these as properties in the graph may also help rank the ‘more interesting’ connections – those connections which are less likely to be known by a person familiar with the subject area.
Textual entailment is another operating on the sentences in which the associations are found, to highlight contradictions, neutrality or entailment. This is an ongoing area of research for me.