Over 150 BSc, MSc and PhD geological questions released to help benchmark geological Gen AI

Over 150 BSc, MSc and PhD geological questions released to help benchmark geological Gen AI. These were released by the team at GeologyOracle the free AI to answer geological questions, extract data from documents, code and interpret sketches and photographs.

Hopefully more elements will be Open-sourced over the coming months such as the open-access training data used. See my previous posts (and research paper) on other areas to be improved to create a more ethical system.

As with all web hosted LLM based tools a word of caution to geoscientists. Any prompts and documents uploaded can be used by vendors, nonprofits and governments for their own purposes. These tools are never ‘free’ it’s at the cost of your ‘data’.

Trusted locally hosted solutions that individuals and institutions control themselves are the way to go. This ensures you retain control of your confidential or proprietary geoscientific and personal data. It’s becoming easier to do this by the day.

Also bear in mind some LLMs promoted to the international community (not the case with GeologyOracle) are funded by governments with extremely strict censorship laws.

This means the original training data (and additional reinforcement learning) will manipulate the answers to omit data or present an overly favourable impression of a certain governments policies, organisations and individuals. This can be subtle and change weekly. If in doubt do your research, there are probably certain LLM models that as scientists we may wish to avoid for certain purposes.

There are enough issues with LLMs producing convincing misinformation implicitly, without adding an another layer of deliberate government intervention for censorship.

On a separate note, there are also claims being made on “Open-source” in this geoscience space. I advocate the two question test.

(1) Can you recreate the entire app behind your firewall based on what has been freely released (training methods, LLM model, code for software tools built on top of the LLM model) so you never have to use the vendor/lab’s hosted app?

(2) Can you use what has been released for any purpose (academic and commercial) and create derivatives and release those?

If the answer is not ‘yes’ to both questions, then it is NOT Open-source!!

Link to the questions below. Kudos to Andrea Baucon & colleagues, the researcher and developer. I also include my peer reviewed paper on AI ethical recommendations in this area.

www.geologyoracle.com

Leave a comment

Website Powered by WordPress.com.

Up ↑