Byte-Sized Chemists: Can AI Think Like a Scientist?

Matthew Smith

3 min read

Artificial intelligence (AI) is making extraordinary breakthroughs in science, with AlphaFold – an AI model that predicts 3D protein structures - securing the 2024 Nobel prize for Chemistry. Today, large language models (LLMs) are increasingly integrated into research workflows, from data analysis to literature review. Some even go beyond, facilitating scientific discovery itself. But what does it really mean for AI to ‘do’ science?


LLMs are a type of AI which are trained on vast amounts of written data and are designed to predict the next word (or ‘token’) in the sequence. Recently, a San Francisco start up – Future House – released a new open-source chemical and molecular reasoning LLM which outperformed other models in scientific tasks. On a mission to automate scientific discovery, ether0 is their first step to achieve specialised scientific intelligence. 


Ether0 is based on a small LLM (Mistral AI), which was trained using reinforcement learning – by taking almost 600,000 chemistry test questions that require working with chemical structures. As the result, ether0 can return a SMILES chemical structure as a response to a variety of tasks involving molecular design. Furthermore, while other LLMs can be like a ‘black box’, ether0 provides a glimpse of its reasoning chain (or ‘reasoning tokens’) in natural language before coming to a final answer in form of a molecule. 


In terms of performance, the leading models such as OpenAI’s o3 and Claude Opus can score higher than human experts in chemistry knowledge, but fall short in working with actual chemical structures. While ether0 has limited abilities to answer knowledge-based questions, it excels in organic synthesis route planning and structure optimisation (pH, pKa etc). 


The ultimate dream in AI circles is to build a standalone ‘AI scientist’ which is able to carry out the entire research pipeline, leading to scientific discovery. The most prominent example in this space is Google’s AI Co-Scientist, released in February 2025.  It is a ‘multi-agent’ system based on Gemini 2.0 LLM, where each ‘agent’ is a separate AI model focusing on carrying out a narrow task of generating, debating and refining research hypotheses and experimental plans. The ideas are fed through a ranking system (another ‘agent’) which prioritizes them based on scientific quality. 


The Co-Scientist’s main purpose is to collaborate with researchers, who can propose and interrogate ideas in a peer-review style. The model has already been tried and tested by biomedical research groups from Stanford University and Imperial College London to propose novel target ideas for cystic fibrosis treatment and explore antimicrobial resistance mechanisms. However, it still remains under debate if these discoveries can truly be considered ‘new’. 


The progress doesn’t stop here. More ‘co-scientist’ platforms are emerging, including Virtual Lab platform which offers a chat-bot style conversation with several LLM-powered characters with varied scientific specialisations. Or perhaps the future is in the Sandbox’s Large Quantitative Models (LQMs) - next generation AI systems which rely on scientific data and simulations rather than natural language? One thing is clear - AI tools are here to support scientists, not replace them. 

 

 

This collaborative article was led by Anna Kukushkina, who interned with our Chemistry team last year.

News, insights, and features

Stay up to date with our latest thinking.