November Roundup

Nov 21, 2024

Welcome to November’s monthly roundup - a monthly newsletter, covering papers and other reading I’ve done during each month. I also share on LinkedIn throughout the month.

This month’s roundup looks at three papers that use AI to generate hypotheses for scientific research. Generating hypotheses is a key part of research, and usually involves manual time and effort from researchers. AI can help in several ways - some I’ve seen are:

Ingest literature in a research field and suggest directions for work
Automatically analyse a dataset and suggest explanations to explore further
Suggest new and novel molecules for additional experiments, e.g. in drug discovery or for new materials
Cluster data to identify clusters that might be scientifically relevant
Detect anomalies in data that might be interesting to research further
Use a dimensionality reduction technique or explainable supervised learning to identify features that might be important for a task

This month’s papers cover the first three of these avenues.

Papers

Artificial Intelligence, Scientific Discovery, and Product Innovation

Find the paper: https://aidantr.github.io/files/AI_innovation.pdf

Can AI accelerate scientific discovery? This paper looks at the adoption of an AI tool in a commercial materials science research lab of about 1000 scientists. The scientists were given access to an AI tool that was capable of suggesting new and novel materials - a key part of their job role to date.

The study found that, on average, the scientists using the AI tool discovered 44% more materials. This led to a subsequent increase in the number of patents being filed, and also the number of new materials being incorporated into product prototypes. While using the tool, the day-to-day job of the scientists switched. They spent less time coming up with novel material ideas, and more time evaluating potential new materials that the AI tool had suggested. Scientists who exhibited better judgement in this task of evaluating materials benefited more from the AI tool.

The new material ideas generated by AI were more novel than expected, and also didn’t compromise on quality. But, a survey of attitudes showed that there was a negative impact on scientists’ satisfaction with their roles. Common complaints included skill underutilisation and more repetitive, less creative, work.

Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers

Find the paper: https://www.arxiv.org/abs/2409.04109

Can LLMs generate expert-level research ideas? This paper compares NLP research ideas generated by LLMs and by expert researchers across a range of subtopics like multilingual NLP, mitigating hallucinations & reducing social bias in LLM outputs.

The authors used a RAG pipeline, where papers relevant to each subtopic were retrieved from Semantic Scholar and automatically ranked. A selection of the retrieved paper abstracts were provided to the LLM via the prompt. The LLM was then prompted to generate many ideas, and duplicate ideas were removed using semantic similarity scores. As a final step, an LLM then reranked all the ideas to find the best among them. This automated reranker was built using publicly available review data. In parallel, expert NLP researchers were asked to propose research ideas in the same subtopics.

The written format of each idea was standardised via a template, and then an LLM further edited the style of the text to avoid differences in writing style affecting people’s judgement. The human & LLM generated ideas were manually reviewed using a review template from NLP conferences. With blind review of the ideas, LLM-generated ideas were judged as more novel but less feasible than those generated by people.

Hypothesis Generation with Large Language Models

Find the paper: https://aclanthology.org/2024.nlp4science-1.10.pdf

Can LLMs generate human-readable hypotheses from data? Interpreting data & coming up with hypotheses to explain it is a key part of scientific research.

This paper looks at social sciences datasets, as you only need text to express hypotheses rather than mathematical notation, and investigates whether LLMs are able to generate valid and high quality hypotheses that can explain the data. For example, in a dataset of dishonest reviews, the system was able to generate the hypothesis that reviews mentioning personal experiences or special occasions were more likely to be honest.

The datasets included honesty of hotel reviews, and popularity of tweets & headlines. In the proposed approach, LLMs were first used to generate hypotheses when given example datapoints. Then, the LLM was used to make predictions about the full dataset to determine how accurate the hypothesis was & to weed out low quality hypotheses. Using the iterative method, the LLM was able to suggest hypotheses that had already been proposed in existing literature, and also propose several new insights about these social science datasets.

Work with me

I work with organisations who are building AI - as a technical advisor, coach and speaker. Get in touch if you’d like to talk about working together.

AI x Insights

Discussion about this post