March Roundup
Welcome to March’s monthly roundup - a monthly newsletter, covering papers, events and other reading I’ve done during each month. I also share on LinkedIn and Bluesky.
Work with me! I work with organisations who are building AI - as an advisor, coach and consultant. Email me to explore working together.
Need an AI speaker? Check out Women of AI Agency, which launched for International Women’s day this month, and I’m thrilled to be a part of.
Articles & other Links
We can’t afford to celebrate International Women’s Day in the tech sector
Sakana claims its AI-generated paper passed peer review — but it’s a bit more nuanced than that
I cloned my voice in seconds using a free AI app, and we really need to talk about speech synthesis
Career Spotlight
The first of two new projects this month is a career spotlight series. The goal here is to shine a light on different career paths in AI, and give you some inspiration from the way these brilliant people have navigated their careers:
Svetlana Stoyanchev, Senior Research Engineer at Toshiba
Maria Mestre, Senior AI Engineer at Taylor & Francis
James Leoni, Director of Machine Learning at Papercup
Alessandra Tosi, Senior Scientist at Mind Foundry
Jo Stansfield, Founder and CEO at Inclusioneering
Peter Wooldridge, Director of Machine Learning at Monolith
Simon Fothergill, Senior AI Engineer
Podcast
The second new project this month is a podcast, focused on what it takes to build and scale AI. Each episode focuses on a different aspect of AI. Listen in, and like & share!
The first episode is with Prass Ravishankar from Papercup, talking about his experience building an ML Platform team.
The second episode is with Sarah Coward from In The Room, focused on authenticity in their Conversational AI products.
AI Updates
Music Protest
Musicians have launched an album to protest the UK Government’s proposed copyright laws around generative AI. The album features silent recordings from recording studios, titled “Is This What We Want?”
One concern is that generative AI can imitate an artist and directly infringe copyright - think of AI creating an image in the style of a living artist, or a new album that sounds just like your favourite band.
But there’s another, deeper issue: AI-generated content that doesn’t copy anyone directly, yet still takes attention and revenue away from human creators. You and I only have so many hours in the day, and if we’re listening to AI generated music then we’re not listening to music created by artists we know and love. Read more at The Guardian.
Papers I’ve read
SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering?
Find the paper: https://arxiv.org/abs/2502.12115
How good is AI at real-world Software Engineering tasks?
We often hear that AI will replace coders, but the current benchmarks used to evaluate coding ability of AI models are relatively simple. To create a more complex real-world benchmark, this paper curates 1488 freelance software engineering tasks from Upwork, with a total value of $1M.
About half of the tasks were individual contributor (IC) coding tasks. These were primarily bug fixes or feature additions, representative of the type of work that freelance software engineers perform. The remainder of the tasks were management tasks, and involved evaluating and choosing between a number of proposals.
To evaluate the AI models’ responses, professional Engineers wrote comprehensive tests for each IC task. For the management tasks, the AI solution was compared against a human selection.
Overall, the models performed better on the management tasks, with the best models (Claude 3.5 Sonnet & OpenAI’s o1) scoring just over 45% on their first pass attempt at the tasks. On the IC tasks, first pass performance for these models was just above 20% - low enough to show that there’s still a way to go before LLMs replace today’s coders.
Thanks for reading! See you next month,
Catherine.