Study: Developers Using AI Coding Assistants Suffer from 41% Increase in Bugs

Technology October 02, 2024

Bing AI generator

A recent study by Uplevel, a firm that analyzes coding metrics, has revealed that AI coding assistants like GitHub Copilot are not significantly improving developer productivity or preventing burnout, despite the hype surrounding these tools.

TechSpot reports that the rise of generative AI has led to a surge in the development of AI coding assistants, with tools like GitHub Copilot promising to revolutionize the way developers work. These assistants are designed to make coding faster and easier, with the expectation that they will boost productivity and reduce the risk of burnout among developers. However, a recent study by Uplevel, a firm that specializes in analyzing coding metrics, has found that these promised benefits are not materializing.

The study, which tracked around 800 developers over three-month periods, compared their output with and without the use of GitHub Copilot. Surprisingly, the results showed no meaningful improvements in key metrics such as pull request cycle time and throughput for those using the AI coding assistants. This finding contradicts the claims made by GitHub and other proponents of AI coding tools, who have touted massive productivity gains.

Matt Hoffman, a data analyst at Uplevel, explains that the team had initially expected developers using AI tools to write more code and introduce fewer defects, as the assistants would help review code before submission. However, the study’s findings defied these expectations. In fact, developers using Copilot were found to introduce 41 percent more bugs into their code compared to those not using the tools. Additionally, Uplevel found no evidence to suggest that AI assistants were helping to prevent developer burnout.

These revelations run counter to a GitHub-sponsored study that had earlier claimed a 55 percent increase in coding speed for developers using Copilot. While it is possible that developers are seeing some positive results, as evidenced by reports showing nearly 30 percent of new code involving AI assistance, another possibility is that coders are developing a dependency on these tools and becoming lazy.

In the field, experiences with AI coding assistants have been mixed so far. Ivan Gekht, CEO of custom software firm Gehtsoft USA, told CIO that AI-generated code has been challenging to understand and debug, sometimes making it more efficient to rewrite from scratch. This observation is backed by a study from last year, which found that ChatGPT got over half of the programming questions it was asked wrong, although the chatbot has since improved with multiple updates.