Get ready for a new kind of 'spy-vs-spy' battle around the AI watermarking goals and bad actors
After a meeting with executives from key AI technology firms, including Amazon, Google, Meta, Microsoft, and OpenAI, President Biden announced that the companies had agreed to four commitments. These range from best practices, such as enhancing system security and product testing, to the ‘moonshot’ goals of watermarking AI content and using AI to solve critical societal challenges in areas like health care.
While solving societal challenges is aspirational, watermarking AI-produced content may be challenging. It also raises questions regarding what constitutes ‘AI generated’ and whether the government should push technology providers to label content produced using their tools.
Watermarking is inherently tricky, and the techniques vary by medium. In many cases, they rely on a shared secret, such as a textual pattern, list of special words, or watermarking location or pattern in a file, between the developer and those who make tools to detect the developer’s watermark.
If this secret becomes more widely known, either due to a leak or it being reverse-engineered, AI users who wish to remove the watermark can readily do so.
WHAT IS ARTIFICIAL INTELLIGENCE (AI)?
For some watermark technologies, simply moving the output to an analog medium and back (such as displaying it on a screen and recording it from there) is all that is required to remove the watermark. Incorporating watermarking into open-source products can be particularly difficult due to the ability of users to remove or disable watermarking functionality in the application’s publicly available code.
While some of these techniques will require programming knowledge and make them difficult to remove, for average users, it would seem to be only a matter of time until someone or some group makes a tool to do this automatically.
State actors will likely be able to source tools that don’t include watermarking functionality – or will have the capability to remove it – limiting the efficacy of including it to identify nation-state-sponsored misinformation campaigns.
BIDEN FLOATS NEARLY $20M IN PRIZES FOR AI TOOLS THAT SECURE US COMPUTER CODE
An alternate approach of signing content that is sourced from cameras as legitimate has been proposed. However, this is readily overcome too. Not only is there a risk of the signing technology being compromised, but there is also the comparatively easy approach of simply using a signing camera to take a picture or video of content off a screen. Whether identified by the absence of a watermark or the presence of a digital signature, this content – which may still be an AI-generated fake – will have greater credibility.
In addition to the problems with the watermarking technologies, there are also issues on the detector side. One key challenge is that AI content detectors won’t know what AI tool was used to create them, so they’ll have to check them against multiple watermarking technologies and use other techniques to try to identify content generated using AI tools that don’t watermark or content where the watermark has been removed.
INSTAGRAM LABELS WOULD IDENTIFY META AI-GENERATED IMAGES: REPORT
Problematically, these tools can ensnare legitimate content. For example, a recent study showed that they incorrectly detected almost all text written by non-native English speakers as AI-generated.
Watermarking also won’t prevent humans from using an AI tool to do background research or to generate an outline or text that a human paraphrases. It is also problematic because it may mark content co-created by a human and AI as AI-generated, diminishing the recognition of the human creator’s contribution to the work.
CLICK HERE FOR MORE FOX NEWS OPINION
While it is important to identify fakes – and attempts to manipulate viewers and readers – government regulation in this area is inherently problematic and raises key First Amendment questions.
More practically, it is unlikely that regulations will be able to keep up with technological progress. This means they may be ineffective at producing their desired goal and impair technological innovation. Whether promulgated by legislation, agency rulemaking or technology provider agreement, regulation in this area must be agile to be effective and unharmful. Unfortunately, this is not typically the case.
Source credibility is one potential non-technological solution to this problem. Before social media and ubiquitous camera phones, the public relied on journalists to collect the news and editors to regulate journalists’ conduct. While there have been notable lapses, these are exceptions and not rules. Because content producers rely on public trust, they have an inherent interest in maintaining it – by vetting all content they publish.
Despite the auspiciousness of a White House announcement, AI watermarking will be technologically difficult. At best, it will be a spy-versus-spy battle between developing watermarking techniques and getting around them. The president's plan to use AI to tackle societal issues in areas such as medicine and environmental protection seems more achievable than implementing watermarking.
—Jeremy Straub is an associate professor in the North Dakota State University Computer Science Department, a Challey Institute Faculty fellow, and the director of the NDSU Institute for Cyber Security Education and Research. The author's opinions are his own.
Jeremy Straub is an associate professor in the North Dakota State University Computer Science Department, a Challey Institute faculty fellow, and the director of the NDSU Institute for Cyber Security Education and Research.