Artificial intelligence is now reaching a point where broadly accessible tools allow the synthetic creation of highly realistic materials—images, audio, and, increasingly, video sequences entirely generated by AI.
As the industry leaps forward and the human eye strains to tell real from artificial, some experts and entrepreneurs have scrambled to think up solutions.
The digital space is about to become increasingly treacherous, suggested several experts consulted by The Epoch Times.
High-quality fabrications are already quick and easy to produce. Verification services have emerged, using AI to spot AI. The logical progression, however, will be an arms race between AI generators and AI detectors, leading to increasingly sophisticated fakes.
The result will be a virtual reality in which users largely lose the ability to discern the genuine and the fake from looking at the content itself. Increasingly, they’d need to rely on third-party verification services or sources of information that have developed a solid record of authenticity.
The issue is likely to attract the spotlight in the upcoming election season amid heightening concerns over the authenticity of political information.
Power of Video
Generative AI improved in leaps and bounds last year, going from surprisingly adept to eerily realistic.
“We saw from the beginning of 2023 to the end of 2023, images kind of pass the eye test,” said Anatoly Kvitnitsky, founder and chief executive of artificial intelligence detection service AI or Not.
“Our prediction is, video is going to have a similar moment in 2024.”
Realistic AI video is “definitely within our reach,” said Robert Marks, a computer science professor at Baylor University who has significantly contributed to the field of machine learning.
“In the future, you can do it from scratch, and it will be really realistic,” he told The Epoch Times.
Some of the experts pointed out that fake information, political or otherwise, isn’t a new phenomenon, and it has long been possible to fabricate imagery.
“Fake stuff has been around for a long, long time. ... It’s just now that we have an automatic way of generating it,” Mr. Marks said.
Yet producing fake video footage has always been tricky and resource-intense. Typically, video hoaxes involve the doctoring of real footage.
The possibility of realistic video made from scratch presents danger on a new level, according to John Maly, a computer engineer and intellectual property lawyer who has delved deeply into the issue of AI use and misuse.
“The danger of a video is it’s much more impactful,” he told The Epoch Times.
Just as in past elections, if a “dossier” emerges with allegations about a candidate, the “average person’s not going to read it,” Mr. Maly said. Most people are likely to learn the gist of the allegations through the filter of their preferred media outlet, provided that the outlet chooses to cover it in the first place.
But video is different, he said.
“If there’s a 30-second video of someone using racial slurs or something like that, that’s going to have an immediate impact on everybody,’ Mr. Maly said.
“And I think that immediacy is going to be what makes this really a more dangerous tool than the others.”
AI would open powerful disinformation tools to groups and individuals who are too fringe to obtain the resources previously needed to access such capabilities, he said.
“Activists can inexpensively put these things together and leak them and get them circulating,” Mr. Maly said.
AI could also boost the secrecy of disinformation operations, he said.
If one were to stage a fake event, for example, the risk of having it exposed increases with each participant. Creating the entire scene digitally dramatically reduces the number of people aware of the scheme.
“One person can script and create an entire deepfake video, and only that person will know they created it, and it makes it much harder to trace after the fact,” Mr. Maly said.
Time to Maturity
So far, AI has been successfully used to manipulate video, such as by adding or removing objects or manipulating the face of a person in the video. Some of the most recent tools allow the creation of completely synthetic video clips, although so far limited to just a few seconds in length and simple action, such as a person turning his head or walking down the street.
The capabilities are expected to significantly expand over the coming months.
“It will be just in time to create misinformation around politics as we’re going to be heads-down into our election season,” Mr. Kvitnitsky told The Epoch Times.
Mr. Marks wouldn’t go as far as to predict fully realistic AI video by Election Day, but he doesn’t discount the possibility.
“What’s going to happen by October? I don’t know. But the acceleration of artificial intelligence has been astonishing,” he said.
Mr. Marks expects that if the technology is used to affect the election, it will be timed as an “October surprise” so as to cut short the window of opportunity to debunk it before the votes are in.
“Now, we have tools to do October surprises with much more sophistication than we had in the past,” he said, noting that “it’s going to be more difficult to counteract.”
Mr. Maly predicts that realistic AI video technology is still two to three years off.
“I don’t think it’s going to be this coming year, but then again, I think the way we’re going to discover that it’s possible is that the entities with the deepest pockets—so governmental actors from first-world countries—they’re probably going to be one of the first ones that manage to pawn off an undetectable deepfake that will get detected much later,” he said.
AI Detection
Despite technology advances, AI-generated content has been, up until quite recently, detectable with minimal effort. Synthesized voices speak with unnatural inflection, especially when trying to mimic emotion; figures in generated images sport unnatural anatomy, such as missing or extra digits.
Yet over the past several months, these issues have been greatly mitigated. AI synthesizers can now speak in the voices of real people in a lively, even animated manner; AI images now feature persons with natural skin texture and accurate anatomy.
There are still noticeable defects, especially in more complicated scenes with multiple people in the frame. AI seems to still struggle with text in the background. A storefront sign in the background, for instance, often shows nonsensical or garbled text.
“However, if those ‘giveaways’ are not there, it becomes really difficult—especially in the higher-quality images, like in the newer models—to detect,” Mr. Kvitnitsky said.
“In many cases, it’s not detectable by the naked eye,” Mr. Marks said.
In recent years, several companies have developed tools that use AI to detect AI-generated content.
Mr. Kvitnitsky’s AI or Not is one of them.
“We’ve trained our model on millions of images, both real and AI-generated. ... We’ve done the same thing with audio,” he said.
Mr. Kvitnitsky said that “there are always artifacts that each respective AI model leaves behind”—a “combination of pixels” or sub-second wavelength patterns in an audio file, which would be unintelligible to the human eye or ear.
“It is that level of granularity that’s required,” he said.
The AI or Not website doesn’t store the images and audio files that it checks and so can’t produce overall statistics on the tool’s reliability, Mr. Kvitnitsky said. He does have information from some of his clients who report about 90 percent reliability.
The Epoch Times tested the tool on a mix of about 20 real photographs and AI-generated images. Most of the real photos were of people, in portrait or photojournalistic style, and edited for lighting, contrast, and color tone. The AI pictures were all of people in candid, selfie, or portrait style, produced by Midjourney V6, a freshly released version of a popular AI-image generator that has garnered rave reviews for realism. AI or Not correctly identified all AI images but mislabeled one of the real photographs as likely AI-generated.
The problem of false positives—mistaking human-made content for artificial—is one of the reasons why Mr. Marks doubts that AI detectors can be an “effective way” to deal with deepfakes. If the tools were to be used to identify fake content, and if such content were to be taken down and users penalized for posting it, many would be punished wrongfully, he said.
“None of the detection techniques is 100 percent effective,” Mr. Marks said.
The Alphabet-owned YouTube has already announced that it plans to roll out new rules requiring users to self-label videos containing “realistic altered or synthetic material,” such as “an AI-generated video that realistically depicts an event that never happened, or content showing someone saying or doing something they didn’t actually do.”
“Creators who consistently choose not to disclose this information may be subject to content removal, suspension from the YouTube Partner Program, or other penalties,” the Nov. 14, 2023, announcement states.
Read more here...