Apple, Nvidia and Salesforce are using content on YouTube to train their AI.
Subtitles from 173,536 YouTube videos spread across 48,000 YouTube channels were used by these companies as training data despite YouTube's rules against harvesting information, according to Proof News and Wired.
The dataset - called YouTube Subtitles - includes transcripts from educational channels like Khan Academy, MIT, and Harvard, as well as media outlets such as The Wall Street Journal, NPR, and the BBC.
Late-night shows like The Late Show, Last Week Tonight, and Jimmy Kimmel Live were also used, thge report says.
Additionally, Proof News found that popular YouTubers like MrBeast, Marques Brownlee, Jacksepticeye, and PewDiePie had their videos included.
David Pakman, host of The David Pakman Show, which sports more than 2 million subscribers and more than 2 billion views, commented: “No one came to me and said, ‘We would like to use this.”
“This is my livelihood, and I put time, resources, money, and staff time into creating this content. There’s really no shortage of work,” he added, arguing that if AI companies are paid, he should be compensated for his data.
Dave Wiskus, the CEO of Nebula, didn't mince words: “It’s theft. Will this be used to exploit and harm artists? Yes, absolutely.”