According to a report by The New York Times, OpenAI was aware of potential legal concerns but believed their actions constituted fair use. The report also claims that OpenAI president Greg Brockman was directly involved in the video collection process. The development comes a few days after YouTube CEO Neal Mohan said in an interview that scraping YouTube videos to train AI models will be a breach of its rules.
What OpenAI, YouTube have to say
OpenAI spokesperson Lindsay Held told The Verge that the company uses “numerous sources including publicly available data and partnerships for non-public data,” to maintain its global research competitiveness and that the company curates “unique” datasets for each of its models to “help their understanding of the world”
Meanwhile, Google, which owns YouTube, said it has “seen unconfirmed reports” of OpenAI’s activity.
“Both our robots.txt files and Terms of Service prohibit unauthorised scraping or downloading of YouTube content,” Google spokesperson Matt Bryant was quoted as saying. Bryant said Google takes “technical and legal measures” to prevent such unauthorised use “when we have a clear legal or technical basis to do so.”
The report also noted that Google also gathered transcripts from YouTube and the spokesperson said that the company has trained its models “on some YouTube content, in accordance with our agreements with YouTube creators.”