Apple, Nvidia and other companies were exposed to use controversial YouTube resources to train AI models

Publisher:诗意世界Latest update time:2024-07-17 Source: IT之家Keywords:Apple Reading articles on mobile phones Scan QR code
Read articles on your mobile phone anytime, anywhere

July 17 news, non-profit news studio ProofNews published a blog post yesterday (July 16), stating that large technology companies including Apple, Nvidia, Salesforce and Anthrophic all used video resources from YouTube when training their AI models.

The report said that these technology companies used a dataset called YouTube Subtitles, which is 5.7GB (489 million words) in size, to train their AI models.

The dataset was created by EleutherAI and was first released in 2020. It involves subtitle content of 173,536 YouTube videos from more than 48,000 channels, including subtitle content of more than 12,000 videos that have been deleted by the platform.

The YouTube Subtitles dataset mainly collects resources from popular YouTube channels. IT Home attaches the relevant information as follows:

  • MrBeast (289 million subscribers, 2 videos for training)

  • Marques Brownlee (19 million subscribers, 7 videos)

  • Jacksepticeye (nearly 31 million subscribers, 377 videos)

  • PewDiePie (111 million subscribers, 337 videos)

The YouTube Subtitles dataset is part of a collection of datasets called "The Pile," which includes several other training datasets. Most of The Pile datasets are open to anyone with enough space and computing power.


Keywords:Apple Reference address:Apple, Nvidia and other companies were exposed to use controversial YouTube resources to train AI models

Previous article:Artificial intelligence lie detection technology is available: better than humans, but should be used with caution
Next article:Build an AI security defense line, Google, Microsoft, Nvidia and other 14 companies form a secure AI alliance

Latest Internet of Things Articles
Change More Related Popular Components

EEWorld
subscription
account

EEWorld
service
account

Automotive
development
circle

About Us Customer Service Contact Information Datasheet Sitemap LatestNews


Room 1530, 15th Floor, Building B, No.18 Zhongguancun Street, Haidian District, Beijing, Postal Code: 100190 China Telephone: 008610 8235 0740

Copyright © 2005-2024 EEWORLD.com.cn, Inc. All rights reserved 京ICP证060456号 京ICP备10001474号-1 电信业务审批[2006]字第258号函 京公网安备 11010802033920号