Forward: I’m a software engineer, have no need/desire to make YouTube videos or spam. I’m a nerd, and seeing this fascinated me. This is the first time I’ve seen a possible deep-learning algorithm implemented in such scale, and one of the most successful spam bots I’ve seen in years.
TL:dr Weird network of Russian/Arabic/English children videos. Many separate channels, some of which are being shut down by YouTube, but most of which are able to pull 20 million views in < 1-week. Most of the videos are of the same people, and the comments section are FULL of A.I. generated comments. There’s a few things in particular that stood out to me:
I think they may be using large data to make and implement the videos. Each one from different channels are displayed in different ways, with different thumbnails. There all targeted toward children, which most people know are a large-view, repeat-click audience. I think they may be using deep-learning to find other successful videos with X-criteria. Some of the channels only have the Russians, others have a large variety of children’s video’s. The thumbnails are perfect implementation to the kind of videos 5-13 year-old’s would click on. The video content is usually the same if it’s the Russians, but in different orders and times.
Some have comments disabled, some have comments sections full of AI. Most comments seem really similar to the Computer-phile video: /watch?v=XyMdpcAPnZc It seems really obvious to me that this is an AI, and not just a simple crawling bot posting copy and pasted comments from other videos. Alot of AI type errors seem to arise, with occasional garbage-type comments. Alot of similar emoji’s, comments in 5 or so languages, the English ones being kind of out of order, same user posting 5 or so really weird comments, and the user’s actually conversating in really strange ways to each other.
The fact that this appeared in my recommendation means some sophisticated SEO work. SEO was easy long ago, but it was mostly associated with spamer’s trying to game the "system". Today it’s more synonymous with trying to make sure YouTube’s algorithm understands what your content is about, and that its presented in the right way. Once you click on one of the videos, all the recommended on the right turn into the many many channels they use. Maybe the change in language makes it more difficult for YouTube algorithms to pinpoint? Either way it has perfected what is bringing views, I’m not sure if it’s bot views that’s causing it to register as "viral", or if it’s really "cornered" YouTube’s algorithms in a way that hasn’t been done in some time.
The sheer scale of these videos. So many channels, videos, comments, and accounts. This makes me think of how they get through anti-bot implementations but more on that on 5. So obviously these videos are designed with a specific purpose in mind. Weird, foreign, bright high-contrasting appealing thumbnails, alot of stuff that people share to there friends with a "wtf" attached. The foreign languages, and comments that are praising the video further this. Even still, 20 million views in matter of a week, how is something that huge not red-flagged? I did some IMO back in college, never did much with it, mostly just read alot on blackhatworld and what not. From what I know, they track where user’s come from, what brought them there, level’s of normal engagement, comments, likes/dislikes, what kind of content, location of content, location of viewers, viral-ity, among many others.
How can so many bots, bypass so many different types of security’s? The captcha’s, IP tracking, and so many other things. The fact that some of these channels are older then a year, means that the bots consistently keep the level of engagement that’s acceptable in Google’s algorithms. Google does not release any of this data. So this leads me back to number 1, that they have pulled massive amounts of data-sets to implement these.
I am specializing in machine learning, I begin in the fall. I’ve been saying for years that this territory is so much more then people imagine. I read the other day about the AI-fake porn reddit that’s become so popular, which seems like it’s straight out of an episode of Black Mirror. So I’m curious, does this mean that machine-learning is now heading for the masses? Spamming is usually a very-low entry level in cost. But to run this large of a network seems to me that it would require a decent amount of money, and perhaps even a large organization? Just seems like a lot of data collection, processing, and "faking" human behavior. I also have a hunch of it being an evolutionary neural net, there is a certain degree on polymorphism between all the content, and if the number 1 data-collection is true, I’m sure the succession of each different implementation is fed back into the next generation. I.e. kill off bottom performing 50%, and all channels shut down < week, breed top 50%, all fitness judged on longevity, views, income, and reach.
So does AI like this render things like Captchas useless? Even the tracking of the "I’m not a robot"? I feel like this is a big deal, more so then just a bunch of spammers flooding junk. Now it’s AI vs AI, the same way wall-street is algorithm vs algorithm. I’m curious to see any other thoughts on this, and perhaps this network I’ve stumbled upon isn’t as sophisticated as I think it is? Perhaps it’s a hybrid on mixed user and bot interaction? Or a large network of mechanical turk type "human-bots". That only explains some though, cause I have a pretty good grasp on what AI generated text looks like, and those comments REALLY point that direction to me.
I really don’t want to share these videos, as it just add’s to the problem, but it’s kind of hard to talk about without a source. So here are some of the ones I’ve found, these are links to the channel’s themselves, so you don’t actually give them any views if you stay on them. /channel/UC5SK4xqbIdgvW84SmTmpEug/featured /channel/UCEREgnbM4H1S6f5JG3IfMUw/featured /channel/UCBn0LGNo4e3EYucsEH9rg4g/featured /channel/UCRzo63EHjjE_PNrtAlmje-A /channel/UCMMIpxNLCGiRNyEtvkwV2IQ
And here’s some reddit posts I found on them, not much useful though: https://www.reddit.com/r/ElsaGate/comments/7dq9r4/did_toy_freaks_turn_into_freaks_crazzy_a_youtube/ https://www.reddit.com/r/youtube/comments/5zefuo/did_something_change_with_youtubes_recommended/