Community analytics in the age of bots

Community analytics in the age of bots

Friday, May 29th 2020

Let me explain where the numbers came from before BOTS, why it changed after BOTS, and why the numbers post-bots were misleading.

  • How I compiled this data before bots were introduced
  • How the data changed when bots were introduced
  • How the bot changes affected the data being displayed

Long story short : Each match now has fewer human players in it than before. This means I would need to process a lot more matches to get a more accurate approximation of the player base.
Processing the same number of matches pre-bots vs post-bots would give the appearance of a huge player drop. This is not an accurate representation of the player base and I’ll explain why. The thing I built fundamentally changed after bots were released and I did not provide this information soon enough.

What I can say with absolute certainty. : The previous and any future numbers do not directly indicate a smaller or larger player base.


How I compiled community analytics data before bots

This data is pulled from match reports that were viewed on PUBG Lookup. This is a very important distinction. This is not "sample data" or necessarily even randomized data. I would track which match reports were viewed, put those in a queue for processing, and then cycle through each map to count up the statistics that were displayed, including player counts.

I rounded this data because, obviously, this data isn't exact and it depends on how much traffic goes to PUBG Lookup and how many march reports are viewed on a given day. For example, if 10,123 Match Reports were viewed on a Monday, I would display 10,000 matches and a rounded count of the players in those matches.

Before bots, every player in a match was represented in the API under a variable called "Participants". This made it easy to see how many players were in a match. If you looked at 10,000 matches, every single player that was in it was represented in the "Participants" variable so you could expect these numbers to be reasonably reliable. If there was a game with 100 players, it would calculate all 100 players. If there was an FPP solo match with 43 players, it would count those.

Pre-bots, every match played had more players in it. On average, there were more human players (obviously) per match before bots. This means that, pre bots, looking at 10,000 matches would display more players because they were all human and were all represented in the "Participants" variable. This changed when bots were released.

How the data changed when bots were released

From a programming standpoint, the information that is passed to us via the official API changed a bit. The "Participants" variable only shows human players. It does not show bot players. The only way to get BOT players is to cycle through the Telemetry file to get a list of both bot and human players, then do the math.

This means that, since bots were released, every public match has fewer human players to be calculated. I want to repeat that... every match has fewer human players to be calculated.
At this point, the maximum amount of human players that can even possibly be calculated in a match is 80... even if there are 100 players. 80 humans / 20 bots. The lowest percentage of bots I have seen in any public match is 20%. I believe that PUBG might have a hard minimum of 20% bots, no matter what the queues looks like.

I hope they change this soon. The highest percentage of bots I've seen is 99%. Meaning 1 human player, 99 bots. This means that each match is calculating fewer players BUT more matches are being played.

How the bot changes affect the data being displayed

Using our example of 10,000 matches. If 10,000 matches are viewed and we now have a hard cap of maximum 80 humans and minimum of 1 human you'll see a DRAMATIC decrease in the "player count" that was displayed. 10,000 matches will now have significantly less human players which will show a dramatic decrease in HUMAN player count.

This is what everybody was reading and misunderstanding. That's not your fault, it's mine. I should have removed that page earlier as soon as I realized how it changed the data because it's hard to understand unless you know how the data was being gathered and how the API changed.

To sum it up

  • The matches that were analyzed were only matches that were viewed by a user on PUBG Lookup. If fewer people go to PUBG Lookup, the numbers go down, if more people go to PUBG Lookup, the numbers go up. Both in the amount of matches analyzed and the number of players analyzed. If I had to guess I would say that pre bots, the average match had somewhere around 85 players or so. Maybe more. After bots, especially in the very first few days, the average player count was significantly lower. This is why so many of you noticed a high percentage of bots in your matches.
  • When the average number of human players decreases per match, but we are calculating the same number of matches (or even more matches) it would display what appears to be a drop in the player base. This is not the case.. I'm not saying with 100% certainty that there isn't a drop in the player base. Only PUBG knows that. What I am saying, though, is that the numbers displayed here did not paint the full picture and misrepresent the player base after bots were released (at least compared to the previous implementation.)

Let me be clear : The player base may have dropped since bots were released. I don't have 100% verifiable information on that and PUBG doesn't release that data. It's also against the “terms of service” of the PUBG API to mine that much data to try to get a full player base count. What I can say is that I haven't seen a significant drop in players from the data I'm looking at.


How can PUBG Lookup fix this to show data that is more accurate or more easily understood

That's actually a tough one. As I write in the HUGE disclaimer at the top of the page "There are three kinds of lies: lies, damned lies, and statistics." Statistics are hard and can be used to confirm whatever bias you might have going in if not represented clearly. Some will look at these numbers and say "SEE. FEWER PEOPLE ARE PLAYING". And some will look at these numbers and say "SEE. WAY MORE MATCHES ARE BEING PLAYED" and come to different conclusions.

I'm working on how I'm going to gather matches to process. This page was specifically about cross play between PS4 and XBOX when I originally released it. That's changed quite a bit with Stadia and with bots being introduced and I'm going to have to figure out a more reliable way to represent this data. I would like to include bot data in these analytics as well.

I’ll keep everyone posted on progress with a new community analytics dashboard.


TL;DR - The community information is and was being perceived incorrectly and that's my fault. I should have removed the page sooner.

  • There are fewer human players in every match, which means the same amount of matches calculated pre-bots would show a significantly higher "player base".
  • The same amount of matches calculated post-bots would show a significantly lower "player base".
  • This alone does not mean that the player base is decreasing.
  • If the same people are going to PUBG Lookup post-bots and viewing their matches, then naturally it's going to look like the player base dropped because there are fewer humans in those matches.
  • I’m working on displaying new data but it’s difficult to represent in a way that won’t be easily misconstrued.