= University of Konstanz, Department of Politics and Public Administration, PhD Student

x University of Amsterdam, Amsterdam School of Communication Research, PhD Student

+ Penn State, Department of Political Science, Assistant Professor

All authors contributed equally to this project.

What is this?

We are conducting academic research on TikTok Politics. In light of the ongoing controversies involving the geopolitics of TikTok, we decided to share some preliminary results about how TikTok has actually been used to discuss US politics over the past two and half years.

The metadata analyzed here—219,787 tiktoks from 1,767 distinct accounts—is only a small fraction of our full dataset. It only includes tiktoks created by accounts that we are very confident are “political.”

What’s the takeaway?

Among these 1,767 dedicated political TikTok accounts, there has been a massive increase in production since the onset of the coronavirus pandemic. Although our time frame ranges from January 2018 to the end of June 2020, the majority of tiktoks in our analysis were created in just the final three months.

We divide our analysis into left-leaning and right-leaning accounts. The first step was to distinguish between “political” and “non-political” content with the help of supervised machine learning. After an iterative process of hand-coding accounts, evaluating the out-of-sample accuracy of our classifier, and coding more edge cases, our present classifier achieves 84% out-of-sample accuracy at the account-level1. Our goal here is to minimize “false positives” to avoid non political accounts biasing the results, so we set a high threshold for an account to be classified as “political”. To be included in this analysis, our classifier has to predict that at least 90% of an account’s tiktoks are political.

After this aggressive step, it is straightfoward to code the remaining accounts as left- or right-leaning. Many of the more occasionally political tiktok accounts have less consistent or more esoteric ideological perspectives, posing a challenge for future analysis.

Here, however, we identify 794 right-leaning and 973 left-leaning accounts. These two ideological clusters follow a similar trajectory in tiktok production until the beginning of widespread lockdowns in March 2020, at which point the left cluster begins to take off. The Black Lives Matter protests that began in May 2020 reinforced this trend; the final week of May saw roughly twice as many left-leaning tiktoks as right-leaning tiktoks. The two trends have since drawn closer together.

However, there is a much smaller disparity in the consumption of tiktoks by these accounts. The “Weekly Plays” tab shows the right-leaning cluster lagging the left-leaning cluster only slightly; the two groups saw almost exactly the same number of plays in the final week of May, and the right-leaning cluster has in fact proven more popular since.

We also provide descriptive analysis in the adoption of various political hashtags and the words used in the account bios (captured in late June) and words in the tiktok descriptions. These are useful for checking the face validity of our results, and track the major political trends in US politics during this time period. It is noteworthy that political hashtags with clear partisan leaning seem to occur in a crosspartisan and a purely partisan way. For the latter category, hashtags like #kag (Keep America Great) are mostly used by right-leaning accounts whereas hashtags like #progressive or #leftist are predominantly used by left-leaning accounts. However, there are also partisan hashtags that are used by both left- and right-leaning accounts such as #democrat(s) or #biden and to a lesser degree #maga (Make America Great Again) and #trump. Upon further inspection there seem to be two reasons for this: 1. Tiktokers hope to increase their visibility in the algorithm and fit in as many hashtags as they can under their videos, a phenomena which can be dubbed as hashtag stacking. 2. Political tiktokers intentionally try to target the hashtags that signal opposing viewpoints because they want to expose people from the other political side to their content, often in videos that ridicule them or ask rhetorical questions. This is an interesting phenomena more unique to TikTok, as you would not expect a left-leaning Twitter user to excessively use hashtags like #trumptrain or #trump2020, because it would be understood as expressing support for President Trump. On (political) TikTok, using various hashtags is just part of the game to gain evermore exposure through the algorithm.

There is also a “Mentions Network” that plots duets and tiktok mentions. Fine-grained inference is difficult with this graph, but the overall network structure indicates a higher level of cross-ideological contact than is generally observed on other social media platforms (cross-ideological contact is colored in purple). This reflects the way that the political tiktokers take advantage of the platform’s affordances (like the duet function) to argue with or “dunk on” their ideological opponents. The lower-left discourse cluster which emerges in the network graph are the so-called political hype houses, an association of political tiktokers who decide to produce content together and often engage in “debates” with rival hype houses from opposing political views. The upper-right cluster in the network graph comprises many users who are people of color, both from the left and right, often talking about issues around the recent wave of black lives matter protests.

Cross-ideological contact is often seen as normatively desirable (in opposition to the dreaded “echo chamber”), but we caution that these conversations are often far from deliberative. A toxicity scoring of the transcripts (obtained via Mozilla’s DeepSpeech) with the help of Google’s Perspective API reveals that right-leaning videos are somewhat more toxic on average when they mention left-leaning accounts. But tiktokers do not just use speech to communicate their support or disdain for other users and (opposing) ideas. This is illustrated in the “Top TikTok Music” plot. These “sounds” are a novel technological affordance. They represent “meme formats” encoded directly into the metadata of the platform. Each user can choose a “sound” from TikTok’s library and instantiate the associated meme through a combination of external images and their own bodily performance. These “embodied memes” are rich with social information but light on deliberative, reasoned discussions. Perusing the tiktoks using the sound from Tekashi 6ix9ine’s song (?) “Gooba” is informative.

Where can I learn more?

We are giving a presentation of our expanded results at the 2020 PACSS Conference hosted by Northeastern University on August 13; we encourage interested readers to attend. We have also shared a condensed version of our theoretical framework for understanding the affordances of TikTok.

Descriptives

Number of Tiktoks in Sample

## [1] 219787

Number of Unique TikTok Users in Sample

## [1] 1767

Time Series (cumulative)

Total Tiktoks

Total Diggs

Total Plays

Total Comments

Time Series (Weekly Sum)

Weekly Tiktoks

Weekly Diggs

Weekly Plays

Weekly Comments

Hashtags Over Time (cumulative)

#biden

#bernie

#blacklivesmatter

#justiceforgeorgefloyd

#alllivesmatter

#bluelivesmatter

#maga

#kag

#trump

#lgbt

#transrights

#corona

#pandemic

#liberal

#progressive

#conservative

#leftist

#democrat

#republican

#capitalism

#communism

#socialism

#impeachment

Hashtags Over Time (Weekly Sum)

#biden

#bernie

#blacklivesmatter

#justiceforgeorgefloyd

#alllivesmatter

#bluelivesmatter

#maga

#kag

#trump

#lgbt

#transrights

#corona

#pandemic

#liberal

#progressive

#conservative

#leftist

#democrat

#republican

#capitalism

#communism

#socialism

#impeachment

Political Duets

Mentions Network

Lower-Left Cluster

Upper-Right Cluster

Toxicity Scores of Transcripts

Bios & Descriptions

Top Words in Account Bios

Top Words in Tiktok Descriptions

Top TikTok Music

Cumulative Tiktoks per User

Distribution

Log-Scaled


  1. At the beginning of the classification process, we kept aside 20% of the accounts, so that we can use them to evaluate the “out-of-sample” performance of the classifier. Labeling occured at the account level, which is why, even if we are classifying individual tiktoks, we performed a train-test split and report the performance metrics at the account level