Edited by Arild Underdal, University of Oslo, Oslo, Norway, and approved January 14, 2021 (received for review November 15, 2020)
Author contributions: M.C., G.D.F.M., A.G., W.Q., and M.S. designed research, performed research, contributed new reagents/analytic tools, analyzed data, and wrote the paper.
We explore the key differences between the main social media platforms and how they are likely to influence information spreading and the formation of echo chambers. To assess the different dynamics, we perform a comparative analysis on more than 100 million pieces of content concerning controversial topics (e.g., gun control, vaccination, abortion) from Gab, Facebook, Reddit, and Twitter. The analysis focuses on two main dimensions: 1) homophily in the interaction networks and 2) bias in the information diffusion toward like-minded peers. Our results show that the aggregation in homophilic clusters of users dominates online dynamics. However, a direct comparison of news consumption on Facebook and Reddit shows higher segregation on Facebook.
Social media may limit the exposure to diverse perspectives and favor the formation of groups of like-minded users framing and reinforcing a shared narrative, that is, echo chambers. However, the interaction paradigms among users and feed algorithms greatly vary across social media platforms. This paper explores the key differences between the main social media platforms and how they are likely to influence information spreading and echo chambers’ formation. We perform a comparative analysis of more than 100 million pieces of content concerning several controversial topics (e.g., gun control, vaccination, abortion) from Gab, Facebook, Reddit, and Twitter. We quantify echo chambers over social media by two main ingredients: 1) homophily in the interaction networks and 2) bias in the information diffusion toward like-minded peers. Our results show that the aggregation of users in homophilic clusters dominate online interactions on Facebook and Twitter. We conclude the paper by directly comparing news consumption on Facebook and Reddit, finding higher segregation on Facebook.
Social media radically changed the mechanism by which we access information and form our opinions (1234–5). We need to understand how people seek or avoid information and how those decisions affect their behavior (6), especially when the news cycle—dominated by the disintermediated diffusion of information—alters the way information is consumed and reported on. A recent study (7) limited to Twitter claimed that fake news travels faster than real news. However, a multitude of factors affects information spreading on social media platforms. Online polarization, for instance, may foster misinformation spreading (1, 8). Our attention span remains limited (9, 10), and feed algorithms might limit our selection process by suggesting contents similar to the ones we are usually exposed to (1112–13). Furthermore, users show a tendency to favor information adhering to their beliefs and join groups formed around a shared narrative, that is, echo chambers (1, 14151617–18). We can broadly define echo chambers as environments in which the opinion, political leaning, or belief of users about a topic gets reinforced due to repeated interactions with peers or sources having similar tendencies and attitudes. Selective exposure (19) and confirmation bias (20) (i.e., the tendency to seek information adhering to preexisting opinions) may explain the emergence of echo chambers on social media (1, 17, 21, 22).
According to group polarization theory (23), an echo chamber can act as a mechanism to reinforce an existing opinion within a group and, as a result, move the entire group toward more extreme positions. Echo chambers have been shown to exist in various forms of online media such as blogs (24), forums (25), and social media sites (2627–28). Some studies point out echo chambers as an emerging effect of human tendencies, such as selective exposure, contagion, and group polarization (13, 23, 2930–31). However, recently, the effects and the very existence of echo chambers have been questioned (2, 27, 32). This issue is also fueled by the scarcity of comparative studies on social media, especially concerning news consumption (33). In this context, the debate around echo chambers is fundamental to understanding social media’s influence on information consumption and public opinion formation. In this paper, we explore the key differences between social media platforms and how they are likely to influence the formation of echo chambers or not. As recently shown in the case of selective exposure to news outlets, studies considering multiple platforms can offer a fresh view on long-debated problems (34). Different platforms offer different interaction paradigms to users, ranging from retweets and mentions on Twitter to likes and comments in groups on Facebook, thus triggering very different social dynamics (35). We introduce an operational definition of echo chambers to provide a common methodological ground to explore how different platforms influence their formation. In particular, we operationalize the two common elements that characterize echo chambers into observables that can be quantified and empirically measured, namely, 1) the inference of the user’s leaning for a specific topic (e.g., politics, vaccines) and 2) the structure of their social interactions on the platform. Then, we use these elements to assess echo chambers’ presence by looking at two different aspects: 1) homophily in interactions concerning a specific topic and 2) bias in information diffusion from like-minded sources. We focus our analysis on multiple platforms: Facebook, Twitter, Reddit, and Gab. These platforms present similar features and functionalities (e.g., they all allow social feedback actions such as likes or upvotes) and design (e.g., Gab is similar to Twitter) but also distinctive features (e.g., Reddit is structured in communities of interest called subreddits). Reddit is one of the most visited websites worldwide (https://www.alexa.com/siteinfo/reddit.com) and is organized as a forum to collect discussions on a wide range of topics, from politics to emotional support. Gab claims to be a social platform aimed at protecting freedom of speech. However, low moderation and regulation on content has resulted in widespread hate speech. For these reasons, it has been repeatedly suspended by its service provider, and its mobile app has been banned from both App and Play stores (36). Overall, we account for the interactions of more than 1 million active users on the four platforms, for a total of more than 100 million unique pieces of content, including posts and social interactions. Our analysis shows that platforms organized around social networks and news feed algorithms, such as Facebook and Twitter, favor the emergence of echo chambers.
We conclude the paper by directly comparing news consumption on Facebook and Reddit, finding higher segregation on Facebook than on Reddit.
To explore the key differences between social media platforms and how they influence echo chambers’ formation, we need to operationalize a definition for them. First, we need to identify the attitude of users at a microlevel. On online social media, the individual leaning of a user toward a specific topic,
This section explains how we implement the operational definitions defined above on different social media. For each medium, we detail 1) how we quantify users’ leaning, and 2) how we reconstruct how the information spread.
We consider the set of tweets posted by user
We quantify the individual leaning of users considering endorsements in the form of likes to posts. Posts are produced by pages that are labeled in a certain number of categories, and, to each category, we assign a numerical value (e.g., Anti-Vax [+1] or Pro-Vax [–1]). Each like to a post (only one like per post is allowed) represents an endorsement for that content, which is assumed to be aligned with the leaning associated with the page. Thus, the user’s leaning is defined as the average of the content leanings of the posts liked by the user, according to Eq. 1.
We analyze three different datasets collected on Facebook regarding a specific topic of discussion: vaccines, science versus conspiracy, and news. The interaction network is defined by considering comments. In such an interaction network, two users are connected if they cocommented on at least one post. Henceforth, we focus on the dataset about vaccines and news, and others are shown in SI Appendix.
The individual leaning of users is quantified similarly to Twitter by considering the links to news organizations in the content produced by the users, submissions, and comments. We build the interaction network considering comments and submissions. There exists a direct link from node
We analyze three datasets collected on different subreddits: the_donald, Politics, and News. In the following, we focus on the dataset collected on the Politics and the News subreddits, and others are shown in SI Appendix.
The political leaning
In the following, we perform a comparative analysis of four different social media. We select one dataset for each social media: Abortion (Twitter), Vaccines (Facebook), Politics (Reddit), and Gab as a whole. Results for other datasets for the same medium are qualitatively similar, as shown in SI Appendix. We first characterize echo chambers in the networks’ topology, and then look at their effects on information diffusion. Finally, we directly compare news consumption on Facebook and Reddit.
The network’s topology can reveal echo chambers, where users are surrounded by peers with similar leanings, and thus they get exposed, with a higher probability, to similar contents. In network terms, this translates into a node


Joint distribution of the leaning of users
The presence of homophilic interactions can be confirmed by the community structure of the interaction networks. We detected communities by applying the Louvain algorithm (41), removing singleton communities with only one user. Then, we computed each community’s average leaning, determined as the average of individual leanings of its members. Fig. 2 shows the communities emerging for each social medium, arranged by increasing average leaning on the


Size and average leaning of communities detected in different datasets. A and C show the full spectrum of leanings related to the topics of abortions and vaccines with regard to communities in B and D, where the political leaning is less sparse.
Simple models of information spreading can gauge the presence of echo chambers: Users are expected to be more likely to exchange information with peers sharing a similar leaning (18, 42, 43). Classical epidemic models such as the susceptible–infected–recovered (SIR) model (44) have been used to study the diffusion of information, such as rumors or news (4546–47). In the SIR model, each agent can be in any of three states: susceptible (unaware of the circulating information), infectious (aware and willing to spread it further), or recovered (knowledgeable but not ready to transmit it anymore). Susceptible (unaware) users may become infectious (aware) upon contact with infected neighbors, with a specific transmission probability
The set of nodes in a recovered state at the end of the dynamics started with user
Fig. 3 shows the average leaning


Average leaning
Again, one can observe a clear distinction between Facebook and Twitter, on one side, and Reddit and Gab on the other side. For the topics of vaccines and abortion, on Facebook and Twitter, respectively, users with a given leaning are much more likely to be reached by information propagated by users with similar leaning, that is,
These results indicate that information diffusion is biased toward individuals who share a similar leaning in some social media, namely Twitter and Facebook. In contrast, in others—Reddit and Gab in our analysis—this effect is absent. Such a latter configuration may depend upon two factors: 1) Gab and Reddit are not bursting the echo chamber effects, or 2) we are observing the dynamic inside a single echo chamber.
Our results are robust for different values of the effective infection ratio
The striking differences observed across social media, in terms of homophily in the interaction networks and information diffusion, could be attributed to the different topics taken into account. For this reason, here we compare Facebook and Reddit on a common topic, news consumption. Facebook and Reddit are particularly apt to a cross-comparison since they share the definition of individual leaning (computed by using the classification provided by mediabiasfactcheck.org; see Materials and Methods for further details) and the rationale in creating connections among users that is based on an interaction network. Fig. 4 shows a direct comparison of news consumption on Facebook and Reddit along the metrics used in the previous sections to quantify the presence of echo chambers: 1) the correlation between the leaning of a user


Direct comparison of news consumption on (A) Facebook and (B) Reddit. Joint distribution of the leaning of users
Social media platforms provide direct access to an unprecedented amount of content. Platforms originally designed for user entertainment changed the way information spread. Indeed, feed algorithms mediate and influence the content promotion accounting for users’ preferences and attitudes. Such a paradigm shift affected the construction of social perceptions and the framing of narratives; it may influence policy making, political communication, and the evolution of public debate, especially on polarizing topics. Indeed, users online tend to prefer information adhering to their worldviews, ignore dissenting information, and form polarized groups around shared narratives. Furthermore, when polarization is high, misinformation quickly proliferates.
Some argued that the veracity of the information might be used as a determinant for information spreading patterns. However, selective exposure dominates content consumption on social media, and different platforms may trigger very different dynamics. In this paper, we explore the key differences between the leading social media platforms and how they are likely to influence the formation of echo chambers and information spreading. To assess the different dynamics, we perform a comparative analysis on more than 100 million pieces of content concerning controversial topics (e.g., gun control, vaccination, abortion) from Gab, Facebook, Reddit, and Twitter. The analysis focuses on two main dimensions: 1) homophily in the interaction networks and 2) bias in the information diffusion toward like-minded peers. Our results show that the aggregation in homophilic clusters of users dominates online dynamics. However, a direct comparison of news consumption on Facebook and Reddit shows higher segregation on Facebook. Furthermore, we find significant differences across platforms in terms of homophilic patterns in the network structure and biases in the information diffusion toward like-minded users. A clear-cut distinction emerges between social media having a feed algorithm tweakable by the users (e.g., Reddit) and social media that don’t provide such an option (e.g., Facebook and Twitter). Our work provides important insights into the understanding of social dynamics and information consumption on social media. The next envisioned step addresses the temporal dimension of echo chambers, to understand better how different social feedback mechanisms, specific to distinct platforms, can impact their formation.
Here we provide details about the labeling of news outlets and the datasets considered.
The labeling of news outlets is based on the information reported by Media Bias/Fact Check (MBFC) (https://mediabiasfactcheck.com), an independent fact-checking organization that rates news outlets on the basis of the reliability and of the political bias of the contents they produce and share. The labeling provided by MBFC, retrieved in June 2019, ranges from Extreme Left to Extreme Right for political bias. The total number of media outlets for which we have a political label is 2,190. A detailed description of the source labeling process and political bias distribution can be found in SI Appendix.
For what concerns Gab, all data are available on the Pushshift public repository (https://pushshift.io/what-is-pushshift-io/) at this link: https://files.pushshift.io/gab/. Reddit data are available on the Pushshift public repository at this link: https://search.pushshift.io/reddit/. For what concerns Facebook and Twitter, we provide data according to their Terms of Services on the corresponding author institutional page at this link: https://walterquattrociocchi.site.uniroma1.it/ricerca. For news outlet classification, we used data from MBFC (https://mediabiasfactcheck.com), an independent fact-checking organization. Anonymized data have been deposited in Open Science Framework (10.17605/OSF.IO/X92BR) (49). For further details about data, refer to the following section.
Table 1 reports summary statistics of the datasets under consideration. Due to the structural differences among platforms, each dataset has different features. For Twitter, we used tweets regarding three topics collected by Garimella et al. (16), namely Gun control, Obamacare, and abortion. Tweets linking to a news source with a known bias are classified based on MBFC. Facebook datasets were created by using Facebook Graph API and were previously explored in ref. 50 (Science and Conspiracy), (51) (Vaccines) and (11) (News). For the two datasets Science and Conspiracy and Vaccines, data were labeled in a binary way, respectively, provaccines/antivaccines and proscience/conspiracy, based on the page where they were posted. Posts in the dataset News were instead classified based on MBFC labeling. Reddit datasets have been obtained by downloading comments and submissions posted in the subreddit Politics, the_donald, and News and labeled according to the classification obtained from MBFC. The Gab dataset has been collected from https://files.pushshift.io/gab and contains posts, replies, and quotations. Posts were labeled according to MBFC classification. Further details can be found in SI Appendix.

| Media | Dataset | |||||
| Gun control | June 2016 | 14 d | 19 million | 3,963 | 0.93 | |
| Obamacare | June 2016 | 7 d | 39 million | 8,703 | 0.90 | |
| Abortion | June 2016 | 7 d | 34 million | 7,401 | 0.95 | |
| Sci/Cons | January 2010 | 5 y | 75,172 | 183,378 | 1.00 | |
| Vaccines | January 2010 | 7 y | 94,776 | 221,758 | 1.00 | |
| News | January 2010 | 6 y | 15,540 | 38,663 | 1.00 | |
| Politics | January 2017 | 1 y | 353,864 | 240,455 | 0.15 | |
| the_donald | January 2017 | 1 y | 1.234 million | 138,617 | 0.16 | |
| News | January 2017 | 1 y | 723,235 | 179,549 | 0.20 | |
| Gab | Gab | November 2017 | 1 y | 13 million | 165,162 | 0.13 |
For each dataset, we report the starting date of collection
We thank Fabiana Zollo and Antonio Scala for precious insights for the development of this paper. We are grateful to Geronimo Stilton and the Hypnotoad for inspiring the data analysis and result interpretation.
1
2
3
4
5
6
7
8
11
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49