Abstract
The growth of social media recently has made individual investors more reliant on online media for information. This trend significantly affects investor behavior and information disparity. For instance, social media can lead to the phenomenon of “meme stocks," in which stock prices rapidly rise and fall. Despite increasing interest in meme stocks, few studies have focused on the information disparity. In this study, we model meme stocks focusing on investors who are influenced by social network information, leading to information spread. We combine the Susceptible-Infectious-Recovered (SIR) model with a network model in an artificial market simulation to examine information disparity among individual investors. Our findings show that conventional investors generally profit, whereas social media investors incur losses. Additionally, the profits of social media investors are heavily affected by their network structure and their network distance from the information source. The results highlight the issue of information inequality in financial markets and offer important guidance for policymakers.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
In recent years, social media has emerged as a significant new source of information in financial markets. The spread of social media has increased the number of individual investors using social media as an information source to make investments. In addition to traditional company and trend analyses, individual investors increasingly use information from online media for financial decisions [14]. The U.S. Internet forum Reddit is a particular example of this trend. Reddit is a giant message board, and its sub-channel, WallStreetBets (WSB), is a thread about investing with approximately 16 million registered users as of January 2023. This community is said to impact the U.S. stock and cryptocurrency markets significantly, becoming one of the most important information sources for individual investors. While traditional analysis methods are mainly based on financial reports, market trends, and economic indicators, online communities such as Reddit provide information based on the exchanges of opinions, rules of thumb, and emotional reactions among individual investors. This phenomenon shows one aspect of the impact of the social media diffusion on financial markets.
As the proliferation of social media continues, the phenomenon of “meme stocks" has emerged. Meme stocks are overvalued relative to their fundamentals, with prices rising sharply as they gain more attention in online communities [9]. One of the most famous meme-stock cases is GameStop, a U.S. video game retailer, whose stock became a popular topic on WSB in January 2021. Its stock price skyrocketed and hedge funds suffered heavy losses [4, 29]. This case has been widely noted as an example of how individual investors can be sufficiently influential to compete with hedge funds. Other examples include the movie chain AMC Entertainment Holdings and smartphone manufacturer Blackberry Limited.
Meme stocks are studied in relation to their connections with social media [1, 4, 15, 16, 31], but the perspective of information disparity among individual investors in the context of meme stocks remains under-researched, despite its importance. The disparity in access to information, which can lead to unequal opportunities among investors and may destabilize financial markets, has become a focal point in financial policy aimed at stabilizing markets. Some examples, such as the SEC’s Regulation Fair DisclosureFootnote 1 [10] and ESMA’s guidelines on the disclosure of inside information under the Market Abuse Regulation (MAR)Footnote 2 [6], illustrate that resolving information disparity is not merely a theoretical concern but a practical priority in financial policy. Information disparity can disadvantage some investors. In fact, not all individual investors profit from meme stocks. According to a report that examined Swedish individual investors, the total realized losses of Swedish individual investors who traded GameStop stock reached USD 7.92 million [20]. Additionally, 63% of investors who sold all their holdings in GameStop incurred losses. Furthermore, while the overall result of trading AMC stock was positive, more individuals incurred losses than made profits. Many individuals were positioned at the extreme ends of the profit distribution, having incurred either substantial gains or significant losses. The characteristics of even individual investors may vary depending on whether they use social media and their position within social media networks. No data link profits to whether investors use social media or their position on social media, making it difficult to analyze this information disparity.
In this study, we use an artificial market simulation to analyze the impact on profit of two types of information asymmetry in meme stocks—the presence or absence of social media usage among individual investors and the position within the network on social media. An artificial market simulation is a multi-agent simulation (MAS) that reproduces a financial market. MAS, a virtual environment with agents created on a computer, can observe macro phenomena as interactions of their micro behaviors. Based on this, MAS has attracted attention owing to its ability to deal with extreme events, such as meme stocks [13, 28]. An artificial market is appropriate for this study, because it allows us to examine the impact of individual investors’ behavior on the market as a whole. In addition, by narrowing down the targets for observation and modeling them, MAS facilitates the analysis of the mechanism and allows us to perform present-day experiments. Furthermore, artificial markets do not require data for analysis. Therefore, we hypothesize that artificial markets are effective for modeling meme stocks.
We develop a new investor model, separate from conventional investor models, to replicate investors who make investment decisions drawn by local information on their social network and cause information propagation (social media investors), by integrating the Susceptible-Infectious-Recovered (SIR) model and network models into artificial market simulation. We use the SIR model to express the behavior of social media investors. The SIR model [12], which describes the epidemic process of infectious diseases, divides the population into three categories according to the infection status: susceptible, infectious, and recovered. Thus, the SIR model helps analyze the rapid diffusion of specific information in a part of the market. Network models are used to model the connections of people on social media. We use the network model to analyze the impact of the network’s social media investor forms and positions in the network.
Our results show that overall, social media investors incur losses, while conventional investors make profits. These losses are different depending on the network structure. Our results reveal that, apart from the initial sources of information, most social media investors incur losses. Only highly rational individuals— information disseminators—consistently make profits. This study not only enhances our understanding of the meme stock phenomenon but also highlights the significance of the information disparity among individual investors and contributes to policy recommendations aimed at addressing this issue.
The remainder of this paper is organized as follows: Sect. 2 introduces related works. Section 3 explains the model. Section 3.1 describes the market, and Sect. 3.2 explains the FCN agent, which models the conventional investor. Section 3.3 explains the SIR agent, which models the social media investor, and is divided into order behavior, network formation, and state transition process. Section 4 explains the experiments, including the validation of the model. Section 5 presents the experimental results and discusses them. Section 6 concludes.
Related works
Meme stocks
Meme stocks are driven by the actions of individual investors influenced by social media; thus, meme stocks are studied in relation to their connections with social media. One primary research topic is the relationship between posts on online media and short squeezes. Lyócsa et al. [31] examined whether posts on WSB are related to the daily price fluctuations of four meme stocks and confirmed that increased discussion on WSB leads to greater price volatility for these four stocks. Anand and Pathak [1] demonstrated that there is a significant relationship between the tone of posts mentioning GameStop stock on WSB and the stock’s volatility or returns. Another primary research topic is the network of social media. Lucchini et al. [15] analyzed the propagation within WSB and the changes in the WSB network over time. Initially, WSB users who posted evidence of trading GameStop stock were at the center of the network, but at a certain point, users around them began to post, leading to a sudden change in the network structure. Mancini et al. [16] analyzed how the network of discussions on WSB grew to examine how consensus regarding GameStop was formed within the WSB community. Overall, previous research on meme stocks has not paid sufficient attention to the disparity among individual investors.
Artificial market simulation
Kikuchi et al. [13] developed an agent-based simulation platform that examines how current systems management regulations affect corresponding bankruptcies. The analysis of this platform focuses on macro-level failures owing to shocks caused by common market risk factors. Westphal & Sornette [28] used an agent-based model to examine how policymakers prevent bubbles generated by interactions between agents. Our study is positioned within this field to understand extreme events in the financial market using MAS.
SIR model
The SIR model has been applied to model not only infectious diseases but also various other phenomena, such as the spread of rumors and ideas [11, 22]. Wang and Wood [25] modeled the propagation of memes by modifying the SIR model to consider the possibility that even people who have lost interest in a meme may become interested again.
Network model
Network models are used to model the connections of people on social media. Several studies have combined network models with artificial market simulations. Ponta and Cincotti [21] investigated the impact of agents’ sentiment networks on market structure using directed graphs. Wang et al. [26] used simulations with a heterogeneous agent model to investigate the influence of social network structure on stock price disclosure. Hoffman et al. [8] created an agent model of investors in a social network using empirical data from a survey of individual investors’ decisions and social interactions.
Simulation model
In this study, we employ the platform for artificial market simulations (PAMS) as our artificial market platform [7]. PAMS has basic stock market mechanisms in a Python-based environment. PAMS inherits its design philosophy from the artificial market simulation platform Plham [24].
Figure 1 illustrates the overarching framework of the artificial market model used in the study. We design this model to simulate a market with a single-brand stock and included two distinct types of agents. At each step, an agent is randomly selected to place an order, which is then recorded in the market’s order books. The market processes these orders and attempts to execute them using the transaction mechanism (refer to Sect. 3.1) at every step.
The primary objective of this study is to examine the impact on individual investors’ profits of two types of information gaps in meme stocks: whether the investors use social media and their position within the social media network. To achieve this objective, we construct a generalized single-brand stock market model and an agent model that reproduces conventional investor behaviors (the FCN agent). Additionally, we introduce a new agent model (the SIR agent) to represent social media investors. To effectively evaluate the first hypothesis, we categorize agents into two types: those who incorporate social media information into their decision-making processes and those who do not. Consequently, we define conventional investors as those who do not use social media information in their investment decisions, and we define social media investors as those whose market participation is influenced by local information on their social media, causing information propagation. For the second hypothesis, we locate social media investors within the network. Our research focuses on the dissemination of information through social media, as opposed to other sources such as direct communication and information exchange between conventional investors and social media investors. This focus is due to the nature of meme stocks, which are typically overvalued relative to their fundamentals and experience rapid price increases driven by social media attention. Therefore, conventional investors are not placed within any network, and our simulation does not establish a direct connection between these two types of agents.
Market
For simplicity, we assume the presence of a single continuous double auction market. The continuous double auction market is a type of transaction mechanism in stock markets whereby investors continuously place buy or sell orders. The market has two order books—a sell order book and a buy order book. A new order is executed against an existing one on the opposite side of the order book, adhering to the principles of price and time priorities. If a new order does not find a match, it is added to the order book on the corresponding side. In other words, if a new order can find a match, it is executed; otherwise, it is recorded in the order book. Subsequently, the best bid and ask prices are updated using the same process as in an actual continuous double auction market. An example order book is shown in Fig. 2. When a buy order is submitted at a price exceeding the current best ask priceFootnote 3(Fig. 2 (i)), it is executed at the best ask price. Conversely, if a sell order is placed at a price below the current best bid priceFootnote 4(Fig. 2 (ii)), it is executed at the best bid price. The moment a trade is executed, the assets and cash holdings of the agents who placed the buy and sell orders are adjusted in accordance with the specific details of the transaction. The order volume per order is set to 1 for all agents, except for one type of agents which expresses the attempt sell off holding. Orders that do not meet these criteria are maintained on the order book pending future transactions.
This simulation defines one step as the period during which a single agent performs an order action and completes the associated execution process. The fundamental price Footnote 5 is set to 300 and remains constant throughout all steps. The market contains \(N_{FCN}\) FCN agents and \(N_{SIR}\) SIR agents.
FCN agent
We model the behavior of a conventional investor who does not use information from social media in their investment decisions as an FCN agent, similar to Chiarella and Iori [3]. We posit that investor heterogeneity in the stock market arises from the different allocations of the three components of their expected prices: fundamental, chart, and noise factors. Hence, each agent has a unique allocation of these factors.
At step t, an FCN agent i determines its order action by computing the values of the fundamental, chart, and noise factors (\(F_{t}^{i}\), \(C_{t}^{i}\), \(N_{t}^{i}\)). The fundamental factor \(F_{t}^{i}\) reflects the idea that if the market price is above (below) the fundamental price, the stock is overvalued (undervalued) and will fall (rise) in the future. This factor is calculated as follows:
\(\tau ^{*}_{i}\) represents the mean reversion time parameter for agent i, \(p_{f}\) denotes the fundamental price (set to 300), and \(p_t\) is the market price at step t. The mean reversion time is the average time taken to regress from the current price to the fundamental price. Chart factor \(C_{t}^{i}\) indicates that recent price movements influence expected future prices for investors. This factor is expressed as follows:
\(\tau _{i}\) indicates the time window size for the agent i and \(r_t\) is the log return at step t. The noise factor \(N_{t}^{i}\) embodies the waving of investors’ decision-making, such as random judgments, psychological factors, and misinformation, rather than analyses based on fundamental prices or chart patterns (Eq. (3)).
Next, agent i calculates the expected price return \(\widehat{r_{t}^{i}}\) based on these three factors (Eq. (4)).
Here, \(w_{F}^{i}, w_{C}^{i}\), and \(w_{N}^{i}\) represent the weights of the fundamental, chart, and noise factors, respectively. Based on these weights, agent i calculates the expected prices \(\widehat{p_{t}^{i}}\) as follows:
This expected price assumes that agent i holds the stock for a duration of \(\tau _{i}\). Then, if \(\widehat{p_{t}^{i}} > p_{t}\), agent i places a one-unit buy order at the following price:
If \(\widehat{p_{t}^{i}} < p_{t}\), agent i places a one-unit sell order at the following price:
The term \(k^i \in [k_{\text {min}}, k_{\text {max}}]\) denotes a fixed order margin. The symbols \(p_{t}^{\text {ask}}\) and \(p_{t}^{\text {bid}}\) represent the best ask and bid prices in the market, respectively.
SIR agent
The SIR agent represents social media investors whose market participation is influenced by local information on their social media, causing information propagation. The meme stock phenomenon is attributed to the following mechanism [19].
-
1.
Some investors start to buy and the price rises slowly.
-
2.
Then, as the stock price starts to rise in the market, individual investors notice the increase in the stock price and volume, leading more people to buy, causing the stock price to skyrocket.
-
3.
Information about the stock spreads to social media and online communities, prompting more investors to participate owing to “fear of missing out" (FOMO).
-
4.
Eventually, purchases peak, and those participating in the buying and selling game begin to sell, leading to a sharp fall in the stock price.
Based on the above mechanism, this phenomenon is considered to heat up owing to a series of actions by social media investors who lack knowledge and are easily influenced by information on social media. They buy stocks expecting the stock price to rise, and when the price soars, they sell and exit the market. To model these actions, each SIR agent has an “infection state” that determines its order actions in the market. The infection state transitions from S to I to R and is influenced by adjacent agents. In this transition, we use the parameters \(\lambda\) and \(\mu\), which represent the infection rate per unit time and the recovery rate per unit time, respectively [12]. In the context of this study, \(\lambda\) and \(\mu\) correspond to the steps of FOMO the duration from 1 to 3, as well as the duration of 4. In this study, we model this process by integrating the SIR model with a network model, referring to Yuasa et al. [30].
Order action of each infection state
This model has three infection states, each reflecting a different stock market participation condition. These states signify non-participation (S), market entry (I), and market exit (R). Social media investors rely solely on the sentiment of surrounding investors, which is conveyed through the network, to make their trading decisions. Therefore, to eliminate other factors, the expected price of the states I and R is determined using a normal random variable. By using these states, this study illustrates how individuals initially inexperienced in the stock market can be driven by surrounding rumors to buy or sell stocks heavily and eventually withdraw from the market.
-
S (Susceptible state)
S denotes the phase before an investor enters the market. An SIR agent in the S state does not perform any market actions; thus, it places an empty order in the simulation.
-
I (Infectious state)
I depicts the action of purchasing many shares, influenced by surrounding rumors, without detailed analysis. Consequently, the expected price is determined solely by the noise factor (Eq. (3)). The calculation of the expected price and its return are as follows:
$$\begin{aligned} \widehat{r_{t}^{i}}=\; & N_{t}^{i} \end{aligned}$$(8)$$\begin{aligned} \widehat{p_{t}^{i}}=\; & p_{t} \exp \left( \widehat{r_{t}^{i}} \tau _{i}\right) \end{aligned}$$(9)Subsequently, agent i places a one-unit buy order at the following price.
$$\begin{aligned} \text {min}\bigg \{\widehat{p_{t}^{i}}(1-k^i), p_{t}^{\textrm{ask}}\bigg \} \end{aligned}$$(10) -
R (Recovered state)
R denotes the situation where, after acquiring a substantial quantity of stock influenced by rumors, the investor sells their holdings without thorough analysis and eventually exits the market. To differentiate between selling stocks and exiting the market, R is divided into two states:
-
R-1
R-1 expresses the attempt to sell off holdings. Similar to the state I, the expected price here is influenced only by the noise factor (Eq. (3)). The expected price and its return are specified as follows:
$$\begin{aligned} \widehat{r_{t}^{i}}=\; & N_{t}^{i} \end{aligned}$$(11)$$\begin{aligned} \widehat{p_{t}^{i}}=\; & p_{t} \exp \left( \widehat{r_{t}^{i}} \tau _{i}\right) \end{aligned}$$(12)Agent i then places sell orders for all units possessed at the following price.
$$\begin{aligned} \text {max}\bigg \{\widehat{p_{t}^{i}}(1+k^i), p_{t}^{\textrm{bid}}\bigg \} \end{aligned}$$(13)Furthermore, the agent i cancels all unexecuted buy orders placed while in the I state.
-
R-2
R-2 denotes the withdrawal from the market. Thus, the agent places an empty order, akin to the S state.
-
Network model of SIR agent
Each SIR agent is divided into nodes of the network model. The agents can understand the infection states of adjacent agents and how they affect each other. In this study, the following three networks are used:
-
The ER model (Erdös-Rényi model) [5] is random, created by establishing edges between nodes based on a fixed probability.
-
The BA model (Barabási-Albert model) [2] is a representative scale-free network model. A scale-free network has a power-law degree distribution.
-
The WS model (Watts-Strogatz model) [27] exhibits the small-world property, characterized by the principle that any two vertices can be connected through only a small number of intermediary vertices.
The networks analyzed in this study are all undirected and unweighted simple graphs. They were designed such that the average degree of the network model is approximately four. Networks were generated using the NetworkX package (Version 3.1) in Python (Version 3.9.16).
We selected the BA and WS models for this study because each model exemplifies the scale-free and small-world properties. These characteristics are common in social and economic networks [17]. We also chose the ER model as a comparatively simple network model for our analysis because it lacks explicit clustering or small-world characteristics.
Process of infection
Each agent updates its infection state based on the infection states of its adjacent agents. This update occurs before an order is placed when an agent is randomly selected to order.
-
1.
If the infection state of the selected agent is S, its state changes into I with a probability of \(\lambda \frac{n_i(I)}{n_i(S)+n_i(I)+n_i(R)}\) (Fig. 3a). Here, \(\lambda\) is a fixed number within the \(0 \le \lambda \le 1\) range. In the SIR model, \(\lambda\) represents the infection rate per unit time [12], and in this study, it is related to the number of agents in state I as well as the rate at which this number increases when the price rises. The terms \(n_i(S),\) \(n_i(I),\) and \(n_i(R)\) represent the number of adjacent agents whose infection states are S, I, and R (with R accounting for the total of R-1 and R-2), respectively.
-
2.
If the infection state of the selected agent is I, its state changes into R-1 with a probability of \(\mu \frac{n_i(R)}{n_i(S)+n_i(I)+n_i(R)}\) (Fig. 3b). \(\mu\) is a fixed number, where \(0 \le \mu \le 1\). In the SIR model, \(\mu\) represents the recovery rate per unit time [12]. In this study, it is related to the rate at which the number of agents in state I increases, as well as the rate at which the number of agents in state I decreases when the price falls.
-
3.
If the infection state of the selected agent is R-1, its state changes into R-2 with certainty (Fig. 3c).
-
4.
If the infection state of the selected agent is R-2, its state remains unchanged, that is, R-2 (Fig. 3d).
Here, the agent in state I at the beginning of the simulation first transitions to state R-1 when the market price exceeds twice the fundamental price. Subsequently, the update process is resumed from process 2 onwards. This setting is intended to prevent the initiation of a sell-off when the market price is low.Footnote 6,Footnote 7,Footnote 8,Footnote 9
Experiments
Experimental settings
In the market, there are \(N_{FCN}\) FCN agents (modeling conventional investors) and \(N_{SIR}\) SIR agents (modeling social media investors on a social network). To analyze the information disparity caused by the presence or absence of social media usage among individual investors, we evaluate the profit as the ratio of SIR agents to FCN agents increases. Assuming \(N_{FCN}+N_{SIR}=400\) is a constant, we increase \(N_{SIR}\) in increments of 10, from 10 to 200, and calculate evaluation indicators for each \(N_{SIR}\). For each \(N_{SIR}\) and sets of parameters \(\lambda\) and \(\mu\), 50 trials were performed. Table 1 lists the simulation parameters which were mainly determined based on Torii et al. [24].
Among the SIR agents, the agent with the highest degree is considered the hub. According to Lucchini et al. [15], the posts of investors who are central to the network initiated the information spread in the meme stock phenomenon. In this experiment, we assume that information propagation in a social network starts from the hub. Therefore, in the network created for each trial, the agent with the highest degree is initially in the “infected” (I) state, and all other agents are in the “susceptible" (S) state. Henceforth, the agent with the highest degree is denoted as the initial infected agent (II agent) in this study.Footnote 10
Validation of the model
Reproducing representative characteristics is important from a validity perspective. This study validates that the model is able to reproduce the meme stock phenomenon through both quantitative and qualitative aspects. In the quantitative aspect, referring to Costola et al. [4], this study confirms that the price and the number of infected agents are cointegrated, as well as the trading volume and the number of infected agents at the same time. Specifically, for each pair of time series, a cointegration test is conducted to examine whether there is a period during which the number of cointegration vectors is one for both pairs simultaneously. In this study, the models with more than 40 SIR agents (\(N_{SIR} >= 40\)) fulfilled this condition. When the model had only one SIR agent, the model did not fulfill the conditions at any time. Therefore, SIR agents (i.e., social media investors) can be said to cause the meme stock phenomenon, and this model successfully reproduces the meme stock phenomenon. For details, refer to Appendix A.
Furthermore, in the qualitative aspects, referring to Lyócsa et al. [31], we confirm that volatility increases higher as the number of SIR agents rises. For details, refer to Appendix B.
Evaluation
We focus on the profit or loss \(PL^{i}\) of each agent i. We define \(PL^{i}\) as follows:
Here, \(v_{T}^i\) and \(c_{T}^i\) denote the asset volume and cash amount of agent i at T, respectively. T denotes the number of steps in the simulation \(^{6}\). \(PL^{i}\) represents the difference between the assets of the agent at the beginning and end of the simulation. A positive value of \(PL^{i}\) indicates a gain, whereas a negative value indicates a loss.
Results and discussion
In this investigation, to analyze the impact of information asymmetry regarding the use of social media on profits in meme stocks, we investigate the profits of conventional investors and social media investors as the proportion of social media investors changes.
We focus on the profit or loss \(PL^{i}\) of each agent i (Eq. (14)). Figure 4a, b, and c show the relationship between \(PL^{i}\) and \(N_{SIR}\) in the experiments with three different network models when both parameters \(\lambda\) and \(\mu\) are set to 0.8. They are plotted with a standard deviation of 50 trials, as indicated by the error bars.
From these three figures, as an overall trend, as \(N_{SIR}\) increases, the \(PL^{i}\) of the SIR agent tends to be negative, whereas the \(PL^{i}\) of the FCN agent becomes positive. However, as \(N_{SIR}\) increases, the standard deviation of \(PL^{i}\) for both agents also increases, and the difference in \(PL^{i}\) between the agents becomes more pronounced. This result remained consistent even with varying values of the parameters \(\lambda\) and \(\mu\) (see Appendix C for details). This result suggests that there are both profitable and unprofitable agents among both types. Next, we discuss the factors that caused the differences in profit or loss between the agents in more detail.
Discussion about SIR agents’ profit
The heterogeneity of the SIR agent is determined by its position in the network. We hypothesize that the distances from the II agent (the agent with the highest degree) affect agents’ profits. The closer it is to the II agent, the more centrality the agent has in the network. We assume that agents closer to the II agent in the network respond better to price fluctuations, likely resulting in a positive \(PL^{i}\). Conversely, those farther from the II agent tend to miss these fluctuations, leading to a negative \(PL^{i}\). Therefore, we analyze the relationship between centrality and \(PL^{i}\).
Figure 5 shows the total number of agents at each distance from the II agent. We use data from 50 trials for all three network types with \(N_{FCN} = N_{SIR} = 200\) agents and all parameters. Figure 5 illustrates the differences in the distribution of agents at each distance between the different networks. It also shows that, even within a single network, agents are not uniformly distributed at all distances. This variability limits quantitative comparisons and analyses across models and distances.
Therefore, using the results of these trials, the relationship between centrality and \(PL^{i}\) is represented by a box-and-whisker diagram (Fig. 6). In this analysis, the maximum distance was set to four, which is the maximum value in the BA model, and data up to a distance of four are used. The x-axis represents the distance from the II agent, and the y-axis represents the agent’s \(PL^{i}\). The box-and-whisker plot indicates that \(PL^{i}\) tends to be positive for agents with a high network centrality. In addition, the median value of \(PL^{i}\) decreases with distance in all networks. Thus, we can conclude that high-centrality agents quickly obtain information and adapt to price changes. By contrast, low-centrality agents, delayed and isolated, are likely to see a negative \(PL^{i}\). These results suggest that network location significantly affects agents’ profits. However, there is a marked difference between the distance of 0 and distances 1–4. The agent at a distance of 0 (II agent) has a positive average of \(PL^{i}\) and is significantly more profitable. However, for agents at distances 1–4, there is no significant difference in \(PL^{i}\) values among these distances in any of the models. Although some agents show positive \(PL^{i}\) values, the average value is negative. This implies that only information senders tend to profit significantly, with non-senders’ earnings being similar regardless of their proximity to the sender. Thus, in real markets, using social media information may cause significant losses for less knowledgeable individual investors.Footnote 11
Box-and-whisker diagram showing the distribution of PL by distance to initially infected agent. The lower and upper limits of the box represent the first (Q1) and third (Q3) quartiles of the data, respectively, and the center line of the box represents the median. The edges of the whiskers represent the maximum and minimum values that do not exceed the 1.5-fold interquartile range (IQR) from Q1 and Q3. Results from different models are color-coded
The relationship between centrality and \(PL^{i}\) can also be interpreted by comparing network models. We observe the mean and standard deviation of \(PL^{i}\) for the BA and WS models at \(N_{FCN} = N_{SIR} = 200\) (see Fig. 4b and c) and the distribution of the number of agents at each distance from the II agent (see Fig. 5). The mean values of \(PL^{i}\) calculated from the data of all sets of parameters \(\lambda\) and \(\mu\) are -476.180 for the BA model and -728.644 for the WS model. These results indicate that the smaller the overall distance from the Initial Infected Agent, the larger the mean value of \(PL^{i}\). The standard deviation of \(PL^{i}\) calculated from the data of all sets of parameters \(\lambda\) and \(\mu\) is 1100.607 for the BA model and 1560.118 for the WS model. Overall, these results suggest that the more comprehensive the range considered as the distance from the II agent, the larger the standard deviation of \(PL^{i}\).
The abovementioned trends are discussed in terms of the differences in the properties of the different network models. In the BA model, the scale-free property confers high centrality on some agents. These agents reduce the overall distance between the II agent and the other agents, facilitating rapid information diffusion. This is evident from the small median and narrow width of the distribution of the number of agents at each distance. Consequently, the BA model agents respond better to price changes, showing higher \(PL^{i}\) mean values and lower standard deviations than the other models. This result underscores the significant role of hub nodes in the network in information propagation and aggregation in the network.
Conversely, the WS model, characterized by small-world network properties and high local density, likely increases the overall distance between the II agent and the other agents. This is indicated by the sizeable median value and wide distribution range of the number of agents at each distance. Therefore, it can be inferred that the high cluster coefficients in the WS model decelerate information propagation. Consequently, WS model agents face challenges in reacting to price changes, which lead to lower \(PL^{i}\) means and higher standard deviations. These outcomes highlight the significant influence of network geometry on information propagation efficiency and its correlation with agent profits.
These results suggest that, in real markets, an investors’ network position and information access can significantly impact their profits. Network model comparisons show that a network’s structure crucially influences information flow and participant earnings. Therefore, fair information access and regulations tailored to the network’s features are essential to maintain market stability.
Even if investors have access to social media as an information source, if that information is delayed, it is akin to investors receiving incorrect information. Consequently, having social media as a source of information can be detrimental.
Discussion about FCN agents’ profits
We examine which information conventional investors prioritize when it comes to price information and how that impacts their profits. The heterogeneity of FCN agents is determined by the fundamental factor weight \(w_{F}^{i}\), chart factor weight \(w_{C}^{i}\), and noise factor weight \(w_{N}^{i}\). Therefore, differences in these weights may cause variations in \(PL^{i}\). Specifically, we assume that a higher chart factor weight often leads to order placement being more affected by price changes, potentially leading to a negative \(PL^{i}\). To investigate this in detail, we analyze the relationship between the weights of the three factors and \(PL^{i}\).
We conduct 50 trials across the three networks, each with 200 agents (\(N_{FCN} = N_{SIR} = 200\)). We then perform multiple regression analysis. In this analysis, \(PL^{i}\) of FCN agents is the dependent variable. The independent variables are the weights of three factors, mean reversion time, and time window size. Table 2 presents the results of the multiple regression analysis for all three networks when both parameters \(\lambda\) and \(\mu\) are set to 0.8. The sample size for this multiple regression analysis is 30,000. The analysis indicates that the chart factor does not contribute significantly to \(PL^{i}\). Although the other weights contribute, the precision of the regression analysis itself is poor (the coefficient of determination is 0.036 and the adjusted coefficient of determination is 0.035). The variables used in this analysis are considered insufficient to explain \(PL^{i}\). This result remained consistent even with varying values of the parameters \(\lambda\) and \(\mu\) (see Appendix C for details). The heterogeneity of FCN agents stems from the variables in this analysis. This implies that \(PL^{i}\) variations among FCN agents are less about heterogeneity and more about random factors, such as order placement timing. The randomness in \(PL^{i}\) for FCN agents is owing to market price fluctuations being affected by many uncertainties. This makes the effectiveness of individual agent strategies and positions inconsistent. Specifically, in our simulation, short-term market behavior is greatly influenced by SIR agents’ activities. This observation also holds for real markets, indicating that specific strategies do not always lead to consistent outcomes.
Conclusion
We investigated the impact of information asymmetry on profits in meme stocks. We integrated the SIR model with a network model for an artificial market simulation. This approach allowed us to model meme stocks in which investors (social media investors) make investment decisions drawn by local information on their social network and cause information propagation. We confirmed that our model reproduced the meme stock phenomenon by observing cointegration and an increase in volatility.
In the experiment, we analyzed investors’ profits by increasing the number of social media investors. We observed that conventional investors tend to gain profit, whereas social media investors incur losses. High-centrality agents quickly adapt to price changes, increasing their profits. By contrast, low-centrality agents, who are more exposed to price volatility, are likely to incur losses. We conducted the experiment with three types of networks and found that the shape of networks impacts the speed of information propagation, affecting both the average profit and the profit disparity among investors.
These results indicate that only information disseminators and a few individuals who are near the information disseminators by chance can gain profits, while almost all social media investors suffer losses. Meme stocks are influenced by the buzz on social media, but if information from social media is received with a delay, it can lead to incorrect investment actions. Therefore, there are instances in which not using social media information could have resulted in better outcomes. Overall, conventional investors profited while social media investors incurred losses, because many social media investors received information late. These findings suggest that the information disparity among individual investors can lead to significant differences in profits in meme stocks.
Based on our results, the following research directions can be considered. A potential future research avenue is exploring network models beyond the ER, BA, and WS models used here. Investigating real-life social media network structures could provide deeper market insights. Furthermore, previous research has shown that the network structure of social media changes when the meme stock phenomenon occurs [15]. Considering these changes in network structure will be important for future research.
Data availability
Our research code is available at https://github.com/Mm678/pams_meme.git.
Notes
The SEC (Securities and Exchange Commission) is the U.S. regulatory body overseeing financial markets to protect investors and ensure fair and efficient markets.
ESMA (European Securities and Markets Authority) is the EU regulatory authority responsible for enhancing investor protection and promoting stable and orderly financial markets in Europe.
The best ask price is the lowest (or best) price at which someone is willing to sell a stock.
The best bid price is the highest (or the best) price at which someone is willing to buy a stock.
The fundamental value of the stock price, calculated based on economic factors, including companies’ financial condition and performance.
The notation \(Ex(\lambda )\) indicates that the values are randomly selected according to an exponential distribution. The expected value corresponds to \(\lambda\).
The notation [a, b] indicates that the values are randomly selected according to a uniform distribution in the interval a to b.
This value refers to the number of remaining steps in the simulation, excluding the first 100 steps in which no execution is processed.
Values for \(\lambda\) and \(\mu\) were set at 0, 0.2, 0.4, 0.6, 0.8, and 1 to evaluate the model’s ability to represent the FOMO mechanism in the meme stock phenomenon. The model used was the ER model, chosen as a comparatively simple network. The key criterion was whether the model could successfully transition social media investors from state S to state R. In this number of simulation steps T, the simulation successfully captured the concept of FOMO when the parameter values were set to either 0.8 or 1.0, as indicated by an increase in the number of state R.
In our model, if there are several nodes with the same largest degree, one II agent is randomly chosen from among them.
Outliers are omitted from this figure. Outliers are defined as values lower than the first quartile (Q1) to 1.5 times the interquartile range (IQR) or higher than the third quartile (Q3) to 1.5 times the IQR. This process allows for a more unambiguous indication of the central features and distribution of the data.
When the first difference of the unit root process \(X_t\), \(\Delta X_t = X_t - X_{t-1}\), becomes a stationary process, \(X_t\) is called an integrated of order one process (I(1)).
When there are no SIR agents, infection data cannot be collected, and thus, testing cannot be performed. Therefore, testing was conducted with one SIR agent instead.
There are various types of volatility depending on the application. This study uses realized volatility, a model-independent volatility estimator. Realized volatility is the standard deviation of log returns when the drift term is assumed to be zero.
Here, we refer to the number of steps remaining in the simulation, excluding the first 100 steps where no execution is processed.
References
Anand, A., & Pathak, J. (2022). The role of reddit in the gamestop short squeeze. Economics Letters, 211, 110249. https://doi.org/10.1016/j.econlet.2021.110249
Barabási, A.-L., & Albert, R. (1999). Emergence of scaling in random networks. Science, 286(5439), 509–512. https://doi.org/10.1126/science.286.5439.509
Chiarella, C., & Iori, G. (2002). A simulation analysis of the microstructure of double auction markets. Quantitative Finance, 2(5), 346–353. https://doi.org/10.1088/1469-7688/2/5/303
Costola, M., Iacopini, M., & Santagiustina, C. R. (2021). On the “mementum’’ of meme stocks. Economics Letters, 207, 1–6. https://doi.org/10.1016/j.econlet.2021.110021
Erdös, P., & Rényi, A. (1959). On random graphs I. Publicationes Mathematicae Debrecen, 6, 290.
European Securities and Markets Authority (2024). Mar guidelines: Delay in disclosure of inside information. Retrieved from https://www.esma.europa.eu/document/mar-guidelines-delay-in-disclosure-inside-information
Hirano, M., Takata, R., & Izumi, K. (2023). Pams: Platform for artificial market simulations. SSRN.
Hoffmann, A. O. I., Jager, W., & Von Eije, J. H. (2007). Social simulation of stock markets: Taking it to the next level. Journal of Artificial Societies and Social Simulation, 10(2), 7.
Investopedia (2023). What are meme stocks, and are they real investments? Retrieved from https://www.investopedia.com/meme-stock-5206762
Investopedia (2024). Regulation fair disclosure (Reg FD). Retrieved from https://www.investopedia.com/terms/r/regulationfd.asp
Kawachi, K. (2008). Deterministic models for rumor transmission. Nonlinear Analysis: Real World Applications, 9(5), 1989–2028. https://doi.org/10.1016/j.nonrwa.2007.06.004
Kermack, W. O., & McKendrick, A. G. (1927). A contribution to the mathematical theory of epidemics. In Proceedings of the royal society of london. Series A, Containing papers of a mathematical and physical character, 115 (772), pp. 700–721
Kikuchi, T., Kunigami, M., Yamada, T., Takahashi, T., & Terano, T. (2016). Agent-based simulation analysis on investment behaviors of financial firms related to bankruptcy propagations and financial regulations. Journal of the Japanese Society for Artificial Intelligence. https://doi.org/10.1527/tjsai.AG-G. [in Japanese].
Liu, W., Yang, J., Chen, J., & Xu, L. (2023). How social-network attention and sentiment of investors affect commodity futures market returns: New evidence from china. SAGE Open, 13(1), 1–18. https://doi.org/10.1177/21582440231152131
Lucchini, L., Aiello, L. M., Alessandretti, L., De Francisci Morales, G., Starnini, M., & Baronchelli, A. (2022). From reddit to wall street: The role of committed minorities in financial collective action. Royal Society Open Science, 9(4), 21148. https://doi.org/10.1098/rsos.211488
Mancini, A., Desiderio, A., Di Clemente, R., & Cimini, G. (2022). Self-induced consensus of reddit users to characterise the gamestop short squeeze. Scientific Reports, 12(1), 1–11. https://doi.org/10.1038/s41598-022-17925-2
Masuda, N., & Nakamaru, M. (2006). Complex networks - a review with implications for ecological researches. Japanese Journal of Ecology, 56(3), 219–229. https://doi.org/10.18960/seitai.56.3_219. [in Japanese].
Okada, Y., Sakaki, T., Toriumi, F., Shinoda, K., Kazama, K., Noda, I., Kurihara, S. (2013). False rumor diffusion analysis based on the sir-extended information diffusion model [in japanese]. In Proceedings of the 27th annual conference of the japanese society for artificial intelligence (2013).
Oshio, K. (2022). Studies on legal regulations of meme stocks by behavioral finance approach [in Japanese]. Journal of legal studies, the Graduate School of Meiji University, 57, 79–101.
Oskarsson, M. L. (2023). Winners and losers in the trading of meme stocks. Fi analysis.
Ponta, L., & Cincotti, S. (2018). Traders’ networks of interactions and structural properties of financial markets: An agent-based approach. Complexity, 2018, 1–9. https://doi.org/10.1155/2018/9072948
Santonja, F. J., Tarazona, A. C., & Villanueva, R. J. (2008). A mathematical model of the pressure of an extreme ideology on a society. Computers & Mathematics with Applications, 56(3), 836–846. https://doi.org/10.1016/j.camwa.2008.01.001
The tail wagging the dog. (2023). How do meme stocks affect market efficiency? International Review of Economics & Finance, 87, 68–78. https://doi.org/10.1016/j.iref.2023.04.019
Torii, T., Kamada, T., Izumi, K., & Yamada, K. (2017). Platform design for large-scale artificial market simulation and preliminary evaluation on the k computer. Artificial Life and Robotics, 22(3), 301–307. https://doi.org/10.1007/s10015-017-0368-z
Wang, L., & Wood, B. C. (2011). An epidemiological approach to model the viral propagation of memes. Applied Mathematical Modelling, 35(11), 5442–5447. https://doi.org/10.1016/j.apm.2011.04.035
Wang, Z., Liu, S., & Yang, H. (2019). The influence of social network structure on stock price disclosure. Physica A: Statistical Mechanics and its Applications, 533, 1–11. https://doi.org/10.1016/j.physa.2019.122064
Watts, D. J., & Strogatz, S. H. (1998). Collective dynamics of ‘small-world’ networks. Nature, 393(6684), 440–442. https://doi.org/10.1038/30918
Westphal, R., & Sornette, D. (2023). How market intervention can prevent bubbles and crashes: An agent based modelling approach. Computational Economics. https://doi.org/10.1007/s10614-023-10462-8
Yousaf, I., Pham, L., & Goodell, J. W. (2023). The connectedness between meme tokens, meme stocks, and other asset classes: Evidence from a quantile connectedness approach. Journal of International Financial Markets, Institutions and Money, 82, 11–14. https://doi.org/10.1016/j.intfin.2022.101694
Yuasa, T., & Shirayama, S. (2010, nov). Infulence of network structure on prediction of epidemic outbreaks [in Japanese]. In The 24th annual conference of the Japanese society for artificial intelligence.
Lyócsa, Štefan, Baumöhl, E., & Výrost, T. (2022). Yolo trading: Riding with the herd during the gamestop episode. Finance Research Letters, 46, 102359. https://doi.org/10.1016/j.frl.2021.102359
Funding
Open Access funding provided by The University of Tokyo.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have no Conflict of interest to declare that are relevant to the content of this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A Quantitative validation of the model
In the qualitative validation of the model, we focus on cointegration, referring to Costola et al. [4]
When the linear combination \(aX_t + bY_t\) of two integrated of order one (I(1) Footnote 12) processes \(X_t\) and \(Y_t\) becomes a stationary process, \(X_t\) and \(Y_t\) are said to be cointegrated. Costola et al. defined “Mementum” originally and showed that meme stocks have experienced Mementum at least once. The definition of Mementum is as follows:
-
Condition 1: Coordinated buying signals originating in social media, which are proxied through count time-series, affect the cointegration between (i) price and tweet series, and (ii) volumes and tweet series. This implies that two cointegration relationships occur contemporaneously.
-
Condition 2: There is synchronicity in the timing when the regime switches to cointegration. The change of just one cointegration relationship does not reflect a common, structural event on social media activity affecting both the price and the trading volume.
-
Condition 3: For both series, the cointegration regime is persistent, as well as the regime before cointegration.
An example result of cointegration test (Johansen procedure). The number of cointegrating vectors is plotted. A cointegrating vector is defined as having a value of 1, and it must persist for 2 or more consecutive days to meet the condition. In the figure, the sections that meet this condition are marked with orange squares
Costola et al. examined whether cointegration between tweets and prices, and between tweets and volumes, persisted simultaneously for 2 consecutive days. They found that in the case of meme stocks, this condition was satisfied at least once, whereas for normal stocks, it was never satisfied. Therefore, this study verified that cointegration between the number of infected agents and prices, and between the number of infected agents and volumes, persisted simultaneously for two consecutive days.
In this study, we do not have tweet data, and thus, we use the number of infected agents instead. Okada et al. [18] modeled false rumor diffusion on Twitter (now X) using an extended SIR model. S denotes not seeing the false rumor tweet, I denotes posting the false rumor tweet, and R denotes posting correction information. Referring to this previous study, we use the number of infected agents instead of tweet data.
We define a set of simulation steps as consisting of 5 steps, and 80 sets are considered as 1 day; hence, we use prices and the number of infected agents every 5 steps, as well as the total trading volumes for those 5 steps. These 80 sets comprise 1 day, forming one time series, and we perform the Johansen procedure every day. Since the total number of simulation steps is 10,000, we perform the Johansen procedure for 25 days. In the Johansen test, when the number of cointegrating vectors is 1, it indicates that the two time series are cointegrated. Using Fig. 7 as an example, we explain how we confirm that the conditions are met. The upper half of Fig. 7 shows the number of cointegrating vectors between the number of infected agents and the price, while the lower half shows the number of cointegrating vectors between the number of infected agents and the trading volume. In Figure 7, during days 1–7 and days 15–19, the number of cointegrating vectors is 1 for both pairs of time series for more than two days, thus meeting the condition (indicated by the orange rectangles). By contrast, on day 12, although the number of cointegrating vectors is 1 for both pairs, it does not continue for 2 days, thus not meeting the condition. Additionally, on days 20–22, the lower cointegrating vector remains at 1 for more than two days, but the upper cointegrating vector is 0, thus failing to meet the condition.
As a result, all models that had more than 40 SIR agents (\(N_{SIR} >= 40\)) fulfilled the abovementioned conditions. When the model had only one SIR agent, it did not fulfill the conditions at any time.Footnote 13 Therefore, it can be said that SIR agents (i.e., social media investors) cause the meme stock phenomenon, and this model reproduces the meme stock phenomenon.
Appendix B Qualititative validation of the model
In the qualitative validation of the model, we focus on volatility. Lyócsa et al. [31] examined whether posts on WSB are related to the daily price fluctuations of four meme stocks, and confirmed that increased discussion on WSB leads to greater price volatility for these four stocks. Other studies have also shown that volatility increases when the meme stock phenomenon occurs [1, 23]. Therefore, we analyze volatility changes when the number of SIR agents increases. Volatility \(\sigma\) is defined as the standard deviation of the log return when the drift term is assumed to be zero .Footnote 14
The number of steps in the simulation is denoted by T.Footnote 15 The parameter \(\sigma\) represents the extent of past price fluctuations. The larger the value of \(\sigma\) (which is greater than or equal to 0), the higher the risk of price fluctuations (indicating lower safety); conversely, the smaller the value of \(\sigma\), the lower the risk of price fluctuations (also indicating lower safety).
Figure 8 plots the mean values of volatility \(\sigma\) for each \(N_{SIR}\). The error bars represent the standard deviation of 50 trials. The overall trend was as hypothesized: the larger \(N_{SIR}\), the larger \(\sigma\).
Appendix C Results for different values of \(\lambda\) and \(\mu\)
In Sect. 5, we focus on the relationship between the profit or loss, \(PL^{i}\), and \(N_{SIR}\) in the experiments with three different network models. Across all values of parameters \(\lambda\) and \(\mu\), as \(N_{SIR}\) increases, the \(PL^{i}\) of the SIR agent tends to be negative, whereas the \(PL^{i}\) of the FCN agent becomes positive. However, as \(N_{SIR}\) increases, the standard deviation of \(PL^{i}\) for both agents also increases, and the difference in \(PL^{i}\) between the agents becomes more pronounced. Figures 9, 10, and 11 show the relationship between \(PL^{i}\) and \(N_{SIR}\) in the experiments with the three different network models. The results are for cases other than when both parameters are 0.8.
In Sect. 5.2, we investigate how conventional investors prioritize different types of information for pricing and its impact on their profits. It focuses on the heterogeneity of FCN agents, determined by weights of fundamental, chart, and noise factors, and how these affect their profits (\(PL^{i}\)). A multiple regression analysis is performed for all values of parameters \(\lambda\) and \(\mu\) over 50 trials, each involving 200 agents. The dependent variable in this analysis is \(PL^{i}\), while the independent variables include the weights of the three factors, the mean reversion time, and the time window size. Results show that the chart factor weight has an insignificant impact on \(PL^{i}\), and the regression model’s precision is low, indicating that the chosen variables do not adequately explain \(PL^{i}\). Variations in \(PL^{i}\) are attributed more to random factors like order timing rather than agent heterogeneity. Market price fluctuations, influenced by uncertainties and SIR agents’ activities, contribute to inconsistent effectiveness of strategies, a finding that is applicable to real markets as well. The following tables show cases other than when both parameters are 0.8. (Tables 3, 4 and 5 )
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Matsumoto, M., Hashimoto, R., Suzuki, M. et al. Impact of information disparity between individual investors on profits of meme stocks using an artificial market simulation approach. J Comput Soc Sc 8, 25 (2025). https://doi.org/10.1007/s42001-024-00355-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42001-024-00355-7