Fuzhong Nian(年福忠), Jingzhou Li(李经洲), and Xin Guo(郭鑫)
School of Computer&Communication,Lanzhou University of Technology,Lanzhou 730050,China
Keywords: Weibo,top news,role,interest degree
“Weibo top news”is a Chinese internet buzzword,which mainly refers to the formation of top news through numerous forces. Making top news is a kind of public opinion orientation.In a sense,top news may be a bellwether,which will have a significant impact on people’s lives and even their values.
In the real world, the process of information dissemination is also similar to the process of virus transmission.Nianet al.defined an epidemic growth index through a comprehensive analysis of factors, including the number of infected people coming from Wuhan,China,the geographic distance from Wuhan, and the GDP of each province. Combining the transmission characteristics of COVID-19 and taking high-risk groups as the research objects, a dynamic network of high-risk groups was constructed, and the infection rate, latent rate, and withdrawal rate in the SEIR model was redefined.[1]Therefore, we can learn from the classic propagation model of complex networks in the research process.In social networks, people tend to pay attention to hot topics, and there will be some super communicators in the process. For this reason,Liuet al.proposed a SAIR model based on an epidemic model.[2]Considering the selective dissemination of nodes, Liuet al.established the SEIR rumor dissemination model to discuss the dissemination dynamics of rumor information in Weibo.[3]Zhuet al.proposed a social network rumor propagation model with nonlinear function and time delay.[4]Literatures[5,6]proposed an improved online social network communication model based on the SIR and SIS models. Chenget al.established an improved rumor dissemination model based on the time delay of the interactive system and discussed the new characteristics of the information dissemination process.[7]Based on our previous research on propagation dynamics,we study the dissemination of news.[8,9]Literature [9] established a new human flesh search model (HFS)based on the SIS and SIR models and proposed four effects:impulse effect,thermal effect,herd effect,and coupling effect.Similarly,most of the early message propagation models were also based on the virus propagation models, such as the classic SI epidemic model[10]and SIR epidemic model. For the study of message dissemination,Daley and Kendall proposed the DK model in the early days,[11]and later the DK model was modified by Makiet al.and developed the MK model.[12]However,these traditional propagation models are difficult to adapt to the increasingly complex social networks,so it is hard to accurately describe the spread of messages. Therefore, a large number of researchers began to further study the law of information propagation by improving the traditional model.For example, Wanget al.proposed a multi-message propagation model on the basis of examining the relationship between different messages.[13]Zhouet al.proposed an adaptive weighted network model based on the SIS model to study the dynamics of infectious diseases.[14]Zhaoet al.proposed a new model to analyze the process and extent of rumors spread from both micro and macro perspectives.[15]Literature [16]proposed an rumor propagation model with impulse vaccination and time delay. The dynamic model of concurrent propagation of double-rumors in a complex environment studied by Zanet al.[17]Fanet al.proposed two spatio-temporal popular network models based on the popularity and similarity optimization(PSO).[18]From the perspective of rumor dissemination,Huoet al.proposed an analysis method for the interaction of the two processes.[19]Considering the subjective judgment and diverse characteristics of individuals,Maet al.proposed a new rumor spreading model, introducing two probability distribution functions to characterize the individual’s knowledge of a specific rumor field and the individual’s rationality.[20]In order to analyze the influence of the distributed infection rate in the complex network on the transmission behavior of infectious diseases, literature [21] proposed an improved SIS model.Literature[22]proposed a new rumor spreading model with hesitation mechanism. In the above research, the complexity of the real social network formation process is not considered. Based on this consideration,we innovatively propose a preprocessing scheme for traditional small-world networks and scale-free networks, which provide three role states of fans, passers-by and anti-fans to better describe the information dissemination in Weibo.
In this paper,we mainly study the relevant laws and characteristics of the top news of internet events, and propose a more reliable plan for top news, which has positive significance for the research of online public opinion orientation.Therefore, it is extremely important to study the propagation characteristics of messages and the dynamics of message propagation in the complex networks.[23–26]In real life,when different message sources transmit multiple messages to the network,with the message spreads,we find that the spread of information among nodes is essentially in competition for node interest degree. Therefore, whether a message is accepted or not depends on the way in which interested the node is in the information. In order to get closer to the reality,we add the attribute of node interest to each node in the network. The node interest degree is defined from node aggregation degree, and the network evolution model based on node interest degree is established. Based on the above consideration, we focus on the impact of node interest on message dissemination in this paper.
Considering that in the real world,individuals have different character types for different message hosts, such as fans,passers-by,anti-fans,etc.,which will lead to different individual behaviors,the individual character types may also change over time. Based on the above thinking, combined with the analysis of the real Weibo environment, the recovered nodes are classified as anti-fans. If the original type of the infected node is a passerby, it will only become a fan. Passersby turn into fans, and fans turn into anti-fans. The modeling process is divided into three stages.
The first stage: the accumulation stage.
Scenario Suppose a star just enters the Weibo and needs to accumulate fans.
Step Establish the SI model,Npieces of irrelevant information is randomly transmitted in the network (for facilitating observation, we spread two information), and susceptible node and the infected node represent the passers-by and fans,respectively. Figure 1(a)is a schematic diagram of the fan interest value. As can be seen from Fig.1(a),the normal curve shows that the interest value is the largest when the round is 4,so the first stage of propagation round is 4. At this stage, the infected node is the fan.
Fig.1. Schematic diagram of the interest value of passersby and fans.
The second stage: the preparation stage.
Scenario The celebrity has a certain fan base and post message to make top news.
Step The SIR model is established,and the initial infected node in the first stage is taken as the node that initially publishes the message,and the fan result obtained in the first stage is taken as the node classification in the network. Recovered nodes are classified as anti-fans, and the infected nodes become fans if they are passers-by. Passersby turn into fans,and fans turn into anti-fans.
The third stage: the stimulation stage.
Scenario Figure 2 is a schematic diagram of the stimulation stage. Based on the second stage,stimulus information is released at a certain time point.
Steps Select four time points (i.e., four stimulation schemes): the infection density rises at the maximum rate,the infection density is the largest,the infection density decreases the rate, and the infection density is stable. Add stimulus information at these four time points respectively, and analyse and compare the results.
Strategy Passersby will not be affected, but its interest value will fluctuate over time, as shown in Fig. 1(b); if a fan is affected, the interest value will be restored to the original value, that is, fans interest value att=0. As shown in Fig.1(a),it then decreases with time.
Fig.2. Schematic diagram of the stimulation stage.
The purpose of this article is to study the formation mechanism of Weibo top news, predict the next hot top news and how to create top news artificially. We introduce two concepts of interest degree and interest value.Interest value is expressed as the degree of individual interest in a certain event or message. The main research environment of this paper is Weibo.Obviously,we should consider the main groups of Weibo are fans and passers-by,then consider that the interest of individuals changes over time, and the interest curves of fans and passersby are obviously different.
For fans, the initial interest value is very high, and then gradually decreases over time. We find that the Ebbinghaus forgetting curve is highly similar to the interest value change curve. Therefore,we extend the forgetting curve and apply it to the change of the interest value of fans, so that the change process of the interest value of the fans can be described.Therefore, we propose the definition of the fan interest value function as
wherekis a constant. Figure 1(a) is a schematic diagram of fans’interest value. And when the interest value of fans drops to a certain critical value,the subject of event can expose some information to stimulate the interest of fans, thereby promoting the formation of top news.
For passers-by,the initial interest value is relatively small,and gradually decreases as time increases and reaches the critical value. We extend the normal curve to this description.When a passerby receives information, the node can change from the passerby role to the fan role. The passerby interest value function is defined as
whereη,ε, andµare constants. Comprehensive analysis of the above, suppose there arenpieces of information in the network, and letΓjdenote the node’s interest value for information(event)j,
Interest degree refers to the tendency of an individual to be interested in a certain event or message. The higher the interest degree, the higher the recognition of the information (event)and the more willing to receive relevant information. Social networks usually have multiple information to spread.Assuming that nodeihasnpieces of information at timet, the total value of the node’s interest is as follows:
People’s willingness to accept information is positively related to their interest degree in information. Through the above analysis,the definition of dynamic propagation probability in the model is shown as
Similarly,people’s willingness to refuse to accept information is negatively related to their interest degree in information.Therefore, the definition of dynamic recovery probability in the model is shown in Eq.(7),whereais the recovery coefficient,
In the Eq.(8),s(t),i(t),andr(t)are the proportions of susceptible nodes, infected nodes and recovered nodes in the population,respectively,ands(t)+i(t)+r(t)≡1(s>0,i,r ≤1).The first term on the right side of the second equation is the number of susceptible nodes that become infected nodes with probabilityβ(t). The second term on the right side of the second equation indicates that the infected node becomes a recovered node with probabilityγ(t),
Although there is no explicit solution for this integral,the evolution characteristics of the solution of the SIR model can be explained by numerical calculation. In fact, for a given set of parameter values, we can get the steady-state value of the recovered node by setting dr/dt=0 as follows:
Fig.3. Effective transmission rate λ curve.
For large-scale networks, it is usually assumed that only one or a few individuals are infected at the initial moment and there are no recovered individuals,so thats0≈1,i0≈0,andr0≈0. Rememberλ=β/γ,so we have
whereλ=1 is the propagation critical value of the improved SIR model. Ifλ<1, thenr=0, which means that the information cannot be propagated. Ifλ>1,thenr>0,and as the value ofλincreases, the value ofralso increases, which means that the spread of information in the network also increases. An intuitive interpretation of the parameterλis that it represents the average number of other susceptible individuals that an infected individual can infect before being immunized. Figure 3 is the effective transmission rate curve. The curves of BAλand WSλrepresent the trend curve of the effective transmission rate of Scheme II in scale-free networks and small-world networks, respectively. The effective propagation rateλis far greater than 1 in both networks, which mean that the information has been spreading rapidly in the network with a high probability, which proves the effectiveness of Scheme II.
This section introduces the simulation results of the small-world network and the scale-free network to verify the theory in the previous section. According to the modeling ideas in the previous section, we chose four time points (i.e.,four stimulation schemes): the infection density rises at the maximum rate, the infection density is the largest, the infection density decreases the rate,and the infection density is stable. Add stimulus information at these four time points respectively,and analyse and compare the results. Each scheme is a group of experiments. In order to reflect the generality and particularity of the experiment,each group of experiments releases in two different information info 0 and info1 at the same time. Among them, info0 represents the experimental object,that is,in the simulation experiment,four kinds of interest incentive schemes are respectively applied to info 0,and info 1 does not exert any influence, when compare and analyse the effects of the schemes. The experimental parameters are shown in Table 1.
Table 1. Experimental parameter description list.
Table 2. Radar chart of fans density.
Table 2 is the fan density radar chart list, which can clearly show the fan density comparison results of our four experimental schemes under different initial conditions. Among them,fans 0,fans 1,fans 2,and fans3 respectively represent the stable value of the fans density of the subject info0 in the four experimental schemes, and fans represent the stable values of fan density in the original experiment (the experiment without any scheme)as the control group. It can be seen that the apparent density of fans 1 (Scheme II) exceeds 0.8 at the maximum,which is much larger than the fan density of other schemes.Considering”celebrity with large traffic amount”effect in the real environment,it is easy to speculate that Scheme II should be the best solution for top news on Weibo.
Table 3 is the chart of average interest degree and infection density evolution. The four groups of pictures respectively show the comparison results of the four interest stimulus schemes under different initial conditions. The curves of infection 0,infection 1,infection 2,and infection3 in the figure represent the infection density curves of the four schemes.The change trend of infection density with time is observed.The infection curve represents the infection density trend of the original experiment, while interestdegree0–3 represents the trend of the average interest degree over time under the four experimental schemes, and the curve of interestdegree represents the average interest degree trend of the original experiment.
It can be seen from Table 3 that the average interest degree curve and the infection density curve have the same trend.The infection curve is used as the control group. The infection3 curve of Scheme IV and the control group shows a relatively good fit,and the corresponding average interest curve also has a similar situation. Therefore,it is proved that the experiment is not obvious,and Scheme IV is invalid. The other three experiments have obvious results. We find that this is due to the limitation of the SIR model. The time node of Scheme IV (when the infection density is stable) is highly similar to the SIR model infection density threshold. At this time, the infected nodes in the network tend to be stable. Therefore,the effect of Scheme IV is not obvious. On the contrary, due to the characteristics of the SIR model and the law of the fan nodes interest value curve in Definition 1. When the infection density is the largest(Scheme II),that is,t=15,the fan node interest value infinitely approaches to 0,and the node interest degree is also affected.It can be seen from the average interest degree evolution graph that the average interest degree begins to show a downward trend,and the interest degree is an important factor in determining whether the information is accepted,so this is the best time to implement the program. It can be clearly seen that the average interest degree interest degree1 curve of Scheme II in the four groups of experiments quickly rises to about 0.9 in a short period of time aftert=15, and the infection density closely related to the interest degree also rises substantially, as high as 37.3%. However, Schemes I and III do not make full use of the characteristics of the SIR model,and the node interest degree is overdrawn when the infection density threshold is not reached,and the best effect is not achieved. We find that the threshold of the second scheme can reach about 0.9 without being affected by the change of the initial conditions,while other schemes are greatly affected.This proves that the second scheme has higher stability,which has been confirmed in other experiments.
Figure 4 shows the fans density, average interest degree and infection density curves of implementation Scheme II under different network and initial conditions based on the improved SIR network evolution model.The curves of BA 0 and WS 0 respectively represent the trend of fans density,average interest degree,and infection density changing with time in the scale-free network and small-world network with different initial interest values(0.6 for the experimental group and 0.8 for the control group)under the interest-degree stimulus Scheme II. The curves of BA 1 and WS 1 are the trends of the curve with time when the initial infection rate is different(the initial infection rate of the experimental group is 0.01, and the control group is 0.02). The curve in the figure shows the data of the experimental group,and the initial value of the experiment is the randomly selected empirical value.
Comparing and analyzing the time required to reach the peak in each experiment, it can be found that the different schemes reach the peak almost simultaneously under the same initial conditions,but the experiment with the initial infection rate as the initial condition reach the peak firstly. Considering the main environment studied in this paper is Weibo,combined with the specific analysis of the environment, we find that the information publishing subject (initial infection rate) has a more positive impact on whether the information can make top news or not compared with the initial interest value of the information. It is not difficult to imagine that the“celebrity with large traffic amount” effect makes its related information easier to make top news.
Table 3. List of average interest degree and infection density curves.
Fig. 4. Infection density, average interest degree, and fans density curve of Scheme II.
Figure 5 is a radar chart of network infection density peak based on the network evolution model in scale-free network and small-world network. In the figure,Info1 to Info7 respectively represent the peak infection density of each information fragment with an initial infection rate of 0.01 to 0.07. Among them, the experiment of Scheme II is performed on Info1 to compare the effects. The result shows that the second scheme works better in the scale-free network. This is determined by the power law characteristics of the scale-free network,which further proves the effectiveness and feasibility of the second scheme for real social networks.
Figure 6 is the effect diagram of Scheme II in the scalefree network with different degrees of initial infection nodes.Based on the above analysis, the experiment of Scheme II is performed on the scale-free network with the same infection rate and different degrees of initial infection nodes. The degree of the initial infection node of info 0 is less than 5, and the degree of the initial infection node of info 1 is greater than 30. Figure 6(a)is the infection curve under normal conditions.In Fig.6(b),infection 0 is the trend graph of infection density under the condition of interest degree stimulation Scheme II,and infection 1 is the trend graph of infection density under the condition of no stimulation. The figure shows that the initial infection node degree is one of the deep-seated elements that affect the headline. And the second picture shows the effectiveness of Scheme II.
Fig.5. Radar chart of peak infection density.
Fig.6.The effect diagram of Scheme II in a scale-free network with different degrees of initial infection nodes.
Fig.7. Comparison of Scheme II and the real case.
In October 2015, Tu Youyou won the Nobel Prize in Physiology and Medicine.In October 2015,Chinese actor and singer Huang Xiaoming got married. These two incidents occurred almost simultaneously,but the attention they caused on Weibo was very different. Figure 7 is a comparison diagram between the simulation experiment of Scheme II in the scalefree network and the real case of Weibo headlines. Real case1 and real case2 respectively represent the attention trend curve of these two network events on Weibo that triggered netizens’comments. BA Scheme II is the infection density trend curve of the Scheme II proposed in this article.It can be seen that the curve of the scheme proposed in this article has the same trend as the curve of the actual top news network event on the whole,which proves the reliability and feasibility of Scheme II.
This paper used Weibo as the research subject to discuss the process of top news formation in social networks and the competition of nodes interest degree. We innovatively proposed a pre-processing solution for traditional small-world networks and scale-free networks, and for the first time introduced three role types: fans, passers-by, and anti-fans to classify nodes. In order to further describe the competition process of a message to node interest degree,the node interest degree was defined from the aspects of node role and node aggregation degree. A network evolution model based on node interest degree was established,and four interest degree stimulation schemes were proposed.
Simulation experiments were carried out on the small world network and the scale-free network respectively. The experimental results showed that: the second scheme is to release the stimulus information when the density of information receivers is the highest,which is more conducive to compete nodes interest degree so as to reach the goal of making top news. The reliability and effectiveness of the scheme and model were further verified by the comparative experiment with the actual case.