Fuzhong Nian(年福忠), Xiaochen Yang(杨晓晨), and Yayong Shi(师亚勇)
1Lanzhou University of Technology,School of Computer&Communication,Lanzhou 730050,China
2University of Science and Technology Beijing,School of Computer&Communication Engineering,Beijing 100083,China
Keywords: COVID-19 basic reproduction number,gross domestic product(GDP),geographic distance,crossregional spread
At the beginning of 2020,the outbreak of COVID-19 hit the majority of countries and regions on this planet.[1]On January 30,2020,the World Health Organization(WHO)announced that a“public health emergency of international concern” had occurred.[2]In the face of such a serious situation,many measures were taken by various countries, such as cutting off transportation networks,closing public places,reducing people’s travel,etc.[3,4]Although these measures have prevented the virus from spreading to a certain extent, they also caused great inconvenience to people’s work and life.In the future,they may also cause a global economic crisis and cause a serious economic recession.[5,6]
Understanding the spread characteristics of COVID-19 and identifying the factors that affect its spread is a serious and challenging problem that has attracted attention in various fields, including biology, sociology, physics, and computer science.[7–11]An infectious disease model provides a powerful framework for investigating the infectious disease spread in populations.[12,13]The classical susceptible–infected–recovered (SIR) model[14]divides the total population into three different classes: the susceptible class denoted by S, which represents individuals who are not yet infected by the disease; the infected class denoted by I,which represents those who have been infected with the disease; and the recovery class denoted by R, which represents individuals who have recovered from the disease.Similar infectious disease models include the susceptible–infected(SI) model, the susceptible–infected–recovered–susceptible(SIRS) model, the susceptible–exposed–infected–recovered(SEIR)model,and so on.[15–17]The infectious disease model has been widely used in simulating the spread process of COVID-19 and predicting the trend of the number of infected cases.[18]Most of this infectious disease model is simulated through small-world[19]networks, scale-free networks[20]or other real networks.The small-world network and scale-free network,as a kind of homogeneous network,are often used to simulate the virus spreads in a local region.[21]
When the virus spreads across regions, its spread speed is affected by differences between regions, such as the network structure of social relationships,contact frequency,protective measures, etc., and cannot be described by a homogeneous network.Many scholars have studied the impact of regional differences on the spread of the virus from a macro perspective.COVID-19 is a highly contagious disease that can be transmitted through respiratory droplets,airborne transmission,and contact,and the virus has an incubation period of 0–21 days.[22]Patients in the incubation period would not show any symptoms,but they still have the ability to spread the virus to other people.Therefore,population movement between regions is an important factor affecting the cross-regional spread of the virus.[23]
In addition to population movements, climate, air pollution, etc.are also factors that influence the spread of the virus.[24,25]Tianet al.found that temperature and humidity are negatively correlated with daily cases of COVID-19.[26]In northern China, the negative effects of rising temperature on COVID-19 were counteracted by aggravated air pollution.In southern cities, the rising temperature restrained the facilitating effects of air pollution on COVID-19.Furthermore,factors such as occupation, individual immune responses, attitudes towards vaccination, whether or not a person has basic diseases,age,and prevention and control measures can all affect the spread of the epidemic.[27–32]Due to the extremely complex factors affecting the spread of COVID-19, it is not possible to consider them all comprehensively.[33,34]Some studies have shown that gross domestic product (GDP)[35,36]and distance[37,38]are more important factors in affecting COVID-19 spread, so in this article, we will consider GDP and distance together and focus on their relationship with COVID-19 spread.
In our previous study,[35]the correlation between the GDP and the number of confirmed cases in a region was confirmed.An epidemic growth index based on factors such as GDP is proposed to quantify the virus spread in the region.Later, Jaysonet al.[36]also found in their research that there is a strong correlation between the confirmed cases in various provinces and cities in China and the region’s GDP.As the virus continues to spread,varying numbers of infected people have been found in most countries.To ascertain the differences in the infectivity of the virus in different countries, the basic regeneration number of COVID-19 is calculated in different countries.Then,the number of confirmed cases in relation to GDP and geographic location is analyzed in the following countries: the United States,the European Union member states,and China.
The main points of this paper are as follows.In Section 2,the infectivity of COVID-19 in different countries through the basic reproduction number is analyzed in different countries.The impact of economic and geographic factors on the spread speed is analyzed when COVID-19 spreads across regions,and the epidemic disease spread index is defined to measure the spread speed of COVID-19 from high-risk regions to other regions in Section 3.Section 4 is the conclusion of this paper.
As of 4 Aug 2020,more than 200 countries had been infected to varying degrees,with the total number of confirmed cases exceeding 18 millions.Among them, the infections in China, Germany, Britain, France, Spain, and the United States are more serious.In order to study the factors influencing the spread speed of COVID-19 across regions, this paper separately counts the data of confirmed cases,GDP,and geographic location in the US states, the EU member states, and Chinese provinces and cities.
China Infection data for China were obtained from the National Health Commission of the People’s Republic of China and the Health Commission from the provinces.[39]They include the numbers of confirmed cases for 22 provinces,5 autonomous regions and 4 municipalities (excluding Hong Kong, Macao, Taiwan province, and Hainan province), and span from Jan 22 to July 17.China’s population data were from the World Population Network’s Seventh National Population Census.[40]GDP data for China were obtained from the National Bureau of Statistics of China.[41]Geolocation data for China were obtained from Google Maps.
United States Infection data for America were obtained from the United States of America Facts (USA FACTS).[42]They include the numbers of confirmed cases for 48 states,and span from Feb 15 to July 17.US population data were from World Population Network.[43]GDP data for America were obtained from the US Bureau of Economic Analysis.[44]Geolocation data for America were obtained from Google Maps.
European Union Infection data for member states of the European Union were obtained from the European Centre for Disease Prevention and Control(ECDC).[45]They include the numbers of confirmed cases for 27 countries, and span from Feb 15 to July 20.Population data for EU countries were from Eurostat.[46]GDP data for all the countries of the European Union were obtained from Statista.[47]Geolocation data for the European Union were obtained from Google Maps.
The COVID-19 spread is described by the SEIR model including ‘susceptible–exposed–infected–recovered’.In the SEIR model, if an infected individual or exposed individual comes into contact with a susceptible individual,the probability(infection rate)of the susceptible individual being infected isβ.The probability of an exposed individual becoming infected isγ1at each time stamp.The probability of an infected individual recovering isγ2at each time stamp.
The spread process of the SEIR model can be described by the following four differential equations:
whereS(t),E(t),I(t),R(t)represent the numbers of individuals who are susceptible, exposed, infected and recovered in the network at time stampt.Nrepresents the total number of individuals in the network,andN=S(t)+E(t)+I(t)+R(t).S(t)=N,I(t)→0,E →0,R(t)=0 whent=0.
The basic reproduction numberR0[48]is the number of infections caused by a single source of infection in a susceptible population, which is widely used to assess the ability of an infectious disease to spread in a population.The greater the basic reproduction number, the greater the ability of this infection to spread.IfR0<1,the spread size of the infectious disease would gradually decrease and eventually disappear.IfR0>1,this infectious disease may cause large-scale infections in the population if no measures are taken in time.Based on the model proposed in Eq.(1),the basic regeneration number can be expressed as follows:
whereλ=lnY(t)/tis the growth rate when the epidemic is in the early stages of spreading.Y(t)is the number of confirmed cases who have already exhibited symptoms of infection at time stamptsince the virus was detected.The incubation period(of disease)TE=1/γ1,infection periodTI=1/γ2,generation timeTg=TE+TI,and exposed individuals as a percentage of generation timeρ=TE/Tg,so the basic reproduction number can be expressed as follows:
As can be seen in Eq.(3),λ,ρandTgdetermine the size ofR0.Initial reports (on 5 January 2020)[49,50]mentioned that of the 59 suspected cases, 41 were confirmed, which means the probabilityq=41/59=0.695 of being diagnosed with suspected infection.Considering the delay caused by limited testing conditions and medical resources,as well as the possibility of potential cases not seeking medical treatment,[51–54]we estimate the confirmation rate to be 0.5 in this article, so settingq=0.5 serves as the lower limit of the daily probability of confirming the diagnosis, whileq=1 serves as the upper limit.Although this estimation is not accurate and has certain limitations,it still has some practical significance.In addition,when the first case was detected in each country was set as the initial time of detection of the outbreakt=0.Theρvalue of the SARS virus is between 0.6 and 0.8.[55]According to the study,[56]COVID-19 is highly similar to the SARS virus, so take theρfor COVID-19 to be 0.6–0.8.Letρe=ρ(1−ρ),then for all possible values ofρ, the maximum value ofρeis 0.51,and the minimum value ofρeis 0.43.The maximum and minimum values ofρewere used as upper and lower bounds for the calculation of the basic regeneration numberR0, respectively.The value ofTgis 8.4.[57]
Based on the time series evolution data of the infected cases in multiple countries,this paper studies the differences in the infectivity of COVID-19 in different countries and regions through the basic reproduction number,as shown in Fig.1.
Fig.1.Basic reproduction number by country.
As can be seen from Fig.1, at the beginning of the COVID-19 spread, the basic reproduction number showed great variation in different countries.For example, the basic reproduction number of COVID-19 in China is greater than 3, while the basic regeneration number in Spain, Canada, the USA and Brazil is around 1.The differences in the early basic regeneration number between countries are due to differences in prevention awareness and non-pharmacological interventions.However,the increasing number of confirmed cases would gradually attract the attention of the government.Each country will suppress the COVID-19 spread by drawing on the successful experience of other countries.Therefore,during the middle stages of the COVID-19 pandemic,the basic reproduction numbers of various countries stabilized at around 2.Research shows that the basic reproduction number of the Zika virus is 1.4–6.6,[58]the basic reproduction number of smallpox is 3.5–6.0,[59]and the basic reproduction number of SARSCoV is 2.2–3.7.[60]After comparison of the basic reproduction numbers of the several kinds of infectious diseases above,it can be found that COVID-19 is an infectious disease with medium infectious capacity.Given the history of fighting infectious diseases,the COVID-19 spread can be prevented and controlled by taking effective intervening measures.
COVID-19 is spread through person-to-person contact,so the geographical location would affect the COVID-19 spread in different regions.
Figures 2(a) and 2(d) show the distribution of the infection density in US states and EU member states.Different colors represent the level of infection in a region.The infection of the red regions is the most serious,and the blue regions are the least infected.Figures 2(b)and 2(e)show the distribution of GDP for US state and EU member state, respectively.The red regions are those with the highest GDP and the blue regions are those with the lowest GDP.
It can be seen that the GDP of most countries and regions has a certain similarity with the infection density(regions with high infection density also have high GDP values).According to the statistics, it can be found that Zhejiang, Guangdong,Hubei and Anhui have higher GDP, and the infection density is also higher than in other regions.The western and northern regions of China have slower economic development, lower GDP,and less infection.For example, in Xinjiang, Tibet, Inner Mongolia, and other regions, there are only sporadic infections, and no aggregate infections have been found.Comparing Figs.2(a) and 2(b), figures 2(d) and 2(e) also found a similar pattern.
Florida,Texas,California,and the northeast of the United States,as regions with higher GDP in the United States,have higher infection density than other states with lower GDP.The GDP of the central and northwestern United States is relatively low,and the infection density is relatively low.Similarly,Germany,France,Spain,and Italy,as the countries with the highest GDP among the EU member states,have more than 50000 confirmed cases, far exceeding other EU member states.On the other hand,Estonia,Latvia,Lithuania,Slovakia,Hungary,Slovenia, Croatia, Bulgaria, and Greece have relatively low GDP among all EU member states, and the number of confirmed cases is relatively low,at less than 1000.
Fig.2.The relationship between the infection density and factors such as GDP and geographic location in the US and the European Union.Panels (a)–(c) is America; data collection took place on April 5th, 2020.Panels (d)–(f) is the European Union; data collection took place on April 1st,2020.The radius of the node in panels(c)and(f)represents the normalized GDP value of that country or region.The color of the node in panels(c)and(f)represents the infection density in that country or region.
Figures 2(c) and 2(f) show the relationship between the infection density, GDP, and geographic location in each region.Each node represents a city,province,state,or country.The relative position between the nodes represents their true geographic location.The radius of the node represents the GDP of the region, and the color of the node represents the magnitude of infection density in the region.
Combining Figs.2(c)and 2(f),it can be seen that the infection degree in each region is not only related to its GDP value but is also related to the geographical location of each region.If a region is closer to a high-risk region, the region would have more frequent contact with people in the high-risk region(the outbreak was earlier and the number of infections was larger),and the virus would spread faster locally.
As can be seen from Fig.2(a),the infection of the states of the northeastern United States is more serious than in other regions.Among them, the infection in New York is the most serious, and the outbreak in New York also occurred earlier than in other states.With similar GDP, states closer to New York such as Michigan, Pennsylvania, and New Jersey have more severe infection status, while states such as Wisconsin,North Dakota, Iowa, and Missouri are slightly farther away from New York and have a relatively low infection density in the region.Florida, Texas and California are far from New York,but because of their higher GDP,their infections are also more serious than those in their surrounding states.The other states in the western United States are farther from New York and have lower GDP,so the lower the impact of New York.
As can be seen from Fig.2(f), among all member states of the European Union,the infection of southwestern countries is more serious than in the east and north.Among them,Italy is a country that caused large-scale infections earlier.With a small difference in GDP, countries such as Belgium, Austria,and Portugal,because of their proximity to Italy,are also more severely infected than other more distant countries such as Poland,Sweden,Romania,and Bulgaria.
In summary, it is known that GDP is a monetary measure of the market value of all the final goods and services produced in a specific time period.It reflects the trade and market size of a country (or region).People in regions with large trade and markets tend to communicate and cooperate more frequently, which invisibly increases the frequency of people’s contact and creates opportunities for the spread of the virus.However, this is often limited by distance.Under the same trade and market size, it is often easier for two closer regions to communicate and cooperate.However, when the trade and market scale of a region is large,it is often irreplaceable in economic activities,and even if the distance is greater than other regions,there would be frequent exchanges and cooperation with other regions.As shown in Fig.3, assuming that there are four regions A,B,C,and D,the GDP of regions A,B,C,and D is A=D>B=C, and the distance from region A is B=D>C, then the frequency of contact between region A and regions B,C and D is D>B>C.
Fig.3.Relationships between regions.The size of the node represents the GDP of the region,the length of the edge represents the distance between the two regions,and the thickness of the arrows represents the frequency of contact between the two regions.
Based on the above analysis, the following definition is made.
Definition 1Θis the epidemic disease spread index,which can measure the spread speed of the virus from highrisk regionjto regioni
whereYjis the GDP of the high-risk regionj,Yiis the GDP of regioni,di jis the geographic distance between high-risk regionjand regioni.
Fig.4.Radiation pattern of COVID-19.Arrows indicate the direction of COVID-19 spread,and the thickness of the arrows indicates the extent to which high-risk regions are affecting the area, determined by the epidemic disease spread index.The size of the node represents the value of the GDP of the country or region.The color of the nodes represents the number of confirmed cases in the region.
Fig.5.The relationship between the epidemic disease spread index and the number of confirmed cases.
Figure 4 shows the COVID-19 spread process across regions from high-risk regions to other regions.The direction of the arrow indicates the direction of the virus spread.The thickness of the arrows represents the extent to which the high-risk region has impacted the regions.It can be seen from Fig.5 that the size of the epidemic disease spread index has a positive correlation with the number of confirmed cases in each region.The larger the epidemic disease spread index, the greater the number of confirmed cases in this region.It can be seen that when the epidemic disease spread index of a certain region is less than 2,as the epidemic disease spread index increases,the number of confirmed cases increases rapidly.When the epidemic disease spread index is greater than 2, as the epidemic disease spread index increases,the number of confirmed cases increases slowly.This shows that the degree of infection in various regions is not only affected by factors such as economic factors and geographical factors, but also by medical conditions,coping measures,and the age structure of populations in different regions, which is beyond the scope of this discussion.
In this paper, the spread dynamics of COVID-19 in local regions are simulated by the SEIR model, and the basic reproduction number of COVID-19 in different countries is estimated based on the time series data of confirmed cases in China, the USA, Brazil, France, and other countries.The results show that COVID-19 is similar to SARS in infectivity,slightly lower than the Zika virus and smallpox,and is a moderately transmissible disease.The infectivity of COVID-19 is not significantly different in each country,so effective protective measures can be used to control the spread.Then, this paper examines the impact of differences between regions on the spread speed when the COVID-19 spread is cross-regional.Studies have shown that when the COVID-19 spread is from a high-risk region to other regions,the spread speed has a positive correlation with the GDP of the two regions and a negative correlation with the distance between the two regions.In other words,countries and regions with high GDP are more susceptible to infecting each other.Similarly, if a region is closer to a high-risk region,COVID-19 spreads more rapidly in that region.In addition, an epidemic disease spread index based on GDP and geographical distance is defined to measure the spread speed of COVID-19 from high-risk regions to other regions.The reliability of the epidemic disease spread index is verified by comparing it with the number of confirmed cases.The results showed that there was a positive correlation between the epidemic disease spread index and the number of confirmed cases.When a region has a high epidemic disease spread index, the number of confirmed cases is also higher than in other regions.This finding has provided an effective suggestion for the control of the epidemic situations in various countries of the world.When a region has a high epidemic disease spread index, stricter quarantine measures and more medical resources are needed to control the COVID-19 spread.
Acknowledgments
Project supported by the National Natural Science Foundation of China(Grant Nos.62266030 and 61863025),International S&T Cooperation Projects of Gansu province(Grant No.144WCGA166),and Longyuan Young Innovation Talents and the Doctoral Foundation of LUT.