LUO Chongliang,FENG Qunqiang,ZHANG Shuguang
(Department of Statistics and Finance,School of Management,University of Science and Technology of China,Hefei 230026,China)
A topological index is a map from the set of chemical compounds to the set of real numbers.Most topological indices are closely correlated with some physico-chemical characteristics of the underlying compounds.Many chemists who derive and use molecular topological indices employ concepts and terminology of graph theory in their work (see,for example,Refs.[1-2]).Here,graphs are generated from molecules by replacing atoms with vertices and bonds with edges.
Many chemical compounds have a tree-like structure.For example,The entire family of alkanes (saturated hydrocarbons) have tree graphs.For a treeTof sizen,Ref.[13] studies the extreme values of the first Zagreb index ofT:If and only ifTis a path onnnodes it reaches the minimum value 4n-6,and just whenTis a star onnnodes it reaches the maximum valuen(n-1).Ref.[14] gets similar results on the second Zagreb index of trees.
The sequel is organized as follows.In Section 1,the Zagreb index and a modified version of it are introduced.Regardless of the probability model,a basic property of binary trees is also mentioned.Section 2 is dedicated to binary search trees.We first set a relation between the Zagreb index and the number of leavesLn,and a stochastic recurrence equation forLn.The recurrence is a transparent tool for the computation of exact moments,which is shown in Section 2.1.And it is also directly amenable to the univariate contraction method,as is demonstrated in Section 2.2.Section 3 is dedicated to Catalan trees,where a known result of the number of leaves is used.The exact mean of the Zagreb index is also given.In Section 4,briefly random digital search trees are discussed.The Gordon-Scantlebury and Platt indices,two other topological indices,are mentioned in passing in Section 5.Finally,a short comment on such three random tree classes via the Zagreb index is given in the last section.
whered(v) denotes the degree of the vertexvinG.
A binary tree is either empty,or has left and right subtrees that are themselves recursively binary trees.Connections of chemistry to random binary trees have been noted,for example in Ref.[15],where they studied molecules with a binary tree structure,where nodes of valence 3 are saturated and the unsaturated nodes have an affinity that is inversely proportional to their valance.
The following lemma plays an important role in our study.
Lemma1.1[16]If a binary tree of sizenhasl(1≤l It is easier to work with a modified Zagreb index that does not get affected by non-root nodes,but rather only requires changes based on the degree of the root.Such a modified index is defined in exactly the same way as the standard index,except that the degreed(r) of the rootrof the whole tree is always enhanced tod(r)+1. (1) A binary search tree is constructed from the permutation (π1,π2,…,πn) of the integer set {1,2,…,n} by a standard search algorithm.The first element of the permutation is inserted in an empty tree as its root node.A subsequent elementπj(withj≥2) is guided to the left subtree ifπj<π1,otherwise it is taken into the right subtree.In the recipient subtreeπjis treated recursively by the same insertion algorithm,until it is inserted in an empty subtree,at which point a node is allocated for it and adjoined appropriately as a left (right) child if its label is less than (more than) that of the last node on the search path. In the probability model induced by random permutations we assume that the tree is built from permutations of {1,2,…,n},where a uniform probability is imposed on the permutations rather than on the trees.When alln! permutations are uniformly random,occurring equally likely,binary search trees are not equally likely.Several permutations give rise to the same tree,favoring shorter balanced trees to tall linear shapes[17].Henceforth the term random binary search tree will refer to a binary search tree built from a random permutation. LetLnbe the number of leaves in a random binary search tree of sizen.From Lemma 1.1,we have (2) We shall use an alternative method to treatLn.Through a distributional equality below,one can compute the explicit expressions for the mean and variance ofLndirectly and get the asymptotic normality for it by the contraction method. LetRnbe the root of a random binary search tree of sizen.ThenRnis distributed uniformly on {1,2,…,n},and the size of the left subtree isRn-1 and that of the right one isn-Rn.We have,forn≥2, (3) The initial values forLnis as follows: L0=0,L1=L2=1; The relation (3) is useful for the direct computation of moments.We shall illustrate the procedure on the mean and variance. Take the difference between a version of the recurrence forE[Ln] from that forE[Ln-1].Observe that because the recurrence (3) is valid forn≥2,we must also taken-1≥2,to get valid forn≥3.Thus, Toward the variance calculation,minusE[Ln] from both sides of the relation (3),square and take expectation.Employing several symmetries we have,forn≥3, Note that the last term in the above equation is 0 ifn≥4,and 2/(9n) ifn=3.Then,applying the initial values Var[L0]=Var[L1]=Var[L2]=0,one can get which gives that Var[L4]=Var[L3]=2/9.Taking the difference between a version of the recurrence forn-1 from that forn,we simplify the recurrence to This recurrence has the solution Applying the explicit expressions ofE[Ln] and Var[Ln] to (2),the process of calculation of the first two moment ofZnis straightforward. Proposition2.1LetZnbe the Zagreb index of a random binary search tree of sizen.Then E[Z0]=0,E[Z1]=1, and ProofIt is easy to check the initial values forE[Zn] and Var[Zn].From the relation (2),ifn≥2, Let1εbe an indicator of the eventεthat assumes the value 1,ifεoccurs,and assumes the value 0 otherwise.Since the root of a binary search tree has only one child only when it is 1 onn,we have by (1) (4) which implies that forn≥2, For the variance ofZn,the relation (4) also gives that forn≥4, or Thus,forn≥4, □ We observe that the distribution ofZnis concentrated around its mean,since the variance is smaller than the square of the mean. Corollary2.1Asn→∞, As expected,a random binary search tree of sizenis much “closer” to a path of lengthn(with the Zagreb index 4n-6) than a star of sizen(with the Zagreb indexn(n-1)),whennis large. In principle,one can compute the higher moments by the methods utilized for the mean and variance,and appeal to the method of recursive moments[21].However,the explosive complexity is forbidding. For the asymptotic distribution ofLnandZn,the contraction method provides a shortcut.The method was crafted by Rösler[22]to deal with the distribution of Quicksort.Ref.[23] added several extensions.Recently general contraction theorems and multivariate extensions have been added by Refs.[24-26].Ref.[27] provides a lucid survey,and Ref.[20] also provides a survey in Chinese.What makes the contraction method appealing for an application like the one we have at hand is that now we have in our possession several general theorems that greatly reduce the effort in applying it. The contraction method used here starts with the recursive distribution equality (3).The rates of growth of the mean and variance ofLncomport with the rates required for an application of Corollary 5.2 in Ref.[25]. Takingf(n)=g(n)=nin Corollary 5.2 of Ref.[25],we see that (Rn-1)/g(n) converges inL3toU,a standard Uniform (0,1) random variable,and (n-Rn)/g(n) converges inL3to 1-U; here the toll function is nothing,which,of course,vanishes inL3,as is required for the convergence (5) Theorem2.1LetZnbe the Zagreb index of a random binary search tree of sizen.Then ProofFrom (2) and (5),we have □ where the toll function h(n)=4·1{Rn∈{1,n}}+9·1{Rn∉{1,n}} is clearlyO(1).It is not hard to see that the conditions in Corollary 5.2 of Ref.[25] are also satisfied. In the Catalan probability model all binary trees of sizenare considered equally likely.It is so called because when research started on binary trees,it was natural to assume a uniform model,and binary trees are counted by Catalan numbers:The numberBnof binary trees of sizenis given by the Catalan number[28]: Using the method of singularity analysis of generating functions,the number of leaves in binary Catalan trees is considered by Ref.[29,Example Ⅲ.14 and Ⅸ.25].We also denote byLnthe number of leaves in a random binary Catalan tree of sizen.LetB(z,u) be a bivariate generating function whereC(n,m) denotes the number of binary Catalan trees of sizen,which contain exactlymleaves.Then (6) from which Ref.[29] shows and By simple algebra with the relation (2),the first two moments and asymptotic normality forZn,the Zagreb index of a random binary Catalan tree of sizen,can be obtained. Theorem3.1LetZnbe the Zagreb index of a random binary Catalan tree of sizen.Then and The structure of the mean and variance of the Zagreb index in random binary Catalan trees is similar to that in the binary search tree case (only the coefficients of linearity are different).So,one also gets a concentration law here,too (cf.Corollary 2.1),by simply altering the specification of the constant. From the explicit formula forB(z,u) one can easily obtain an explicit result forE[Zn].We shall illustrate the procedure on the mean for the sake of completeness.Forn≥1,we have which implies that forn≥1, The relations (2) and (4) thus give that We also letLnandZnbe the numbers of leaves and the Zagreb index of a random digital search tree withnnodes,respectively.From the construction of DST,it is obvious that,forn≥2, (7) For the asymptotic normality for the Zagreb index of a random digital search tree,the contraction method is also applied.We first argue heuristically the existence of a limit for The argument elicits the nature of the limit. (8) Theorem4.1LetZnbe the Zagreb index of a random digital search tree withnnodes.Then and For a simple graphG,the degree of an edge is the number of its adjacent edges.The Gordon-Scantlebury index is defined as the number of paths of length two inG,and Platt index is the total number of the degrees of all edges inG.LetZG,SGandPGbe the Zagreb index,the Gordon-Scantlebury index and the Platt index ofG,according to Ref.[12], ZG=2(SG+E(G)), and PG=2SG, whereE(G) denotes the number of edges inG,Since for a tree ofnnodes,E(G)=n-1,we can directly establish parallel results for the Gordon-Scantlebury and Platt indices of the three kinds of random trees in this paper.We omit the details. The Zagreb index induces the topological property of random trees (graphs).We observed that all the Zagreb indices of the three random binary tree classes that we studied here are close and have asymptotic normality.However,as Tab.1 shows that,on the average,a random binary search tree is a bit closer to a star than a random binary Catalan tree,and is a bit closer to a path than a random digital search tree,when their sizes grow. Tab.1 The first two asymptotic moments of the Zagreb index of a tree with large size n [1] Harary F.Graph Theory[M].2nd ed.Reading,MA:Addison-Wesley,1972. [3] Christophi C,Mahmoud H.The oscillatory distribution of distances in random tries[J].The Annals of Applied Probability,2005,15:1 536-1 564. [4] Devroye L,Neininger R.Distances and finger search in random binary search trees[J].SIAM Journal on Computing,2004,33:647-658. [5] Janson S.The Wiener index of simply generated random trees[J].Random Structures and Algorithms,2003,22:337-358. [6] Janson S,Chassaing P.The center of mass of the ISE and the Wiener index of trees[J].Electronic Communications of Probability,2004,9:178-187. [7] Mahmoud M,Neininger R.Distribution of distances in random binary search trees[J].The Annals of Applied Probability,2002,13:253-276. [8] Neininger R.The Wiener index of random trees[J].Combinatorics,Probability and Computing,2002,11:587-597. [13] Li X,Li Z,Wang L.The inverse problems for some topological indices in combinatorial chemistry[J].Journal of Computational Biology,2003,10:47-55. [14] Lang R,Li X,Zhang S.Inverse problem for Zagreb index of molecular Graph[J].Applied Mathematics A Journal of Chinese Universities,2003,18:487-493. [15] Quintas L,Szymanski J.Nonuniform random recursive trees with bounded degree[J].Sets,Graphs,and Numbers:Colloquia Mathematica Societas Janos Bolyai,1992,60:611-620. [16] Siltaneva J,Mäkinen E.A note on the expected distribution of degrees in random binary trees[J].ACM SIGCSE Bulletin,2000,32:32-33. [17] Mahmoud H.Evolution of random search trees[J].New York:Wiley,1992. [18] Devroye L.Limit laws for local counters in random binary search trees[J].Random Structures and Algorithms,1991,2:303-315. [19] Mahmoud H.Pólya urn models[M].New York:Chapman-Hall,2008. [20] Su C,Feng Q,Liu J.Modern Limit Theory and Its Applications to Random Structures[M].Beijing:Higher Education Press,2010.(In Chinese) [21] Chern H,Hwang H,Tsai T.An asymptotic theory for Cauchy-Euler differential equations with applications to the analysis of algorithms[J].Journal of Algorithms,2002,44:177-225. [22] Rösler U.A limit theorem for “Quicksort”[J].RAIRO Informatiques Theoriques et Applications,1991,25:85-100. [23] Rachev S,Rüschendorf L.Probability metrics and recursive algorithms[J].Advances in Applied Probability,1995,27:770-799. [24] Neininger R.On a multivariate contraction method for random recursive structures with applications to Quicksort[J].Random Structures Algorithms,2001,19:498-524. [25] Neininger R,Rüschendorf L.A general limit theorem for recursive algorithms and combinatorial structures[J].The Annals of Applied Probability,2004,14:378-418. [26] Rösler U.On the analysis of stochastic divide and conquer algorithms[J].Algorithmica,2001,29:238-261. [27] Rösler U,Rüschendorf L.The contraction method for recursive algorithms[J].Algorithmica,2001,29:3-33. [28] Knuth D.The Art of Computer Programming,Vol.3:Sorting and Searching[M].2nd ed.Reading,MA:Addison-Wesley,1998. [29] Flajolet P,Sedgewick R.Analytic Combinatorics[M].Cambridge:Cambridge University Press,2009. [30] Coffman E G Jr,Eve J.File structures using hashing functions[J].Commun ACM,1970,13:427-432. [31] Flajolet P,Sedgewick R.Digital search trees revisited[J].SIAM Journal on Computing,1986,15:748-767. [32] Devillersand J,Balaban A T.Topological Indices and Related Descriptors in QSAR and QSPR[M].Amsterdam:Gordon & Breach,1999.2 Random binary search trees
2.1 The moments
2.2 Asymptotic normality
3 Random binary Catalan trees
4 Random digital search trees
5 The Gordon-Scantlebury and Platt indices
6 Conclusion