Diego García-Zamora, Álvaro Labella, Weiping Ding,,Rosa M. Rodríguez,, and Luis Martínez,
Abstract—The society in the digital transformation era demands new decision schemes such as e-democracy or based on social media. Such novel decision schemes require the participation of many experts/decision makers/stakeholders in the decision processes. As a result, large-scale group decision making(LSGDM) has attracted the attention of many researchers in the last decade and many studies have been conducted in order to face the challenges associated with the topic. Therefore, this paper aims at reviewing the most relevant studies about LSGDM,identifying the most profitable research trends and analyzing them from a critical point of view. To do so, the Web of Science database has been consulted by using different searches. From these results a total of 241 contributions were found and a selection process regarding language, type of contribution and actual relation with the studied topic was then carried out. The 87 contributions finally selected for this review have been analyzed from four points of view that have been highly remarked in the topic, such as the preference structure in which decision-makers’opinions are modeled, the group decision rules used to define the decision making process, the techniques applied to verify the quality of these models and their applications to real world problems solving. Afterwards, a critical analysis of the main limitations of the existing proposals is developed. Finally, taking into account these limitations, new research lines for LSGDM are proposed and the main challenges are stressed out.
THESE days, decision-making processes entirely guided by data and quantitative modeling are being widely used, and the participation of human experts who usually manage qualitative information is either ignored or relegated to second place [1], [2]. However, considering the commitment, cost and relevance of human stakeholders in economic, social or learning research, the use of expert-guided decision-making methods, also known as group decision making (GDM)models in specialized literature, is still essential in several areas [3], [4], especially when agreed solutions are required[5], [6].
On the other hand, the digitization era and application of novel technologies to all human-beings tasks have implied a transition towards new ways to solve real-world problems. In the decision making field, the GDM problems have evolved from a few decision-makers (DMs) involved in the solving process to numerous of them, emerging the large-scale group decision making (LSGDM) [7], [8]. E-democracy technologies [9], [10], e-marketplaces [11], [12], social media [13],earthquake shelter selection [14] or water resource management [15], [16] are just a few examples of new decision making situations involving an increasing number of DMs,making of LSGDM an important topic in recent years [17].This emergence has caused multiple changes and challenges in the approach for solving these new types of GDM problems. Initially, four research trends were pointed out and mainly developed in LSGDM [18]:
1) Clustering Methods in LSGDM:Dimension reduction has a key role in the resolution of LSGDM problems, since managing large decision groups may be tough or even impossible because of resources limitations. Clustering large groups into smaller and more manageable subgroups, usually based on the similarity between DMs’ preferences, has been used as a satisfactory solution to overcome this dimension reduction issue [18], [19].
2) Large-Scale Consensus Reaching Processes (LSCRPs):Conflicting and polarized opinions are even more common in LSGDM than in classical GDM due to the participation of many DMs. If the conflicts are not addressed, the decision process may fail and affect negatively on the society. LSCRPs are applied to smooth out disagreements and increase the level of accordance in the group [18], [20], [21].
3) LSGDM Methods:The resolution of decision problems is usually carried out by ad hoc decision methods. These methods aim, in general terms, at obtaining a ranking of the alternatives of the problem by applying a set of algorithmic/optimization steps. LSGDM methods introduce new features regarding the classical decision methods for GDM in order to meet the new challenges posed by large-scale problems [22], [23].
4) LSGDM Support Systems:The enormous complexity associated with the LSGDM problems makes their resolution difficult by the DMs. The LSGDM support systems are software tools that aim at helping DMs along the decision process by providing additional information and reducing uncertainty related to LSGDM problems [24], [25].
The recent impact of LSGDM in the specialized literature has given place to many proposals, which have been reviewed by several authors. For instance, Labellaet al. [20] analyze the performance of classical consensus models focused on solving GDM process with a few experts in LSGDM problems, concluding that these models are not able to deal with the challenges related to the large-scale context. Zhanget al. [26] review the consensus models with feedback mechanism based on minimum adjustments proposed in the literature from two different contexts, classical GDM problems and complex GDM problems which include largescale contexts. Dinget al. [17] develop a taxonomy for the existing literature and discuss future research directions under a perspective based on Artificial Intelligence, whereas Tang and Liao [27] analyze the state of the art in order to provide an analysis from Big Data point of view.
However, although these reviews propose some classifications for the existing literature from different points of view,the necessary critical analysis of the existing literature is usually neglected, which has implied a deviation of the original purpose of the topic related to apply decision models in groups with a huge number of stakeholders. For instance,the classical definition of LSGDM itself (GDM with more than 20 DMs [27]) may be inadequate for current society demands because it assumes that just 20 DMs are a large group, whereas nowadays real-world decision situations may require much bigger groups (Netflix recommendation system deals with more than 200 million users). To this regard, it is usual to find lots of LSGDM proposals in the literature which test the performances of their methods by using toy examples in which just 20-50 DMs are considered. Undoubtedly, this is a prominent source of papers, but it is far away from solving real-world problems, which should be the main goal of the research in a purely applied area like this.
Hence, the main motivation of this survey is to analyze the current state of the art about the existing trends obtained from our literature analysis, but also to provide a comprehensive view about LSGDM and a critical discussion about the main limitations of present proposals, in order to redirect current research towards new trends which face the real world needs demanded by large-scale contexts.
Therefore, this contribution is devoted to answer the following research questions:
Q1: What are the most relevant studies addressing LSGDM?
Q2: What is the current state-of-the-art regarding LSGDM?
Q3: What are the limitations of the current contributions?
Q4: What are the most promising new trends in LSGDM for future research?
Consequently, the four major contributions of this proposal are summarized as follows: first a systematic review about the current state of the art of LSGDM is performed in order to point out the most relevant papers and trends in the area,which are then studied from different points of view,according to the different steps that conform a classical LSGDM process, namely i) the preference structure used to model DMs’ opinions, ii) the internal group decision rules used to model the decision process, iii) the mechanisms to evaluate the quality of the proposed model and, finally, iv) the application of the models to solve real world LSGDM problems. Subsequently, it is provided a deep critical analysis based on these four perspectives regarding the way that researches have developed so far their methods, and faced the different challenges demanded by LSGDM problems.Eventually, future research lines about LSGDM are discussed keeping in mind this critique and pointing out how to overcome it.
The remaining of this contribution is set up as follows. In Section II, the main concepts related to LSGDM are introduced. Section III describes the search process adopted to identify relevant studies on the topic. Section IV introduces the results obtained from the search process related to LSGDM. Afterwards, Section V exposes a critique vision about the current researches based on LSGDM. Additionally,Section VI provides a discussion about the future challenges and trends on LSGDM. Finally, Section VII draws some conclusions.
A GDM problem is a decision situation in which several DMs are required to decide one or several alternatives as solution for the given problem [28], [29]. Formally, such problems are modeled by a pair (D,X) in whichDis a finite set of DMs
which are asked to judge a finite set of alternatives
with the aim of choosing the best solution for the problem.Traditionally, the resolution process of these problems mainly consists of two steps [30] (see Fig. 1):
1) Aggregation:The DMs’ preferences are grouped, by using an aggregation operator, into a single collective preference that represents the overall group’s opinion.
2) Exploitation:One or several alternatives are selected as solution of the problem.
Formerly, Butler and Rothstein [31] introduced several rules to guide the resolution process such as majority, minority, or Borda count. However, when using these kinds of rules, some DMs may not feel satisfied with the chosen solution because their opinions may not have been sufficiently considered in the final collective choice. To deal with these discrepancies among DMs’ opinions, CRPs were added as an additional phase in the resolution process of a GDM problem. A CRP is a dynamic and iterative process in which DMs discuss each other and change their initial opinions in order to bring closer their views and increase the agreement within the group.These processes are usually supervised by a moderator, who is responsible for providing DMs with the proper feedback about the state of the negotiation. In broad terms, a CRP consists of four steps [32] (see Fig. 2):
Fig. 1. Scheme of a GDM problem.
Fig. 2. Scheme of a CRP.
1) Gathering Preferences:DMs provide their assessments over the alternatives by using preference structures.
2) Consensus Measuring:The current level of agreement in the group is derived by using consensus measures [33].
3) Consensus Control:The current level of agreement is compared with a predefined desired level of consensus for the group. If the group achieves such a desired level, the CRP finishes and the process to select the best alternative starts,otherwise another consensus round is accomplished. In order to avoid endless processes, the number of rounds is limited.
4) Feedback Generation:The moderator identifies the DMs whose opinions are the furthest away from the group and recommends that they change them.
Classically, GDM problems and their CRPs have considered just a small numbers of DMs, however, new technological advances such as Big Data [34] or e-commerce [35] and the emergent society demands to deal with problems like emergency situations [36] or sustainability [37] have given place to new large-scale contexts requiring the participation of more DMs in the decision process, which has attracted the attention of many researchers. In this context, LSGDM has arisen as those GDM problems in which 20 or more DMs take part in the decision process [17].
According to Tang and Liao [27] and Labellaet al. [20] the involvement of numerous DMs with different views and preferences inevitably implies to consider new aspects in the general resolution scheme of GDM problems (see Fig. 3):
1) Dimension Reduction:These models usually include mechanisms to manage the large amount of information.
2) Weighting and Aggregation of Information:Related to properly determine the importance of the DMs participating in the process and fuse their opinions,
3) Behavior Management:A mechanism to detect and manage uncooperative DMs should be considered to avoid these DMs harm the decision process,
4) Cost Management:The human, economic and time resources required for developing models which aim at managing hundreds, thousands, or millions of DMs,
Fig. 3. Scheme of a LSGDM problem.
Fig. 4. Study selection process.
5) Social Network Analysis (SNA):When large groups are considered it is necessary to take into account how the relationships among DMs (trust or reputation) influence the decision process.
6) Consensus:The larger the number of DMs, the greater the probability of disagreement. Therefore, new consensus models dealing with large groups are key to reach agreed solutions.
Consequently, LSGDM inherits two phases of the classic scheme of GDM (see Fig. 1), namely the gathering of preferences and the exploitation phase, but the aggregation phase becomes much more complex because several of the aforementioned aspects may be taken into account in the fusion of the original values of DMs’ opinions. The combination of these techniques allows proposing a huge variety of LSGDM schemes. Some of the most relevant ones are listed as follows:
● Palomareset al. [7] and Donget al. [38] propose consensus models which take into account the management of the uncooperative behaviors and the dimension reduction to weight and aggregate the information.
● Zhanget al. [8] deal with multi-attribute LSGDM problems by using a linguistic aggregation process.
● Xuet al. [14], and Wu and Xu [39] introduce consensus models which also develop a dimension reduction to weight and aggregate the original preferences.
● Liuet al. [40] develop a consensus model which includes mechanisms to control the cost of moving DMs’ opinions and use SNA to derive the importance of the DMs.
● Luet al. [41] present a CRP which combines SNA and clustering to perform the dimension reduction and determine the influence of DMs in a decision process which also takes into account the cost of moving DM’s preferences.
● Shiet al. [42] apply behavior and cost management techniques in a consensus model, which also performs a dimension reduction with adaptive weights.
This study aims at reviewing the main concepts regarding LSGDM, by showing the relations among them and their future perspectives. To do so, the guidelines proposed by Kitchenham and Charters [43] to develop a systematic review in Software Engineering have been taken into consideration and adapted to our topic.
To obtain the documents that conform the state of the art of LSGDM, we have selected as data source the WoS database because maybe it is the most prestigious scientific bibliographic database. Even though others like Scopus are also relevant, in our case we make decision about WoS because, when comparing the results between both databases,the extra results obtained by Scopus were marginal regarding our aim. Our search strategy consisted of performing two different queries. In the first one, the keywords “Large-scale”and “Group Decision Making” were used as topic, whereas the second one used the keywords “Group Decision Making”as topic and also asked for the words “Large-scale” to appear in the title of the papers. As a result of these searches, which were done on 29th April 2021, a collection of 241 papers was found.
After that, a study selection process (see Fig. 4) was carried out in order to discard non-relevant proposals. The contributions which either were written in a language different from English or not published in peer-reviewed indexed journals were excluded. In addition, we also discarded those contributions non-related to the topic or which developed the quality evaluation of the proposed models by using examples involving problems that are not LSGDM at all because of the number of DMs. The number of papers which passed this filter was 87.
These 87 contributions have been published on 33 different journals, most of them belonging to the Computer Science &Artificial Intelligence category. The journals in which more contributions have been published areKnowledge-BasedSystems(12),Information Fusion(11),IEEE Transactions on Fuzzy Systems(8) andInformation Sciences(7). Fig. 5 illustrates the complete journals’ distribution of the reviewed papers. Table I shows the 5 most highly cited papers found in our search. The temporal distribution of the reviewed papers is shown in Fig. 6. The first proposal in our database related to LSGDM, from the interpretation of this review paper, was published in 2011. In such a contribution, Carvalhoet al. [24]proposed a decision support system for LSGDM contexts and defined “large groups” as those groups with 10-20 individuals. In the subsequent years, just a few contributions were published until 2017 and the majority of papers have been published between 2018 and 2021, making LSGDM a hot topic in recent years.
Fig. 5. Journal distribution of the reviewed contributions.
In order to identify the most relevant keywords in the topic for the data extraction process, a bibliography visualization tool has been applied to our database. On the one hand, Fig. 7 shows the main keywords used in the selected LSGDM literature and their connections so that the size of each node represents its occurrence.
These keywords are also classified into several colored categories so that those with a closer connection are represented with the same color. This figure allows to easily identify the most prolific trends related to LSGDM. For instance, it can be appreciated thatconsensusis one of the most important research lines within LSGDM but also as one of the trends with more links to other keywords such asfeedback mechanismorconsensus level. In addition, this figure also shows some keywords involving weighting and reduction dimension techniques such asclusterorclustering methodandsocial networks. It should be highlighted the distinction amongexpertandDMterms because the former is related to GDM, whereas the latter is more related to LSGDM.The proper use of these terms may be key to make differences between GDM and LSGDM, though many researches use both interchangeably.
On the other hand, Fig. 8 shows the publications mean per year regarding several key topics. According to this figure, the most recent interest in LSGDM seems to be the validness and quality of the proposed models related to terms such ascomparative analysis,feasibilityorvalidity.
In addition to this automatized review of keywords, a manual abstract analysis was performed in order to provide a more comprehensive view of the current state of the art. From this manual research, we have identified some other keywords which have been used as complement to the ones obtained in the automatized search. Finally, to synthesize all the information, the resulting list of keywords has been organized in four blocks (see Fig. 9) according to the step of the model resolution process that these keywords belong to:
1) Preference Structures:This block includes the keywords related to the modelling of DMs’ preferences and their characteristics.
2) Group Decision Rules:This block is related to the different formal processes applied to solve an LSGDM problem.
3) Evaluation of Quality:This block is devoted to group those keywords regarding the measure of quality and validness of the proposed LSGDM approaches by means of metrics, comparative analysis or use of datasets.
4) Application to Real-World Contexts:The keywords in this block deal with the applicability of the proposed LSGDM approaches to real-world LSGDM situations and the use of LSGDM support systems.
This section analyzes the main research trends related to LSGDM to provide a clear view of the topic by providing a taxonomy of the studied contributions according to the aforementioned four points of view, namelyPreference Structure,Group Decision Rules,Evaluation of QualityandReal World Problems(see Fig. 9). To do so, first each block is introduced by providing a detailed description of its main specificities and then the 87 contributions obtained from our search in the WoS database are classified according to such four points of view. By using the results obtained in this section, a critical analysis of the studied contributions and several possible new research trends will be, respectively,provided in Sections V and VI.
This block is devoted to classify the studied contributions according to the way in which the information is elicited from DMs [44]. The concept ofPreference Structurein decision making in general and in LSGDM in particular is referred to the format in which DMs give their opinions. DMs could be asked to provide their opinions by following different formal rules which, in turn, give place to several preference structures. Since the chosen preference structure will determine the nature of the input of any GDM model, it is key to properly select these structures according to the faced decision situation.
There are several relevant features related to preference structures to keep in mind:
1) Type of Information:One of the most relevant features in preference structures is the type of information in which DMs are allowed to give their opinions, which may be of different natures. The analyzed proposals from the 87 papers essentially use three types of information when modelling DMs’preferences, namely, numeric, linguistic and heterogeneous:
TABLE I HIGHLY CITED PAPERS
Fig. 6. Temporal distribution of the reviewed contributions (April 2021).
Fig. 7. Keywords related to LSGDM.
Fig. 8. Mean per year publications related to LSGDM keywords.
i) Numeric:Some proposals consider that the information is given numerically by using preference structures such as fuzzy preference relations (FPRs) [45], multiplicative preference relations (MPRs) [46], hesitant fuzzy preference relations(HFPRs) and so on. Apart from these, there are other numeric structures such as preference orderings [47] or utility functions [48].
ii) Linguistic:Other contributions allow DMs to express their preferences by using linguistic information, which is very useful to model the uncertainty inherent in LSGDM problems due to their complexity. In this sense, there are many types of linguistic preference structures such as linguistic preference relations (LPRs) or hesitant fuzzy linguistic preference relations (HFLPRs), whose elements are represented by linguistic terms belonging to a predefined linguistic term set.
iii) Heterogeneous:Finally, some papers consider situations in which DMs may provide their opinions by using different types of preference structures, numeric or linguistic. By using heterogeneous information, each DM may use the most suitable preference structure according to her/his necessity,which provides more flexibility to the elicitation task.
2) Personalized Semantics:On the other hand, especially in large-scale contexts, the DMs participating in the decision may possess different backgrounds or use different scales to express their preferences. Therefore, an interesting research area related to preference structures is the management of DMs’personalized individual semantics, which is devoted to deal with the different knowledges and subjectivities of DMs when expressing their opinions [49]-[51].
Fig. 9. Found keywords classified according to their relations.
3) Consistency:Other research line is devoted to study theconsistencyof the DMs’ preferences [22], [52]-[54], since sometimes the information provided by these DMs may be contradictory and lead to unreliable results.
4) Incomplete Information:The last identified research line focuses on dealing withincompletepreference structures,since limitation of knowledge over the alternatives or the time pressure could lead to circumstances in which DMs may not provide all the necessary preference values [53], [55]-[57].
Table II classifies the revised contributions according to the type of preference structures used to model the DMs’preferences and their type of information, and Table III shows the acronyms of such preference structures.
Group Decision Rulesblock analyses the contributions according to the internal performance of the models. Initially,GDM was based on using certain classic rules [31] such as the Majority Rule, Borda Count, or Unanimity in order to fuse the individual preferences of the respective DMs into one single collective opinion. Nowadays, these few methods have evolved into many rules, which provide several frameworks to achieve the same goal. These rules cover a wide spectrum of possibilities, such as methods to reach agreed solutions obtained by simulating a discussion process or proposals which evaluate alternatives taking into consideration different conflicting criteria. Therefore, when designing an LSGDM model, it is essential to carefully select these rules according to the needs of the faced problem.
1) Aggregation Operators:The importance of selecting the adequate aggregation operator cannot be neglected [105] since the main differences among the GDM models are usually related to the way in which the information is combined. In spite of this, just a few articles in our database [44], [106]focus exclusively on proposing new aggregation operators for large-scale contexts.
2) Multi-Criterion Decision Making:The analysis of the proposals in our database reveals that the use of classic multicriterion group decision methods to solve LSGDM problems is widely extended. Among these approaches, one of the most common is thetechnique for order of preference by similarity to ideal solution(TOPSIS) [51], [57] based on the idea that the best chosen alternative for a decision problem should have the shortest geometric distance regarding the ideal solution and the largest geometric distance regarding the negative antiideal solution, being the ideal solution the one that maximizes benefit criteria and minimizes cost criteria and the anti-ideal solution the one that maximizes cost criteria and minimizes benefit criteria. There are also approaches that use themultiobjective optimization on the basis of a ratio analysis plus the full multiplicative form(MULTIMOORA) [37], which obtains a final ranking by aggregating the results of the ternary ranking methods Ratio systems, Reference Point approach andFull Multiplicative Form or theELimination Et Choix Traduisant la REalité(ELECTRE) III [89], an outranking method based on pairwise comparisons (every option is compared to all other options) which is able to provide a total/partial order of the alternatives by using pseudo-criteria and outranking degrees.
TABLE II PROPOSALS CLASSIFIED ACCORDING TO THE PREFERENCE STRUCTURE USED IN LSGDM
TABLE III ACRONYM FOR THE IDENTIFIED PREFERENCE STRUCTURES USED IN LSGDM
3) Weighting and Dimension Reduction Techniques:In large-scale contexts, it is essential to be able to manage at the same time thousands of DMs’ opinions to achieve a solution.Therefore, it is necessary to use dimension reduction techniques to reduce the resource consumption or specific weighting processes to determine the importance of each DM.Several dimension reduction techniques have been identified in the analyzed contributions:
i) Clustering:This technique consists of reducing the dimension of DMs by grouping those with a similar performance into the same subgroups/clusters. In the literature, we can find well-known clustering methods such as fuzzy C-means [7],[15] or K-means [39], [41] but also other novel clustering methods such as grey clustering, fuzzy equivalence and others techniques.
ii) SNA:Another widely accepted method is the use of tools to reduce the data sparsity related to DMs’ preferences through SNA techniques [40], [60]. These kinds of proposals are based on the graph theory and allow weighting DMs by taking into account human factors such as the trust relations among them.
iii) Clustering and SNA:Some proposals combine clustering and SNA to produce several independent subnetworks of DMs according to the relations among them [86], [93].
iv) Others:Besides clustering and SNA, it is possible to find other weighting and dimension reduction techniques in the literature, which are usually based on mathematical programming [57], [73].
The main contributions related to weighting and dimension reduction techniques are shown in Table IV.
4) Consensus Models:Some real world situations require an agreement among a large number of DMs. Traditionally,researchers have faced these situations by proposing consensus models for a few DMs in GDM. However, these models have proven to be inappropriate to deal with LSGDM problems [20] because of the peculiarities of these contexts.The main consensus models identified in the analyzed proposals are shown in Table V.
i) Feedback:Even though classical consensus models assume the role of a moderator to analyze the state of the consensus process and provide recommendations to the DMs,in contexts in which hundreds or thousands of DMs take part both the moderator figure andfeedbackmechanisms [16],[18], [39], [42] are obsolete due to the fact that they are too time-consuming and not feasible in practice. Therefore, largescale consensus proposals are devoted to replace both with automatic mechanisms to provide recommendations and analyze the level of consensus achieved. The use of mathematical optimization techniques is widely extended in the literature related to this regard.
ii) Behavior management:The large number of DMs in large-scale contexts increases the probability of dealing with DMs who refuse to adjust or make changes in their preferences. For this reason, it is necessary to includemechanisms to face theseuncooperative behaviorsin order to prevent the failure of the consensus process.
TABLE IV MAIN PROPOSALS ACCORDING TO THEIR WEIGHTING AND DIMENSION REDUCTION TECHNIQUES
TABLE V MAIN CONSENSUS MODELS
iii) Cost:Cost refers to the price (economical or attitudinal)of changing DMs’ opinions [40]-[42]. For instance, some widely used consensus models are the so-called minimum cost consensus models [26] whose aim is to provide a feasible consensual solution by changing the initial DMs’ opinions as few as possible.
iv) Minority opinions:The coalition of large groups in large-scale contexts may cause ignoring minority group opinions that are just as valid as the first. Even though these differing opinions are often referred to as obstacles to decision-making, several proposals study how to properly manage the importance given to these minority opinions [21],[84], [86].
5) Optimization Models:Due to their flexibility, the use ofmathematical programmingtechniques is also pretty popular among researchers. Therefore, it is usual to find models which rely on optimization models to complete missing information[53], managing individual semantics [49], translating preference structures [15], for weighting determination [52], [54],[57] defining groups [73], in SNA [97], or in consensus models [22], [36], [41].
After designing the rules which define an LSGDM method,it is necessary to test the feasibility of the proposal when dealing with a specific decision problem. Consequently, the blockEvaluation of qualityis devoted to study the reviewed papers according to the mechanisms used by researchers in order to show the feasibility of their models. In the studied literature, there are essentially three kinds of methods to showthe good performance of the proposed models (see Table VI).
TABLE VI MAIN EVALUATION TECHNIQUES
TABLE VII MAIN APPLICATIONS TO REAL-WORLD PROBLEMS
1) Experimental Comparisons:The majority of the consulted references use experimental comparisons, consisting of testing the performance of the proposed models by comparing them with other techniques through different simulations [14], [39], [51]. However, there are not widely extended metrics to compare these models, on the contrary, a huge number of different measures can be found in the reviewed literature, such as the final ranking of the alternatives, the cost incurred to achieve a solution, consensus degree, number of discussion rounds, and so on.
2) Theoretical Comparisons:Other authors propose theoretical comparisons in which the advantages of their models over others are discussed [8], [38], [50].
3) Datasets:Finally, other proposals just provide the results of testing their models in a certain dataset, which may be obtained from real DMs or created manually by the authors[68], [79], [103].
Decision making is a natural activity of human beings’ life and covers multiple disciplines in society related to management, education, or healthcare. Therefore,Applicationblock is focused on analyzing the proposals from the point of view of their implementation to solve concrete problems.Consequently, this subsection reviews how the different LSGDM proposals in the specialized literature are enforced by taking into account two main groups of applications:
1) Real-World Problems:This group resembles those applications related to using LSGDM models in real-world situations. The flexibility of LSGDM techniques to deal with all kinds of situations has allowed researchers to provide solutions for many problems (see Table VII). For instance, it has been applied to solve health-related problems [90] such as COVID-19 pandemic [96], [100] and other emergency situations [14], [36], [54]. In addition, the recent interests of society in sustainability problems have led to studies related to green suppliers selection [37], [57], energy [86], [107] or water management [15], [16], [89]. Furthermore, it is also possible to find applications of LSGDM in technological environments [34], [79], [108].
2) Decision Support Systems:Decision support system refers to those software applications whose aim is to assist DMs to make proper choices when facing decision situations.Several LSGDM support systems have been found in the review such as LaSca [24], which stands out because of the flexibility in which DMs can “decide how to decide”,MENTOR [25] which is a graphical tool to study the evolution of the preferences during an LSGDM process and DeciTrustNET [109] which takes into account trust and reputation in social networks.
Once we have a clear view of the current state of the art of LSGDM, it is necessary to devote one section to provide a critical analysis of it in LSGDM. First, it is provided a general critique regarding the vagueness of several notions related to LSGDM. Afterwards, the main trends related to LSGDM identified in the bibliographic analysis are discussed from the four blocks considered in Section IV.
Undoubtedly, LSGDM is today a hot topic among researchers in Computer Science area. In spite of this, the main notions regarding this topic do not have any theoretical or practical support, but they are based on assumptions which have been inherited through years because of their wide extended use, which, in the end, has implied a deviation from the initial purpose of LSGDM. Consequently, this subsection is devoted to discuss all of these definitions and redirect them to face the new challenges demanded by society.
1) Definition of LSGDM:Even though LSGDM should be devoted to deal with decision situations in which thousands or millions of DMs take part, the analysis of the existing literature shows that researchers have abused of the “20 or more experts” definition [17] to publish papers in the topic which are not necessarily focused on solving any real-world problem nor society demand.
According to Carvalhoet al. [24], the oldest reference found in our search, this definition seems to be motivated by the fact that finding 20 experts in a certain area who want to participate in the decision process is a difficult task to carry out in practice, especially if they are expected to meet in the same room. However, the origin of this boundary of 20 DMs is not clear. When justifying the number of DMs which bounds the notion of LSGDM, some proposals refer to even older works from the early 2000s, which are usually hard to retrieve because they have been published on nonindexed research journals, and others do not provide any justification or cite. As a consequence, the vast majority of the reviewed papers validate their approaches by using examples with 50 or fewer experts referring to this definition (see Fig. 10).
However, new technological advances allow us to consider the preferences of a huge number of DMs and this former definition for LSGDM seems to be inadequate for the current situation. Furthermore, this definition introduces a certain ambiguity when considering a model whose performance is limited to 50 DMs and another proposal which can deal with 500 DMs to bethe same. On the one hand, the formal aspects of both problems do not have necessarily to be similar, and neither the methods and techniques used to properly model these decision situations. On the other hand, this ambiguity may result in redundant proposals in which a GDM model in which 19 DMs are considered, could be easily transformed into anLSGDM modelby using the same proposal in a problem which requires of 20 DMs.
Fig. 10. Contributions are classified according to the number of DMs used in their examples.
In order to overcome this problem, we propose the use of the following definition:
Definition 1 (m-Large-Scale Group Decision Making Model):Anm-large-scale group decision making (m-LSGDM) model is a method which has proven to be able to efficiently manage LSGDM situations involvingmDMs.
Remark 1:It should be noted that to consider a model as anm-LSGDM, the respective authors must provide a sustained proof of its good performance when dealing with these kinds of problems.
This nomenclature not only provides a clear vision of what authors intend with their proposals, but also a taxonomy regarding the performance threshold of each contribution. In addition, this allows to easily identify the most suitable models to solve a specific LSGDM problem.
2) Ambiguity in the Notion of Expert:Another controversial terminology is the use of the termexpertto name the participants of an LSGDM problem, because it does not seem to be reasonable to ask a million people to be an expert in a concrete area. In spite of this, many contributions use the terms expert/decision maker/stakeholders interchangeably.Therefore, the term “expert” should be replaced by other terms such asstakeholderorDMwhen dealing with largescale decision situations, especially those in which hundreds or thousands of DMs are required.
3) Consensus in LSGDM:The notion of consensus in largescale contexts regarding millions of DMs seems to be unclear.Classic literature states that a fundamental assumption for CRPs is the fact that all the DMs agree to change their preferences in order to get a collective agreement [110].However, this collective agreement may not be the goal of the DMs which participate in large-scale decision situations and considering the same assumption could be too optimistic.Therefore, in large-scale situations, the philosophy behind the idea of consensus should not assume a will for agreement, but a personal interest in achieving a collective solution which harms each DM as little as possible: when millions of DMs take part in a decision situation in which consensus is desired(for example, e-democracy), this will of consensus should be understood as a will of maximizing the personal satisfaction of each individual with respect to the desired consensual solution.
1) Elevate Number of Preference Structures:The most remarkable feature is the fact that there are too many preference structures proposed in the literature. For the sake of providing more flexibility for DMs, researchers have developed different types of preference structures. However,even though this purpose is noble, we have found no proposals related to the comparison of the performance of the different preference structures in LSGDM contexts, which could lead to imprecise results or redundant proposals in which only variation is given by changing the type of preference structure used. To overcome this drawback,rigorous studies are necessary to decide which preference structure is most suitable for a certain problem.
2) Heterogeneous Knowledge:It should also be highlighted the fact that in problems in which thousands of DMs take part,the differences in their knowledge could be considerable.However, due to the majority of the reviewed proposals considertoy examples(less than 50 experts) to validate the proposed model, this issue is often neglected, and these differences are not considered. When dealing with LSGDM problems in which a larger number of DMs are involved, they should be allowed to express their preferences by using flexible expression domains and their influence in the decision process must be related to their degree of knowledge about the topic.
3) Inconsistency:Another key aspect related to the preference structures is the consistency of the information given by DMs. However, this issue is usually not considered in the reviewed proposals, which could lead to contradictory results. To avoid this issue, it is necessary to evaluate the consistency from DMs’ opinions (before and after the decision process) to guarantee reliable solutions, especially in real LSGDM problems in which the high complexity and uncertainty may increase the probability of the occurrence of this phenomenon.
4) Incompleteness:Finally, it is possible that because of the lack of knowledge, time limitations, or simply human errors,some values of the preferences are missing, especially in LSGDM problems in which complexity is high and hundreds of DMs, usually not experts in the topic, take part in the decision process. Although this fact is rarely taken into account by researchers, new nontrivial mechanisms to manage these missing values should be proposed to generate complete preferences as complete as possible.
1) Extension of Classic GDM Techniques to LSGDM:In the revised proposals no reviews about the performance of classic multi-criterion GDM methods (TOPSIS, AHP, PCA,...), weighting mechanisms, or dimension reduction techniques in largescale contexts in which hundreds or thousands of DMs take part have been found. Even though they have proved to be effective when dealing with 20-50 DMs, there is no guarantee of their good performance for larger groups [27] and it seems that these methods have been directly imported into LSGDM contexts without any proof of their feasibility. It has been already proved that classic CRPs are not suitable for dealing with LSGDM problems [20], because these techniques do not perform a reduction of the dimension and neglect the consideration of DMs’ behaviors. Therefore, to guarantee the good performance of other classic GDM techniques in largescale contexts, it is necessary to previously develop a depth study regarding the feasibility of these models in several scenarios in which different numbers of DMs are considered and, in case they are not suitable for dealing with, study the possibility of extending these methods to LSGDM.
2) Feedback and Moderator in Large-Scale Consensus:
Regarding consensus models, the use of the terminologyprocesswhen referring to consensus models seems to be obsolete. On the one hand, the role of the human moderator is unfeasible to develop in large-scale contexts due to time and resource limitations. On the other hand, simulating different discussion rounds in which feedback is provided to the DMs to influence their opinions could lead to endless situations.However, some reviewed contributions inherit the original concept of CRP and apply these ideas to propose consensus models which consider either the moderator figure or feedback mechanisms. This could be feasible when dealing with 20-50 DMs, but it is a nonsense when considering thousands of them. Therefore, the classic idea of consensus model as an iterative discussion process should be replaced by automatic algorithms which do not necessarily involve discussion rounds, human moderators, nor the approval of DMs to change their opinions.
3) Non-Cooperative Behaviors in Consensus Models:In addition, when thousands or millions of DMs take part in a decision problem, it should not be supposed that all of them agree to reach a collective agreement because they may have different interests and, consequently, form groups according to their personal profits. According to our bibliography analysis, some proposals already include techniques to detect and manage these uncooperative behaviors, but their use is not extended and those authors who take into account such mechanisms usually apply them to solve simple problems involving 50 or fewer DMs.
1) Toy Examples:The main critique in this subsection is related to the widely spread use oftoy examplesto study the performance of the proposed models (see Fig. 11). The majority of the reviewed papers claim to propose LSGDM models, but just solve cases in which less than 50 DMs are considered and there is no information about the performance of these models when thousands of DMs are required (see Fig. 10). In this regard, it is necessary to be more demanding with the conditions in which the validity of a method is tested.Solving a problem from a concrete dataset is not enough to guarantee the good performance of the proposals in any context. Since models dealing with 20-50 DMs do not have to be similar to those which deal with several millions,researchers should clarify from the beginning the volume of DMs which their proposal is able to manage (see Definition 1)and also make sure that the models are stable by carrying out several simulations with different values of the preferences.
Fig. 11. Contributions are classified according to their evaluation technique.
2) Global Metrics:Regarding the simulations from the previous paragraph, there is no universal way to develop them. Usually, researchers use a convenient measure to highlight the best properties of their models when making comparisons with others, but there are no global metrics which allow researchers to do a fair balancing by showing both positive and negative aspects of the models. Recently, a metric with this property was proposed [111] for consensus models, but it is key to introduce new ones for other problems to analyze different features of the LSGDM methods such as the proper selection of the preference structures according to the problem to solve and the DMs, the robustness of the final alternatives ranking or the understanding degree of the results.
3) Accessibility to the Existing Models:Currently, there is no easy way to get access to the models proposed by other authors, since there are no common repositories in which authors can upload their proposals, making it quite complex to make comparisons among several approaches. To facilitate comparisons among different models, a common platform should be developed to allow researchers to test and upload their proposals.
1) Real World Problems:The majority of the revised studies are oriented to introduce abstract methods and the proposed models are used to solve simple toy examples with no interest to society. Especially in a purely applied area like LSGDM,the main purpose of research should be facing real world problems instead of being deviated towards publication goals.
2) LSGDM Support Systems:Finally, the inherent complexity in real world LSGDM problems makes it difficult to approach their resolution by users who are not experts in the area. Under these circumstances, the use of LSGDM support systems is mandatory to facilitate the entire decision process.However, there is an evident lack of LSGDM support systems to facilitate the resolution of LSGDM problems and appropriate user-friendly software should be developed.
Section IV was devoted to analyzing the current state of the art of LSGDM and in Section V we developed a critical analysis of the main drawbacks in the area. In this critique,several limitations regarding the researching in the topic have been highlighted, which must be addressed for the sake of the quality of current and future researches in the topic.Therefore, this section provides a discussion about the future challenges and trends on LSGDM according to our bibliographic and critical analysis. The remaining of this section will be based on the four block scheme shown in Section IV.
Regarding preference structures, the main issue which is usually neglected in the literature is the fact that there are too many preference structures. It is required a deep analysis of if some of them are redundant and about which ones are better for representing DMs opinions in a certain LSGDM problem,especially taking into account that some preference structures,such as FPRs, add more variables to the LSGDM problem,which implies more complexity and resource consumption.
Besides, the reviewed proposals consider preferences modeled by using linear preferences. However, a recent study[112] has shown that when using nonlinear scales to remap the DMs’ preferences the consensus models improve and the obtained collective solution for the decision problem is also more realistic from a psychological point of view. Therefore,further studies regarding the impact of these nonlinear scales in LSGDM would be desirable.
When dealing with the internal performance of the reviewed models, the most remarkable critique is related to the nonexistence of studies to guarantee the good performance of classic GDM techniques in large-scale contexts in which hundreds or thousands of DMs are involved [27]. Researchers have been directly applying these methods in contexts in which 50 or fewer DMs are considered, but there is no proof about if they will also present a good performance when more DMs are involved in the decision situation. Rigorous studies about the feasibility of these techniques in large-scale contexts are required and, if necessary, these proposals should be extended to deal with LSGDM problems.
Additionally, an interesting research line for this block could be proposing hybrid models in which it is necessary to combine the knowledge of a group of DMs and the information obtained from a large database of users’preferences, Internet of Things (IoT) devices and so on in order to provide realistic solutions for real world problems.
In order to prove the validity of the reviewed techniques,authors usually test their models by usingtoy exampleswhich consider less than 50 DMs. Although it matches the original definition of LSGDM [17], this way of evaluating the performance of a proposal does not seem to be appropriate for nowadays society in which some problems require of taking into account the preferences of millions of users. In this contribution, we have proposed the definition ofm-LSGDM addressing those models which are able to manage decision situations in whichmDMs are required. This notion allows to easily classify both the existing and new proposals according to the number of DMs which are designed to deal with. In this regard, it is necessary to test classical models in more demanding contexts which require of standard datasets with hundreds or thousands of DMs in order to avoid ambiguous proposals whose performance in contexts with more than 50 experts is unclear. In addition, global metrics (none of them were found in our search) which allow comparing models should be proposed and used by researchers to show the quality of their models. Furthermore, it would be interesting to develop a universal research platform composed by the different existing LSGDM models in order to facilitate the accessibility of these proposals and the comparisons among them. Therefore, a new research line focuses on the performance analysis of the LSGDM models and their validness from an objective point of view seems to be primordial.
Finally, even though GDM is a purely applied topic, the reviewed proposals usually consist of providing theoretical models which are applied to solve easy examples. The main interest of the area should be devoted to solving real world problems, instead of proposing more models whose performance is just studied for 50 or fewer DMs. Using LSGDM models in Big Data environments or designing new LSGDM Support Systems devoted to e-democracy could be prominent research lines regarding this issue. In addition, it would be interesting to consider the application of other Artificial Intelligence tools to LSGDM. For instance, how to apply Natural Language Processing methods to improve the model of DMs’ preferences when they are obtained from social networks in which millions of users take part or developing Group Recommendation Systems for managing millions of users which provide recommendations by taking into account a certain consensus degree when fusing the preferences of other users with similar profiles.
The main aim of this review is to become a turning point for researchers to better understand the concept of LSGDM and introduce proposals that explore new challenges in the area related to new technological developments such as Big Data or social media and pay more attention to the validness of their models under these contexts.
This contribution has performed a systematic review of the existing literature regarding LSGDM. To do so, we have followed the indications for developing bibliographic analysis in Software Engineering proposed by Kitchenham and Charters [43]. By using this methodology, the existing proposals have been reviewed from four different points of view, namely Preference Structure, Group Decision Rules,Evaluation of Quality and Applications, which contain the most relevant keywords in the LSGDM literature and represent the different steps to consider when proposing LSGDM models. Since the developed analysis has revealed several major drawbacks regarding the current research in the topic, this contribution also provides a deep critical analysis of these bad habits found in the literature and some indications about how to redirect future investigation towards the original purpose of LSGDM, which was related to propose frameworks to face decision situations involving an elevated number of DMs.
It should be highlighted that defining theoretical models and testing their performance in toy examples, in which 20-50 DMs are considered, may be a profitable source of content from the point of view of publishing interests, but they would be hard to be applied in practical situations if they do not explicitly specify the number of DMs that are able to manage and prove their good performance in these contexts. In a purely applied area like this, researchers should focus future studies on dealing with real world problems involving a large group of DMs (for instance, Netflix manages 209 million paid memberships) instead of proposing more “large-scale” models which work just with 20 DMs.
ACKNOWLEDGMENT
We thank to Javier Andreu Pérez for his support in the bibliographic analysis.
IEEE/CAA Journal of Automatica Sinica2022年6期