Consensus measurement in Delphi studies: Review and implications for future quality assurance

https://doi.org/10.1016/j.techfore.2012.04.013Get rights and content

Abstract

Consensus measurement plays an important role in Delphi research. Although it is not the technique's aim, the measurement has to be considered an important component of Delphi analyses and data interpretation. During the past 60 years, the Delphi multi-round survey procedure has been widely and successfully used to aggregate expert opinions on future developments and incidents. This paper is dedicated to how consensus (and dissent) has been measured since the technique's emergence in the 1960s and which criteria have been used. The review also includes a description of its relationship with the measurement of stability over Delphi rounds, although the major focus lies on the concept of consensus. In an extensive literature review, 15 types of measure were identified and classified for measuring consensus (and/or stability) in detail. The research reveals that there are obvious deficits in the practice and rigour of consensus measurement for Delphi research: mistakes in statistical tests or their premises have even been made. This article gives a broad understanding of the consensus concept, shows strengths and weaknesses as well as premises of different types of measure and concludes with lessons learned. Its major contribution is therefore on improving the future quality of consensus-oriented Delphi studies.

Highlights

► Research reveals deficits in the practice and rigour of consensus measurement for Delphi research. ► Qualitative criteria, descriptive and inferential statistics are used for consensus measurement. ► Consensus is not a suitable stopping criterion in Delphi research. ► Consensus measurement is an important component for Delphi analysis and interpretation.

Introduction

Mankind has always desired to know what the future will be like. Throughout history, people have consulted chosen individuals who were said to be able to envision the future. Among the most famous prophesiers is Michel de Notredame (1503–1566) or Marie Anne Adélaide Lenormand (1772–1843). For over thousand years, oracles constituted the lives of Romans and Greeks. From the eighth century B.C. until the third century A.D., people primarily consulted oracles regarding fortune, success, marital affairs, professional advancement, and judicial disputes [1], [2]. In these times, oracular sites were spread all over Greece. The two greatest were in Delphi, associated with Apollo, and in Dodona, associated with Zeus [3]. The Greek word Delphoi means “hollow” or “womb”. Historians interpret it as a reference to Gaia, the great mother of all creatures on Earth or the primordial Earth goddess, in the Ancient Greek religion.

In the 1950s, the term Delphi was adopted by the U.S. RAND Corporation for its research purposes. The RAND Corporation was a research institution that initially focused on national security issues and later concentrated on scientific, educational, and charitable endeavours for public welfare. Within the scope of the “Project Delphi”, RAND researchers developed a structured survey in written form in order to estimate bombing requirements. For confidentiality reasons, the contents of the experiment were first published by Dalkey and Helmer [4] in their article “An Experimental Application of the Delphi Method for the Use of Experts” 10 years later. Project Delphi was sponsored by the United States Air Force and included the application of “expert opinion of the selection, from the viewpoint of a Soviet strategic planner, of an optimal U.S. industrial target system and to the estimation of the number of A-bombs required to reduce the munitions output by a prescribed amount” [4, p.458]. The expert panel consisted of seven specialists in the areas of economics, physics, systems analysis, and electronics. Dalkey and Helmer [4] reported that the experts' first evaluation of possible industry targets did not result in consensus. However, in a second estimate, consensus was achieved and the procedure was said to have yielded more reliable results than comparable techniques. Shortly after the technique's introduction to the public in 1963, various studies using the technique on non-military issues followed.

Since the 1950s, the usage of the Delphi survey method has undergone different stages of development, which were described by Rieger [5]:

  • 1.

    Secrecy/obscurity (1950s): exclusive application in the military context

  • 2.

    Novelty (1960s): declassification by the U.S. military and introduction to the public

  • 3.

    Popularity (1970–1975): spread to Western Europe, Eastern Europe, and Asia; major forecasting tool in business

  • 4.

    Scrutiny (1975–1980): critical evaluation of the technique's reliability and validity

  • 5.

    Continuity (1980–1986): acceptance in science and practice; stable application patterns

After a time of stagnation in the 1980s, the Delphi technique received increasing interest in the early 1990s again. As an extensive literature review by Landeta [6] shows, this trend prevails. In total, 414 Delphi-related articles were published in the two major databases “Science Direct” and “ABI/Inform” during 1995 and 1999 [6]. This number increased to 677 articles in the period between 2000 and 2004. Similar patterns can be observed for book publications. Google Ngram viewer displays graphs showing how certain phrases have occurred in a collection of books (generated in July 2009) during time (see [7]). From 1950 until 2008, a bigram search in a sample of more than one million English books for “Delphi study” reveals a strong increase from almost zero in 1960 to peak in 1962, but then to decrease to about half of the 1980 frequency around 1990. Since then, the term “Delphi study” was used increasingly and after 2005 reached a level higher than all years before.

More recent applications concentrate on the web-based implementation of the Delphi procedure, but still follow the technique's fundamental rationale and consider consensus measurement to be a key component of analysis [8], [9]. The facts presented here underline that the Delphi technique is widely accepted as a research technique today and that its value has been scientifically and practically proven.

This paper contributes to research on the Delphi techniques in three ways. First, it identifies and explains 15 different types of measures related to the field of consensus measurement and/or stability over Delphi rounds. Second, it gives an overview of the criteria that have been defined for these 15 different types of measurement. Third, an overall assessment is presented that gives guideline for future Delphi work. The overall research questions are therefore: (1) how has consensus been measured in Delphi studies since its emergence in the 1960s until today?; (2) which levels of measurement have been used to define consensus?; and (3) which implications should be considered for future quality assurance in Delphi research? The research presented here is therefore conceptual in nature and represents a comprehensive literature review across multiple disciplines and research strands.

The remainder of the paper is organized as follows: Section 2 summarizes the fundamental characteristics and rationale of Delphi in order to set the basis for the following core sections on consensus measurement. Section 3 is dedicated to general definitions of “consensus”, its relationship with stability and the differentiation from dissent-oriented Delphi studies. 4 The use of subjective criteria and descriptive statistics, 5 Inferential statistics for consensus measurement provide an overview of different types of consensus measurement and the defined consensus criteria. This includes subjective and descriptive types of measurement as well as inferential statistics. Conclusions will be stated in Section 6; limitations and future research will be addressed in Section 7.

Section snippets

Fundamental characteristics of the Delphi method

The Delphi technique is survey technique in order to facilitate an efficient group dynamic process. This is done in the form of an anonymous, written, multi-stage survey process, where feedback of group opinion is provided after each round. In most cases such studies further focus on opinion building, usually consensus, among experts. Although there are differences in the focus of definitions and the procedure of the technique, four distinct characteristics of Delphi usually remain the same [10]

The concept of consensus in Delphi studies

The efficient structuring of a group communication process can be considered the primary goal of a Delphi study. Consensus measurement in turn should be considered a valuable component of data analysis and interpretation in Delphi research. Nevertheless, many researchers have used it as a sole stopping criterion of rounds, which does not match the original idea of Delphi and is not recommended. In fact, it is important to distinguish between the two different concepts “consensus/agreement” and

The use of subjective criteria and descriptive statistics

Many Delphi studies have used subjective criteria or descriptive statistics for the determination of consensus and the quantification of its degree. The criteria have, however, sometimes been chosen rather arbitrarily. The literature review revealed that researchers have actually used all kinds of descriptive statistics in order to measure consensus. One can find applications of measures of association as well as measures of central tendency and dispersion. Table 1 summarizes the research

Inferential statistics for consensus measurement

Inferential statistics are statistics that help to establish relationships among variables and draw conclusions therefrom [86]. The application of such statistical tests depends on the level of data and whether this data conforms approximately to a normal distribution [83]. If the latter is the case and the data is interval/ratio-scaled, parametric tests can be used. Nonparametric tests, on the other hand, can be used on nominal- or ordinal-scaled data not conforming to a normal frequency

Conclusion

The previous sections have presented the results of an extensive literature review on consensus measurement in Delphi studies including accompanying tests for stability. This review revealed that a general standard of how to measure consensus in Delphi studies does not yet exist. Researchers have applied subjective criteria as well as descriptive and inferential statistics to measure consensus and convergence. Especially in the case of the latter, violations in basic assumptions have been found

Limitations and future research

As with any research, the discussions within this article have some limitations and generate questions for future research. First, the review within this article strongly focused on consensus measurement, which is by far the most often considered concept in Delphi studies since its development. However, the author is aware of the Delphi research that focusses on dissent rather than consensus, e.g. policy Delphi or emerging dissensus-based Delphi methods. Recently, there have been an increasing

Acknowledgments

The content of this publication is partly based on work of the joint research project “Competitiveness Monitor”, funded by the German Federal Ministry of Education and Research (project reference number: 01IC10L18 A) in the course of its leading-edge cluster initiative. I would like to thank all involved in the leading-edge cluster as well as its partners and sponsors. Furthermore, I would like to thank Janice Magel, Dr. Tobias Gnatzy, Philipp Ecken and Dr. Inga-Lena Darkow for their valuable

Dr. Heiko A. von der Gracht is Founder and Director of the Institute for Futures Studies and Knowledge Management (IFK) and post-doctoral researcher at EBS Business School in Wiesbaden, Germany. His research interests are corporate foresight, Delphi and scenario techniques, decision support, and quality in futures research. His works have been published in several books and in peer-reviewed journals, among them Technological Forecasting & Social Change, Futures, and International Journal of

References (114)

  • M. Turoff

    The design of a policy Delphi

    Technol. Forecast. Soc. Chang.

    (1970)
  • W. Rauch

    The decision Delphi

    Technol. Forecast. Soc. Chang.

    (1979)
  • M. Steinert

    A dissensus based online Delphi approach: an explorative research tool

    Technol. Forecast. Soc. Chang.

    (2009)
  • M. Nowack et al.

    Review of Delphi-based scenario studies: quality and design considerations

    Technol. Forecast. Soc. Chang.

    (2011)
  • V.M.R. Tummala et al.

    Applying a risk management process (RMP) to manage cost risk for an EHV transmission line project

    Int. J. Proj. Manag.

    (1999)
  • T. Addison

    E-commerce project development risks: evidence from a Delphi survey

    Int. J. Inf. Manag.

    (2003)
  • E. van de Linde et al.

    The Delphi method as early warning: linking global societal trends to future radicalization and terrorism in The Netherlands

    Technol. Forecast. Soc. Chang.

    (2011)
  • P. Tapio

    Disaggregative policy Delphi: using cluster analysis as a tool for systematic scenario formation

    Technol. Forecast. Soc. Chang.

    (2003)
  • J. Landeta et al.

    People consultation to construct the future: a Delphi application

    Int. J. Forecast.

    (2011)
  • F. Hasson et al.

    Enhancing rigour in the Delphi technique research

    Technol. Forecast. Soc. Chang.

    (2011)
  • A.K. Chakravarti et al.

    Modified Delphi methodology for technology forecasting case study of electronics and information technology in India

    Technol. Forecast. Soc. Chang.

    (1998)
  • D.P. Sharma et al.

    Analytical search of problems and prospects of power sector through Delphi study: case study of Kerala State, India

    Energy Policy

    (2003)
  • M.R. Rogers et al.

    Identifying critical cross-cultural school psychology competencies

    J. Sch. Psychol.

    (2002)
  • E.R. Doke et al.

    Decision variables for selecting prototyping in information systems development: a Delphi study of MIS managers

    Inf. Manag.

    (1995)
  • H. von der Gracht et al.

    Scenarios for the logistics services industry: a Delphi-based analysis for 2025

    Int. J. Prod. Econ.

    (2010)
  • G. Rowe et al.

    The Delphi technique as a forecasting tool: issues and analysis

    Int. J. Forecast.

    (1999)
  • J. Rohrbaugh

    Improving the quality of group judgment: social judgment analysis and the Delphi technique

    Organ. Behav. Hum. Perform.

    (1979)
  • F. Munier et al.

    The role of knowledge codification in the emergence of consensus under uncertainty: empirical analysis and policy implications

    Res. Policy

    (2001)
  • M.J. Bardecki

    Participants' response to the Delphi method: an attitudinal perspective

    Technol. Forecast. Soc. Chang.

    (1984)
  • C.P. Ferri et al.

    Global prevalence of dementia: a Delphi consensus study

    Lancet

    (2005)
  • W.W. Cooper et al.

    A Delphi study of goals and evaluation criteria of state and privately owned latin American airlines

    Socioecon. Plann. Sci.

    (1995)
  • S. Hakim et al.

    The Delphi process as a tool for decision making

  • O. Strathern

    A Brief History of the Future. How Visionary Thinkers Changed the World and Tomorrow's Trends are ‘Made’ and Marketed

    (2007)
  • N.C. Dalkey et al.

    An experimental application of the Delphi method to the use of experts

    Manag. Sci.

    (1963)
  • J.-B. Michel et al.

    Quantitative analysis of culture using millions of digitized books

    Science

    (2011)
  • G. Rowe et al.

    Expert opinions in forecasting: the role of the Delphi technique

  • R.G. Fischer

    The Delphi method: a description

    Rev. Critic. J. Acad. Librariansh.

    (1978)
  • M. Häder

    Delphi-Befragungen. Ein Arbeitsbuch

    (2002)
  • W.N. Dunn

    Public Policy Analysis. An Introduction

    (2004)
  • J.F. Coates

    An overview of futures methods

  • M. Scheibe et al.

    Experiments in Delphi methodology

  • J. Crisp et al.

    The Delphi method?

    Nurs. Res.

    (1997)
  • M.K. Rayens et al.

    Building consensus using the policy Delphi method

    Policy Polit. Nurs. Pract.

    (2000)
  • Y.N. Yang

    Testing the stability of experts' opinions between successive rounds of Delphi studies

  • V.W. Mitchell

    The Delphi technique: an exposition and application

    Technol. Anal. Strateg. Manag.

    (1991)
  • A. Fink et al.

    Consensus methods: characteristics and guidelines for use

    Am. J. Public Health

    (1984)
  • American Heritage Dictionary of the English Language

    (1994)
  • J.S. Armstrong

    Principles of Forecasting: A Handbook for Researchers and Practitioners

  • J. Hall

    Decisions, decisions, decisions

    Psychol. Today

    (1971)
  • M. Turoff

    The Policy Delphi

  • Cited by (1041)

    View all citing articles on Scopus

    Dr. Heiko A. von der Gracht is Founder and Director of the Institute for Futures Studies and Knowledge Management (IFK) and post-doctoral researcher at EBS Business School in Wiesbaden, Germany. His research interests are corporate foresight, Delphi and scenario techniques, decision support, and quality in futures research. His works have been published in several books and in peer-reviewed journals, among them Technological Forecasting & Social Change, Futures, and International Journal of Production Economics.

    View full text