U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Front Psychol

Quantitative and Qualitative Approaches to Generalization and Replication–A Representationalist View

In this paper, we provide a re-interpretation of qualitative and quantitative modeling from a representationalist perspective. In this view, both approaches attempt to construct abstract representations of empirical relational structures. Whereas quantitative research uses variable-based models that abstract from individual cases, qualitative research favors case-based models that abstract from individual characteristics. Variable-based models are usually stated in the form of quantified sentences (scientific laws). This syntactic structure implies that sentences about individual cases are derived using deductive reasoning. In contrast, case-based models are usually stated using context-dependent existential sentences (qualitative statements). This syntactic structure implies that sentences about other cases are justifiable by inductive reasoning. We apply this representationalist perspective to the problems of generalization and replication. Using the analytical framework of modal logic, we argue that the modes of reasoning are often not only applied to the context that has been studied empirically, but also on a between-contexts level. Consequently, quantitative researchers mostly adhere to a top-down strategy of generalization, whereas qualitative researchers usually follow a bottom-up strategy of generalization. Depending on which strategy is employed, the role of replication attempts is very different. In deductive reasoning, replication attempts serve as empirical tests of the underlying theory. Therefore, failed replications imply a faulty theory. From an inductive perspective, however, replication attempts serve to explore the scope of the theory. Consequently, failed replications do not question the theory per se , but help to shape its boundary conditions. We conclude that quantitative research may benefit from a bottom-up generalization strategy as it is employed in most qualitative research programs. Inductive reasoning forces us to think about the boundary conditions of our theories and provides a framework for generalization beyond statistical testing. In this perspective, failed replications are just as informative as successful replications, because they help to explore the scope of our theories.

Introduction

Qualitative and quantitative research strategies have long been treated as opposing paradigms. In recent years, there have been attempts to integrate both strategies. These “mixed methods” approaches treat qualitative and quantitative methodologies as complementary, rather than opposing, strategies (Creswell, 2015 ). However, whilst acknowledging that both strategies have their benefits, this “integration” remains purely pragmatic. Hence, mixed methods methodology does not provide a conceptual unification of the two approaches.

Lacking a common methodological background, qualitative and quantitative research methodologies have developed rather distinct standards with regard to the aims and scope of empirical science (Freeman et al., 2007 ). These different standards affect the way researchers handle contradictory empirical findings. For example, many empirical findings in psychology have failed to replicate in recent years (Klein et al., 2014 ; Open Science, Collaboration, 2015 ). This “replication crisis” has been discussed on statistical, theoretical and social grounds and continues to have a wide impact on quantitative research practices like, for example, open science initiatives, pre-registered studies and a re-evaluation of statistical significance testing (Everett and Earp, 2015 ; Maxwell et al., 2015 ; Shrout and Rodgers, 2018 ; Trafimow, 2018 ; Wiggins and Chrisopherson, 2019 ).

However, qualitative research seems to be hardly affected by this discussion. In this paper, we argue that the latter is a direct consequence of how the concept of generalizability is conceived in the two approaches. Whereas most of quantitative psychology is committed to a top-down strategy of generalization based on the idea of random sampling from an abstract population, qualitative studies usually rely on a bottom-up strategy of generalization that is grounded in the successive exploration of the field by means of theoretically sampled cases.

Here, we show that a common methodological framework for qualitative and quantitative research methodologies is possible. We accomplish this by introducing a formal description of quantitative and qualitative models from a representationalist perspective: both approaches can be reconstructed as special kinds of representations for empirical relational structures. We then use this framework to analyze the generalization strategies used in the two approaches. These turn out to be logically independent of the type of model. This has wide implications for psychological research. First, a top-down generalization strategy is compatible with a qualitative modeling approach. This implies that mainstream psychology may benefit from qualitative methods when a numerical representation turns out to be difficult or impossible, without the need to commit to a “qualitative” philosophy of science. Second, quantitative research may exploit the bottom-up generalization strategy that is inherent to many qualitative approaches. This offers a new perspective on unsuccessful replications by treating them not as scientific failures, but as a valuable source of information about the scope of a theory.

The Quantitative Strategy–Numbers and Functions

Quantitative science is about finding valid mathematical representations for empirical phenomena. In most cases, these mathematical representations have the form of functional relations between a set of variables. One major challenge of quantitative modeling consists in constructing valid measures for these variables. Formally, to measure a variable means to construct a numerical representation of the underlying empirical relational structure (Krantz et al., 1971 ). For example, take the behaviors of a group of students in a classroom: “to listen,” “to take notes,” and “to ask critical questions.” One may now ask whether is possible to assign numbers to the students, such that the relations between the assigned numbers are of the same kind as the relations between the values of an underlying variable, like e.g., “engagement.” The observed behaviors in the classroom constitute an empirical relational structure, in the sense that for every student-behavior tuple, one can observe whether it is true or not. These observations can be represented in a person × behavior matrix 1 (compare Figure 1 ). Given this relational structure satisfies certain conditions (i.e., the axioms of a measurement model), one can assign numbers to the students and the behaviors, such that the relations between the numbers resemble the corresponding numerical relations. For example, if there is a unique ordering in the empirical observations with regard to which person shows which behavior, the assigned numbers have to constitute a corresponding unique ordering, as well. Such an ordering coincides with the person × behavior matrix forming a triangle shaped relation and is formally represented by a Guttman scale (Guttman, 1944 ). There are various measurement models available for different empirical structures (Suppes et al., 1971 ). In the case of probabilistic relations, Item-Response models may be considered as a special kind of measurement model (Borsboom, 2005 ).

An external file that holds a picture, illustration, etc.
Object name is fpsyg-12-605191-g0001.jpg

Constructing a numerical representation from an empirical relational structure; Due to the unique ordering of persons with regard to behaviors (indicated by the triangular shape of the relation), it is possible to construct a Guttman scale by assigning a number to each of the individuals, representing the number of relevant behaviors shown by the individual. The resulting variable (“engagement”) can then be described by means of statistical analyses, like, e.g., plotting the frequency distribution.

Although essential, measurement is only the first step of quantitative modeling. Consider a slightly richer empirical structure, where we observe three additional behaviors: “to doodle,” “to chat,” and “to play.” Like above, one may ask, whether there is a unique ordering of the students with regard to these behaviors that can be represented by an underlying variable (i.e., whether the matrix forms a Guttman scale). If this is the case, we may assign corresponding numbers to the students and call this variable “distraction.” In our example, such a representation is possible. We can thus assign two numbers to each student, one representing his or her “engagement” and one representing his or her “distraction” (compare Figure 2 ). These measurements can now be used to construct a quantitative model by relating the two variables by a mathematical function. In the simplest case, this may be a linear function. This functional relation constitutes a quantitative model of the empirical relational structure under study (like, e.g., linear regression). Given the model equation and the rules for assigning the numbers (i.e., the instrumentations of the two variables), the set of admissible empirical structures is limited from all possible structures to a rather small subset. This constitutes the empirical content of the model 2 (Popper, 1935 ).

An external file that holds a picture, illustration, etc.
Object name is fpsyg-12-605191-g0002.jpg

Constructing a numerical model from an empirical relational structure; Since there are two distinct classes of behaviors that each form a Guttman scale, it is possible to assign two numbers to each individual, correspondingly. The resulting variables (“engagement” and “distraction”) can then be related by a mathematical function, which is indicated by the scatterplot and red line on the right hand side.

The Qualitative Strategy–Categories and Typologies

The predominant type of analysis in qualitative research consists in category formation. By constructing descriptive systems for empirical phenomena, it is possible to analyze the underlying empirical structure at a higher level of abstraction. The resulting categories (or types) constitute a conceptual frame for the interpretation of the observations. Qualitative researchers differ considerably in the way they collect and analyze data (Miles et al., 2014 ). However, despite the diverse research strategies followed by different qualitative methodologies, from a formal perspective, most approaches build on some kind of categorization of cases that share some common features. The process of category formation is essential in many qualitative methodologies, like, for example, qualitative content analysis, thematic analysis, grounded theory (see Flick, 2014 for an overview). Sometimes these features are directly observable (like in our classroom example), sometimes they are themselves the result of an interpretative process (e.g., Scheunpflug et al., 2016 ).

In contrast to quantitative methodologies, there have been little attempts to formalize qualitative research strategies (compare, however, Rihoux and Ragin, 2009 ). However, there are several statistical approaches to non-numerical data that deal with constructing abstract categories and establishing relations between these categories (Agresti, 2013 ). Some of these methods are very similar to qualitative category formation on a conceptual level. For example, cluster analysis groups cases into homogenous categories (clusters) based on their similarity on a distance metric.

Although category formation can be formalized in a mathematically rigorous way (Ganter and Wille, 1999 ), qualitative research hardly acknowledges these approaches. 3 However, in order to find a common ground with quantitative science, it is certainly helpful to provide a formal interpretation of category systems.

Let us reconsider the above example of students in a classroom. The quantitative strategy was to assign numbers to the students with regard to variables and to relate these variables via a mathematical function. We can analyze the same empirical structure by grouping the behaviors to form abstract categories. If the aim is to construct an empirically valid category system, this grouping is subject to constraints, analogous to those used to specify a measurement model. The first and most important constraint is that the behaviors must form equivalence classes, i.e., within categories, behaviors need to be equivalent, and across categories, they need to be distinct (formally, the relational structure must obey the axioms of an equivalence relation). When objects are grouped into equivalence classes, it is essential to specify the criterion for empirical equivalence. In qualitative methodology, this is sometimes referred to as the tertium comparationis (Flick, 2014 ). One possible criterion is to group behaviors such that they constitute a set of specific common attributes of a group of people. In our example, we might group the behaviors “to listen,” “to take notes,” and “to doodle,” because these behaviors are common to the cases B, C, and D, and they are also specific for these cases, because no other person shows this particular combination of behaviors. The set of common behaviors then forms an abstract concept (e.g., “moderate distraction”), while the set of persons that show this configuration form a type (e.g., “the silent dreamer”). Formally, this means to identify the maximal rectangles in the underlying empirical relational structure (see Figure 3 ). This procedure is very similar to the way we constructed a Guttman scale, the only difference being that we now use different aspects of the empirical relational structure. 4 In fact, the set of maximal rectangles can be determined by an automated algorithm (Ganter, 2010 ), just like the dimensionality of an empirical structure can be explored by psychometric scaling methods. Consequently, we can identify the empirical content of a category system or a typology as the set of empirical structures that conforms to it. 5 Whereas the quantitative strategy was to search for scalable sub-matrices and then relate the constructed variables by a mathematical function, the qualitative strategy is to construct an empirical typology by grouping cases based on their specific similarities. These types can then be related to one another by a conceptual model that describes their semantic and empirical overlap (see Figure 3 , right hand side).

An external file that holds a picture, illustration, etc.
Object name is fpsyg-12-605191-g0003.jpg

Constructing a conceptual model from an empirical relational structure; Individual behaviors are grouped to form abstract types based on them being shared among a specific subset of the cases. Each type constitutes a set of specific commonalities of a class of individuals (this is indicated by the rectangles on the left hand side). The resulting types (“active learner,” “silent dreamer,” “distracted listener,” and “troublemaker”) can then be related to one another to explicate their semantic and empirical overlap, as indicated by the Venn-diagram on the right hand side.

Variable-Based Models and Case-Based Models

In the previous section, we have argued that qualitative category formation and quantitative measurement can both be characterized as methods to construct abstract representations of empirical relational structures. Instead of focusing on different philosophical approaches to empirical science, we tried to stress the formal similarities between both approaches. However, it is worth also exploring the dissimilarities from a formal perspective.

Following the above analysis, the quantitative approach can be characterized by the use of variable-based models, whereas the qualitative approach is characterized by case-based models (Ragin, 1987 ). Formally, we can identify the rows of an empirical person × behavior matrix with a person-space, and the columns with a corresponding behavior-space. A variable-based model abstracts from the single individuals in a person-space to describe the structure of behaviors on a population level. A case-based model, on the contrary, abstracts from the single behaviors in a behavior-space to describe individual case configurations on the level of abstract categories (see Table 1 ).

Variable-based models and case-based models.

From a representational perspective, there is no a priori reason to favor one type of model over the other. Both approaches provide different analytical tools to construct an abstract representation of an empirical relational structure. However, since the two modeling approaches make use of different information (person-space vs. behavior-space), this comes with some important implications for the researcher employing one of the two strategies. These are concerned with the role of deductive and inductive reasoning.

In variable-based models, empirical structures are represented by functional relations between variables. These are usually stated as scientific laws (Carnap, 1928 ). Formally, these laws correspond to logical expressions of the form

In plain text, this means that y is a function of x for all objects i in the relational structure under consideration. For example, in the above example, one may formulate the following law: for all students in the classroom it holds that “distraction” is a monotone decreasing function of “engagement.” Such a law can be used to derive predictions for single individuals by means of logical deduction: if the above law applies to all students in the classroom, it is possible to calculate the expected distraction from a student's engagement. An empirical observation can now be evaluated against this prediction. If the prediction turns out to be false, the law can be refuted based on the principle of falsification (Popper, 1935 ). If a scientific law repeatedly withstands such empirical tests, it may be considered to be valid with regard to the relational structure under consideration.

In case-based models, there are no laws about a population, because the model does not abstract from the cases but from the observed behaviors. A case-based model describes the underlying structure in terms of existential sentences. Formally, this corresponds to a logical expression of the form

In plain text, this means that there is at least one case i for which the condition XYZ holds. For example, the above category system implies that there is at least one active learner. This is a statement about a singular observation. It is impossible to deduce a statement about another person from an existential sentence like this. Therefore, the strategy of falsification cannot be applied to test the model's validity in a specific context. If one wishes to generalize to other cases, this is accomplished by inductive reasoning, instead. If we observed one person that fulfills the criteria of calling him or her an active learner, we can hypothesize that there may be other persons that are identical to the observed case in this respect. However, we do not arrive at this conclusion by logical deduction, but by induction.

Despite this important distinction, it would be wrong to conclude that variable-based models are intrinsically deductive and case-based models are intrinsically inductive. 6 Both types of reasoning apply to both types of models, but on different levels. Based on a person-space, in a variable-based model one can use deduction to derive statements about individual persons from abstract population laws. There is an analogous way of reasoning for case-based models: because they are based on a behavior space, it is possible to deduce statements about singular behaviors. For example, if we know that Peter is an active learner, we can deduce that he takes notes in the classroom. This kind of deductive reasoning can also be applied on a higher level of abstraction to deduce thematic categories from theoretical assumptions (Braun and Clarke, 2006 ). Similarly, there is an analog for inductive generalization from the perspective of variable-based modeling: since the laws are only quantified over the person-space, generalizations to other behaviors rely on inductive reasoning. For example, it is plausible to assume that highly engaged students tend to do their homework properly–however, in our example this behavior has never been observed. Hence, in variable-based models we usually generalize to other behaviors by means of induction. This kind of inductive reasoning is very common when empirical results are generalized from the laboratory to other behavioral domains.

Although inductive and deductive reasoning are used in qualitative and quantitative research, it is important to stress the different roles of induction and deduction when models are applied to cases. A variable-based approach implies to draw conclusions about cases by means of logical deduction; a case-based approach implies to draw conclusions about cases by means of inductive reasoning. In the following, we build on this distinction to differentiate between qualitative (bottom-up) and quantitative (top-down) strategies of generalization.

Generalization and the Problem of Replication

We will now extend the formal analysis of quantitative and qualitative approaches to the question of generalization and replicability of empirical findings. For this sake, we have to introduce some concepts of formal logic. Formal logic is concerned with the validity of arguments. It provides conditions to evaluate whether certain sentences (conclusions) can be derived from other sentences (premises). In this context, a theory is nothing but a set of sentences (also called axioms). Formal logic provides tools to derive new sentences that must be true, given the axioms are true (Smith, 2020 ). These derived sentences are called theorems or, in the context of empirical science, predictions or hypotheses . On the syntactic level, the rules of logic only state how to evaluate the truth of a sentence relative to its premises. Whether or not sentences are actually true, is formally specified by logical semantics.

On the semantic level, formal logic is intrinsically linked to set-theory. For example, a logical statement like “all dogs are mammals,” is true if and only if the set of dogs is a subset of the set of mammals. Similarly, the sentence “all chatting students doodle” is true if and only if the set of chatting students is a subset of the set of doodling students (compare Figure 3 ). Whereas, the first sentence is analytically true due to the way we define the words “dog” and “mammal,” the latter can be either true or false, depending on the relational structure we actually observe. We can thus interpret an empirical relational structure as the truth criterion of a scientific theory. From a logical point of view, this corresponds to the semantics of a theory. As shown above, variable-based and case-based models both give a formal representation of the same kinds of empirical structures. Accordingly, both types of models can be stated as formal theories. In the variable-based approach, this corresponds to a set of scientific laws that are quantified over the members of an abstract population (these are the axioms of the theory). In the case-based approach, this corresponds to a set of abstract existential statements about a specific class of individuals.

In contrast to mathematical axiom systems, empirical theories are usually not considered to be necessarily true. This means that even if we find no evidence against a theory, it is still possible that it is actually wrong. We may know that a theory is valid in some contexts, yet it may fail when applied to a new set of behaviors (e.g., if we use a different instrumentation to measure a variable) or a new population (e.g., if we draw a new sample).

From a logical perspective, the possibility that a theory may turn out to be false stems from the problem of contingency . A statement is contingent, if it is both, possibly true and possibly false. Formally, we introduce two modal operators: □ to designate logical necessity, and ◇ to designate logical possibility. Semantically, these operators are very similar to the existential quantifier, ∃, and the universal quantifier, ∀. Whereas ∃ and ∀ refer to the individual objects within one relational structure, the modal operators □ and ◇ range over so-called possible worlds : a statement is possibly true, if and only if it is true in at least one accessible possible world, and a statement is necessarily true if and only if it is true in every accessible possible world (Hughes and Cresswell, 1996 ). Logically, possible worlds are mathematical abstractions, each consisting of a relational structure. Taken together, the relational structures of all accessible possible worlds constitute the formal semantics of necessity, possibility and contingency. 7

In the context of an empirical theory, each possible world may be identified with an empirical relational structure like the above classroom example. Given the set of intended applications of a theory (the scope of the theory, one may say), we can now construct possible world semantics for an empirical theory: each intended application of the theory corresponds to a possible world. For example, a quantified sentence like “all chatting students doodle” may be true in one classroom and false in another one. In terms of possible worlds, this would correspond to a statement of contingency: “it is possible that all chatting students doodle in one classroom, and it is possible that they don't in another classroom.” Note that in the above expression, “all students” refers to the students in only one possible world, whereas “it is possible” refers to the fact that there is at least one possible world for each of the specified cases.

To apply these possible world semantics to quantitative research, let us reconsider how generalization to other cases works in variable-based models. Due to the syntactic structure of quantitative laws, we can deduce predictions for singular observations from an expression of the form ∀ i : y i = f ( x i ). Formally, the logical quantifier ∀ ranges only over the objects of the corresponding empirical relational structure (in our example this would refer to the students in the observed classroom). But what if we want to generalize beyond the empirical structure we actually observed? The standard procedure is to assume an infinitely large, abstract population from which a random sample is drawn. Given the truth of the theory, we can deduce predictions about what we may observe in the sample. Since usually we deal with probabilistic models, we can evaluate our theory by means of the conditional probability of the observations, given the theory holds. This concept of conditional probability is the foundation of statistical significance tests (Hogg et al., 2013 ), as well as Bayesian estimation (Watanabe, 2018 ). In terms of possible world semantics, the random sampling model implies that all possible worlds (i.e., all intended applications) can be conceived as empirical sub-structures from a greater population structure. For example, the empirical relational structure constituted by the observed behaviors in a classroom would be conceived as a sub-matrix of the population person × behavior matrix. It follows that, if a scientific law is true in the population, it will be true in all possible worlds, i.e., it will be necessarily true. Formally, this corresponds to an expression of the form

The statistical generalization model thus constitutes a top-down strategy for dealing with individual contexts that is analogous to the way variable-based models are applied to individual cases (compare Table 1 ). Consequently, if we apply a variable-based model to a new context and find out that it does not fit the data (i.e., there is a statistically significant deviation from the model predictions), we have reason to doubt the validity of the theory. This is what makes the problem of low replicability so important: we observe that the predictions are wrong in a new study; and because we apply a top-down strategy of generalization to contexts beyond the ones we observed, we see our whole theory at stake.

Qualitative research, on the contrary, follows a different strategy of generalization. Since case-based models are formulated by a set of context-specific existential sentences, there is no need for universal truth or necessity. In contrast to statistical generalization to other cases by means of random sampling from an abstract population, the usual strategy in case-based modeling is to employ a bottom-up strategy of generalization that is analogous to the way case-based models are applied to individual cases. Formally, this may be expressed by stating that the observed qualia exist in at least one possible world, i.e., the theory is possibly true:

This statement is analogous to the way we apply case-based models to individual cases (compare Table 1 ). Consequently, the set of intended applications of the theory does not follow from a sampling model, but from theoretical assumptions about which cases may be similar to the observed cases with respect to certain relevant characteristics. For example, if we observe that certain behaviors occur together in one classroom, following a bottom-up strategy of generalization, we will hypothesize why this might be the case. If we do not replicate this finding in another context, this does not question the model itself, since it was a context-specific theory all along. Instead, we will revise our hypothetical assumptions about why the new context is apparently less similar to the first one than we originally thought. Therefore, if an empirical finding does not replicate, we are more concerned about our understanding of the cases than about the validity of our theory.

Whereas statistical generalization provides us with a formal (and thus somehow more objective) apparatus to evaluate the universal validity of our theories, the bottom-up strategy forces us to think about the class of intended applications on theoretical grounds. This means that we have to ask: what are the boundary conditions of our theory? In the above classroom example, following a bottom-up strategy, we would build on our preliminary understanding of the cases in one context (e.g., a public school) to search for similar and contrasting cases in other contexts (e.g., a private school). We would then re-evaluate our theoretical description of the data and explore what makes cases similar or dissimilar with regard to our theory. This enables us to expand the class of intended applications alongside with the theory.

Of course, none of these strategies is superior per se . Nevertheless, they rely on different assumptions and may thus be more or less adequate in different contexts. The statistical strategy relies on the assumption of a universal population and invariant measurements. This means, we assume that (a) all samples are drawn from the same population and (b) all variables refer to the same behavioral classes. If these assumptions are true, statistical generalization is valid and therefore provides a valuable tool for the testing of empirical theories. The bottom-up strategy of generalization relies on the idea that contexts may be classified as being more or less similar based on characteristics that are not part of the model being evaluated. If such a similarity relation across contexts is feasible, the bottom-up strategy is valid, as well. Depending on the strategy of generalization, replication of empirical research serves two very different purposes. Following the (top-down) principle of generalization by deduction from scientific laws, replications are empirical tests of the theory itself, and failed replications question the theory on a fundamental level. Following the (bottom-up) principle of generalization by induction to similar contexts, replications are a means to explore the boundary conditions of a theory. Consequently, failed replications question the scope of the theory and help to shape the set of intended applications.

We have argued that quantitative and qualitative research are best understood by means of the structure of the employed models. Quantitative science mainly relies on variable-based models and usually employs a top-down strategy of generalization from an abstract population to individual cases. Qualitative science prefers case-based models and usually employs a bottom-up strategy of generalization. We further showed that failed replications have very different implications depending on the underlying strategy of generalization. Whereas in the top-down strategy, replications are used to test the universal validity of a model, in the bottom-up strategy, replications are used to explore the scope of a model. We will now address the implications of this analysis for psychological research with regard to the problem of replicability.

Modern day psychology almost exclusively follows a top-down strategy of generalization. Given the quantitative background of most psychological theories, this is hardly surprising. Following the general structure of variable-based models, the individual case is not the focus of the analysis. Instead, scientific laws are stated on the level of an abstract population. Therefore, when applying the theory to a new context, a statistical sampling model seems to be the natural consequence. However, this is not the only possible strategy. From a logical point of view, there is no reason to assume that a quantitative law like ∀ i : y i = f ( x i ) implies that the law is necessarily true, i.e.,: □(∀ i : y i = f ( x i )). Instead, one might just as well define the scope of the theory following an inductive strategy. 8 Formally, this would correspond to the assumption that the observed law is possibly true, i.e.,: ◇(∀ i : y i = f ( x i )). For example, we may discover a functional relation between “engagement” and “distraction” without referring to an abstract universal population of students. Instead, we may hypothesize under which conditions this functional relation may be valid and use these assumptions to inductively generalize to other cases.

If we take this seriously, this would require us to specify the intended applications of the theory: in which contexts do we expect the theory to hold? Or, equivalently, what are the boundary conditions of the theory? These boundary conditions may be specified either intensionally, i.e., by giving external criteria for contexts being similar enough to the ones already studied to expect a successful application of the theory. Or they may be specified extensionally, by enumerating the contexts where the theory has already been shown to be valid. These boundary conditions need not be restricted to the population we refer to, but include all kinds of contextual factors. Therefore, adopting a bottom-up strategy, we are forced to think about these factors and make them an integral part of our theories.

In fact, there is good reason to believe that bottom-up generalization may be more adequate in many psychological studies. Apart from the pitfalls associated with statistical generalization that have been extensively discussed in recent years (e.g., p-hacking, underpowered studies, publication bias), it is worth reflecting on whether the underlying assumptions are met in a particular context. For example, many samples used in experimental psychology are not randomly drawn from a large population, but are convenience samples. If we use statistical models with non-random samples, we have to assume that the observations vary as if drawn from a random sample. This may indeed be the case for randomized experiments, because all variation between the experimental conditions apart from the independent variable will be random due to the randomization procedure. In this case, a classical significance test may be regarded as an approximation to a randomization test (Edgington and Onghena, 2007 ). However, if we interpret a significance test as an approximate randomization test, we test not for generalization but for internal validity. Hence, even if we use statistical significance tests when assumptions about random sampling are violated, we still have to use a different strategy of generalization. This issue has been discussed in the context of small-N studies, where variable-based models are applied to very small samples, sometimes consisting of only one individual (Dugard et al., 2012 ). The bottom-up strategy of generalization that is employed by qualitative researchers, provides such an alternative.

Another important issue in this context is the question of measurement invariance. If we construct a variable-based model in one context, the variables refer to those behaviors that constitute the underlying empirical relational structure. For example, we may construct an abstract measure of “distraction” using the observed behaviors in a certain context. We will then use the term “distraction” as a theoretical term referring to the variable we have just constructed to represent the underlying empirical relational structure. Let us now imagine we apply this theory to a new context. Even if the individuals in our new context are part of the same population, we may still get into trouble if the observed behaviors differ from those used in the original study. How do we know whether these behaviors constitute the same variable? We have to ensure that in any new context, our measures are valid for the variables in our theory. Without a proper measurement model, this will be hard to achieve (Buntins et al., 2017 ). Again, we are faced with the necessity to think of the boundary conditions of our theories. In which contexts (i.e., for which sets of individuals and behaviors) do we expect our theory to work?

If we follow the rationale of inductive generalization, we can explore the boundary conditions of a theory with every new empirical study. We thus widen the scope of our theory by comparing successful applications in different contexts and unsuccessful applications in similar contexts. This may ultimately lead to a more general theory, maybe even one of universal scope. However, unless we have such a general theory, we might be better off, if we treat unsuccessful replications not as a sign of failure, but as a chance to learn.

Author Contributions

MB conceived the original idea and wrote the first draft of the paper. MS helped to further elaborate and scrutinize the arguments. All authors contributed to the final version of the manuscript.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Acknowledgments

We would like to thank Annette Scheunpflug for helpful comments on an earlier version of the manuscript.

1 A person × behavior matrix constitutes a very simple relational structure that is common in psychological research. This is why it is chosen here as a minimal example. However, more complex structures are possible, e.g., by relating individuals to behaviors over time, with individuals nested within groups etc. For a systematic overview, compare Coombs ( 1964 ).

2 This notion of empirical content applies only to deterministic models. The empirical content of a probabilistic model consists in the probability distribution over all possible empirical structures.

3 For example, neither the SAGE Handbook of qualitative data analysis edited by Flick ( 2014 ) nor the Oxford Handbook of Qualitative Research edited by Leavy ( 2014 ) mention formal approaches to category formation.

4 Note also that the described structure is empirically richer than a nominal scale. Therefore, a reduction of qualitative category formation to be a special (and somehow trivial) kind of measurement is not adequate.

5 It is possible to extend this notion of empirical content to the probabilistic case (this would correspond to applying a latent class analysis). But, since qualitative research usually does not rely on formal algorithms (neither deterministic nor probabilistic), there is currently little practical use of such a concept.

6 We do not elaborate on abductive reasoning here, since, given an empirical relational structure, the concept can be applied to both types of models in the same way (Schurz, 2008 ). One could argue that the underlying relational structure is not given a priori but has to be constructed by the researcher and will itself be influenced by theoretical expectations. Therefore, abductive reasoning may be necessary to establish an empirical relational structure in the first place.

7 We shall not elaborate on the metaphysical meaning of possible worlds here, since we are only concerned with empirical theories [but see Tooley ( 1999 ), for an overview].

8 Of course, this also means that it would be equally reasonable to employ a top-down strategy of generalization using a case-based model by postulating that □(∃ i : XYZ i ). The implications for case-based models are certainly worth exploring, but lie beyond the scope of this article.

  • Agresti A. (2013). Categorical Data Analysis, 3rd Edn. Wiley Series In Probability And Statistics . Hoboken, NJ: Wiley. [ Google Scholar ]
  • Borsboom D. (2005). Measuring the Mind: Conceptual Issues in Contemporary Psychometrics . Cambridge: Cambridge University Press; 10.1017/CBO9780511490026 [ CrossRef ] [ Google Scholar ]
  • Braun V., Clarke V. (2006). Using thematic analysis in psychology . Qual. Res. Psychol . 3 , 77–101. 10.1191/1478088706qp063oa [ CrossRef ] [ Google Scholar ]
  • Buntins M., Buntins K., Eggert F. (2017). Clarifying the concept of validity: from measurement to everyday language . Theory Psychol. 27 , 703–710. 10.1177/0959354317702256 [ CrossRef ] [ Google Scholar ]
  • Carnap R. (1928). The Logical Structure of the World . Berkeley, CA: University of California Press. [ Google Scholar ]
  • Coombs C. H. (1964). A Theory of Data . New York, NY: Wiley. [ Google Scholar ]
  • Creswell J. W. (2015). A Concise Introduction to Mixed Methods Research . Los Angeles, CA: Sage. [ Google Scholar ]
  • Dugard P., File P., Todman J. B. (2012). Single-Case and Small-N Experimental Designs: A Practical Guide to Randomization Tests 2nd Edn . New York, NY: Routledge; 10.4324/9780203180938 [ CrossRef ] [ Google Scholar ]
  • Edgington E., Onghena P. (2007). Randomization Tests, 4th Edn. Statistics. Hoboken, NJ: CRC Press; 10.1201/9781420011814 [ CrossRef ] [ Google Scholar ]
  • Everett J. A. C., Earp B. D. (2015). A tragedy of the (academic) commons: interpreting the replication crisis in psychology as a social dilemma for early-career researchers . Front. Psychol . 6 :1152. 10.3389/fpsyg.2015.01152 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Flick U. (Ed.). (2014). The Sage Handbook of Qualitative Data Analysis . London: Sage; 10.4135/9781446282243 [ CrossRef ] [ Google Scholar ]
  • Freeman M., Demarrais K., Preissle J., Roulston K., St. Pierre E. A. (2007). Standards of evidence in qualitative research: an incitement to discourse . Educ. Res. 36 , 25–32. 10.3102/0013189X06298009 [ CrossRef ] [ Google Scholar ]
  • Ganter B. (2010). Two basic algorithms in concept analysis , in Lecture Notes In Computer Science. Formal Concept Analysis, Vol. 5986 , eds Hutchison D., Kanade T., Kittler J., Kleinberg J. M., Mattern F., Mitchell J. C., et al. (Berlin, Heidelberg: Springer Berlin Heidelberg; ), 312–340. 10.1007/978-3-642-11928-6_22 [ CrossRef ] [ Google Scholar ]
  • Ganter B., Wille R. (1999). Formal Concept Analysis . Berlin, Heidelberg: Springer Berlin Heidelberg; 10.1007/978-3-642-59830-2 [ CrossRef ] [ Google Scholar ]
  • Guttman L. (1944). A basis for scaling qualitative data . Am. Sociol. Rev . 9 :139 10.2307/2086306 [ CrossRef ] [ Google Scholar ]
  • Hogg R. V., Mckean J. W., Craig A. T. (2013). Introduction to Mathematical Statistics, 7th Edn . Boston, MA: Pearson. [ Google Scholar ]
  • Hughes G. E., Cresswell M. J. (1996). A New Introduction To Modal Logic . London; New York, NY: Routledge; 10.4324/9780203290644 [ CrossRef ] [ Google Scholar ]
  • Klein R. A., Ratliff K. A., Vianello M., Adams R. B., Bahník Š., Bernstein M. J., et al. (2014). Investigating variation in replicability . Soc. Psychol. 45 , 142–152. 10.1027/1864-9335/a000178 [ CrossRef ] [ Google Scholar ]
  • Krantz D. H., Luce D., Suppes P., Tversky A. (1971). Foundations of Measurement Volume I: Additive And Polynomial Representations . New York, NY; London: Academic Press; 10.1016/B978-0-12-425401-5.50011-8 [ CrossRef ] [ Google Scholar ]
  • Leavy P. (2014). The Oxford Handbook of Qualitative Research . New York, NY: Oxford University Press; 10.1093/oxfordhb/9780199811755.001.0001 [ CrossRef ] [ Google Scholar ]
  • Maxwell S. E., Lau M. Y., Howard G. S. (2015). Is psychology suffering from a replication crisis? what does “failure to replicate” really mean? Am. Psychol. 70 , 487–498. 10.1037/a0039400 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Miles M. B., Huberman A. M., Saldaña J. (2014). Qualitative Data Analysis: A Methods Sourcebook, 3rd Edn . Los Angeles, CA; London; New Delhi; Singapore; Washington, DC: Sage. [ Google Scholar ]
  • Open Science, Collaboration (2015). Estimating the reproducibility of psychological science . Science 349 :Aac4716. 10.1126/science.aac4716 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Popper K. (1935). Logik Der Forschung . Wien: Springer; 10.1007/978-3-7091-4177-9 [ CrossRef ] [ Google Scholar ]
  • Ragin C. (1987). The Comparative Method : Moving Beyond Qualitative and Quantitative Strategies . Berkeley, CA: University Of California Press. [ Google Scholar ]
  • Rihoux B., Ragin C. (2009). Configurational Comparative Methods: Qualitative Comparative Analysis (Qca) And Related Techniques . Thousand Oaks, CA: Sage Publications, Inc; 10.4135/9781452226569 [ CrossRef ] [ Google Scholar ]
  • Scheunpflug A., Krogull S., Franz J. (2016). Understanding learning in world society: qualitative reconstructive research in global learning and learning for sustainability . Int. Journal Dev. Educ. Glob. Learn. 7 , 6–23. 10.18546/IJDEGL.07.3.02 [ CrossRef ] [ Google Scholar ]
  • Schurz G. (2008). Patterns of abduction . Synthese 164 , 201–234. 10.1007/s11229-007-9223-4 [ CrossRef ] [ Google Scholar ]
  • Shrout P. E., Rodgers J. L. (2018). Psychology, science, and knowledge construction: broadening perspectives from the replication crisis . Annu. Rev. Psychol . 69 , 487–510. 10.1146/annurev-psych-122216-011845 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Smith P. (2020). An Introduction To Formal Logic . Cambridge: Cambridge University Press. 10.1017/9781108328999 [ CrossRef ] [ Google Scholar ]
  • Suppes P., Krantz D. H., Luce D., Tversky A. (1971). Foundations of Measurement Volume II: Geometrical, Threshold, and Probabilistic Representations . New York, NY; London: Academic Press. [ Google Scholar ]
  • Tooley M. (Ed.). (1999). Necessity and Possibility. The Metaphysics of Modality . New York, NY; London: Garland Publishing. [ Google Scholar ]
  • Trafimow D. (2018). An a priori solution to the replication crisis . Philos. Psychol . 31 , 1188–1214. 10.1080/09515089.2018.1490707 [ CrossRef ] [ Google Scholar ]
  • Watanabe S. (2018). Mathematical Foundations of Bayesian Statistics. CRC Monographs On Statistics And Applied Probability . Boca Raton, FL: Chapman And Hall. [ Google Scholar ]
  • Wiggins B. J., Chrisopherson C. D. (2019). The replication crisis in psychology: an overview for theoretical and philosophical psychology . J. Theor. Philos. Psychol. 39 , 202–217. 10.1037/teo0000137 [ CrossRef ] [ Google Scholar ]

Quantitative and Qualitative Research Methods

  • First Online: 03 January 2022

Cite this chapter

Book cover

  • Andrew England 5  

591 Accesses

Quantitative research uses methods that seek to explain phenomena by collecting numerical data, which are then analysed mathematically, typically by statistics. With quantitative approaches, the data produced are always numerical; if there are no numbers, then the methods are not quantitative. Many phenomena lend themselves to quantitative methods because the relevant information is already available numerically. Qualitative methods provide a mechanism to provide answers based on the collection of non-numerical data ‘i.e words, actions, behaviours’. Both quantitative and qualitative methodologies are important in medical imaging and radiation therapy.   In some instances, both quantitative and qualitative approaches can be combined into a mixed-methods approach. This chapter discusses all methodological approaches to research from both medical imaging and radiation therapy perspectives.  

  • Quantitative research
  • Qualitative research
  • Mixed methods research
  • Experimental studies
  • Randomised controlled trials
  • Quasi-experimental studies
  • Thematic analysis
  • Statistical analysis

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
  • Durable hardcover edition

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Alzyoud, K., Hogg, P., Snaith, B., Flintham, K., & England, A. (2019). Impact of body part thickness on AP pelvis radiographic image quality and effective dose. Radiography, 25 (1), e11–e17. https://doi.org/10.1016/j.radi.2018.09.001

Article   CAS   PubMed   Google Scholar  

Banks, E., Beral, V., Cmeron, R., et al. (2001). Comparison of various characteristics of women who do and do not attend for breast cancer screening. Breast Cancer Research, 4 , R1. https://doi.org/10.1186/br418

Article   PubMed   PubMed Central   Google Scholar  

Benfield, S., Hewis, J. D., & Hayre, C. M. (2021). Investigating perceptions of ‘dose creep’ amongst student radiographers: A grounded theory study. Radiography, 27 (2), 605–610. https://doi.org/10.1016/j.radi.2020.11.023

Booth, L., Henwood, S., & Millker, P. K. (2017). Leadership and the everyday practice of Consultant Radiographers in the UK: Transformational ideals and the generation of self-efficacy. Radiography, 23 (2), 125–129. https://doi.org/10.1016/j.radi.2016.12.003

Bristowe, K., Selman, L., & Murtagh, F. E. M. (2015). Qualitative research methods in renal medicine: An introduction. Nephrology, Dialysis, Transplantation, 30 (9), 1424–1431. https://doi.org/10.1093/ndt/gfu410

Article   PubMed   Google Scholar  

Cuthbertson, L. M. (2019). The journey to radiographer advanced practice: A methodological reflection on the use of interpretative phenomenological analysis to explore perceptions and experiences. Journal of Radiotherapy in Practice, 19 , 116–121. https://doi.org/10.1017/S1460396919000621

Article   Google Scholar  

Decker, S. (2009). The lived experience of newly qualified radiographers (1950–1985): An oral history of radiography. Radiography, 15 (1), e72–e77. https://doi.org/10.1016/j.radi.2009.09.009

Dillman, J. R., Ellis, J. H., Cohan, R. H., Strouse, P. J., & Jan, S. C. (2007). Frequency and severity of acute allergic-like reactions to gadolinium-containing IV contrast media in children and adults. American Journal of Roentgenology, 189 (6), 1533–1538. https://doi.org/10.2214/AJR.078.2554

Hart, D., Hillier, M. C., & Wall, B. F. (2009). National reference doses for common radiographic, fluoroscopic and dental X-ray examinations in the UK. The British Journal of Radiology, 82 , 1–12. https://doi.org/10.1259/bjr/12568539

Hayre, C. M., Blackman, S., Carlton, K., & Eyden, A. (2018). Attitudes and perceptions of radioigraphers applying lead (Pb) in general radiography: An ethnographic study. Radiography, 24 (1), e13–e18. https://doi.org/10.1016/j.radi.2017.07.010

Mercer, C. E., Hogg, P., Lawson, R., Diffey, J., & Denton, E. R. E. (2013). Practitioner compression force variability in mammography: A preliminary study. The British Journal of Radiology, 86 (1022), 20110596. https://doi.org/10.1259/bjr.20110596

Article   CAS   PubMed   PubMed Central   Google Scholar  

Nijssen, E. C., Rennenberg, R. J., Nelemans, P. J., Essers, B. A., Jannseen, M. M., Vermeeren, M. A., et al. (2017). Prophylactic hydration to protect renal function from intravascular iodinated contrast materials in patients at high risk of contrast-induced nephropathy (AMACING): A prospective, randomised, phase 3, controlled trial, open-label, non-inferiority trail. Lancet, 389 (10076), 1312–1322. https://doi.org/10.1016/S0140-6736(17):30057-0

Rosenkrantz, A. B., & Pysarenko, K. (2016). The patient experience in radiology: Observations from over 3,500 patient feedback reports in a single institution. Journal of the American College of Radiology, 13 (11), 1371–1377. https://doi.org/10.1016/j.jacr.2016.04.034

Sternberg, C. N., Hawkins, R. E., Wagstaff, J., Salman, P., Mardiak, J., Barrios, C. H., et al. (2013). A randomised, double-blind phase III study of pazopanib in patients with advanced and/or metastatic renal cell carcinoma: Final overall survival results and safety update. European Journal of Cancer, 49 (6), 1287–1296. https://doi.org/10.1016/j.ejca.2012.12.010

Download references

Author information

Authors and affiliations.

Discipline of Medical Imaging, School of Medicine, University College Cork, Cork, Ireland

Andrew England

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Andrew England .

Editor information

Editors and affiliations.

Medical Imaging, Faculty of Health, University of Canberra, Burnaby, BC, Canada

Euclid Seeram

Faculty of Health, University of Canberra, Canberra, ACT, Australia

Robert Davidson

Brookfield Health Sciences, University College Cork, Cork, Ireland

Mark F. McEntee

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this chapter

England, A. (2021). Quantitative and Qualitative Research Methods. In: Seeram, E., Davidson, R., England, A., McEntee, M.F. (eds) Research for Medical Imaging and Radiation Sciences. Springer, Cham. https://doi.org/10.1007/978-3-030-79956-4_5

Download citation

DOI : https://doi.org/10.1007/978-3-030-79956-4_5

Published : 03 January 2022

Publisher Name : Springer, Cham

Print ISBN : 978-3-030-79955-7

Online ISBN : 978-3-030-79956-4

eBook Packages : Biomedical and Life Sciences Biomedical and Life Sciences (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

International Journal of Quantitative and Qualitative Research Methods (IJQQRM)

EA Journals

British Journal of Marketing Studies (BJMS) European Journal of Accounting, Auditing and Finance Research (EJAAFR) European Journal of Business and Innovation Research (EJBIR) European Journal of Hospitality and Tourism Research (EJHTR) European Journal of Logistics, Purchasing and Supply Chain Management (EJLPSCM) Global Journal of Human Resource Management (GJHRM) International Journal of Business and Management Review (IJBMR) International Journal of Community and Cooperative Studies (IJCCS) International Journal of Management Technology (IJMT) International Journal of Small Business and Entrepreneurship Research (IJSBER)

British Journal of Education (BJE) European Journal of Training and Development Studies (EJTDS) International Journal of Education, Learning and Development (IJELD) International Journal of Interdisciplinary Research Methods (IJIRM) International Journal of Quantitative and Qualitative Research Methods (IJQQRM) International Journal of Vocational and Technical Education Research (IJVTER)

British Journal of Earth Sciences Research (BJESR) British Journal of Environmental Sciences (BJES) European Journal of Computer Science and Information Technology (EJCSIT) European Journal of Material Sciences (EJMS) European Journal of Mechanical Engineering Research (EJMER) European Journal of Statistics and Probability (EJSP) Global Journal of Pure and Applied Chemistry Research (GJPACR) International Journal of Civil Engineering, Construction and Estate Management (IJCECEM) International Journal of Electrical and Electronics Engineering Studies (IJEEES) International Journal of Energy and Environmental Research (IJEER) International Journal of Engineering and Advanced Technology Studies (IJEATS) International Journal of Environment and Pollution Research (IJEPR) International Journal of Manufacturing, Material and Mechanical Engineering Research (IJMMMER) International Journal of Mathematics and Statistics Studies (IJMSS) International Journal of Network and Communication Research (IJNCR) International Research Journal of Natural Sciences (IRJNS) International Research Journal of Pure and Applied Physics (IRJPAP)

British Journal of English Linguistics (BJEL) European Journal of English Language and Literature Studies (EJELLS) International Journal of African Society, Cultures and Traditions (IJASCT) International Journal of Asian History, Culture and Tradition (IJAHCT) International Journal of Developing and Emerging Economies (IJDEE) International Journal of English Language and Linguistics Research (IJELLR) International Journal of English Language Teaching (IJELT)

International Journal of Agricultural Extension and Rural Development Studies (IJAERDS) International Journal of Animal Health and Livestock Production Research (IJAHLPR) International Journal of Cancer, Clinical Inventions and Experimental Oncology (IJCCEO) International Journal of Cell, Animal Biology and Genetics (IJCABG) International Journal of Dentistry, Diabetes, Endocrinology and Oral Hygiene (IJDDEOH) International Journal of Ebola, AIDS, HIV and Infectious Diseases and Immunity (IJEAHII) International Journal of Entomology and Nematology Research (IJENR) International Journal of Environmental Chemistry and Ecotoxicology Research (IJECER) International Journal of Fisheries and Aquaculture Research (IJFAR) International Journal of Horticulture and Forestry Research (IJHFR) International Journal of Micro Biology, Genetics and Monocular Biology Research (IJMGMR) International Journal of Nursing, Midwife and Health Related Cases (IJNMH) International Journal of Nutrition and Metabolism Research (IJNMR) International Journal of Public Health, Pharmacy and Pharmacology (IJPHPP) International Journal of Weather, Climate Change and Conservation Research (IJWCCCR) International Journal Water Resources Management and Irrigation Engineering Research (IJWEMIER)

British Journal of Psychology Research (BJPR) European Journal of Agriculture and Forestry Research (EJAFR) European Journal of Biology and Medical Science Research (EJBMSR) European Journal of Botany, Plant Sciences and Phytology (EJBPSP) European Journal of Educational and Development Psychology (EJEDP) European Journal of Food Science and Technology (EJFST) Global Journal of Agricultural Research (GJAR) International Journal of Health and Psychology Research (IJHPR)

Global Journal of Arts, Humanities and Social Sciences (GJAHSS) Global Journal of Political Science and Administration (GJPSA) Global Journal of Politics and Law Research (GJPLR) International Journal of Development and Economic Sustainability (IJDES) International Journal of History and Philosophical Research (IJHPHR) International Journal of International Relations, Media and Mass Communication Studies (IJIRMMCS) International Journal of Music Studies (IJMS) International Journal of Non-Governmental Organizations (NGOs) and Essays (IJNGOE) International Journal of Physical and Human Geography (IJPHG) International Journal of Sociology and Anthropology Research (IJSAR)

International Journal of Biochemistry, Bioinformatics and Biotechnology Studies (IJBBBS) International Journal of Coal, Geology and Mining Research (IJCGMR) International Journal of Geography and Regional Planning Research (IJGRPR) International Journal of Library and Information Science Studies (IJLISS) International Journal of Petroleum and Gas Engineering Research (IJPGER) International Journal of Petroleum and Gas Exploration Management (IJPGEM) International Journal of Physical Sciences Research (IJPSR) International Journal of Scientific Research in Essays and Case Studies (IJSRECS)

International Journal of Quantitative and Qualitative Research Methods is run by the European Centre for Research, Training and Development, United Kingdom. The journal publishes outstanding academic, theoretical and methodological articles relating to quantitative, qualitative and research in professional and service settings. The journal covers issues addressed by researchers within academic and independent research organizations in different areas. The scope of the journal focuses on the on-going and emerging methodological debates across a verities of methods. These include mixed and comparative methods relating to philosophical, theoretical, ethical, political and practical issues. It is also an international medium for the publication of social research methodology and practices across a wide range of disciplines and avenue for researchers in different sectors to consider and evaluate methods as these relate to research practice. It also publishes book reviews of potential interest to readers. The journal is published in both print and online versions. The online version of the journal is free access and downloads.

IJQQRM

Email ID: [email protected] Impact Factor: 7.02 Print ISSN: 2056-3620 Online ISSN: 2056-3639 DOI: https://doi.org/10.37745/ijqqrm.13 Author Guidelines Submit Papers Review Status Journal Subscription ECRTD Membership Pay Publication Fee

RECENT PUBLICATIONS

Publication archive.

IJQQRM

Vol12, Issue 2, 2024

Vol12, Issue 1, 2024

Vol11, Issue 1, 2023

Vol10, Issue 2, 2022

Vol10, Issue 1, 2022

Vol 9, Issue 3, 2021

Vol 9, Issue 2, 2021

Vol 9, Issue 1, 2021

Vol 8, Issue 3, September 2020

Don't miss any Call For Paper update from EA Journals

Fill up the form below and get notified everytime we call for new submissions for our journals. 

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

  • Qualitative vs. Quantitative Research | Differences, Examples & Methods

Qualitative vs. Quantitative Research | Differences, Examples & Methods

Published on April 12, 2019 by Raimo Streefkerk . Revised on June 22, 2023.

When collecting and analyzing data, quantitative research deals with numbers and statistics, while qualitative research deals with words and meanings. Both are important for gaining different kinds of knowledge.

Common quantitative methods include experiments, observations recorded as numbers, and surveys with closed-ended questions.

Quantitative research is at risk for research biases including information bias , omitted variable bias , sampling bias , or selection bias . Qualitative research Qualitative research is expressed in words . It is used to understand concepts, thoughts or experiences. This type of research enables you to gather in-depth insights on topics that are not well understood.

Common qualitative methods include interviews with open-ended questions, observations described in words, and literature reviews that explore concepts and theories.

Table of contents

The differences between quantitative and qualitative research, data collection methods, when to use qualitative vs. quantitative research, how to analyze qualitative and quantitative data, other interesting articles, frequently asked questions about qualitative and quantitative research.

Quantitative and qualitative research use different research methods to collect and analyze data, and they allow you to answer different kinds of research questions.

Qualitative vs. quantitative research

Quantitative and qualitative data can be collected using various methods. It is important to use a data collection method that will help answer your research question(s).

Many data collection methods can be either qualitative or quantitative. For example, in surveys, observational studies or case studies , your data can be represented as numbers (e.g., using rating scales or counting frequencies) or as words (e.g., with open-ended questions or descriptions of what you observe).

However, some methods are more commonly used in one type or the other.

Quantitative data collection methods

  • Surveys :  List of closed or multiple choice questions that is distributed to a sample (online, in person, or over the phone).
  • Experiments : Situation in which different types of variables are controlled and manipulated to establish cause-and-effect relationships.
  • Observations : Observing subjects in a natural environment where variables can’t be controlled.

Qualitative data collection methods

  • Interviews : Asking open-ended questions verbally to respondents.
  • Focus groups : Discussion among a group of people about a topic to gather opinions that can be used for further research.
  • Ethnography : Participating in a community or organization for an extended period of time to closely observe culture and behavior.
  • Literature review : Survey of published works by other authors.

A rule of thumb for deciding whether to use qualitative or quantitative data is:

  • Use quantitative research if you want to confirm or test something (a theory or hypothesis )
  • Use qualitative research if you want to understand something (concepts, thoughts, experiences)

For most research topics you can choose a qualitative, quantitative or mixed methods approach . Which type you choose depends on, among other things, whether you’re taking an inductive vs. deductive research approach ; your research question(s) ; whether you’re doing experimental , correlational , or descriptive research ; and practical considerations such as time, money, availability of data, and access to respondents.

Quantitative research approach

You survey 300 students at your university and ask them questions such as: “on a scale from 1-5, how satisfied are your with your professors?”

You can perform statistical analysis on the data and draw conclusions such as: “on average students rated their professors 4.4”.

Qualitative research approach

You conduct in-depth interviews with 15 students and ask them open-ended questions such as: “How satisfied are you with your studies?”, “What is the most positive aspect of your study program?” and “What can be done to improve the study program?”

Based on the answers you get you can ask follow-up questions to clarify things. You transcribe all interviews using transcription software and try to find commonalities and patterns.

Mixed methods approach

You conduct interviews to find out how satisfied students are with their studies. Through open-ended questions you learn things you never thought about before and gain new insights. Later, you use a survey to test these insights on a larger scale.

It’s also possible to start with a survey to find out the overall trends, followed by interviews to better understand the reasons behind the trends.

Qualitative or quantitative data by itself can’t prove or demonstrate anything, but has to be analyzed to show its meaning in relation to the research questions. The method of analysis differs for each type of data.

Analyzing quantitative data

Quantitative data is based on numbers. Simple math or more advanced statistical analysis is used to discover commonalities or patterns in the data. The results are often reported in graphs and tables.

Applications such as Excel, SPSS, or R can be used to calculate things like:

  • Average scores ( means )
  • The number of times a particular answer was given
  • The correlation or causation between two or more variables
  • The reliability and validity of the results

Analyzing qualitative data

Qualitative data is more difficult to analyze than quantitative data. It consists of text, images or videos instead of numbers.

Some common approaches to analyzing qualitative data include:

  • Qualitative content analysis : Tracking the occurrence, position and meaning of words or phrases
  • Thematic analysis : Closely examining the data to identify the main themes and patterns
  • Discourse analysis : Studying how communication works in social contexts

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Chi square goodness of fit test
  • Degrees of freedom
  • Null hypothesis
  • Discourse analysis
  • Control groups
  • Mixed methods research
  • Non-probability sampling
  • Quantitative research
  • Inclusion and exclusion criteria

Research bias

  • Rosenthal effect
  • Implicit bias
  • Cognitive bias
  • Selection bias
  • Negativity bias
  • Status quo bias

Quantitative research deals with numbers and statistics, while qualitative research deals with words and meanings.

Quantitative methods allow you to systematically measure variables and test hypotheses . Qualitative methods allow you to explore concepts and experiences in more detail.

In mixed methods research , you use both qualitative and quantitative data collection and analysis methods to answer your research question .

The research methods you use depend on the type of data you need to answer your research question .

  • If you want to measure something or test a hypothesis , use quantitative methods . If you want to explore ideas, thoughts and meanings, use qualitative methods .
  • If you want to analyze a large amount of readily-available data, use secondary data. If you want data specific to your purposes with control over how it is generated, collect primary data.
  • If you want to establish cause-and-effect relationships between variables , use experimental methods. If you want to understand the characteristics of a research subject, use descriptive methods.

Data collection is the systematic process by which observations or measurements are gathered in research. It is used in many different contexts by academics, governments, businesses, and other organizations.

There are various approaches to qualitative data analysis , but they all share five steps in common:

  • Prepare and organize your data.
  • Review and explore your data.
  • Develop a data coding system.
  • Assign codes to the data.
  • Identify recurring themes.

The specifics of each step depend on the focus of the analysis. Some common approaches include textual analysis , thematic analysis , and discourse analysis .

A research project is an academic, scientific, or professional undertaking to answer a research question . Research projects can take many forms, such as qualitative or quantitative , descriptive , longitudinal , experimental , or correlational . What kind of research approach you choose will depend on your topic.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Streefkerk, R. (2023, June 22). Qualitative vs. Quantitative Research | Differences, Examples & Methods. Scribbr. Retrieved April 6, 2024, from https://www.scribbr.com/methodology/qualitative-quantitative-research/

Is this article helpful?

Raimo Streefkerk

Raimo Streefkerk

Other students also liked, what is quantitative research | definition, uses & methods, what is qualitative research | methods & examples, mixed methods research | definition, guide & examples, "i thought ai proofreading was useless but..".

I've been using Scribbr for years now and I know it's a service that won't disappoint. It does a good job spotting mistakes”

  • Search Menu
  • Advance articles
  • Editor's Choice
  • ESHRE Pages
  • Mini-reviews
  • Author Guidelines
  • Submission Site
  • Reasons to Publish
  • Open Access
  • Advertising and Corporate Services
  • Advertising
  • Reprints and ePrints
  • Sponsored Supplements
  • Branded Books
  • Journals Career Network
  • About Human Reproduction
  • About the European Society of Human Reproduction and Embryology
  • Editorial Board
  • Self-Archiving Policy
  • Dispatch Dates
  • Contact ESHRE
  • Journals on Oxford Academic
  • Books on Oxford Academic

Article Contents

Introduction, when to use qualitative research, how to judge qualitative research, conclusions, authors' roles, conflict of interest.

  • < Previous

Qualitative research methods: when to use them and how to judge them

  • Article contents
  • Figures & tables
  • Supplementary Data

K. Hammarberg, M. Kirkman, S. de Lacey, Qualitative research methods: when to use them and how to judge them, Human Reproduction , Volume 31, Issue 3, March 2016, Pages 498–501, https://doi.org/10.1093/humrep/dev334

  • Permissions Icon Permissions

In March 2015, an impressive set of guidelines for best practice on how to incorporate psychosocial care in routine infertility care was published by the ESHRE Psychology and Counselling Guideline Development Group ( ESHRE Psychology and Counselling Guideline Development Group, 2015 ). The authors report that the guidelines are based on a comprehensive review of the literature and we congratulate them on their meticulous compilation of evidence into a clinically useful document. However, when we read the methodology section, we were baffled and disappointed to find that evidence from research using qualitative methods was not included in the formulation of the guidelines. Despite stating that ‘qualitative research has significant value to assess the lived experience of infertility and fertility treatment’, the group excluded this body of evidence because qualitative research is ‘not generally hypothesis-driven and not objective/neutral, as the researcher puts him/herself in the position of the participant to understand how the world is from the person's perspective’.

Qualitative and quantitative research methods are often juxtaposed as representing two different world views. In quantitative circles, qualitative research is commonly viewed with suspicion and considered lightweight because it involves small samples which may not be representative of the broader population, it is seen as not objective, and the results are assessed as biased by the researchers' own experiences or opinions. In qualitative circles, quantitative research can be dismissed as over-simplifying individual experience in the cause of generalisation, failing to acknowledge researcher biases and expectations in research design, and requiring guesswork to understand the human meaning of aggregate data.

As social scientists who investigate psychosocial aspects of human reproduction, we use qualitative and quantitative methods, separately or together, depending on the research question. The crucial part is to know when to use what method.

The peer-review process is a pillar of scientific publishing. One of the important roles of reviewers is to assess the scientific rigour of the studies from which authors draw their conclusions. If rigour is lacking, the paper should not be published. As with research using quantitative methods, research using qualitative methods is home to the good, the bad and the ugly. It is essential that reviewers know the difference. Rejection letters are hard to take but more often than not they are based on legitimate critique. However, from time to time it is obvious that the reviewer has little grasp of what constitutes rigour or quality in qualitative research. The first author (K.H.) recently submitted a paper that reported findings from a qualitative study about fertility-related knowledge and information-seeking behaviour among people of reproductive age. In the rejection letter one of the reviewers (not from Human Reproduction ) lamented, ‘Even for a qualitative study, I would expect that some form of confidence interval and paired t-tables analysis, etc. be used to analyse the significance of results'. This comment reveals the reviewer's inappropriate application to qualitative research of criteria relevant only to quantitative research.

In this commentary, we give illustrative examples of questions most appropriately answered using qualitative methods and provide general advice about how to appraise the scientific rigour of qualitative studies. We hope this will help the journal's reviewers and readers appreciate the legitimate place of qualitative research and ensure we do not throw the baby out with the bath water by excluding or rejecting papers simply because they report the results of qualitative studies.

In psychosocial research, ‘quantitative’ research methods are appropriate when ‘factual’ data are required to answer the research question; when general or probability information is sought on opinions, attitudes, views, beliefs or preferences; when variables can be isolated and defined; when variables can be linked to form hypotheses before data collection; and when the question or problem is known, clear and unambiguous. Quantitative methods can reveal, for example, what percentage of the population supports assisted conception, their distribution by age, marital status, residential area and so on, as well as changes from one survey to the next ( Kovacs et al. , 2012 ); the number of donors and donor siblings located by parents of donor-conceived children ( Freeman et al. , 2009 ); and the relationship between the attitude of donor-conceived people to learning of their donor insemination conception and their family ‘type’ (one or two parents, lesbian or heterosexual parents; Beeson et al. , 2011 ).

In contrast, ‘qualitative’ methods are used to answer questions about experience, meaning and perspective, most often from the standpoint of the participant. These data are usually not amenable to counting or measuring. Qualitative research techniques include ‘small-group discussions’ for investigating beliefs, attitudes and concepts of normative behaviour; ‘semi-structured interviews’, to seek views on a focused topic or, with key informants, for background information or an institutional perspective; ‘in-depth interviews’ to understand a condition, experience, or event from a personal perspective; and ‘analysis of texts and documents’, such as government reports, media articles, websites or diaries, to learn about distributed or private knowledge.

Qualitative methods have been used to reveal, for example, potential problems in implementing a proposed trial of elective single embryo transfer, where small-group discussions enabled staff to explain their own resistance, leading to an amended approach ( Porter and Bhattacharya, 2005 ). Small-group discussions among assisted reproductive technology (ART) counsellors were used to investigate how the welfare principle is interpreted and practised by health professionals who must apply it in ART ( de Lacey et al. , 2015 ). When legislative change meant that gamete donors could seek identifying details of people conceived from their gametes, parents needed advice on how best to tell their children. Small-group discussions were convened to ask adolescents (not known to be donor-conceived) to reflect on how they would prefer to be told ( Kirkman et al. , 2007 ).

When a population cannot be identified, such as anonymous sperm donors from the 1980s, a qualitative approach with wide publicity can reach people who do not usually volunteer for research and reveal (for example) their attitudes to proposed legislation to remove anonymity with retrospective effect ( Hammarberg et al. , 2014 ). When researchers invite people to talk about their reflections on experience, they can sometimes learn more than they set out to discover. In describing their responses to proposed legislative change, participants also talked about people conceived as a result of their donations, demonstrating various constructions and expectations of relationships ( Kirkman et al. , 2014 ).

Interviews with parents in lesbian-parented families generated insight into the diverse meanings of the sperm donor in the creation and life of the family ( Wyverkens et al. , 2014 ). Oral and written interviews also revealed the embarrassment and ambivalence surrounding sperm donors evident in participants in donor-assisted conception ( Kirkman, 2004 ). The way in which parents conceptualise unused embryos and why they discard rather than donate was explored and understood via in-depth interviews, showing how and why the meaning of those embryos changed with parenthood ( de Lacey, 2005 ). In-depth interviews were also used to establish the intricate understanding by embryo donors and recipients of the meaning of embryo donation and the families built as a result ( Goedeke et al. , 2015 ).

It is possible to combine quantitative and qualitative methods, although great care should be taken to ensure that the theory behind each method is compatible and that the methods are being used for appropriate reasons. The two methods can be used sequentially (first a quantitative then a qualitative study or vice versa), where the first approach is used to facilitate the design of the second; they can be used in parallel as different approaches to the same question; or a dominant method may be enriched with a small component of an alternative method (such as qualitative interviews ‘nested’ in a large survey). It is important to note that free text in surveys represents qualitative data but does not constitute qualitative research. Qualitative and quantitative methods may be used together for corroboration (hoping for similar outcomes from both methods), elaboration (using qualitative data to explain or interpret quantitative data, or to demonstrate how the quantitative findings apply in particular cases), complementarity (where the qualitative and quantitative results differ but generate complementary insights) or contradiction (where qualitative and quantitative data lead to different conclusions). Each has its advantages and challenges ( Brannen, 2005 ).

Qualitative research is gaining increased momentum in the clinical setting and carries different criteria for evaluating its rigour or quality. Quantitative studies generally involve the systematic collection of data about a phenomenon, using standardized measures and statistical analysis. In contrast, qualitative studies involve the systematic collection, organization, description and interpretation of textual, verbal or visual data. The particular approach taken determines to a certain extent the criteria used for judging the quality of the report. However, research using qualitative methods can be evaluated ( Dixon-Woods et al. , 2006 ; Young et al. , 2014 ) and there are some generic guidelines for assessing qualitative research ( Kitto et al. , 2008 ).

Although the terms ‘reliability’ and ‘validity’ are contentious among qualitative researchers ( Lincoln and Guba, 1985 ) with some preferring ‘verification’, research integrity and robustness are as important in qualitative studies as they are in other forms of research. It is widely accepted that qualitative research should be ethical, important, intelligibly described, and use appropriate and rigorous methods ( Cohen and Crabtree, 2008 ). In research investigating data that can be counted or measured, replicability is essential. When other kinds of data are gathered in order to answer questions of personal or social meaning, we need to be able to capture real-life experiences, which cannot be identical from one person to the next. Furthermore, meaning is culturally determined and subject to evolutionary change. The way of explaining a phenomenon—such as what it means to use donated gametes—will vary, for example, according to the cultural significance of ‘blood’ or genes, interpretations of marital infidelity and religious constructs of sexual relationships and families. Culture may apply to a country, a community, or other actual or virtual group, and a person may be engaged at various levels of culture. In identifying meaning for members of a particular group, consistency may indeed be found from one research project to another. However, individuals within a cultural group may present different experiences and perceptions or transgress cultural expectations. That does not make them ‘wrong’ or invalidate the research. Rather, it offers insight into diversity and adds a piece to the puzzle to which other researchers also contribute.

In qualitative research the objective stance is obsolete, the researcher is the instrument, and ‘subjects’ become ‘participants’ who may contribute to data interpretation and analysis ( Denzin and Lincoln, 1998 ). Qualitative researchers defend the integrity of their work by different means: trustworthiness, credibility, applicability and consistency are the evaluative criteria ( Leininger, 1994 ).

Trustworthiness

A report of a qualitative study should contain the same robust procedural description as any other study. The purpose of the research, how it was conducted, procedural decisions, and details of data generation and management should be transparent and explicit. A reviewer should be able to follow the progression of events and decisions and understand their logic because there is adequate description, explanation and justification of the methodology and methods ( Kitto et al. , 2008 )

Credibility

Credibility is the criterion for evaluating the truth value or internal validity of qualitative research. A qualitative study is credible when its results, presented with adequate descriptions of context, are recognizable to people who share the experience and those who care for or treat them. As the instrument in qualitative research, the researcher defends its credibility through practices such as reflexivity (reflection on the influence of the researcher on the research), triangulation (where appropriate, answering the research question in several ways, such as through interviews, observation and documentary analysis) and substantial description of the interpretation process; verbatim quotations from the data are supplied to illustrate and support their interpretations ( Sandelowski, 1986 ). Where excerpts of data and interpretations are incongruent, the credibility of the study is in doubt.

Applicability

Applicability, or transferability of the research findings, is the criterion for evaluating external validity. A study is considered to meet the criterion of applicability when its findings can fit into contexts outside the study situation and when clinicians and researchers view the findings as meaningful and applicable in their own experiences.

Larger sample sizes do not produce greater applicability. Depth may be sacrificed to breadth or there may be too much data for adequate analysis. Sample sizes in qualitative research are typically small. The term ‘saturation’ is often used in reference to decisions about sample size in research using qualitative methods. Emerging from grounded theory, where filling theoretical categories is considered essential to the robustness of the developing theory, data saturation has been expanded to describe a situation where data tend towards repetition or where data cease to offer new directions and raise new questions ( Charmaz, 2005 ). However, the legitimacy of saturation as a generic marker of sampling adequacy has been questioned ( O'Reilly and Parker, 2013 ). Caution must be exercised to ensure that a commitment to saturation does not assume an ‘essence’ of an experience in which limited diversity is anticipated; each account is likely to be subtly different and each ‘sample’ will contribute to knowledge without telling the whole story. Increasingly, it is expected that researchers will report the kind of saturation they have applied and their criteria for recognising its achievement; an assessor will need to judge whether the choice is appropriate and consistent with the theoretical context within which the research has been conducted.

Sampling strategies are usually purposive, convenient, theoretical or snowballed. Maximum variation sampling may be used to seek representation of diverse perspectives on the topic. Homogeneous sampling may be used to recruit a group of participants with specified criteria. The threat of bias is irrelevant; participants are recruited and selected specifically because they can illuminate the phenomenon being studied. Rather than being predetermined by statistical power analysis, qualitative study samples are dependent on the nature of the data, the availability of participants and where those data take the investigator. Multiple data collections may also take place to obtain maximum insight into sensitive topics. For instance, the question of how decisions are made for embryo disposition may involve sampling within the patient group as well as from scientists, clinicians, counsellors and clinic administrators.

Consistency

Consistency, or dependability of the results, is the criterion for assessing reliability. This does not mean that the same result would necessarily be found in other contexts but that, given the same data, other researchers would find similar patterns. Researchers often seek maximum variation in the experience of a phenomenon, not only to illuminate it but also to discourage fulfilment of limited researcher expectations (for example, negative cases or instances that do not fit the emerging interpretation or theory should be actively sought and explored). Qualitative researchers sometimes describe the processes by which verification of the theoretical findings by another team member takes place ( Morse and Richards, 2002 ).

Research that uses qualitative methods is not, as it seems sometimes to be represented, the easy option, nor is it a collation of anecdotes. It usually involves a complex theoretical or philosophical framework. Rigorous analysis is conducted without the aid of straightforward mathematical rules. Researchers must demonstrate the validity of their analysis and conclusions, resulting in longer papers and occasional frustration with the word limits of appropriate journals. Nevertheless, we need the different kinds of evidence that is generated by qualitative methods. The experience of health, illness and medical intervention cannot always be counted and measured; researchers need to understand what they mean to individuals and groups. Knowledge gained from qualitative research methods can inform clinical practice, indicate how to support people living with chronic conditions and contribute to community education and awareness about people who are (for example) experiencing infertility or using assisted conception.

Each author drafted a section of the manuscript and the manuscript as a whole was reviewed and revised by all authors in consultation.

No external funding was either sought or obtained for this study.

The authors have no conflicts of interest to declare.

Beeson D , Jennings P , Kramer W . Offspring searching for their sperm donors: how family types shape the process . Hum Reprod 2011 ; 26 : 2415 – 2424 .

Google Scholar

Brannen J . Mixing methods: the entry of qualitative and quantitative approaches into the research process . Int J Soc Res Methodol 2005 ; 8 : 173 – 184 .

Charmaz K . Grounded Theory in the 21st century; applications for advancing social justice studies . In: Denzin NK , Lincoln YS (eds). The Sage Handbook of Qualitative Research . California : Sage Publications Inc. , 2005 .

Google Preview

Cohen D , Crabtree B . Evaluative criteria for qualitative research in health care: controversies and recommendations . Ann Fam Med 2008 ; 6 : 331 – 339 .

de Lacey S . Parent identity and ‘virtual’ children: why patients discard rather than donate unused embryos . Hum Reprod 2005 ; 20 : 1661 – 1669 .

de Lacey SL , Peterson K , McMillan J . Child interests in assisted reproductive technology: how is the welfare principle applied in practice? Hum Reprod 2015 ; 30 : 616 – 624 .

Denzin N , Lincoln Y . Entering the field of qualitative research . In: Denzin NK , Lincoln YS (eds). The Landscape of Qualitative Research: Theories and Issues . Thousand Oaks : Sage , 1998 , 1 – 34 .

Dixon-Woods M , Bonas S , Booth A , Jones DR , Miller T , Shaw RL , Smith JA , Young B . How can systematic reviews incorporate qualitative research? A critical perspective . Qual Res 2006 ; 6 : 27 – 44 .

ESHRE Psychology and Counselling Guideline Development Group . Routine Psychosocial Care in Infertility and Medically Assisted Reproduction: A Guide for Fertility Staff , 2015 . http://www.eshre.eu/Guidelines-and-Legal/Guidelines/Psychosocial-care-guideline.aspx .

Freeman T , Jadva V , Kramer W , Golombok S . Gamete donation: parents' experiences of searching for their child's donor siblings or donor . Hum Reprod 2009 ; 24 : 505 – 516 .

Goedeke S , Daniels K , Thorpe M , Du Preez E . Building extended families through embryo donation: the experiences of donors and recipients . Hum Reprod 2015 ; 30 : 2340 – 2350 .

Hammarberg K , Johnson L , Bourne K , Fisher J , Kirkman M . Proposed legislative change mandating retrospective release of identifying information: consultation with donors and Government response . Hum Reprod 2014 ; 29 : 286 – 292 .

Kirkman M . Saviours and satyrs: ambivalence in narrative meanings of sperm provision . Cult Health Sex 2004 ; 6 : 319 – 336 .

Kirkman M , Rosenthal D , Johnson L . Families working it out: adolescents' views on communicating about donor-assisted conception . Hum Reprod 2007 ; 22 : 2318 – 2324 .

Kirkman M , Bourne K , Fisher J , Johnson L , Hammarberg K . Gamete donors' expectations and experiences of contact with their donor offspring . Hum Reprod 2014 ; 29 : 731 – 738 .

Kitto S , Chesters J , Grbich C . Quality in qualitative research . Med J Aust 2008 ; 188 : 243 – 246 .

Kovacs GT , Morgan G , Levine M , McCrann J . The Australian community overwhelmingly approves IVF to treat subfertility, with increasing support over three decades . Aust N Z J Obstetr Gynaecol 2012 ; 52 : 302 – 304 .

Leininger M . Evaluation criteria and critique of qualitative research studies . In: Morse J (ed). Critical Issues in Qualitative Research Methods . Thousand Oaks : Sage , 1994 , 95 – 115 .

Lincoln YS , Guba EG . Naturalistic Inquiry . Newbury Park, CA : Sage Publications , 1985 .

Morse J , Richards L . Readme First for a Users Guide to Qualitative Methods . Thousand Oaks : Sage , 2002 .

O'Reilly M , Parker N . ‘Unsatisfactory saturation’: a critical exploration of the notion of saturated sample sizes in qualitative research . Qual Res 2013 ; 13 : 190 – 197 .

Porter M , Bhattacharya S . Investigation of staff and patients' opinions of a proposed trial of elective single embryo transfer . Hum Reprod 2005 ; 20 : 2523 – 2530 .

Sandelowski M . The problem of rigor in qualitative research . Adv Nurs Sci 1986 ; 8 : 27 – 37 .

Wyverkens E , Provoost V , Ravelingien A , De Sutter P , Pennings G , Buysse A . Beyond sperm cells: a qualitative study on constructed meanings of the sperm donor in lesbian families . Hum Reprod 2014 ; 29 : 1248 – 1254 .

Young K , Fisher J , Kirkman M . Women's experiences of endometriosis: a systematic review of qualitative research . J Fam Plann Reprod Health Care 2014 ; 41 : 225 – 234 .

  • conflict of interest
  • credibility
  • qualitative research
  • quantitative methods

Email alerts

Citing articles via.

  • Recommend to your Library

Affiliations

  • Online ISSN 1460-2350
  • Copyright © 2024 European Society of Human Reproduction and Embryology
  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Institutional account management
  • Rights and permissions
  • Get help with access
  • Accessibility
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 26 March 2024

Predicting and improving complex beer flavor through machine learning

  • Michiel Schreurs   ORCID: orcid.org/0000-0002-9449-5619 1 , 2 , 3   na1 ,
  • Supinya Piampongsant 1 , 2 , 3   na1 ,
  • Miguel Roncoroni   ORCID: orcid.org/0000-0001-7461-1427 1 , 2 , 3   na1 ,
  • Lloyd Cool   ORCID: orcid.org/0000-0001-9936-3124 1 , 2 , 3 , 4 ,
  • Beatriz Herrera-Malaver   ORCID: orcid.org/0000-0002-5096-9974 1 , 2 , 3 ,
  • Christophe Vanderaa   ORCID: orcid.org/0000-0001-7443-5427 4 ,
  • Florian A. Theßeling 1 , 2 , 3 ,
  • Łukasz Kreft   ORCID: orcid.org/0000-0001-7620-4657 5 ,
  • Alexander Botzki   ORCID: orcid.org/0000-0001-6691-4233 5 ,
  • Philippe Malcorps 6 ,
  • Luk Daenen 6 ,
  • Tom Wenseleers   ORCID: orcid.org/0000-0002-1434-861X 4 &
  • Kevin J. Verstrepen   ORCID: orcid.org/0000-0002-3077-6219 1 , 2 , 3  

Nature Communications volume  15 , Article number:  2368 ( 2024 ) Cite this article

49k Accesses

847 Altmetric

Metrics details

  • Chemical engineering
  • Gas chromatography
  • Machine learning
  • Metabolomics
  • Taste receptors

The perception and appreciation of food flavor depends on many interacting chemical compounds and external factors, and therefore proves challenging to understand and predict. Here, we combine extensive chemical and sensory analyses of 250 different beers to train machine learning models that allow predicting flavor and consumer appreciation. For each beer, we measure over 200 chemical properties, perform quantitative descriptive sensory analysis with a trained tasting panel and map data from over 180,000 consumer reviews to train 10 different machine learning models. The best-performing algorithm, Gradient Boosting, yields models that significantly outperform predictions based on conventional statistics and accurately predict complex food features and consumer appreciation from chemical profiles. Model dissection allows identifying specific and unexpected compounds as drivers of beer flavor and appreciation. Adding these compounds results in variants of commercial alcoholic and non-alcoholic beers with improved consumer appreciation. Together, our study reveals how big data and machine learning uncover complex links between food chemistry, flavor and consumer perception, and lays the foundation to develop novel, tailored foods with superior flavors.

Similar content being viewed by others

quantitative and qualitative research methods journals

Sensory lexicon and aroma volatiles analysis of brewing malt

Xiaoxia Su, Miao Yu, … Tianyi Du

quantitative and qualitative research methods journals

Predicting odor from molecular structure: a multi-label classification approach

Kushagra Saini & Venkatnarayan Ramanathan

quantitative and qualitative research methods journals

Toward a general and interpretable umami taste predictor using a multi-objective machine learning approach

Lorenzo Pallante, Aigli Korfiati, … Marco A. Deriu

Introduction

Predicting and understanding food perception and appreciation is one of the major challenges in food science. Accurate modeling of food flavor and appreciation could yield important opportunities for both producers and consumers, including quality control, product fingerprinting, counterfeit detection, spoilage detection, and the development of new products and product combinations (food pairing) 1 , 2 , 3 , 4 , 5 , 6 . Accurate models for flavor and consumer appreciation would contribute greatly to our scientific understanding of how humans perceive and appreciate flavor. Moreover, accurate predictive models would also facilitate and standardize existing food assessment methods and could supplement or replace assessments by trained and consumer tasting panels, which are variable, expensive and time-consuming 7 , 8 , 9 . Lastly, apart from providing objective, quantitative, accurate and contextual information that can help producers, models can also guide consumers in understanding their personal preferences 10 .

Despite the myriad of applications, predicting food flavor and appreciation from its chemical properties remains a largely elusive goal in sensory science, especially for complex food and beverages 11 , 12 . A key obstacle is the immense number of flavor-active chemicals underlying food flavor. Flavor compounds can vary widely in chemical structure and concentration, making them technically challenging and labor-intensive to quantify, even in the face of innovations in metabolomics, such as non-targeted metabolic fingerprinting 13 , 14 . Moreover, sensory analysis is perhaps even more complicated. Flavor perception is highly complex, resulting from hundreds of different molecules interacting at the physiochemical and sensorial level. Sensory perception is often non-linear, characterized by complex and concentration-dependent synergistic and antagonistic effects 15 , 16 , 17 , 18 , 19 , 20 , 21 that are further convoluted by the genetics, environment, culture and psychology of consumers 22 , 23 , 24 . Perceived flavor is therefore difficult to measure, with problems of sensitivity, accuracy, and reproducibility that can only be resolved by gathering sufficiently large datasets 25 . Trained tasting panels are considered the prime source of quality sensory data, but require meticulous training, are low throughput and high cost. Public databases containing consumer reviews of food products could provide a valuable alternative, especially for studying appreciation scores, which do not require formal training 25 . Public databases offer the advantage of amassing large amounts of data, increasing the statistical power to identify potential drivers of appreciation. However, public datasets suffer from biases, including a bias in the volunteers that contribute to the database, as well as confounding factors such as price, cult status and psychological conformity towards previous ratings of the product.

Classical multivariate statistics and machine learning methods have been used to predict flavor of specific compounds by, for example, linking structural properties of a compound to its potential biological activities or linking concentrations of specific compounds to sensory profiles 1 , 26 . Importantly, most previous studies focused on predicting organoleptic properties of single compounds (often based on their chemical structure) 27 , 28 , 29 , 30 , 31 , 32 , 33 , thus ignoring the fact that these compounds are present in a complex matrix in food or beverages and excluding complex interactions between compounds. Moreover, the classical statistics commonly used in sensory science 34 , 35 , 36 , 37 , 38 , 39 require a large sample size and sufficient variance amongst predictors to create accurate models. They are not fit for studying an extensive set of hundreds of interacting flavor compounds, since they are sensitive to outliers, have a high tendency to overfit and are less suited for non-linear and discontinuous relationships 40 .

In this study, we combine extensive chemical analyses and sensory data of a set of different commercial beers with machine learning approaches to develop models that predict taste, smell, mouthfeel and appreciation from compound concentrations. Beer is particularly suited to model the relationship between chemistry, flavor and appreciation. First, beer is a complex product, consisting of thousands of flavor compounds that partake in complex sensory interactions 41 , 42 , 43 . This chemical diversity arises from the raw materials (malt, yeast, hops, water and spices) and biochemical conversions during the brewing process (kilning, mashing, boiling, fermentation, maturation and aging) 44 , 45 . Second, the advent of the internet saw beer consumers embrace online review platforms, such as RateBeer (ZX Ventures, Anheuser-Busch InBev SA/NV) and BeerAdvocate (Next Glass, inc.). In this way, the beer community provides massive data sets of beer flavor and appreciation scores, creating extraordinarily large sensory databases to complement the analyses of our professional sensory panel. Specifically, we characterize over 200 chemical properties of 250 commercial beers, spread across 22 beer styles, and link these to the descriptive sensory profiling data of a 16-person in-house trained tasting panel and data acquired from over 180,000 public consumer reviews. These unique and extensive datasets enable us to train a suite of machine learning models to predict flavor and appreciation from a beer’s chemical profile. Dissection of the best-performing models allows us to pinpoint specific compounds as potential drivers of beer flavor and appreciation. Follow-up experiments confirm the importance of these compounds and ultimately allow us to significantly improve the flavor and appreciation of selected commercial beers. Together, our study represents a significant step towards understanding complex flavors and reinforces the value of machine learning to develop and refine complex foods. In this way, it represents a stepping stone for further computer-aided food engineering applications 46 .

To generate a comprehensive dataset on beer flavor, we selected 250 commercial Belgian beers across 22 different beer styles (Supplementary Fig.  S1 ). Beers with ≤ 4.2% alcohol by volume (ABV) were classified as non-alcoholic and low-alcoholic. Blonds and Tripels constitute a significant portion of the dataset (12.4% and 11.2%, respectively) reflecting their presence on the Belgian beer market and the heterogeneity of beers within these styles. By contrast, lager beers are less diverse and dominated by a handful of brands. Rare styles such as Brut or Faro make up only a small fraction of the dataset (2% and 1%, respectively) because fewer of these beers are produced and because they are dominated by distinct characteristics in terms of flavor and chemical composition.

Extensive analysis identifies relationships between chemical compounds in beer

For each beer, we measured 226 different chemical properties, including common brewing parameters such as alcohol content, iso-alpha acids, pH, sugar concentration 47 , and over 200 flavor compounds (Methods, Supplementary Table  S1 ). A large portion (37.2%) are terpenoids arising from hopping, responsible for herbal and fruity flavors 16 , 48 . A second major category are yeast metabolites, such as esters and alcohols, that result in fruity and solvent notes 48 , 49 , 50 . Other measured compounds are primarily derived from malt, or other microbes such as non- Saccharomyces yeasts and bacteria (‘wild flora’). Compounds that arise from spices or staling are labeled under ‘Others’. Five attributes (caloric value, total acids and total ester, hop aroma and sulfur compounds) are calculated from multiple individually measured compounds.

As a first step in identifying relationships between chemical properties, we determined correlations between the concentrations of the compounds (Fig.  1 , upper panel, Supplementary Data  1 and 2 , and Supplementary Fig.  S2 . For the sake of clarity, only a subset of the measured compounds is shown in Fig.  1 ). Compounds of the same origin typically show a positive correlation, while absence of correlation hints at parameters varying independently. For example, the hop aroma compounds citronellol, and alpha-terpineol show moderate correlations with each other (Spearman’s rho=0.39 and 0.57), but not with the bittering hop component iso-alpha acids (Spearman’s rho=0.16 and −0.07). This illustrates how brewers can independently modify hop aroma and bitterness by selecting hop varieties and dosage time. If hops are added early in the boiling phase, chemical conversions increase bitterness while aromas evaporate, conversely, late addition of hops preserves aroma but limits bitterness 51 . Similarly, hop-derived iso-alpha acids show a strong anti-correlation with lactic acid and acetic acid, likely reflecting growth inhibition of lactic acid and acetic acid bacteria, or the consequent use of fewer hops in sour beer styles, such as West Flanders ales and Fruit beers, that rely on these bacteria for their distinct flavors 52 . Finally, yeast-derived esters (ethyl acetate, ethyl decanoate, ethyl hexanoate, ethyl octanoate) and alcohols (ethanol, isoamyl alcohol, isobutanol, and glycerol), correlate with Spearman coefficients above 0.5, suggesting that these secondary metabolites are correlated with the yeast genetic background and/or fermentation parameters and may be difficult to influence individually, although the choice of yeast strain may offer some control 53 .

figure 1

Spearman rank correlations are shown. Descriptors are grouped according to their origin (malt (blue), hops (green), yeast (red), wild flora (yellow), Others (black)), and sensory aspect (aroma, taste, palate, and overall appreciation). Please note that for the chemical compounds, for the sake of clarity, only a subset of the total number of measured compounds is shown, with an emphasis on the key compounds for each source. For more details, see the main text and Methods section. Chemical data can be found in Supplementary Data  1 , correlations between all chemical compounds are depicted in Supplementary Fig.  S2 and correlation values can be found in Supplementary Data  2 . See Supplementary Data  4 for sensory panel assessments and Supplementary Data  5 for correlation values between all sensory descriptors.

Interestingly, different beer styles show distinct patterns for some flavor compounds (Supplementary Fig.  S3 ). These observations agree with expectations for key beer styles, and serve as a control for our measurements. For instance, Stouts generally show high values for color (darker), while hoppy beers contain elevated levels of iso-alpha acids, compounds associated with bitter hop taste. Acetic and lactic acid are not prevalent in most beers, with notable exceptions such as Kriek, Lambic, Faro, West Flanders ales and Flanders Old Brown, which use acid-producing bacteria ( Lactobacillus and Pediococcus ) or unconventional yeast ( Brettanomyces ) 54 , 55 . Glycerol, ethanol and esters show similar distributions across all beer styles, reflecting their common origin as products of yeast metabolism during fermentation 45 , 53 . Finally, low/no-alcohol beers contain low concentrations of glycerol and esters. This is in line with the production process for most of the low/no-alcohol beers in our dataset, which are produced through limiting fermentation or by stripping away alcohol via evaporation or dialysis, with both methods having the unintended side-effect of reducing the amount of flavor compounds in the final beer 56 , 57 .

Besides expected associations, our data also reveals less trivial associations between beer styles and specific parameters. For example, geraniol and citronellol, two monoterpenoids responsible for citrus, floral and rose flavors and characteristic of Citra hops, are found in relatively high amounts in Christmas, Saison, and Brett/co-fermented beers, where they may originate from terpenoid-rich spices such as coriander seeds instead of hops 58 .

Tasting panel assessments reveal sensorial relationships in beer

To assess the sensory profile of each beer, a trained tasting panel evaluated each of the 250 beers for 50 sensory attributes, including different hop, malt and yeast flavors, off-flavors and spices. Panelists used a tasting sheet (Supplementary Data  3 ) to score the different attributes. Panel consistency was evaluated by repeating 12 samples across different sessions and performing ANOVA. In 95% of cases no significant difference was found across sessions ( p  > 0.05), indicating good panel consistency (Supplementary Table  S2 ).

Aroma and taste perception reported by the trained panel are often linked (Fig.  1 , bottom left panel and Supplementary Data  4 and 5 ), with high correlations between hops aroma and taste (Spearman’s rho=0.83). Bitter taste was found to correlate with hop aroma and taste in general (Spearman’s rho=0.80 and 0.69), and particularly with “grassy” noble hops (Spearman’s rho=0.75). Barnyard flavor, most often associated with sour beers, is identified together with stale hops (Spearman’s rho=0.97) that are used in these beers. Lactic and acetic acid, which often co-occur, are correlated (Spearman’s rho=0.66). Interestingly, sweetness and bitterness are anti-correlated (Spearman’s rho = −0.48), confirming the hypothesis that they mask each other 59 , 60 . Beer body is highly correlated with alcohol (Spearman’s rho = 0.79), and overall appreciation is found to correlate with multiple aspects that describe beer mouthfeel (alcohol, carbonation; Spearman’s rho= 0.32, 0.39), as well as with hop and ester aroma intensity (Spearman’s rho=0.39 and 0.35).

Similar to the chemical analyses, sensorial analyses confirmed typical features of specific beer styles (Supplementary Fig.  S4 ). For example, sour beers (Faro, Flanders Old Brown, Fruit beer, Kriek, Lambic, West Flanders ale) were rated acidic, with flavors of both acetic and lactic acid. Hoppy beers were found to be bitter and showed hop-associated aromas like citrus and tropical fruit. Malt taste is most detected among scotch, stout/porters, and strong ales, while low/no-alcohol beers, which often have a reputation for being ‘worty’ (reminiscent of unfermented, sweet malt extract) appear in the middle. Unsurprisingly, hop aromas are most strongly detected among hoppy beers. Like its chemical counterpart (Supplementary Fig.  S3 ), acidity shows a right-skewed distribution, with the most acidic beers being Krieks, Lambics, and West Flanders ales.

Tasting panel assessments of specific flavors correlate with chemical composition

We find that the concentrations of several chemical compounds strongly correlate with specific aroma or taste, as evaluated by the tasting panel (Fig.  2 , Supplementary Fig.  S5 , Supplementary Data  6 ). In some cases, these correlations confirm expectations and serve as a useful control for data quality. For example, iso-alpha acids, the bittering compounds in hops, strongly correlate with bitterness (Spearman’s rho=0.68), while ethanol and glycerol correlate with tasters’ perceptions of alcohol and body, the mouthfeel sensation of fullness (Spearman’s rho=0.82/0.62 and 0.72/0.57 respectively) and darker color from roasted malts is a good indication of malt perception (Spearman’s rho=0.54).

figure 2

Heatmap colors indicate Spearman’s Rho. Axes are organized according to sensory categories (aroma, taste, mouthfeel, overall), chemical categories and chemical sources in beer (malt (blue), hops (green), yeast (red), wild flora (yellow), Others (black)). See Supplementary Data  6 for all correlation values.

Interestingly, for some relationships between chemical compounds and perceived flavor, correlations are weaker than expected. For example, the rose-smelling phenethyl acetate only weakly correlates with floral aroma. This hints at more complex relationships and interactions between compounds and suggests a need for a more complex model than simple correlations. Lastly, we uncovered unexpected correlations. For instance, the esters ethyl decanoate and ethyl octanoate appear to correlate slightly with hop perception and bitterness, possibly due to their fruity flavor. Iron is anti-correlated with hop aromas and bitterness, most likely because it is also anti-correlated with iso-alpha acids. This could be a sign of metal chelation of hop acids 61 , given that our analyses measure unbound hop acids and total iron content, or could result from the higher iron content in dark and Fruit beers, which typically have less hoppy and bitter flavors 62 .

Public consumer reviews complement expert panel data

To complement and expand the sensory data of our trained tasting panel, we collected 180,000 reviews of our 250 beers from the online consumer review platform RateBeer. This provided numerical scores for beer appearance, aroma, taste, palate, overall quality as well as the average overall score.

Public datasets are known to suffer from biases, such as price, cult status and psychological conformity towards previous ratings of a product. For example, prices correlate with appreciation scores for these online consumer reviews (rho=0.49, Supplementary Fig.  S6 ), but not for our trained tasting panel (rho=0.19). This suggests that prices affect consumer appreciation, which has been reported in wine 63 , while blind tastings are unaffected. Moreover, we observe that some beer styles, like lagers and non-alcoholic beers, generally receive lower scores, reflecting that online reviewers are mostly beer aficionados with a preference for specialty beers over lager beers. In general, we find a modest correlation between our trained panel’s overall appreciation score and the online consumer appreciation scores (Fig.  3 , rho=0.29). Apart from the aforementioned biases in the online datasets, serving temperature, sample freshness and surroundings, which are all tightly controlled during the tasting panel sessions, can vary tremendously across online consumers and can further contribute to (among others, appreciation) differences between the two categories of tasters. Importantly, in contrast to the overall appreciation scores, for many sensory aspects the results from the professional panel correlated well with results obtained from RateBeer reviews. Correlations were highest for features that are relatively easy to recognize even for untrained tasters, like bitterness, sweetness, alcohol and malt aroma (Fig.  3 and below).

figure 3

RateBeer text mining results can be found in Supplementary Data  7 . Rho values shown are Spearman correlation values, with asterisks indicating significant correlations ( p  < 0.05, two-sided). All p values were smaller than 0.001, except for Esters aroma (0.0553), Esters taste (0.3275), Esters aroma—banana (0.0019), Coriander (0.0508) and Diacetyl (0.0134).

Besides collecting consumer appreciation from these online reviews, we developed automated text analysis tools to gather additional data from review texts (Supplementary Data  7 ). Processing review texts on the RateBeer database yielded comparable results to the scores given by the trained panel for many common sensory aspects, including acidity, bitterness, sweetness, alcohol, malt, and hop tastes (Fig.  3 ). This is in line with what would be expected, since these attributes require less training for accurate assessment and are less influenced by environmental factors such as temperature, serving glass and odors in the environment. Consumer reviews also correlate well with our trained panel for 4-vinyl guaiacol, a compound associated with a very characteristic aroma. By contrast, correlations for more specific aromas like ester, coriander or diacetyl are underrepresented in the online reviews, underscoring the importance of using a trained tasting panel and standardized tasting sheets with explicit factors to be scored for evaluating specific aspects of a beer. Taken together, our results suggest that public reviews are trustworthy for some, but not all, flavor features and can complement or substitute taste panel data for these sensory aspects.

Models can predict beer sensory profiles from chemical data

The rich datasets of chemical analyses, tasting panel assessments and public reviews gathered in the first part of this study provided us with a unique opportunity to develop predictive models that link chemical data to sensorial features. Given the complexity of beer flavor, basic statistical tools such as correlations or linear regression may not always be the most suitable for making accurate predictions. Instead, we applied different machine learning models that can model both simple linear and complex interactive relationships. Specifically, we constructed a set of regression models to predict (a) trained panel scores for beer flavor and quality and (b) public reviews’ appreciation scores from beer chemical profiles. We trained and tested 10 different models (Methods), 3 linear regression-based models (simple linear regression with first-order interactions (LR), lasso regression with first-order interactions (Lasso), partial least squares regressor (PLSR)), 5 decision tree models (AdaBoost regressor (ABR), extra trees (ET), gradient boosting regressor (GBR), random forest (RF) and XGBoost regressor (XGBR)), 1 support vector regression (SVR), and 1 artificial neural network (ANN) model.

To compare the performance of our machine learning models, the dataset was randomly split into a training and test set, stratified by beer style. After a model was trained on data in the training set, its performance was evaluated on its ability to predict the test dataset obtained from multi-output models (based on the coefficient of determination, see Methods). Additionally, individual-attribute models were ranked per descriptor and the average rank was calculated, as proposed by Korneva et al. 64 . Importantly, both ways of evaluating the models’ performance agreed in general. Performance of the different models varied (Table  1 ). It should be noted that all models perform better at predicting RateBeer results than results from our trained tasting panel. One reason could be that sensory data is inherently variable, and this variability is averaged out with the large number of public reviews from RateBeer. Additionally, all tree-based models perform better at predicting taste than aroma. Linear models (LR) performed particularly poorly, with negative R 2 values, due to severe overfitting (training set R 2  = 1). Overfitting is a common issue in linear models with many parameters and limited samples, especially with interaction terms further amplifying the number of parameters. L1 regularization (Lasso) successfully overcomes this overfitting, out-competing multiple tree-based models on the RateBeer dataset. Similarly, the dimensionality reduction of PLSR avoids overfitting and improves performance, to some extent. Still, tree-based models (ABR, ET, GBR, RF and XGBR) show the best performance, out-competing the linear models (LR, Lasso, PLSR) commonly used in sensory science 65 .

GBR models showed the best overall performance in predicting sensory responses from chemical information, with R 2 values up to 0.75 depending on the predicted sensory feature (Supplementary Table  S4 ). The GBR models predict consumer appreciation (RateBeer) better than our trained panel’s appreciation (R 2 value of 0.67 compared to R 2 value of 0.09) (Supplementary Table  S3 and Supplementary Table  S4 ). ANN models showed intermediate performance, likely because neural networks typically perform best with larger datasets 66 . The SVR shows intermediate performance, mostly due to the weak predictions of specific attributes that lower the overall performance (Supplementary Table  S4 ).

Model dissection identifies specific, unexpected compounds as drivers of consumer appreciation

Next, we leveraged our models to infer important contributors to sensory perception and consumer appreciation. Consumer preference is a crucial sensory aspects, because a product that shows low consumer appreciation scores often does not succeed commercially 25 . Additionally, the requirement for a large number of representative evaluators makes consumer trials one of the more costly and time-consuming aspects of product development. Hence, a model for predicting chemical drivers of overall appreciation would be a welcome addition to the available toolbox for food development and optimization.

Since GBR models on our RateBeer dataset showed the best overall performance, we focused on these models. Specifically, we used two approaches to identify important contributors. First, rankings of the most important predictors for each sensorial trait in the GBR models were obtained based on impurity-based feature importance (mean decrease in impurity). High-ranked parameters were hypothesized to be either the true causal chemical properties underlying the trait, to correlate with the actual causal properties, or to take part in sensory interactions affecting the trait 67 (Fig.  4A ). In a second approach, we used SHAP 68 to determine which parameters contributed most to the model for making predictions of consumer appreciation (Fig.  4B ). SHAP calculates parameter contributions to model predictions on a per-sample basis, which can be aggregated into an importance score.

figure 4

A The impurity-based feature importance (mean deviance in impurity, MDI) calculated from the Gradient Boosting Regression (GBR) model predicting RateBeer appreciation scores. The top 15 highest ranked chemical properties are shown. B SHAP summary plot for the top 15 parameters contributing to our GBR model. Each point on the graph represents a sample from our dataset. The color represents the concentration of that parameter, with bluer colors representing low values and redder colors representing higher values. Greater absolute values on the horizontal axis indicate a higher impact of the parameter on the prediction of the model. C Spearman correlations between the 15 most important chemical properties and consumer overall appreciation. Numbers indicate the Spearman Rho correlation coefficient, and the rank of this correlation compared to all other correlations. The top 15 important compounds were determined using SHAP (panel B).

Both approaches identified ethyl acetate as the most predictive parameter for beer appreciation (Fig.  4 ). Ethyl acetate is the most abundant ester in beer with a typical ‘fruity’, ‘solvent’ and ‘alcoholic’ flavor, but is often considered less important than other esters like isoamyl acetate. The second most important parameter identified by SHAP is ethanol, the most abundant beer compound after water. Apart from directly contributing to beer flavor and mouthfeel, ethanol drastically influences the physical properties of beer, dictating how easily volatile compounds escape the beer matrix to contribute to beer aroma 69 . Importantly, it should also be noted that the importance of ethanol for appreciation is likely inflated by the very low appreciation scores of non-alcoholic beers (Supplementary Fig.  S4 ). Despite not often being considered a driver of beer appreciation, protein level also ranks highly in both approaches, possibly due to its effect on mouthfeel and body 70 . Lactic acid, which contributes to the tart taste of sour beers, is the fourth most important parameter identified by SHAP, possibly due to the generally high appreciation of sour beers in our dataset.

Interestingly, some of the most important predictive parameters for our model are not well-established as beer flavors or are even commonly regarded as being negative for beer quality. For example, our models identify methanethiol and ethyl phenyl acetate, an ester commonly linked to beer staling 71 , as a key factor contributing to beer appreciation. Although there is no doubt that high concentrations of these compounds are considered unpleasant, the positive effects of modest concentrations are not yet known 72 , 73 .

To compare our approach to conventional statistics, we evaluated how well the 15 most important SHAP-derived parameters correlate with consumer appreciation (Fig.  4C ). Interestingly, only 6 of the properties derived by SHAP rank amongst the top 15 most correlated parameters. For some chemical compounds, the correlations are so low that they would have likely been considered unimportant. For example, lactic acid, the fourth most important parameter, shows a bimodal distribution for appreciation, with sour beers forming a separate cluster, that is missed entirely by the Spearman correlation. Additionally, the correlation plots reveal outliers, emphasizing the need for robust analysis tools. Together, this highlights the need for alternative models, like the Gradient Boosting model, that better grasp the complexity of (beer) flavor.

Finally, to observe the relationships between these chemical properties and their predicted targets, partial dependence plots were constructed for the six most important predictors of consumer appreciation 74 , 75 , 76 (Supplementary Fig.  S7 ). One-way partial dependence plots show how a change in concentration affects the predicted appreciation. These plots reveal an important limitation of our models: appreciation predictions remain constant at ever-increasing concentrations. This implies that once a threshold concentration is reached, further increasing the concentration does not affect appreciation. This is false, as it is well-documented that certain compounds become unpleasant at high concentrations, including ethyl acetate (‘nail polish’) 77 and methanethiol (‘sulfury’ and ‘rotten cabbage’) 78 . The inability of our models to grasp that flavor compounds have optimal levels, above which they become negative, is a consequence of working with commercial beer brands where (off-)flavors are rarely too high to negatively impact the product. The two-way partial dependence plots show how changing the concentration of two compounds influences predicted appreciation, visualizing their interactions (Supplementary Fig.  S7 ). In our case, the top 5 parameters are dominated by additive or synergistic interactions, with high concentrations for both compounds resulting in the highest predicted appreciation.

To assess the robustness of our best-performing models and model predictions, we performed 100 iterations of the GBR, RF and ET models. In general, all iterations of the models yielded similar performance (Supplementary Fig.  S8 ). Moreover, the main predictors (including the top predictors ethanol and ethyl acetate) remained virtually the same, especially for GBR and RF. For the iterations of the ET model, we did observe more variation in the top predictors, which is likely a consequence of the model’s inherent random architecture in combination with co-correlations between certain predictors. However, even in this case, several of the top predictors (ethanol and ethyl acetate) remain unchanged, although their rank in importance changes (Supplementary Fig.  S8 ).

Next, we investigated if a combination of RateBeer and trained panel data into one consolidated dataset would lead to stronger models, under the hypothesis that such a model would suffer less from bias in the datasets. A GBR model was trained to predict appreciation on the combined dataset. This model underperformed compared to the RateBeer model, both in the native case and when including a dataset identifier (R 2  = 0.67, 0.26 and 0.42 respectively). For the latter, the dataset identifier is the most important feature (Supplementary Fig.  S9 ), while most of the feature importance remains unchanged, with ethyl acetate and ethanol ranking highest, like in the original model trained only on RateBeer data. It seems that the large variation in the panel dataset introduces noise, weakening the models’ performances and reliability. In addition, it seems reasonable to assume that both datasets are fundamentally different, with the panel dataset obtained by blind tastings by a trained professional panel.

Lastly, we evaluated whether beer style identifiers would further enhance the model’s performance. A GBR model was trained with parameters that explicitly encoded the styles of the samples. This did not improve model performance (R2 = 0.66 with style information vs R2 = 0.67). The most important chemical features are consistent with the model trained without style information (eg. ethanol and ethyl acetate), and with the exception of the most preferred (strong ale) and least preferred (low/no-alcohol) styles, none of the styles were among the most important features (Supplementary Fig.  S9 , Supplementary Table  S5 and S6 ). This is likely due to a combination of style-specific chemical signatures, such as iso-alpha acids and lactic acid, that implicitly convey style information to the original models, as well as the low number of samples belonging to some styles, making it difficult for the model to learn style-specific patterns. Moreover, beer styles are not rigorously defined, with some styles overlapping in features and some beers being misattributed to a specific style, all of which leads to more noise in models that use style parameters.

Model validation

To test if our predictive models give insight into beer appreciation, we set up experiments aimed at improving existing commercial beers. We specifically selected overall appreciation as the trait to be examined because of its complexity and commercial relevance. Beer flavor comprises a complex bouquet rather than single aromas and tastes 53 . Hence, adding a single compound to the extent that a difference is noticeable may lead to an unbalanced, artificial flavor. Therefore, we evaluated the effect of combinations of compounds. Because Blond beers represent the most extensive style in our dataset, we selected a beer from this style as the starting material for these experiments (Beer 64 in Supplementary Data  1 ).

In the first set of experiments, we adjusted the concentrations of compounds that made up the most important predictors of overall appreciation (ethyl acetate, ethanol, lactic acid, ethyl phenyl acetate) together with correlated compounds (ethyl hexanoate, isoamyl acetate, glycerol), bringing them up to 95 th percentile ethanol-normalized concentrations (Methods) within the Blond group (‘Spiked’ concentration in Fig.  5A ). Compared to controls, the spiked beers were found to have significantly improved overall appreciation among trained panelists, with panelist noting increased intensity of ester flavors, sweetness, alcohol, and body fullness (Fig.  5B ). To disentangle the contribution of ethanol to these results, a second experiment was performed without the addition of ethanol. This resulted in a similar outcome, including increased perception of alcohol and overall appreciation.

figure 5

Adding the top chemical compounds, identified as best predictors of appreciation by our model, into poorly appreciated beers results in increased appreciation from our trained panel. Results of sensory tests between base beers and those spiked with compounds identified as the best predictors by the model. A Blond and Non/Low-alcohol (0.0% ABV) base beers were brought up to 95th-percentile ethanol-normalized concentrations within each style. B For each sensory attribute, tasters indicated the more intense sample and selected the sample they preferred. The numbers above the bars correspond to the p values that indicate significant changes in perceived flavor (two-sided binomial test: alpha 0.05, n  = 20 or 13).

In a last experiment, we tested whether using the model’s predictions can boost the appreciation of a non-alcoholic beer (beer 223 in Supplementary Data  1 ). Again, the addition of a mixture of predicted compounds (omitting ethanol, in this case) resulted in a significant increase in appreciation, body, ester flavor and sweetness.

Predicting flavor and consumer appreciation from chemical composition is one of the ultimate goals of sensory science. A reliable, systematic and unbiased way to link chemical profiles to flavor and food appreciation would be a significant asset to the food and beverage industry. Such tools would substantially aid in quality control and recipe development, offer an efficient and cost-effective alternative to pilot studies and consumer trials and would ultimately allow food manufacturers to produce superior, tailor-made products that better meet the demands of specific consumer groups more efficiently.

A limited set of studies have previously tried, to varying degrees of success, to predict beer flavor and beer popularity based on (a limited set of) chemical compounds and flavors 79 , 80 . Current sensitive, high-throughput technologies allow measuring an unprecedented number of chemical compounds and properties in a large set of samples, yielding a dataset that can train models that help close the gaps between chemistry and flavor, even for a complex natural product like beer. To our knowledge, no previous research gathered data at this scale (250 samples, 226 chemical parameters, 50 sensory attributes and 5 consumer scores) to disentangle and validate the chemical aspects driving beer preference using various machine-learning techniques. We find that modern machine learning models outperform conventional statistical tools, such as correlations and linear models, and can successfully predict flavor appreciation from chemical composition. This could be attributed to the natural incorporation of interactions and non-linear or discontinuous effects in machine learning models, which are not easily grasped by the linear model architecture. While linear models and partial least squares regression represent the most widespread statistical approaches in sensory science, in part because they allow interpretation 65 , 81 , 82 , modern machine learning methods allow for building better predictive models while preserving the possibility to dissect and exploit the underlying patterns. Of the 10 different models we trained, tree-based models, such as our best performing GBR, showed the best overall performance in predicting sensory responses from chemical information, outcompeting artificial neural networks. This agrees with previous reports for models trained on tabular data 83 . Our results are in line with the findings of Colantonio et al. who also identified the gradient boosting architecture as performing best at predicting appreciation and flavor (of tomatoes and blueberries, in their specific study) 26 . Importantly, besides our larger experimental scale, we were able to directly confirm our models’ predictions in vivo.

Our study confirms that flavor compound concentration does not always correlate with perception, suggesting complex interactions that are often missed by more conventional statistics and simple models. Specifically, we find that tree-based algorithms may perform best in developing models that link complex food chemistry with aroma. Furthermore, we show that massive datasets of untrained consumer reviews provide a valuable source of data, that can complement or even replace trained tasting panels, especially for appreciation and basic flavors, such as sweetness and bitterness. This holds despite biases that are known to occur in such datasets, such as price or conformity bias. Moreover, GBR models predict taste better than aroma. This is likely because taste (e.g. bitterness) often directly relates to the corresponding chemical measurements (e.g., iso-alpha acids), whereas such a link is less clear for aromas, which often result from the interplay between multiple volatile compounds. We also find that our models are best at predicting acidity and alcohol, likely because there is a direct relation between the measured chemical compounds (acids and ethanol) and the corresponding perceived sensorial attribute (acidity and alcohol), and because even untrained consumers are generally able to recognize these flavors and aromas.

The predictions of our final models, trained on review data, hold even for blind tastings with small groups of trained tasters, as demonstrated by our ability to validate specific compounds as drivers of beer flavor and appreciation. Since adding a single compound to the extent of a noticeable difference may result in an unbalanced flavor profile, we specifically tested our identified key drivers as a combination of compounds. While this approach does not allow us to validate if a particular single compound would affect flavor and/or appreciation, our experiments do show that this combination of compounds increases consumer appreciation.

It is important to stress that, while it represents an important step forward, our approach still has several major limitations. A key weakness of the GBR model architecture is that amongst co-correlating variables, the largest main effect is consistently preferred for model building. As a result, co-correlating variables often have artificially low importance scores, both for impurity and SHAP-based methods, like we observed in the comparison to the more randomized Extra Trees models. This implies that chemicals identified as key drivers of a specific sensory feature by GBR might not be the true causative compounds, but rather co-correlate with the actual causative chemical. For example, the high importance of ethyl acetate could be (partially) attributed to the total ester content, ethanol or ethyl hexanoate (rho=0.77, rho=0.72 and rho=0.68), while ethyl phenylacetate could hide the importance of prenyl isobutyrate and ethyl benzoate (rho=0.77 and rho=0.76). Expanding our GBR model to include beer style as a parameter did not yield additional power or insight. This is likely due to style-specific chemical signatures, such as iso-alpha acids and lactic acid, that implicitly convey style information to the original model, as well as the smaller sample size per style, limiting the power to uncover style-specific patterns. This can be partly attributed to the curse of dimensionality, where the high number of parameters results in the models mainly incorporating single parameter effects, rather than complex interactions such as style-dependent effects 67 . A larger number of samples may overcome some of these limitations and offer more insight into style-specific effects. On the other hand, beer style is not a rigid scientific classification, and beers within one style often differ a lot, which further complicates the analysis of style as a model factor.

Our study is limited to beers from Belgian breweries. Although these beers cover a large portion of the beer styles available globally, some beer styles and consumer patterns may be missing, while other features might be overrepresented. For example, many Belgian ales exhibit yeast-driven flavor profiles, which is reflected in the chemical drivers of appreciation discovered by this study. In future work, expanding the scope to include diverse markets and beer styles could lead to the identification of even more drivers of appreciation and better models for special niche products that were not present in our beer set.

In addition to inherent limitations of GBR models, there are also some limitations associated with studying food aroma. Even if our chemical analyses measured most of the known aroma compounds, the total number of flavor compounds in complex foods like beer is still larger than the subset we were able to measure in this study. For example, hop-derived thiols, that influence flavor at very low concentrations, are notoriously difficult to measure in a high-throughput experiment. Moreover, consumer perception remains subjective and prone to biases that are difficult to avoid. It is also important to stress that the models are still immature and that more extensive datasets will be crucial for developing more complete models in the future. Besides more samples and parameters, our dataset does not include any demographic information about the tasters. Including such data could lead to better models that grasp external factors like age and culture. Another limitation is that our set of beers consists of high-quality end-products and lacks beers that are unfit for sale, which limits the current model in accurately predicting products that are appreciated very badly. Finally, while models could be readily applied in quality control, their use in sensory science and product development is restrained by their inability to discern causal relationships. Given that the models cannot distinguish compounds that genuinely drive consumer perception from those that merely correlate, validation experiments are essential to identify true causative compounds.

Despite the inherent limitations, dissection of our models enabled us to pinpoint specific molecules as potential drivers of beer aroma and consumer appreciation, including compounds that were unexpected and would not have been identified using standard approaches. Important drivers of beer appreciation uncovered by our models include protein levels, ethyl acetate, ethyl phenyl acetate and lactic acid. Currently, many brewers already use lactic acid to acidify their brewing water and ensure optimal pH for enzymatic activity during the mashing process. Our results suggest that adding lactic acid can also improve beer appreciation, although its individual effect remains to be tested. Interestingly, ethanol appears to be unnecessary to improve beer appreciation, both for blond beer and alcohol-free beer. Given the growing consumer interest in alcohol-free beer, with a predicted annual market growth of >7% 84 , it is relevant for brewers to know what compounds can further increase consumer appreciation of these beers. Hence, our model may readily provide avenues to further improve the flavor and consumer appreciation of both alcoholic and non-alcoholic beers, which is generally considered one of the key challenges for future beer production.

Whereas we see a direct implementation of our results for the development of superior alcohol-free beverages and other food products, our study can also serve as a stepping stone for the development of novel alcohol-containing beverages. We want to echo the growing body of scientific evidence for the negative effects of alcohol consumption, both on the individual level by the mutagenic, teratogenic and carcinogenic effects of ethanol 85 , 86 , as well as the burden on society caused by alcohol abuse and addiction. We encourage the use of our results for the production of healthier, tastier products, including novel and improved beverages with lower alcohol contents. Furthermore, we strongly discourage the use of these technologies to improve the appreciation or addictive properties of harmful substances.

The present work demonstrates that despite some important remaining hurdles, combining the latest developments in chemical analyses, sensory analysis and modern machine learning methods offers exciting avenues for food chemistry and engineering. Soon, these tools may provide solutions in quality control and recipe development, as well as new approaches to sensory science and flavor research.

Beer selection

250 commercial Belgian beers were selected to cover the broad diversity of beer styles and corresponding diversity in chemical composition and aroma. See Supplementary Fig.  S1 .

Chemical dataset

Sample preparation.

Beers within their expiration date were purchased from commercial retailers. Samples were prepared in biological duplicates at room temperature, unless explicitly stated otherwise. Bottle pressure was measured with a manual pressure device (Steinfurth Mess-Systeme GmbH) and used to calculate CO 2 concentration. The beer was poured through two filter papers (Macherey-Nagel, 500713032 MN 713 ¼) to remove carbon dioxide and prevent spontaneous foaming. Samples were then prepared for measurements by targeted Headspace-Gas Chromatography-Flame Ionization Detector/Flame Photometric Detector (HS-GC-FID/FPD), Headspace-Solid Phase Microextraction-Gas Chromatography-Mass Spectrometry (HS-SPME-GC-MS), colorimetric analysis, enzymatic analysis, Near-Infrared (NIR) analysis, as described in the sections below. The mean values of biological duplicates are reported for each compound.

HS-GC-FID/FPD

HS-GC-FID/FPD (Shimadzu GC 2010 Plus) was used to measure higher alcohols, acetaldehyde, esters, 4-vinyl guaicol, and sulfur compounds. Each measurement comprised 5 ml of sample pipetted into a 20 ml glass vial containing 1.75 g NaCl (VWR, 27810.295). 100 µl of 2-heptanol (Sigma-Aldrich, H3003) (internal standard) solution in ethanol (Fisher Chemical, E/0650DF/C17) was added for a final concentration of 2.44 mg/L. Samples were flushed with nitrogen for 10 s, sealed with a silicone septum, stored at −80 °C and analyzed in batches of 20.

The GC was equipped with a DB-WAXetr column (length, 30 m; internal diameter, 0.32 mm; layer thickness, 0.50 µm; Agilent Technologies, Santa Clara, CA, USA) to the FID and an HP-5 column (length, 30 m; internal diameter, 0.25 mm; layer thickness, 0.25 µm; Agilent Technologies, Santa Clara, CA, USA) to the FPD. N 2 was used as the carrier gas. Samples were incubated for 20 min at 70 °C in the headspace autosampler (Flow rate, 35 cm/s; Injection volume, 1000 µL; Injection mode, split; Combi PAL autosampler, CTC analytics, Switzerland). The injector, FID and FPD temperatures were kept at 250 °C. The GC oven temperature was first held at 50 °C for 5 min and then allowed to rise to 80 °C at a rate of 5 °C/min, followed by a second ramp of 4 °C/min until 200 °C kept for 3 min and a final ramp of (4 °C/min) until 230 °C for 1 min. Results were analyzed with the GCSolution software version 2.4 (Shimadzu, Kyoto, Japan). The GC was calibrated with a 5% EtOH solution (VWR International) containing the volatiles under study (Supplementary Table  S7 ).

HS-SPME-GC-MS

HS-SPME-GC-MS (Shimadzu GCMS-QP-2010 Ultra) was used to measure additional volatile compounds, mainly comprising terpenoids and esters. Samples were analyzed by HS-SPME using a triphase DVB/Carboxen/PDMS 50/30 μm SPME fiber (Supelco Co., Bellefonte, PA, USA) followed by gas chromatography (Thermo Fisher Scientific Trace 1300 series, USA) coupled to a mass spectrometer (Thermo Fisher Scientific ISQ series MS) equipped with a TriPlus RSH autosampler. 5 ml of degassed beer sample was placed in 20 ml vials containing 1.75 g NaCl (VWR, 27810.295). 5 µl internal standard mix was added, containing 2-heptanol (1 g/L) (Sigma-Aldrich, H3003), 4-fluorobenzaldehyde (1 g/L) (Sigma-Aldrich, 128376), 2,3-hexanedione (1 g/L) (Sigma-Aldrich, 144169) and guaiacol (1 g/L) (Sigma-Aldrich, W253200) in ethanol (Fisher Chemical, E/0650DF/C17). Each sample was incubated at 60 °C in the autosampler oven with constant agitation. After 5 min equilibration, the SPME fiber was exposed to the sample headspace for 30 min. The compounds trapped on the fiber were thermally desorbed in the injection port of the chromatograph by heating the fiber for 15 min at 270 °C.

The GC-MS was equipped with a low polarity RXi-5Sil MS column (length, 20 m; internal diameter, 0.18 mm; layer thickness, 0.18 µm; Restek, Bellefonte, PA, USA). Injection was performed in splitless mode at 320 °C, a split flow of 9 ml/min, a purge flow of 5 ml/min and an open valve time of 3 min. To obtain a pulsed injection, a programmed gas flow was used whereby the helium gas flow was set at 2.7 mL/min for 0.1 min, followed by a decrease in flow of 20 ml/min to the normal 0.9 mL/min. The temperature was first held at 30 °C for 3 min and then allowed to rise to 80 °C at a rate of 7 °C/min, followed by a second ramp of 2 °C/min till 125 °C and a final ramp of 8 °C/min with a final temperature of 270 °C.

Mass acquisition range was 33 to 550 amu at a scan rate of 5 scans/s. Electron impact ionization energy was 70 eV. The interface and ion source were kept at 275 °C and 250 °C, respectively. A mix of linear n-alkanes (from C7 to C40, Supelco Co.) was injected into the GC-MS under identical conditions to serve as external retention index markers. Identification and quantification of the compounds were performed using an in-house developed R script as described in Goelen et al. and Reher et al. 87 , 88 (for package information, see Supplementary Table  S8 ). Briefly, chromatograms were analyzed using AMDIS (v2.71) 89 to separate overlapping peaks and obtain pure compound spectra. The NIST MS Search software (v2.0 g) in combination with the NIST2017, FFNSC3 and Adams4 libraries were used to manually identify the empirical spectra, taking into account the expected retention time. After background subtraction and correcting for retention time shifts between samples run on different days based on alkane ladders, compound elution profiles were extracted and integrated using a file with 284 target compounds of interest, which were either recovered in our identified AMDIS list of spectra or were known to occur in beer. Compound elution profiles were estimated for every peak in every chromatogram over a time-restricted window using weighted non-negative least square analysis after which peak areas were integrated 87 , 88 . Batch effect correction was performed by normalizing against the most stable internal standard compound, 4-fluorobenzaldehyde. Out of all 284 target compounds that were analyzed, 167 were visually judged to have reliable elution profiles and were used for final analysis.

Discrete photometric and enzymatic analysis

Discrete photometric and enzymatic analysis (Thermo Scientific TM Gallery TM Plus Beermaster Discrete Analyzer) was used to measure acetic acid, ammonia, beta-glucan, iso-alpha acids, color, sugars, glycerol, iron, pH, protein, and sulfite. 2 ml of sample volume was used for the analyses. Information regarding the reagents and standard solutions used for analyses and calibrations is included in Supplementary Table  S7 and Supplementary Table  S9 .

NIR analyses

NIR analysis (Anton Paar Alcolyzer Beer ME System) was used to measure ethanol. Measurements comprised 50 ml of sample, and a 10% EtOH solution was used for calibration.

Correlation calculations

Pairwise Spearman Rank correlations were calculated between all chemical properties.

Sensory dataset

Trained panel.

Our trained tasting panel consisted of volunteers who gave prior verbal informed consent. All compounds used for the validation experiment were of food-grade quality. The tasting sessions were approved by the Social and Societal Ethics Committee of the KU Leuven (G-2022-5677-R2(MAR)). All online reviewers agreed to the Terms and Conditions of the RateBeer website.

Sensory analysis was performed according to the American Society of Brewing Chemists (ASBC) Sensory Analysis Methods 90 . 30 volunteers were screened through a series of triangle tests. The sixteen most sensitive and consistent tasters were retained as taste panel members. The resulting panel was diverse in age [22–42, mean: 29], sex [56% male] and nationality [7 different countries]. The panel developed a consensus vocabulary to describe beer aroma, taste and mouthfeel. Panelists were trained to identify and score 50 different attributes, using a 7-point scale to rate attributes’ intensity. The scoring sheet is included as Supplementary Data  3 . Sensory assessments took place between 10–12 a.m. The beers were served in black-colored glasses. Per session, between 5 and 12 beers of the same style were tasted at 12 °C to 16 °C. Two reference beers were added to each set and indicated as ‘Reference 1 & 2’, allowing panel members to calibrate their ratings. Not all panelists were present at every tasting. Scores were scaled by standard deviation and mean-centered per taster. Values are represented as z-scores and clustered by Euclidean distance. Pairwise Spearman correlations were calculated between taste and aroma sensory attributes. Panel consistency was evaluated by repeating samples on different sessions and performing ANOVA to identify differences, using the ‘stats’ package (v4.2.2) in R (for package information, see Supplementary Table  S8 ).

Online reviews from a public database

The ‘scrapy’ package in Python (v3.6) (for package information, see Supplementary Table  S8 ). was used to collect 232,288 online reviews (mean=922, min=6, max=5343) from RateBeer, an online beer review database. Each review entry comprised 5 numerical scores (appearance, aroma, taste, palate and overall quality) and an optional review text. The total number of reviews per reviewer was collected separately. Numerical scores were scaled and centered per rater, and mean scores were calculated per beer.

For the review texts, the language was estimated using the packages ‘langdetect’ and ‘langid’ in Python. Reviews that were classified as English by both packages were kept. Reviewers with fewer than 100 entries overall were discarded. 181,025 reviews from >6000 reviewers from >40 countries remained. Text processing was done using the ‘nltk’ package in Python. Texts were corrected for slang and misspellings; proper nouns and rare words that are relevant to the beer context were specified and kept as-is (‘Chimay’,’Lambic’, etc.). A dictionary of semantically similar sensorial terms, for example ‘floral’ and ‘flower’, was created and collapsed together into one term. Words were stemmed and lemmatized to avoid identifying words such as ‘acid’ and ‘acidity’ as separate terms. Numbers and punctuation were removed.

Sentences from up to 50 randomly chosen reviews per beer were manually categorized according to the aspect of beer they describe (appearance, aroma, taste, palate, overall quality—not to be confused with the 5 numerical scores described above) or flagged as irrelevant if they contained no useful information. If a beer contained fewer than 50 reviews, all reviews were manually classified. This labeled data set was used to train a model that classified the rest of the sentences for all beers 91 . Sentences describing taste and aroma were extracted, and term frequency–inverse document frequency (TFIDF) was implemented to calculate enrichment scores for sensorial words per beer.

The sex of the tasting subject was not considered when building our sensory database. Instead, results from different panelists were averaged, both for our trained panel (56% male, 44% female) and the RateBeer reviews (70% male, 30% female for RateBeer as a whole).

Beer price collection and processing

Beer prices were collected from the following stores: Colruyt, Delhaize, Total Wine, BeerHawk, The Belgian Beer Shop, The Belgian Shop, and Beer of Belgium. Where applicable, prices were converted to Euros and normalized per liter. Spearman correlations were calculated between these prices and mean overall appreciation scores from RateBeer and the taste panel, respectively.

Pairwise Spearman Rank correlations were calculated between all sensory properties.

Machine learning models

Predictive modeling of sensory profiles from chemical data.

Regression models were constructed to predict (a) trained panel scores for beer flavors and quality from beer chemical profiles and (b) public reviews’ appreciation scores from beer chemical profiles. Z-scores were used to represent sensory attributes in both data sets. Chemical properties with log-normal distributions (Shapiro-Wilk test, p  <  0.05 ) were log-transformed. Missing chemical measurements (0.1% of all data) were replaced with mean values per attribute. Observations from 250 beers were randomly separated into a training set (70%, 175 beers) and a test set (30%, 75 beers), stratified per beer style. Chemical measurements (p = 231) were normalized based on the training set average and standard deviation. In total, three linear regression-based models: linear regression with first-order interaction terms (LR), lasso regression with first-order interaction terms (Lasso) and partial least squares regression (PLSR); five decision tree models, Adaboost regressor (ABR), Extra Trees (ET), Gradient Boosting regressor (GBR), Random Forest (RF) and XGBoost regressor (XGBR); one support vector machine model (SVR) and one artificial neural network model (ANN) were trained. The models were implemented using the ‘scikit-learn’ package (v1.2.2) and ‘xgboost’ package (v1.7.3) in Python (v3.9.16). Models were trained, and hyperparameters optimized, using five-fold cross-validated grid search with the coefficient of determination (R 2 ) as the evaluation metric. The ANN (scikit-learn’s MLPRegressor) was optimized using Bayesian Tree-Structured Parzen Estimator optimization with the ‘Optuna’ Python package (v3.2.0). Individual models were trained per attribute, and a multi-output model was trained on all attributes simultaneously.

Model dissection

GBR was found to outperform other methods, resulting in models with the highest average R 2 values in both trained panel and public review data sets. Impurity-based rankings of the most important predictors for each predicted sensorial trait were obtained using the ‘scikit-learn’ package. To observe the relationships between these chemical properties and their predicted targets, partial dependence plots (PDP) were constructed for the six most important predictors of consumer appreciation 74 , 75 .

The ‘SHAP’ package in Python (v0.41.0) was implemented to provide an alternative ranking of predictor importance and to visualize the predictors’ effects as a function of their concentration 68 .

Validation of causal chemical properties

To validate the effects of the most important model features on predicted sensory attributes, beers were spiked with the chemical compounds identified by the models and descriptive sensory analyses were carried out according to the American Society of Brewing Chemists (ASBC) protocol 90 .

Compound spiking was done 30 min before tasting. Compounds were spiked into fresh beer bottles, that were immediately resealed and inverted three times. Fresh bottles of beer were opened for the same duration, resealed, and inverted thrice, to serve as controls. Pairs of spiked samples and controls were served simultaneously, chilled and in dark glasses as outlined in the Trained panel section above. Tasters were instructed to select the glass with the higher flavor intensity for each attribute (directional difference test 92 ) and to select the glass they prefer.

The final concentration after spiking was equal to the within-style average, after normalizing by ethanol concentration. This was done to ensure balanced flavor profiles in the final spiked beer. The same methods were applied to improve a non-alcoholic beer. Compounds were the following: ethyl acetate (Merck KGaA, W241415), ethyl hexanoate (Merck KGaA, W243906), isoamyl acetate (Merck KGaA, W205508), phenethyl acetate (Merck KGaA, W285706), ethanol (96%, Colruyt), glycerol (Merck KGaA, W252506), lactic acid (Merck KGaA, 261106).

Significant differences in preference or perceived intensity were determined by performing the two-sided binomial test on each attribute.

Reporting summary

Further information on research design is available in the  Nature Portfolio Reporting Summary linked to this article.

Data availability

The data that support the findings of this work are available in the Supplementary Data files and have been deposited to Zenodo under accession code 10653704 93 . The RateBeer scores data are under restricted access, they are not publicly available as they are property of RateBeer (ZX Ventures, USA). Access can be obtained from the authors upon reasonable request and with permission of RateBeer (ZX Ventures, USA).  Source data are provided with this paper.

Code availability

The code for training the machine learning models, analyzing the models, and generating the figures has been deposited to Zenodo under accession code 10653704 93 .

Tieman, D. et al. A chemical genetic roadmap to improved tomato flavor. Science 355 , 391–394 (2017).

Article   ADS   CAS   PubMed   Google Scholar  

Plutowska, B. & Wardencki, W. Application of gas chromatography–olfactometry (GC–O) in analysis and quality assessment of alcoholic beverages – A review. Food Chem. 107 , 449–463 (2008).

Article   CAS   Google Scholar  

Legin, A., Rudnitskaya, A., Seleznev, B. & Vlasov, Y. Electronic tongue for quality assessment of ethanol, vodka and eau-de-vie. Anal. Chim. Acta 534 , 129–135 (2005).

Loutfi, A., Coradeschi, S., Mani, G. K., Shankar, P. & Rayappan, J. B. B. Electronic noses for food quality: A review. J. Food Eng. 144 , 103–111 (2015).

Ahn, Y.-Y., Ahnert, S. E., Bagrow, J. P. & Barabási, A.-L. Flavor network and the principles of food pairing. Sci. Rep. 1 , 196 (2011).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Bartoshuk, L. M. & Klee, H. J. Better fruits and vegetables through sensory analysis. Curr. Biol. 23 , R374–R378 (2013).

Article   CAS   PubMed   Google Scholar  

Piggott, J. R. Design questions in sensory and consumer science. Food Qual. Prefer. 3293 , 217–220 (1995).

Article   Google Scholar  

Kermit, M. & Lengard, V. Assessing the performance of a sensory panel-panellist monitoring and tracking. J. Chemom. 19 , 154–161 (2005).

Cook, D. J., Hollowood, T. A., Linforth, R. S. T. & Taylor, A. J. Correlating instrumental measurements of texture and flavour release with human perception. Int. J. Food Sci. Technol. 40 , 631–641 (2005).

Chinchanachokchai, S., Thontirawong, P. & Chinchanachokchai, P. A tale of two recommender systems: The moderating role of consumer expertise on artificial intelligence based product recommendations. J. Retail. Consum. Serv. 61 , 1–12 (2021).

Ross, C. F. Sensory science at the human-machine interface. Trends Food Sci. Technol. 20 , 63–72 (2009).

Chambers, E. IV & Koppel, K. Associations of volatile compounds with sensory aroma and flavor: The complex nature of flavor. Molecules 18 , 4887–4905 (2013).

Pinu, F. R. Metabolomics—The new frontier in food safety and quality research. Food Res. Int. 72 , 80–81 (2015).

Danezis, G. P., Tsagkaris, A. S., Brusic, V. & Georgiou, C. A. Food authentication: state of the art and prospects. Curr. Opin. Food Sci. 10 , 22–31 (2016).

Shepherd, G. M. Smell images and the flavour system in the human brain. Nature 444 , 316–321 (2006).

Meilgaard, M. C. Prediction of flavor differences between beers from their chemical composition. J. Agric. Food Chem. 30 , 1009–1017 (1982).

Xu, L. et al. Widespread receptor-driven modulation in peripheral olfactory coding. Science 368 , eaaz5390 (2020).

Kupferschmidt, K. Following the flavor. Science 340 , 808–809 (2013).

Billesbølle, C. B. et al. Structural basis of odorant recognition by a human odorant receptor. Nature 615 , 742–749 (2023).

Article   ADS   PubMed   PubMed Central   Google Scholar  

Smith, B. Perspective: Complexities of flavour. Nature 486 , S6–S6 (2012).

Pfister, P. et al. Odorant receptor inhibition is fundamental to odor encoding. Curr. Biol. 30 , 2574–2587 (2020).

Moskowitz, H. W., Kumaraiah, V., Sharma, K. N., Jacobs, H. L. & Sharma, S. D. Cross-cultural differences in simple taste preferences. Science 190 , 1217–1218 (1975).

Eriksson, N. et al. A genetic variant near olfactory receptor genes influences cilantro preference. Flavour 1 , 22 (2012).

Ferdenzi, C. et al. Variability of affective responses to odors: Culture, gender, and olfactory knowledge. Chem. Senses 38 , 175–186 (2013).

Article   PubMed   Google Scholar  

Lawless, H. T. & Heymann, H. Sensory evaluation of food: Principles and practices. (Springer, New York, NY). https://doi.org/10.1007/978-1-4419-6488-5 (2010).

Colantonio, V. et al. Metabolomic selection for enhanced fruit flavor. Proc. Natl. Acad. Sci. 119 , e2115865119 (2022).

Fritz, F., Preissner, R. & Banerjee, P. VirtualTaste: a web server for the prediction of organoleptic properties of chemical compounds. Nucleic Acids Res 49 , W679–W684 (2021).

Tuwani, R., Wadhwa, S. & Bagler, G. BitterSweet: Building machine learning models for predicting the bitter and sweet taste of small molecules. Sci. Rep. 9 , 1–13 (2019).

Dagan-Wiener, A. et al. Bitter or not? BitterPredict, a tool for predicting taste from chemical structure. Sci. Rep. 7 , 1–13 (2017).

Pallante, L. et al. Toward a general and interpretable umami taste predictor using a multi-objective machine learning approach. Sci. Rep. 12 , 1–11 (2022).

Malavolta, M. et al. A survey on computational taste predictors. Eur. Food Res. Technol. 248 , 2215–2235 (2022).

Lee, B. K. et al. A principal odor map unifies diverse tasks in olfactory perception. Science 381 , 999–1006 (2023).

Mayhew, E. J. et al. Transport features predict if a molecule is odorous. Proc. Natl. Acad. Sci. 119 , e2116576119 (2022).

Niu, Y. et al. Sensory evaluation of the synergism among ester odorants in light aroma-type liquor by odor threshold, aroma intensity and flash GC electronic nose. Food Res. Int. 113 , 102–114 (2018).

Yu, P., Low, M. Y. & Zhou, W. Design of experiments and regression modelling in food flavour and sensory analysis: A review. Trends Food Sci. Technol. 71 , 202–215 (2018).

Oladokun, O. et al. The impact of hop bitter acid and polyphenol profiles on the perceived bitterness of beer. Food Chem. 205 , 212–220 (2016).

Linforth, R., Cabannes, M., Hewson, L., Yang, N. & Taylor, A. Effect of fat content on flavor delivery during consumption: An in vivo model. J. Agric. Food Chem. 58 , 6905–6911 (2010).

Guo, S., Na Jom, K. & Ge, Y. Influence of roasting condition on flavor profile of sunflower seeds: A flavoromics approach. Sci. Rep. 9 , 11295 (2019).

Ren, Q. et al. The changes of microbial community and flavor compound in the fermentation process of Chinese rice wine using Fagopyrum tataricum grain as feedstock. Sci. Rep. 9 , 3365 (2019).

Hastie, T., Friedman, J. & Tibshirani, R. The Elements of Statistical Learning. (Springer, New York, NY). https://doi.org/10.1007/978-0-387-21606-5 (2001).

Dietz, C., Cook, D., Huismann, M., Wilson, C. & Ford, R. The multisensory perception of hop essential oil: a review. J. Inst. Brew. 126 , 320–342 (2020).

CAS   Google Scholar  

Roncoroni, Miguel & Verstrepen, Kevin Joan. Belgian Beer: Tested and Tasted. (Lannoo, 2018).

Meilgaard, M. Flavor chemistry of beer: Part II: Flavor and threshold of 239 aroma volatiles. in (1975).

Bokulich, N. A. & Bamforth, C. W. The microbiology of malting and brewing. Microbiol. Mol. Biol. Rev. MMBR 77 , 157–172 (2013).

Dzialo, M. C., Park, R., Steensels, J., Lievens, B. & Verstrepen, K. J. Physiology, ecology and industrial applications of aroma formation in yeast. FEMS Microbiol. Rev. 41 , S95–S128 (2017).

Article   PubMed   PubMed Central   Google Scholar  

Datta, A. et al. Computer-aided food engineering. Nat. Food 3 , 894–904 (2022).

American Society of Brewing Chemists. Beer Methods. (American Society of Brewing Chemists, St. Paul, MN, U.S.A.).

Olaniran, A. O., Hiralal, L., Mokoena, M. P. & Pillay, B. Flavour-active volatile compounds in beer: production, regulation and control. J. Inst. Brew. 123 , 13–23 (2017).

Verstrepen, K. J. et al. Flavor-active esters: Adding fruitiness to beer. J. Biosci. Bioeng. 96 , 110–118 (2003).

Meilgaard, M. C. Flavour chemistry of beer. part I: flavour interaction between principal volatiles. Master Brew. Assoc. Am. Tech. Q 12 , 107–117 (1975).

Briggs, D. E., Boulton, C. A., Brookes, P. A. & Stevens, R. Brewing 227–254. (Woodhead Publishing). https://doi.org/10.1533/9781855739062.227 (2004).

Bossaert, S., Crauwels, S., De Rouck, G. & Lievens, B. The power of sour - A review: Old traditions, new opportunities. BrewingScience 72 , 78–88 (2019).

Google Scholar  

Verstrepen, K. J. et al. Flavor active esters: Adding fruitiness to beer. J. Biosci. Bioeng. 96 , 110–118 (2003).

Snauwaert, I. et al. Microbial diversity and metabolite composition of Belgian red-brown acidic ales. Int. J. Food Microbiol. 221 , 1–11 (2016).

Spitaels, F. et al. The microbial diversity of traditional spontaneously fermented lambic beer. PLoS ONE 9 , e95384 (2014).

Blanco, C. A., Andrés-Iglesias, C. & Montero, O. Low-alcohol Beers: Flavor Compounds, Defects, and Improvement Strategies. Crit. Rev. Food Sci. Nutr. 56 , 1379–1388 (2016).

Jackowski, M. & Trusek, A. Non-Alcohol. beer Prod. – Overv. 20 , 32–38 (2018).

Takoi, K. et al. The contribution of geraniol metabolism to the citrus flavour of beer: Synergy of geraniol and β-citronellol under coexistence with excess linalool. J. Inst. Brew. 116 , 251–260 (2010).

Kroeze, J. H. & Bartoshuk, L. M. Bitterness suppression as revealed by split-tongue taste stimulation in humans. Physiol. Behav. 35 , 779–783 (1985).

Mennella, J. A. et al. A spoonful of sugar helps the medicine go down”: Bitter masking bysucrose among children and adults. Chem. Senses 40 , 17–25 (2015).

Wietstock, P., Kunz, T., Perreira, F. & Methner, F.-J. Metal chelation behavior of hop acids in buffered model systems. BrewingScience 69 , 56–63 (2016).

Sancho, D., Blanco, C. A., Caballero, I. & Pascual, A. Free iron in pale, dark and alcohol-free commercial lager beers. J. Sci. Food Agric. 91 , 1142–1147 (2011).

Rodrigues, H. & Parr, W. V. Contribution of cross-cultural studies to understanding wine appreciation: A review. Food Res. Int. 115 , 251–258 (2019).

Korneva, E. & Blockeel, H. Towards better evaluation of multi-target regression models. in ECML PKDD 2020 Workshops (eds. Koprinska, I. et al.) 353–362 (Springer International Publishing, Cham, 2020). https://doi.org/10.1007/978-3-030-65965-3_23 .

Gastón Ares. Mathematical and Statistical Methods in Food Science and Technology. (Wiley, 2013).

Grinsztajn, L., Oyallon, E. & Varoquaux, G. Why do tree-based models still outperform deep learning on tabular data? Preprint at http://arxiv.org/abs/2207.08815 (2022).

Gries, S. T. Statistics for Linguistics with R: A Practical Introduction. in Statistics for Linguistics with R (De Gruyter Mouton, 2021). https://doi.org/10.1515/9783110718256 .

Lundberg, S. M. et al. From local explanations to global understanding with explainable AI for trees. Nat. Mach. Intell. 2 , 56–67 (2020).

Ickes, C. M. & Cadwallader, K. R. Effects of ethanol on flavor perception in alcoholic beverages. Chemosens. Percept. 10 , 119–134 (2017).

Kato, M. et al. Influence of high molecular weight polypeptides on the mouthfeel of commercial beer. J. Inst. Brew. 127 , 27–40 (2021).

Wauters, R. et al. Novel Saccharomyces cerevisiae variants slow down the accumulation of staling aldehydes and improve beer shelf-life. Food Chem. 398 , 1–11 (2023).

Li, H., Jia, S. & Zhang, W. Rapid determination of low-level sulfur compounds in beer by headspace gas chromatography with a pulsed flame photometric detector. J. Am. Soc. Brew. Chem. 66 , 188–191 (2008).

Dercksen, A., Laurens, J., Torline, P., Axcell, B. C. & Rohwer, E. Quantitative analysis of volatile sulfur compounds in beer using a membrane extraction interface. J. Am. Soc. Brew. Chem. 54 , 228–233 (1996).

Molnar, C. Interpretable Machine Learning: A Guide for Making Black-Box Models Interpretable. (2020).

Zhao, Q. & Hastie, T. Causal interpretations of black-box models. J. Bus. Econ. Stat. Publ. Am. Stat. Assoc. 39 , 272–281 (2019).

Article   MathSciNet   Google Scholar  

Hastie, T., Tibshirani, R. & Friedman, J. The Elements of Statistical Learning. (Springer, 2019).

Labrado, D. et al. Identification by NMR of key compounds present in beer distillates and residual phases after dealcoholization by vacuum distillation. J. Sci. Food Agric. 100 , 3971–3978 (2020).

Lusk, L. T., Kay, S. B., Porubcan, A. & Ryder, D. S. Key olfactory cues for beer oxidation. J. Am. Soc. Brew. Chem. 70 , 257–261 (2012).

Gonzalez Viejo, C., Torrico, D. D., Dunshea, F. R. & Fuentes, S. Development of artificial neural network models to assess beer acceptability based on sensory properties using a robotic pourer: A comparative model approach to achieve an artificial intelligence system. Beverages 5 , 33 (2019).

Gonzalez Viejo, C., Fuentes, S., Torrico, D. D., Godbole, A. & Dunshea, F. R. Chemical characterization of aromas in beer and their effect on consumers liking. Food Chem. 293 , 479–485 (2019).

Gilbert, J. L. et al. Identifying breeding priorities for blueberry flavor using biochemical, sensory, and genotype by environment analyses. PLOS ONE 10 , 1–21 (2015).

Goulet, C. et al. Role of an esterase in flavor volatile variation within the tomato clade. Proc. Natl. Acad. Sci. 109 , 19009–19014 (2012).

Article   ADS   CAS   PubMed   PubMed Central   Google Scholar  

Borisov, V. et al. Deep Neural Networks and Tabular Data: A Survey. IEEE Trans. Neural Netw. Learn. Syst. 1–21 https://doi.org/10.1109/TNNLS.2022.3229161 (2022).

Statista. Statista Consumer Market Outlook: Beer - Worldwide.

Seitz, H. K. & Stickel, F. Molecular mechanisms of alcoholmediated carcinogenesis. Nat. Rev. Cancer 7 , 599–612 (2007).

Voordeckers, K. et al. Ethanol exposure increases mutation rate through error-prone polymerases. Nat. Commun. 11 , 3664 (2020).

Goelen, T. et al. Bacterial phylogeny predicts volatile organic compound composition and olfactory response of an aphid parasitoid. Oikos 129 , 1415–1428 (2020).

Article   ADS   Google Scholar  

Reher, T. et al. Evaluation of hop (Humulus lupulus) as a repellent for the management of Drosophila suzukii. Crop Prot. 124 , 104839 (2019).

Stein, S. E. An integrated method for spectrum extraction and compound identification from gas chromatography/mass spectrometry data. J. Am. Soc. Mass Spectrom. 10 , 770–781 (1999).

American Society of Brewing Chemists. Sensory Analysis Methods. (American Society of Brewing Chemists, St. Paul, MN, U.S.A., 1992).

McAuley, J., Leskovec, J. & Jurafsky, D. Learning Attitudes and Attributes from Multi-Aspect Reviews. Preprint at https://doi.org/10.48550/arXiv.1210.3926 (2012).

Meilgaard, M. C., Carr, B. T. & Carr, B. T. Sensory Evaluation Techniques. (CRC Press, Boca Raton). https://doi.org/10.1201/b16452 (2014).

Schreurs, M. et al. Data from: Predicting and improving complex beer flavor through machine learning. Zenodo https://doi.org/10.5281/zenodo.10653704 (2024).

Download references

Acknowledgements

We thank all lab members for their discussions and thank all tasting panel members for their contributions. Special thanks go out to Dr. Karin Voordeckers for her tremendous help in proofreading and improving the manuscript. M.S. was supported by a Baillet-Latour fellowship, L.C. acknowledges financial support from KU Leuven (C16/17/006), F.A.T. was supported by a PhD fellowship from FWO (1S08821N). Research in the lab of K.J.V. is supported by KU Leuven, FWO, VIB, VLAIO and the Brewing Science Serves Health Fund. Research in the lab of T.W. is supported by FWO (G.0A51.15) and KU Leuven (C16/17/006).

Author information

These authors contributed equally: Michiel Schreurs, Supinya Piampongsant, Miguel Roncoroni.

Authors and Affiliations

VIB—KU Leuven Center for Microbiology, Gaston Geenslaan 1, B-3001, Leuven, Belgium

Michiel Schreurs, Supinya Piampongsant, Miguel Roncoroni, Lloyd Cool, Beatriz Herrera-Malaver, Florian A. Theßeling & Kevin J. Verstrepen

CMPG Laboratory of Genetics and Genomics, KU Leuven, Gaston Geenslaan 1, B-3001, Leuven, Belgium

Leuven Institute for Beer Research (LIBR), Gaston Geenslaan 1, B-3001, Leuven, Belgium

Laboratory of Socioecology and Social Evolution, KU Leuven, Naamsestraat 59, B-3000, Leuven, Belgium

Lloyd Cool, Christophe Vanderaa & Tom Wenseleers

VIB Bioinformatics Core, VIB, Rijvisschestraat 120, B-9052, Ghent, Belgium

Łukasz Kreft & Alexander Botzki

AB InBev SA/NV, Brouwerijplein 1, B-3000, Leuven, Belgium

Philippe Malcorps & Luk Daenen

You can also search for this author in PubMed   Google Scholar

Contributions

S.P., M.S. and K.J.V. conceived the experiments. S.P., M.S. and K.J.V. designed the experiments. S.P., M.S., M.R., B.H. and F.A.T. performed the experiments. S.P., M.S., L.C., C.V., L.K., A.B., P.M., L.D., T.W. and K.J.V. contributed analysis ideas. S.P., M.S., L.C., C.V., T.W. and K.J.V. analyzed the data. All authors contributed to writing the manuscript.

Corresponding author

Correspondence to Kevin J. Verstrepen .

Ethics declarations

Competing interests.

K.J.V. is affiliated with bar.on. The other authors declare no competing interests.

Peer review

Peer review information.

Nature Communications thanks Florian Bauer, Andrew John Macintosh and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary information, peer review file, description of additional supplementary files, supplementary data 1, supplementary data 2, supplementary data 3, supplementary data 4, supplementary data 5, supplementary data 6, supplementary data 7, reporting summary, source data, source data, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Schreurs, M., Piampongsant, S., Roncoroni, M. et al. Predicting and improving complex beer flavor through machine learning. Nat Commun 15 , 2368 (2024). https://doi.org/10.1038/s41467-024-46346-0

Download citation

Received : 30 October 2023

Accepted : 21 February 2024

Published : 26 March 2024

DOI : https://doi.org/10.1038/s41467-024-46346-0

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

quantitative and qualitative research methods journals

Log in using your username and password

  • Search More Search for this keyword Advanced search
  • Latest content
  • For authors
  • Browse by collection
  • BMJ Journals More You are viewing from: Google Indexer

You are here

  • Volume 14, Issue 4
  • Collaborative design of a health research training programme for nurses and midwives in Tshwane district, South Africa: a study protocol
  • Article Text
  • Article info
  • Citation Tools
  • Rapid Responses
  • Article metrics

Download PDF

  • http://orcid.org/0000-0002-8761-2055 Rodwell Gundo ,
  • Mavis Fhumulani Mulaudzi
  • Department of Nursing Science , University of Pretoria , Pretoria , South Africa
  • Correspondence to Dr Rodwell Gundo; rodwell.gundo{at}up.ac.za

Introduction Nurses are essential for implementing evidence-based practices to improve patient outcomes. Unfortunately, nurses lack knowledge about research and do not always understand research terminology. This study aims to develop an in-service training programme for health research for nurses and midwives in the Tshwane district of South Africa.

Methods and analysis This protocol outlines a codesign study guided by the five stages of design thinking proposed by the Hasso-Plattner Institute of Design at Stanford University. The participants will include nurses and midwives at two hospitals in the Tshwane district, Gauteng Province. The five stages will be implemented in three phases: Phase 1: Stage 1—empathise and Stage 2—define. Exploratory sequential mixed methods including focus group discussions with nurses and midwives (n=40), face-to-face interviews (n=6), and surveys (n=330), will be used in this phase. Phase 2: Stage 3—ideate and Stage 4—prototype. A team of research experts (n=5), nurses and midwives (n=20) will develop the training programme based on the identified learning needs. Phase 3: Stage 5—test. The programme will be delivered to clinical nurses and midwives (n=41). The training programme will be evaluated through pretraining and post-training surveys and face-to-face interviews (n=4) following training. SPSS V.29 will be used for quantitative analysis, and content analysis will be used to analyse qualitative data.

Ethics and dissemination The protocol was approved by the Faculty of Health Sciences Research Ethics Committee of the University of Pretoria (reference number 123/2023). The protocol is also registered with the National Health Research Database in South Africa (reference number GP_202305_032). The study findings will be disseminated through conference presentations and publications in peer-reviewed journals.

  • Nursing Care
  • Patient-Centered Care
  • Patient Satisfaction

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:  http://creativecommons.org/licenses/by-nc/4.0/ .

https://doi.org/10.1136/bmjopen-2023-076959

Statistics from Altmetric.com

Request permissions.

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

STRENGTHS AND LIMITATIONS OF THIS STUDY

This study will be strengthened through the use of quantitative and qualitative methods to understand the research problem.

The inclusion of two hospitals and the participation of different nurses and midwives will ensure the credibility of the findings.

Local research experts, nurses and midwives will collaborate to develop a training programme appropriate to the context of the setting.

The findings will be limited to two hospitals; therefore, the findings may not be generalisable to other hospitals.

Introduction

Evidence-based practice (EBP) has gained prominence in health services internationally over the past three decades. 1 EBP integrates individual clinical expertise with clinical evidence generated from systematic research. 2 EBP aims to deliver appropriate, efficient patient care. 3 Consequently, generating evidence that informs care delivery has become increasingly important for improving patient-centred care, patient safety, patient outcomes and the healthcare system. 1 3 In healthcare, nurses are well positioned to implement EBP because they constitute the largest proportion of the health workforce. 1 4 Nurses thus have to be proactive in acquiring, synthesising and using research knowledge and the best evidence to inform their practice and decision-making. 3 4

Recognising the need for EBP, many nursing organisations worldwide have developed best practice guidelines for patient-care decision-making. 4 In South Africa, the roadmap for strengthening nursing and midwifery acknowledges that nurses are vital for providing safe and effective patient care. Strategically, investing in nurse-led research will help develop nurse-led models of care. 5 Similarly, the South African Nursing Council expects nurses to actively participate in research activities, including academic writing, reading and reviewing, as part of continuing professional development. 6 Training nurses and midwives can enhance their research capacity and enable them to use available resources for research, ultimately leading to changes in EBP in clinical settings.

Nurses need to gain research knowledge and become comfortable with research terminology. 7 8 Although undergraduate nursing training includes a research component, this training does not always translate into a strong understanding of research. 7 As such, there needs to be more nurse-led patient-centred research. A recent review of nursing research from 2000 to 2019 showed that most nursing research is conducted by nurses working at higher education institutions. Research output and collaboration are also disproportionately more prominent in high-income countries across North America, Europe, and Oceania than in low-income and middle-income countries. 9 The other challenges that affect health research include limited time, lack of research facilities, research culture, mentors, access to mentors, and workforce capacity. 10

Little is known about the research literacy of nurses and midwives and research training programmes for practicing nurses and midwives in South Africa. Therefore, we developed a protocol to develop a research training programme for nurses and midwives in the Tshwane district of South Africa. This protocol is guided by the following research questions: (a) what are the levels of nurses’ and midwives’ knowledge, attitudes and involvement in research?; (b) what are the learning needs of nurses and midwives regarding research design and implementation?; (c) what content should be included in a research training programme for nurses and midwives?; (d) how does the developed training programme impact nurses’ knowledge about research?

Theoretical framework

The principles of constructivism learning theory will guide this study. This theory is rooted in the work of Piaget and Vygotsky. 11 This paradigm explains how people might acquire and retain knowledge. 12 Through the lens of constructivism learning theory, adult educators acknowledge learners’ previous experiences, appreciate multiple perspectives and embed learning in social contexts. The instructor is a mentor who helps learners understand new information. Constructivism learning theory has three dimensions, namely, individual constructivism, social constructivism and contextualism. In individual constructivism, learners are self-directed and construct knowledge via personal experience. Social constructivism assumes that learning is socially mediated, and that knowledge is constructed through social interaction. In contextualism, learning should be tied to real-life contexts. 13 Some benefits of constructivism theory are that learners enjoy learning because they are actively engaged and have ownership over what they learn. 12 The theory was considered appropriate because the study will be conducted at two research-intensive hospitals. Therefore, nurses and midwives are familiar with the research process.

Methods and analysis

Research design.

We will use a codesign approach guided by the stages of design thinking proposed by the Hasso-Plattner Institute of Design at Stanford University. 14 15 The design originated from participatory research and involves active engagement of the participants to identify needs and collaboratively propose solutions. 14 16 The approach is considered appropriate because it ensures meaningful involvement of end-users, thereby creating meaningful benefits. 17 A codesign approach ensures fewer challenges when implementing the initiative because stakeholders are fully engaged throughout the process. 14 Underpinned by the African philosophy of Ubuntu, the process will promote the culture of working together and collective solidarity. 18

The study will be guided by the five stages of design thinking: empathise, define, ideate, prototype and test. Empathise aims to understand the deeper issues, needs and challenges needed to solve the problem. Define involves data analysis and prioritising the needs of the end users of the training programme. Ideate includes brainstorming for innovative solutions to address the identified needs. In the prototype stage, the idea or innovation is shown to the end users and other stakeholders. Finally, testing involves checking what works in a real-world setting. 14 15

Study setting

The study will be conducted at two public hospitals in the Tshwane district of Gauteng Province in South Africa. The province has the highest population density, the most hospitals and the greatest number of nurses and midwives. 19 According to a 2016 community survey, Gauteng has a population of 13.4 million people. 20 Tshwane is one of the five districts in the province and the third most populous district, accounting for 24% of the population in the province. 21 There are three district hospitals, namely, Tshwane, Pretoria West, Jubilee and ODI; one regional hospital, Mamelodi; and three tertiary hospitals, namely, Steve Biko Academic Hospital, Dr George Mukhari Hospital and Khalafong Hospital. The two hospitals were selected due to their proximity to the University of Pretoria. One of the hospitals is a tertiary hospital with 800 beds. The second hospital is a 240-bed district hospital linked to the University of Pretoria’s Faculty of Health Sciences. 22

Target population

The population will comprise nurses and midwives working at the two hospitals. In South Africa, there are six categories of nurses and midwives based on qualifications as follows: registered auxiliary nurse (higher certificate), registered general nurse (diploma in nursing), registered midwife (advanced diploma), registered professional nurse and midwife (bachelor’s degree), nurse specialist or midwife specialist (postgraduate diploma), advanced specialist nurse (master’s degree) and those with doctorate degrees. 5 Nurses working at academic hospitals are expected to engage in research activities, including academic writing, reading and reviewing, as part of continuing professional development. 6 A preliminary audit revealed 1900 nurses and midwives working at the two hospitals.

Inclusion and exclusion criteria

Participation will be limited to registered auxiliary nurses, registered general nurses, registered midwives, registered professional nurses and midwives older than 18 years, those registered with the South African Nursing Council, and those with more than 3 months of experience. All people older than 18 years are mandated to give legal consent in South Africa. Nurses with less than 3 months of experience or undergoing orientation will be excluded from the study.

As illustrated in table 1 , the study will be implemented in three phases and five stages to address the four objectives. Stage 1 is currently underway. The collection of the qualitative data started in December 2023 at one of the two hospitals. This will proceed at the second hospital until April 2024. The whole study is expected to be completed by September 2024.

  • View inline

Illustration of the research process guided by the stages of design thinking

In this phase, we aim to understand the nurses’ and midwives’ perceived knowledge, attitudes and involvement in research and their learning needs. We will base our investigation on empathising and defining. An exploratory sequential mixed methods design will be used. This design begins with collecting and analysing qualitative data. The qualitative findings are used to develop quantitative measures or instruments to test the identified variables. 23 In this study, the qualitative findings will be used to revise a questionnaire for the subsequent quantitative strand.

Strand 1—qualitative study

Qualitative methods are appropriate for investigating the who, what and where of events or experiences of informants of a poorly understood phenomenon. 24 25

Sample size and sampling

Forty-six participants (n=46) will be selected from nurses and midwives working at the two hospitals. The sample size was pragmatically determined according to the mode of data collection and the volume of data to be collected. However, the final sample size will be determined by data saturation.

We will purposively sample nurses and midwives from the following cadres: registered auxiliary nurses, registered general nurses, registered midwives, and registered professional nurses and midwives. As presented in table 2 , two focus group discussions (FGDs) will be held at each hospital and will involve 10 participants each. Due to power differences that can cause a halo effect among the participants, 26 one FGD will include senior professional nurses and midwives. In contrast, the other FDG will include junior nurses and midwives with either diplomas or certificates. For the individual interviews, three participants (one registered auxiliary nurse, one registered general nurse with a diploma and one professional nurse (with either a bachelor’s or postgraduate qualification)) will be invited to participate. The participants will be expected to share their knowledge of the competencies needed for conducting health research.

Sampling plan for the qualitative strand

Data collection

The study information will be communicated through nursing and midwifery managers. Participation will be voluntary. Nurses and midwives willing to participate will be invited for either FGDs or individual interviews. The participants will be given the details of the study and a consent form. The interviews will be conducted in English in hospitals in private settings at times and places that are most convenient for participants. The participants will be requested to use pseudonyms during interviews. A semistructured interview guide will be used for the interviews (refer to online supplemental file 1 ). The interviews will be audiotaped and later transcribed verbatim in English.

Supplemental material

Data analysis.

The data will be analysed manually using conventional content analysis as described by Hsieh and Shannon. 27 The steps of the analysis will be as follows: (a) repeatedly reading the data to achieve immersion and a sense of the whole; (b) deriving and labelling codes by highlighting the words that capture critical thoughts and concepts; (c) sorting the related codes into categories; (d) organising numerous subcategories into fewer categories; (e) defining each category; and (f) identifying the relationship of the categories in terms of their concurrence, antecedents or consequences. To ensure the reliability of the qualitative coding, tHead2he two researchers will code the first transcript independently. The online Coding Analysis Toolkits 28 will be used to calculate intercoder reliability. The two researchers will discuss differences and agree on the coding before proceeding to the next transcript.

Methodological rigour

Trustworthiness will be achieved through credibility, transferability, dependability and confirmability. 24 29 Credibility will be achieved through spatial and personal triangulation. Spatial triangulation refers to collecting data on the same phenomenon from multiple sites, while personal triangulation refers to collecting data from different types and levels of people. 29 This study will collect data from different cadres of nurses and midwives at two hospitals. Transferability will be enhanced by providing sufficient study details. Dependability and confirmability will be achieved by establishing an audit trail describing the procedures and processes. Additionally, reflexivity will be used to ensure the transparency and quality of the study. 29 30 Reflexivity is where researchers critique, appraise and evaluate the influence of subjectivity and context on the research process. 30 In some branches of qualitative inquiries, researchers use reflexive bracketing to prevent subjective influences. However, Olmos-Vega et al 30 observed that this approach is no longer favoured in modern qualitative research because setting aside certain aspects of subjectivity is problematic. In this study, reflexivity will be ensured by keeping memos and field notes to document interpersonal dynamics and critical decisions made throughout the study.

Strand 2—quantitative study

A cross-sectional survey will be used to assess nurses’ and midwives’ perceived knowledge, attitudes and involvement in research.

The sample size was calculated using Yamane’s formula 31 as follows: n=N/(1+N(e2), where n is the sample, N is the population size, and e is the level of precision. Assuming a 95% CI and the estimated proportion of an attribute p=0.5, the calculated sample size for a population N=1900 with ±5% precision is 330. In this study, a convenience sampling technique will be used to select participants.

The researchers will brief nurse managers about the study. Furthermore, posters inviting nurses and midwives to participate in the study will be placed in each department. The poster will include details of the study and relevant contact details. The nurses and midwives willing to participate will be given an information sheet, consent form and questionnaire. They will be requested to leave the completed questionnaire in a designated box in the unit manager’s office.

Data collection instrument

The data will be collected using the Edmonton Research Orientation Survey (EROS). The EROS was developed in Canada and is a valid and reliable self-reported instrument for measuring perceived knowledge, attitudes and involvement in research. The tool has four subscales with 43 items. The four subscales are the value of research, value of innovation, research involvement and research utilisation (EBP). Valuing research is a positive attitude towards research; the value of innovation refers to being on the leading edge or keeping up to date with information; research involvement relates to active participation in research; and research utilisation (EBP) pertains to whether respondents use research to guide their day-to-day practice. Additionally, there is a category for the barriers and support for research. 32–34

The EROS items are measured using a 5-point Likert scale ranging from 1—strongly disagree to 5—strongly agree. The maximum score is 215. Higher overall scores indicate a stronger research orientation. The scores will be categorised into high (between 143 and 215 points), medium (73–142 points) or low (0–72 points). 32 33 The tool has been extensively used to assess the research orientation of health professionals, including physiotherapists, 35 midwives, 36 occupational therapists, 33 academics 32 and undergraduate students. 34 Previous studies reported high internal reliability with Cronbach’s alpha coefficients of 0.95 37 and 0.92. 34

Although the tool has been previously used among South African occupational therapists, 33 the copyright author observed that the tool had been developed at a time when there was no access to information via the internet, hence the need to find ways of incorporating such issues. This study will use qualitative findings to identify items not included in the tool but relevant to the South African context.

The quantitative data will be entered into Microsoft Excel and imported to IBM SPSS statistics V.29. Descriptive statistics will be used to summarise demographic characteristics and questionnaire scores. Mean scores and SD will be calculated for individual items, subgroup scores and overall scores. Independent sample t-tests, Mann-Whitney U tests, and multiple regression will be used to compare the scores of different groups of nurses and midwives. The assumptions for each test will be assessed before analysis. The level of significance will be set at 0.05.

During this phase, we will develop the training programme based on the learning needs identified in Phase 1. Research experts (n=5) will participate in a one-design studio workshop to brainstorm the content to be included in the training programme. Although there is limited literature on the definition and characteristics of an expert, Bruce et al 38 defined an expert as a person who is knowledgeable or informed in a particular discipline. Bruce et al 38 further observed that maximum variation or heterogeneity in sampling experts yields rich information. This study will select experts based on the criteria proposed by Davis 39 and Rubio et al . 40 The characteristics include clinical experience in the setting, professional certification in a related area, research experience, work experience, conference presentation and publication in the topic area.

A design studio workshop is a process in which participants create, and critique proposed interventions. 16 The researcher will share the findings of Phase 1 and explain the workshop’s goal to the participants. Participants will be provided with pens, sticky notes and flip-chart paper. The researcher will facilitate discussion and capture feedback. At the end of the workshop, the researcher will consolidate the ideas, create a more detailed programme design and communicate with the participants.

Next, we will develop a prototype to be discussed in a consultative meeting and validation meeting. An iterative process will be used to validate the developed training programme. The consultative meeting will be held with research experts (n=5). A validation exercise will also be conducted with nurses and midwives (n=20), the programme’s end-users. The nurses and midwives will be identified in consultation with nurse managers at the two hospitals to avoid disruption of services. During the validation exercise, the participants will be grouped into smaller idea groups to review and discuss the developed programme. Each group will be requested to identify a representative to report on behalf of the group. The feedback from the consultative and validation meeting will help to improve the developed programme.

The purpose of this phase is to assess the impact of the developed training programme. The developed training will be delivered to 41 nurses and midwives in the Tshwane district. The sample is based on similar studies that have implemented interventions for health professionals. For example, a study by Gundo et al 41 used G-Power software 42 to calculate the sample size based on a conservative effect size of d=0.5, a power of 80% and an alpha=0.05. The calculated sample size was 34, but 41 participants were invited to participate in training to allow for a dropout rate of at most 20%. The identification and invitation of the participants will be negotiated with nurse managers at the two hospitals to avoid service disruptions. The selection process will ensure the representation of the different cadres of nurses and midwives. We will invite a team of research experts to facilitate the training. The impact of the training will be assessed by comparing pre-survey and post-survey EROS scores, FGDs with participants, and evaluations at the end of the training. A paired-sample t-test will be used to compare the pretest and post-test scores.

This protocol aims to develop a research training programme for nurses and midwives in the Tshwane district of South Africa. Initially, we will investigate the learning needs of nurses and midwives. The learning needs will inform a training programme to improve research capacity. As observed by Hines et al , 7 implementing a training programme will improve nurses’ research knowledge, critical appraisal ability and research efficacy. Building capacity for health research in Africa will enhance the ownership of research activities that target relevant topics.

Furthermore, findings relevant to local populations will be communicated in a culturally acceptable manner. Research recommendations may also resonate better and have a better uptake among African policymakers than research produced by internationally led teams. 43–45 This research training programme could be used in other hospitals with similar contexts and other categories of healthcare professionals. However, this will require a larger, multicentre validation study. Our findings will be limited to the two hospitals; therefore, the findings may not be generalisable to other hospitals.

Ethics and dissemination

The protocol was approved by the Research Ethics Committee, Faculty of Health Sciences at the University of Pretoria (reference number: 123/2023). The protocol is registered with the National Health Research Database in South Africa (reference number GP_202305_032). The two hospitals also provided permission for the study. Permission to use the EROS was obtained from the copyright authors, Dr Kerrie Pain and Dr Paul Hagler.

The participants will receive an information leaflet and be required to provide written informed consent. The researcher will ensure that the participants’ personal information is anonymised. Participants can give the researcher written permission to share their personal information. During the FGDs and individual interviews in Phase 1, the participants will be asked to use pseudonyms of their choice. In Phases 2 and 3, anonymity will not be possible because the meetings will be in person. However, the participants will be requested to maintain confidentiality. The data will be stored in compliance with the research ethics committee’s guidelines. The findings of the study will be disseminated through conference presentations and publications in peer-reviewed journals. The preparation of this manuscript followed the standards for reporting qualitative research 46 and the guidelines for reporting observational studies. 47

Ethics statements

Patient consent for publication.

Not applicable.

Acknowledgments

The manuscript was written during a writing retreat that was funded by the National Research Foundation through the Ubuntu Community Model of Nursing Project at the University of Pretoria in South Africa. We also thank Dr Cheryl Tosh for editing the manuscript.

  • Cassidy CE ,
  • Sackett DL ,
  • Rosenberg WM ,
  • Gray JA , et al
  • World Health Organization
  • Bassendowski S
  • Republic of South Africa
  • ↵ South African nursing Council . In : Continuing professional development framework for nurses and midwives in South Africa . Pretoria : South African Nursing Council , 2021 .
  • Ramsbotham J ,
  • Yanbing S ,
  • Chao L , et al
  • Dempsey O , et al
  • Efgivia MG ,
  • Adora Rinanda RY , et al
  • Baral KP , et al
  • Thabrew H ,
  • Fleming T ,
  • Hetrick S , et al
  • Slattery P ,
  • Mulaudzi FM ,
  • Anokwuru RA ,
  • Mogale R , et al
  • Coetzee SK ,
  • Ellis SM , et al
  • Statistics South Africa
  • Mutyambizi C ,
  • Pavlova M ,
  • Hongoro C , et al
  • Abdullah F ,
  • Basu D , et al
  • Shrestha S ,
  • Bradshaw C ,
  • Atkinson S ,
  • Sefcik JS ,
  • O.Nyumba T ,
  • Derrick CJ , et al
  • Hsieh H-F ,
  • MacPhail C ,
  • Abler L , et al
  • Korstjens I ,
  • Olmos-Vega FM ,
  • Stalmeijer RE ,
  • Varpio L , et al
  • Sarmah HK ,
  • Hazarika BB ,
  • Choudhury G
  • Fernandez A ,
  • Sadownik L ,
  • Lisonkova S , et al
  • Peachey AA ,
  • Janssen J ,
  • Mirfin-Veitch B , et al
  • Kuliukas L , et al
  • Langley GC ,
  • Berg-Weger M ,
  • Tebb SS , et al
  • Chirwa E , et al
  • Cunningham JB ,
  • McCrum-Gardner E
  • Nyirenda T , et al
  • Kasprowicz VO ,
  • Chopera D ,
  • Waddilove KD , et al
  • O’Brien BC ,
  • Harris IB ,
  • Beckman TJ , et al
  • von Elm E ,
  • Altman DG ,
  • Egger M , et al

Contributors RG and MFM conceptualised the study, developed the proposal, drafted and revised the manuscript.

Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

Competing interests None declared.

Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.

Provenance and peer review Not commissioned; externally peer reviewed.

Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.

Read the full text or download the PDF:

Title: Does COVID-19 affect non-performing loans at commercial banks in Vietnam?

Authors : Nguyen Kim Quoc Trung

Addresses : Faculty of Accounting – Auditing, University of Finance – Marketing, Vietnam

Abstract : The paper aims to investigate the direct connection between COVID-19 and non-performing loans in Vietnamese commercial banks from 2011 to 2021. This article uses qualitative (expert interviews, namely senior credit officers' and credit managers' surveys to find the expert consensus coefficient) and quantitative research methods (generalised method of moments to solve the endogeneity) to define that COVID-19 is a statistically significant factor and positively affect non-performing loans. In line with financial accelerator theory, the author emphasises the significant role of the economic shock in amplifying the non-performing loans at commercial banks in Vietnam during the COVID-19 pandemic. Besides, the paper provides some managerial implications to control and mitigate non-performing loans in the credit granting process.

Keywords : COVID-19; commercial banks; non-performing loans; NPLs; generalised method of moments; GMM; Vietnam.

DOI : 10.1504/IJPM.2024.137792

International Journal of Procurement Management, 2024 Vol.20 No.1, pp.33 - 46

Received: 06 Jun 2022 Accepted: 18 Feb 2023 Published online: 05 Apr 2024 *

Keep up-to-date

  • Our Newsletter ( subscribe for free )
  • New issue alerts
  • Inderscience is a member of publishing organisations including:

CLOCKSS

FlatWorld

Your Course

Indiana university-purdue university - indianapolis, soc-r 351 social science research methods with jerry daday.

Start Date:  August 2023

Required Reading

Social Research

Social Research

Version 2.0 By  Amy Blackstone Published:  February 2019 ISBN:   978-1-4533-9520-2

Purchase Options

  • Integrated quizzes and flashcards connected to each online textbook to help you study.
  • Downloadable, copy-protected PDF and other versions of your textbook for reading offline on multiple devices using Adobe Digital Editions. Downloaded files are protected by digital rights management and are yours to keep.
  • High-quality, color textbook printed to order and shipped to your door.

Redeem Code

Already purchased a pass or code from your college bookstore? Select REDEEM to begin.

IMAGES

  1. Qualitative vs Quantitative Research: Differences and Examples

    quantitative and qualitative research methods journals

  2. Qualitative vs. Quantitative Research: Definition and Types

    quantitative and qualitative research methods journals

  3. Qualitative V/S Quantitative Research Method: Which One Is Better?

    quantitative and qualitative research methods journals

  4. Qualitative vs. Quantitative research: Which is the better method for

    quantitative and qualitative research methods journals

  5. Difference-Between-Quantitative-and-Qualitative-Research-infographic

    quantitative and qualitative research methods journals

  6. Qualitative vs Quantitative Research

    quantitative and qualitative research methods journals

VIDEO

  1. Quantitative Research

  2. Quantitative Research, Types and Examples Latest

  3. Quantitative v Qualitative Data for Legal Research 009

  4. Quantitative and Qualitative research in research psychology

  5. Quantitative Research

  6. Quantitative & Qualitative Research Design and Citation, Impact Factor

COMMENTS

  1. Quantitative and Qualitative Approaches to Generalization and Replication-A Representationalist View

    Qualitative and quantitative research strategies have long been treated as opposing paradigms. In recent years, there have been attempts to integrate both strategies. These "mixed methods" approaches treat qualitative and quantitative methodologies as complementary, rather than opposing, strategies (Creswell, 2015 ).

  2. Qualitative Research: Sage Journals

    Qualitative Research is a peer-reviewed international journal that has been leading debates about qualitative methods for over 20 years. The journal provides a forum for the discussion and development of qualitative methods across disciplines, publishing high quality articles that contribute to the ways in which we think about and practice the craft of qualitative research.

  3. Planning Qualitative Research: Design and Decision ...

    While many books and articles guide various qualitative research methods and analyses, there is currently no concise resource that explains and differentiates among the most common qualitative approaches. We believe novice qualitative researchers, students planning the design of a qualitative study or taking an introductory qualitative research course, and faculty teaching such courses can ...

  4. International Journal of Qualitative Methods: Sage Journals

    The International Journal of Qualitative Methods is the peer-reviewed interdisciplinary open access journal of the International Institute for Qualitative Methodology (IIQM) at the University of Alberta, Canada. The journal, established in 2002, is an eclectic international forum for insights, innovations and advances in methods and study designs using qualitative or mixed methods research.

  5. Quantitative and Qualitative Research Methods

    5.1 Quantitative Research Methods. Quantitative research uses methods that seek to explain phenomena by collecting numerical data, which are then analysed mathematically, typically by statistics. With quantitative approaches, the data produced are always numerical; if there are no numbers, then the methods are not quantitative.

  6. Quantitative, Qualitative, Mixed Methods, and Triangulation Research

    This article discusses the two main research traditions (quantitative and qualitative) and the differences and similarities in methods for front-line nurses. It simplifies and clarifies how the reader might enhance the rigor of the research study by using mixed methods or triangulation. ... International Journal of Qualitative Methods, 15, 1 ...

  7. Qualitative and quantitative research methods

    Quantitative research focuses on numbers and graphs, while qualitative research emphasizes words and definitions. Overall, qualitative research often is based on observations, interviews, and previously published papers, whereas quantitative methods utilize math, surveys, and hands-on experiments. Hence, when researchers want to understand ...

  8. International Journal of Quantitative and Qualitative Research Methods

    International Journal of Quantitative and Qualitative Research Methods is run by the European Centre for Research, Training and Development, United Kingdom. The journal publishes outstanding academic, theoretical and methodological articles relating to quantitative, qualitative and research in professional and service settings. The journal covers issues addressed by researchers within academic ...

  9. Qualitative vs. Quantitative Research

    When collecting and analyzing data, quantitative research deals with numbers and statistics, while qualitative research deals with words and meanings. Both are important for gaining different kinds of knowledge. Quantitative research. Quantitative research is expressed in numbers and graphs. It is used to test or confirm theories and assumptions.

  10. Research methods in business: Quantitative and qualitative comparative

    Continuing with this endeavor, this special issue of the Journal of Business Research presents articles that explore "Research Methods in Business: Quantitative and Qualitative Comparative Analysis.". The original papers were presented at the 2019 INEKA Conference held at University of Verona, Verona, Italy, from June 11 to 13, 2019.

  11. Quantitative and Qualitative Research: Perceptual Foundations

    The way in which quantitative research and qualitative research are conventionally contrasted with each other runs along familiar lines - the former is seen as offering 'hard', 'factual' data, while the latter is depicted as softer, as providing deeper insight, but at the expense of being necessarily more 'interpretivist' and 'subjective' in its approach.

  12. PDF The Usefulness of Qualitative and Quantitative Approaches and Methods

    either classified as qualitative, quantitative research or mixed method. Method of research is generally believed to reside in paradigms and communities of scholars (Cohen, 2011, p4). Kuhn (1970) (cited in Hammersley (2012) examines paradigm as a "set of philosophical assumptions about the phenomena to be studied (ontology),

  13. Quantitative and qualitative research methods: Considerations and

    Quantitative and qualitative research design represent the two sides of a coin in research project and Hammed (2020) citing Guba (1982) illustrated the axiomatic differences between the two ...

  14. Understanding quantitative and qualitative research methods: A

    Quantitative and qualitative methods are the engine behind evidence-based outcomes. For decades, one of the popular phenomena that troubled young researchers is that which appropriate research ...

  15. Qualitative research methods: when to use them and how to judge them

    Qualitative and quantitative research methods are often juxtaposed as representing two different world views. ... We hope this will help the journal's reviewers and readers appreciate the legitimate place of qualitative research and ensure we do not throw the baby out with the bath water by excluding or rejecting papers simply because they ...

  16. ReformED: Integrating Qualitative and Quantitative Research Methods in

    This article reports on a research-based theatre piece about teachers leaving education. The data and scene shared in this article examine race and racism (both interpersonal and institutional) and demonstrate how quantitative data illuminated buried and unclear qualitative data. In social science research, surveys have the potential to contextualize interview data and enable researchers to ...

  17. Strengths and Limitations of Qualitative and Quantitative Research Methods

    Jamshed (2014) advocates the use of interviewing and observation as two main methods. to have an in depth and extensive understanding of a complex reality. Qualitative studies ha ve been used in a ...

  18. Education Sciences

    The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal. Original Submission Date Received: . ... Duffy, P. ReformED: Integrating Qualitative and Quantitative Research Methods in Research-Based Theatre. Educ. Sci. 2024, 14, 378. https: ...

  19. Administrative Sciences

    This study focuses on digital operational knowledge belonging to natural persons and proposes a greenfield approach to differentiate the value of intangibles from that of human intellectual capital. Our research approach involves two assessments. Assessment 1 evaluates intangible assets using the internally generated goodwill (IGG) measure. We analyze time-series IGG data for six digital ...

  20. Predicting and improving complex beer flavor through machine ...

    For each beer, we measure over 200 chemical properties, perform quantitative descriptive sensory analysis with a trained tasting panel and map data from over 180,000 consumer reviews to train 10 ...

  21. Deductive Qualitative Analysis: Evaluating, Expanding, and Refining

    Deductive qualitative analysis (DQA; Gilgun, 2005) is a specific approach to deductive qualitative research intended to systematically test, refine, or refute theory by integrating deductive and inductive strands of inquiry.The purpose of the present paper is to provide a primer on the basic principles and practices of DQA and to exemplify the methodology using two studies that were conducted ...

  22. Collaborative design of a health research training programme for nurses

    Methods and analysis This protocol outlines a codesign study guided by the five stages of design thinking proposed by the Hasso-Plattner Institute of Design at Stanford University. The participants will include nurses and midwives at two hospitals in the Tshwane district, Gauteng Province. The five stages will be implemented in three phases: Phase 1: Stage 1—empathise and Stage 2—define.

  23. Article: Does COVID-19 affect non-performing loans at commercial banks

    This article uses qualitative (expert interviews, namely senior credit officers' and credit managers' surveys to find the expert consensus coefficient) and quantitative research methods (generalised method of moments to solve the endogeneity) to define that COVID-19 is a statistically significant factor and positively affect non-performing ...

  24. Paradigmatic Compatibility Matters: A Critical Review of Qualitative

    Mixed methods research was initially defined as research designs that involved "at least one quantitative method (designed to collect numbers) and one qualitative method (designed to collect words), where neither type of method is inherently linked to any particular inquiry paradigm" (Greene et al., 1989, p. 256).During the 1990s, advocates of mixed methods research argued that this type ...

  25. Required Reading

    Downloadable Textbook with Online Access $52.95. Downloadable, copy-protected PDF and other versions of your textbook for reading offline on multiple devices using Adobe Digital Editions. Downloaded files are protected by digital rights management and are yours to keep. Mobile-friendly, searchable online textbook access for one year (until ...