Likert Scale: Definition, Examples & How to Use

Likert Scale: A Psychometric Measurement Tool

Core Definition and Distinctions

The Likert scale is a fundamental psychometrics tool widely employed in questionnaires and survey research to measure attitudes, opinions, and perceptions. At its core, it is a method designed to capture the intensity of a respondent’s feelings toward a specific statement. Although the term is often used loosely to refer to any rating scale, the true Likert scale is defined by a specific methodology: it involves presenting respondents with a series of declarative statements and asking them to indicate their level of agreement or disagreement on a symmetric, balanced scale. This structured approach allows researchers to quantify subjective experiences, transforming qualitative attitudes into measurable quantitative data for statistical analysis.

It is crucial to distinguish between a Likert item and the Likert scale itself. A Likert item refers to the single statement and its corresponding response options, such as “I am satisfied with this service,” followed by options ranging from “Strongly Disagree” to “Strongly Agree.” Conversely, the true Likert scale is the composite score derived from summing the responses across several related Likert items. This summation is performed under the assumption that these items collectively measure a single underlying construct or latent variable. Therefore, while individual items provide limited information, the summated scale is designed to offer a robust and reliable measure of the psychological construct being studied, making it a powerful instrument in social science research.

Historical Development and Origin

The scale is named after its inventor, the American psychologist Rensis Likert, who first published his methodology in 1932 in his doctoral dissertation, “A Technique for the Measurement of Attitudes.” Likert developed this technique as an improvement over existing, often cumbersome, methods of attitude measurement, such as the Thurstone scale. His goal was to create a simple, reliable, and easily administered method for quantifying complex human attitudes that could be applied efficiently in large-scale surveys and organizational research. The immediate acceptance and widespread adoption of the Likert method solidified its place as the dominant scaling method in behavioral and social research throughout the 20th century.

Likert’s innovation was predicated on the principle of summated ratings. Instead of relying on judges to determine the “value” of each statement (as in the Thurstone method), Likert proposed that the total score derived from a set of statements, all related to the same attitude, would provide a more accurate and reliable indicator of the individual’s underlying psychological state. This emphasis on internal consistency and the combination of multiple items to measure a single construct became a cornerstone of modern psychometrics. The historical context of its development—during a period of burgeoning survey science and psychological testing—underscores its significance as a tool that democratized and standardized attitude measurement.

Structure of a Likert Item and Scale

A fundamental characteristic of the Likert item is its symmetric or balanced structure. This means there must be an equal number of positive (agreement) and negative (disagreement) response options flanking a neutral or midpoint option. While the most common format utilizes five response levels, psychometric research suggests that scales with seven or nine points may offer slightly greater sensitivity and statistical power, though the practical differences are often minimal. The choice between an odd number (which includes a neutral midpoint) and an even number (which forces a choice, often called a forced-choice scale) depends entirely on the research objective and whether the researcher wishes to allow respondents to abstain from expressing an opinion.

The standard five-level format is universally recognized and provides a clear spectrum of intensity for the respondent. This format allows researchers to capture subtle variations in attitude that a simple binary “yes/no” choice would miss. The verbal anchors used must be clear, unambiguous, and perceived as equidistant by the respondents, though the assumption of true equidistance remains a subject of debate in statistical analysis.

The format of a typical five-level Likert item is:

  • Strongly disagree
  • Disagree
  • Neither agree nor disagree
  • Agree
  • Strongly agree

When multiple items are used to create a summated scale, researchers must often employ balanced keying. This involves including both positively and negatively phrased statements relating to the same construct. For instance, if measuring job satisfaction, one item might be “I enjoy coming to work” (positive keying), and another might be “The workload is overwhelming” (negative keying). When scoring, the responses to the negatively keyed items must be reversed before summation. This technique is vital for mitigating a common response bias known as acquiescence, where respondents tend to agree with statements regardless of their true attitude.

Real-World Application and Example

The utility of the Likert scale is vast, spanning fields from clinical psychology to market research and organizational behavior. A common practical example involves assessing employee engagement within a large corporation. The company wants to measure the degree to which employees feel connected to their work and the organization’s mission. Instead of relying on subjective interviews, a Likert-scaled survey provides quantifiable metrics.

The application of the principle proceeds through several steps. First, the researcher develops a set of ten to fifteen statements specifically designed to capture various facets of engagement, such as identification with company values, willingness to put in extra effort, and perception of management support. Second, these statements are presented to the employees, who rate each one using a standard five-point Likert scale. Third, the numerical values assigned to each response (e.g., 1 for Strongly Disagree, 5 for Strongly Agree) are summed after ensuring negative items are reversed. This summation yields a total engagement score for each employee.

Finally, the resulting data is analyzed. If the average summed score for a department is 45 out of a possible 50, management can infer high engagement. Conversely, if the average is 20, significant intervention is required. This step-by-step process demonstrates how the Likert scale transforms the abstract concept of “engagement” into a tangible, actionable metric, allowing managers to track changes over time and compare performance across different organizational units objectively.

Challenges and Response Biases

While highly popular, Likert scales are susceptible to several forms of systematic distortion, known as response biases, which can compromise the validity of the results. One prevalent issue is central tendency bias, where respondents consistently avoid using the extreme categories (Strongly Agree or Strongly Disagree), clustering their responses around the middle or neutral option. This may occur if the respondent lacks strong feelings, perceives the statements as non-applicable, or seeks to appear moderate.

Another significant distortion is social desirability bias, where respondents attempt to portray themselves or their organization in a more favorable light than reality dictates. For instance, in a survey about ethical behavior, an individual might be reluctant to “Strongly Disagree” with a statement regarding workplace honesty, even if their actions suggest otherwise. This bias is particularly strong when the survey topic is sensitive or carries social judgment.

Finally, acquiescence bias, or “yea-saying,” describes the tendency of some respondents to agree with statements regardless of their content. As noted earlier, this bias can be partially mitigated through the strategic inclusion of balanced keying (mixing positively and negatively worded statements). Addressing these biases often requires careful scale design, precise instructions, and sometimes the use of advanced statistical modeling to account for systematic response patterns that do not reflect the true attitude being measured.

Scoring, Analysis, and Level of Measurement

The scoring and subsequent analysis of Likert data are subjects of considerable debate within statistics and psychometrics, primarily centered on the appropriate level of measurement. A core question is whether individual Likert items should be treated as ordinal data or interval-level data. When treated strictly as ordinal data, the categories possess a rank order (Agree is higher than Disagree), but the distance between adjacent categories is not assumed to be equal. In this case, researchers should rely on non-parametric statistics, such as the median, mode, or the Mann–Whitney test.

However, when responses to several Likert items are summed to form a composite scale, the data often approaches an approximation of interval-level data, especially when the scale includes five or more categories and the distribution is reasonably normal. The rationale for this treatment is often supported by the Central Limit Theorem, which suggests that the distribution of sample means (the summated scores) will tend toward normality, even if the individual items are inherently non-interval. When the summed scores are treated as interval data, powerful parametric tests—such as the t-test or Analysis of Variance (ANOVA)—can be applied, allowing for more detailed comparison and modeling of relationships between variables.

Many researchers justify treating the summated scale as interval data because the verbal anchors (e.g., “Strongly Agree” to “Strongly Disagree”) are carefully chosen to imply a symmetry and equal spacing around the neutral midpoint. To treat the summated score as merely ordinal data would result in a significant loss of statistical information and predictive power. Consequently, in applied research, the practice of treating a well-constructed, multi-item Likert scale as interval data is common, provided the researchers acknowledge the underlying assumptions and limitations regarding the true metric properties of the scale.

Advanced Analysis: Rasch Modeling

For researchers aiming for the highest level of measurement precision, the polytomous Rasch model provides an advanced framework for analyzing Likert scale data. The Rasch model is an item response theory (IRT) approach that attempts to transform the ordinal raw scores into true interval-level estimates along a continuum. This is achieved by testing whether the data rigorously conforms to a set of strict formal axioms, ensuring that the intervals between scale points truly correspond to empirical observations in a metric sense, thereby validating the assumption of equidistance.

One of the critical benefits of applying the polytomous Rasch model is its ability to test the hypothesis that the statements reflect increasing levels of the attitude or trait being measured. Often, application of the model reveals that the neutral category (“Neither agree nor disagree”) does not function as a true intermediate level between disagreement and agreement but may instead operate as a category for non-response or indifference. While powerful, the Rasch model requires extensive data checking to ensure the Likert items fit the stringent requirements of the model, meaning not every set of Likert data is suitable for this type of transformation.

Connections to Other Psychometric Concepts

The Likert scale belongs fundamentally to the subfield of Social Psychology, as its original purpose was the measurement of social attitudes, but its methodology is firmly rooted in Psychometrics, the science of measuring mental capacities and processes. It is a specific type of summated rating scale, placing it in a broader category of scaling techniques used in survey design.

One related concept is the Semantic Differential Scale, developed by Charles Osgood. Unlike the Likert scale, which asks for agreement with a statement, the Semantic Differential Scale asks respondents to rate a concept (e.g., “My Job”) on a continuum between two bipolar adjectives (e.g., Good [7] to Bad [1] or Strong [7] to Weak [1]). Both scales are designed to capture the direction and intensity of feelings, but the Likert scale focuses on a specific behavioral or attitudinal statement, whereas the Semantic Differential Scale focuses on the connotative meaning of a concept using evaluative, potency, and activity dimensions.

Another connection exists with **Guttman Scaling** (or cumulative scaling). Guttman scaling posits that items can be arranged in a hierarchy such that agreeing with a high-ranking item implies agreement with all lower-ranking items. The Likert scale is generally less restrictive than the Guttman scale, as Likert items are not necessarily assumed to be cumulative or perfectly hierarchical. While the Guttman approach is complex and less frequently used in general survey research, the Likert scale’s simplicity and robustness have ensured its continued dominance as the primary method for reliably quantifying attitudes and perceptions across nearly all domains of the behavioral and social sciences.

Scroll to Top