Handling Data

When you access data in jamovi via self$data, you aren’t just getting a standard R data frame. jamovi uses a sophisticated system to ensure data integrity while remaining flexible for the user.

1. The “Dual Nature” of Variables

In jamovi, variables have a Dual Nature: they exist as both Factors (labels) and Numeric values simultaneously.

There are six primary variable types:

IconTypeR Representation
Nominal (Text)Factor
Nominal (Integer)Factor (with underlying Numeric attributes)
Ordinal (Text)Factor
Ordinal (Integer)Factor (with underlying Numeric attributes)
ContinuousInteger or Numeric
IDInteger or Character

Why this matters

Users often want to treat the same variable differently depending on the context. For example, a “Likert Scale” (Ordinal) might be used as a Grouping Factor in a t-test, but as a Numeric Score when calculating a mean.


2. Best Practice: Explicit Conversion

Do not infer how a variable should be treated from its data type alone. Instead, ask the user explicitly by providing separate input slots.

jamovi follows the tradition of statistics software where measure type is only a guide — users routinely ignore it, and requiring them to set the correct type on potentially hundreds of columns before running an analysis creates unnecessary friction.

For example, in an ANCOVA:

The second approach will surprise users whose variables have the “wrong” type set, and it removes their ability to override that decision without editing their data.

By default, jamovi provides Nominal and Ordinal variables as Factors. If you need to use them as numbers, you must explicitly convert them.

Important

The Golden Rule of Conversion Always perform your data conversions early in your .run() function, and always before using functions like na.omit() or subset(). These base R functions often strip away the special attributes jamovi uses to track numeric values.

Implementation Example (ANCOVA)

If you are building an ANCOVA that requires a numeric dependent variable, factors for your groups, and numeric covariates, follow these steps:

.run = function() {
    
    # 1. READ option values into short names
    dep  <- self$options$dep
    facs <- self$options$factors
    covs <- self$options$covs
    
    # 2. GET the raw data
    data <- self$data
    
    # 3. CONVERT to the required types explicitly
    
    # Ensure Dependent is numeric
    data[[dep]] <- jmvcore::toNumeric(data[[dep]])
    
    # Ensure Factors are factors
    for (fac in facs)
        data[[fac]] <- as.factor(data[[fac]])
        
    # Ensure Covariates are numeric
    for (cov in covs)
        data[[cov]] <- jmvcore::toNumeric(data[[cov]])
    
    # 4. CLEAN the data (now that attributes aren't needed)
    data <- na.omit(data)
    
    # 5. PERFORM calculations...
}

Why use jmvcore::toNumeric()?

Next Step: Now that you can handle data safely, let’s look at how to manage complex analysis state.