How to Find and Count Missing Values in R DataFrame

Last Updated : 13 Jan, 2026

In R programming, missing values are represented using NA. Before analysis, it is important to identify where missing values occur and how many are present. R provides simple built-in functions like is.na(), which() and sum() to handle this task.

Consider a small data frame containing player statistics, where some values are missing:

R
df <- data.frame(
  player = c("A", "B", "C", "D"),
  runs = c(100, 200, NA, 408),
  wickets = c(17, NA, 20, 5)
)
df
which(is.na(df))
sum(is.na(df))

Output:

dataframe
Missing values

Functions Used

which(is.na(data))
sum(is.na(data))

Parameters:

  • is.na(data): Identifies missing values and returns TRUE for each NA.
  • which(is.na(data)): Returns the index positions of missing values.
  • sum(is.na(data)): Calculates the total number of missing values.

Find and Count Missing Values in the Entire Data Frame

We create a data frame named stats and use which(is.na()) to get the positions of missing values and sum(is.na()) to get the total number.

  • data.frame: creates tabular data from vectors.
  • is.na: checks whether a value is missing (NA).
  • which: returns the positions of TRUE values in a logical vector.
  • sum: counts TRUE values by summing them (as TRUE = 1, FALSE = 0).
R
df <- data.frame(player = c('A', 'B', 'C', 'D'),
                    runs = c(100, 200, 408, NA),
                    wickets = c(17, 20, NA, 5))

print("Position of missing values ")
which(is.na(df))

print("Count of total missing values  ")
sum(is.na(df))

Output:

missing
Output

Count Missing Values Using summary()

We use summary() to get statistical details of each column, including the number of missing values.

  • summary: gives descriptive statistics and NA counts per column.
R
df <- data.frame(player=c('A', 'B', 'C', 'D'),
                    runs=c(NA, 200, 408, NA),
                    wickets=c(17, 20, NA, 8))

summary(df)

Output:

players
Output

Count Missing Values Using colSums()

We use colSums() with is.na() to count NA values in each column.

  • colSums: computes the sum of each column, here to count NAs.
  • is.na: checks for missing values.
R
df <- data.frame(player=c('A', 'B', 'C', 'D'),
                    runs=c(NA, 200, 408, NA),
                    wickets=c(17, 20, NA, 8))

colSums(is.na(df))

Output:

players
Output

Find and Count Missing Values in a Single Column

We check the missing values in specific columns using dataframe$column.

  • $ operator: accesses a specific column from a data frame.
R
df <- data.frame(player = c('A', 'B', 'C', 'D'),
                    runs = c(NA, 200, 408, NA),
                    wickets = c(17, 20, NA, 8))

print("Location of missing values in runs column")
which(is.na(df$runs))

print("Count of missing values in wickets column")
sum(is.na(df$wickets))

Output:

data
Output

Find and Count Missing Values in All Columns

We use sapply() to apply functions column-wise and identify NA positions and counts.

  • sapply: applies a function to each column and returns a list or vector.
  • function(x): defines an anonymous function to check NAs per column.
  • which: returns positions of missing values.
  • sum: counts total missing values in each column.
R
df <- data.frame(player = c('A', 'B', 'C', 'D'),
                    runs = c(100, 200, 408, NA),
                    wickets = c(17, 20, NA, 5))

print("Position of missing values by column wise")
sapply(df, function(x) which(is.na(x)))

print("Count of missing values by column wise")
sapply(df, function(x) sum(is.na(x)))

Output:

player_output
Output

The output shows the position and count of missing values in each column. The runs column has a missing value at position 4 and wickets has one at position 3, while player has no missing values. This helps quickly locate and quantify missing data column-wise.

Comment

Explore