CSV (Comma-Separated Values) files are widely used to store tabular data. Each line in a CSV file corresponds to a data record, and each record consists of one or more fields separated by commas. In this article, you’ll learn how to extract specific columns from a CSV file and convert them into Python lists.
In this article we’ll explore two common methods:
- Using the Pandas library
- Using Python’s built-in csv module
To use the file used in this article, click here.
Method 1: Using Pandas
Pandas is a powerful library for data manipulation. The read_csv() function reads the CSV file into a DataFrame, and then tolist() converts column data to Python lists.
Approach:
- Import pandas.
- Use read_csv() to load the file.
- Access specific columns.
- Convert each column to a list using .tolist().
- Print the lists.
Example:
import pandas as pd
# Reading the CSV file
data = pd.read_csv("company_sales_data.csv")
# Converting specific columns into lists
month = data['month_number'].tolist()
fc = data['facecream'].tolist()
fw = data['facewash'].tolist()
tp = data['toothpaste'].tolist()
sh = data['shampoo'].tolist()
# Printing the lists
print("Facecream:", fc)
print("Facewash:", fw)
print("Toothpaste:", tp)
print("Shampoo:", sh)
Output:
Method 2: Using csv.DictReader
If you prefer using built-in libraries or want more control over parsing, the csv module is a good alternative. DictReader reads the file as a dictionary where each row is mapped using column headers as keys.
Approach:
- Import the csv module.
- Open the file in read mode.
- Use csv.DictReader() to parse rows.
- Create empty lists for desired columns.
- Append values to lists inside a loop.
import csv
# Open the CSV file
with open('company_sales_data.csv', mode='r') as file:
reader = csv.DictReader(file)
# Empty lists to store column values
month = []
moisturizer = []
total_units = []
# Iterating over each row
for row in reader:
month.append(row['month_number'])
moisturizer.append(row['moisturizer'])
total_units.append(row['total_units'])
# Printing the results
print("Month:", month)
print("Moisturizer:", moisturizer)
print("Total Units:", total_units)