Data extraction and transformation convert raw, unstructured data into meaningful insights. In Power BI these tasks are performed using Power Query which allows analysts to collect data from multiple sources and clean, reshape and prepare it for accurate analysis and reporting.
- Supports data extraction from databases, files, APIs and cloud services
- Enables data cleaning, filtering, merging and reshaping
- Ensures accurate, consistent and analysis‑ready datasets
Preparing and Cleaning the Dataset
Data transformation process starts by loading the dataset into Power BI. The goal of this stage is to clean and structure the data so it is consistent and ready for analysis.
- Removing Unnecessary Rows: Extra rows such as top headers, blank rows and duplicate records are removed to keep only relevant and meaningful data.
- Changing Data Types: Columns are assigned appropriate data types such as converting the date column to date formatting, revenue as currency and setting sales values as numeric to enable accurate calculations.
- Transforming Data: The column is cleaned by removing non-numeric characters and extra spaces, then converted into a numeric data type so it can be reliably used for calculations and aggregations such as sum and average.
Extracting Numeric Values from Text Data
In many datasets, numeric values are often mixed with descriptive text, making them unsuitable for direct analysis. Extracting the numeric portion helps convert such fields into a structured format that supports calculations and comparisons.
You can download dataset from here
Step 1: Uploading the Store Dataset
- Open Power Query Editor by selecting Transform Data.
- Click on New Source and choose the CSV file containing the store dataset.

- Load the dataset and ensure the column headers are correctly identified.

Step 2: Handling and Extracting Text Data
The store size column contains values like “19 meter square” or “28 meter square”. To make this data usable:

- Use the Extract feature in Power Query.

- Apply Text Before Delimiter using a space as the delimiter.
- This extracts only the numeric part of the value.
Step 3: Converting Extracted Values
- Convert the extracted values to a Whole Number data type.
- The column is now ready for numeric analysis and comparisons.

Handling Combined Text Fields Using Delimiters
Another common challenge occurs when multiple pieces of information are stored in a single column. For example, state abbreviation, state name and city name may appear together in one field, separated by a delimiter.
Understanding Delimiters
Delimiters are characters used to separate different parts of data within a column.
- Common delimiters include hyphens, commas and spaces.
- In above dataset the state and city information is separated using a hyphen (-).

Using the Extract Function
Power Query provides an Extract option that allows you to pull text before or after a specified delimiter. This approach works best when the data follows a consistent pattern across all rows. By extracting text around a hyphen (-) you can isolate specific parts of the value.
Here we use the hyphen as the delimiter to extract the state abbreviation and city information from the column.

However when the data pattern varies across rows this method may produce inaccurate results and can lead to the loss of important information.
Splitting Data into Multiple Columns
If the extract function does not return all required information or behaves inconsistently, a better approach is to use the Split Column feature. The Split by Delimiter option allows you to divide a single column into multiple columns.
- Select the column containing combined data.
- Choose Split Column by Delimiter and specify the hyphen as the delimiter.

- Power Query splits the data into separate columns such as state abbreviation, state name and city name.

Renaming the Columns
After splitting the column rename the newly created columns to make them clear and meaningful.
