2.2 Resolving Errors, Nulls, and Data Quality Issues
Key Takeaways
- Errors in Power Query are caused by type mismatches, formula failures, or invalid data and must be resolved before loading.
- The Remove Errors and Replace Errors steps handle error values at the column level.
- Null values can be filled using Fill Down, Fill Up, Replace Values, or conditional logic.
- Data type mismatches are the most common source of import errors — always set explicit types early in the query.
- The Keep Errors step isolates error rows for investigation before removal.
Resolving Errors, Nulls, and Data Quality Issues
Quick Answer: Common data quality issues include type mismatch errors, null values, duplicates, and inconsistent formatting. Use Remove Errors to discard error rows, Replace Errors to substitute default values, Fill Down/Up for nulls in hierarchical data, and Replace Values for standardizing inconsistent entries.
Understanding Errors in Power Query
Errors in Power Query appear as red [Error] cells and prevent data from loading correctly. Common causes:
| Error Type | Cause | Example |
|---|---|---|
| Type mismatch | Text in a numeric column | "N/A" in a Price column typed as Decimal |
| Formula error | Division by zero, invalid function args | Dividing revenue by zero units |
| Missing reference | Column rename broke a downstream step | Step references "Sales" but column renamed to "Revenue" |
| Connection error | Source unavailable or authentication failed | Database server offline |
| Overflow | Value exceeds data type range | Number too large for Int32 |
Resolving Errors
Option 1: Remove Errors — Deletes entire rows containing errors
Right-click column header → Remove Errors
M code: Table.RemoveRowsWithErrors(previousStep, {"ColumnName"})
Option 2: Replace Errors — Substitutes a default value for errors
Right-click column header → Replace Errors → Enter replacement value
M code: Table.ReplaceErrorValues(previousStep, {{"ColumnName", 0}})
Option 3: Keep Errors — Isolates error rows for investigation
Right-click column header → Keep Errors
This is useful for debugging — create a reference query that keeps only errors, investigate the patterns, then handle them in the main query.
Option 4: Try-Otherwise in M — Custom error handling in formulas
try [Price] * [Quantity] otherwise 0
Handling Null Values
Null values represent missing or unknown data. The appropriate handling depends on context.
Fill Down / Fill Up
Fill Down copies the last non-null value downward — ideal for hierarchical or grouped data where headers appear only once:
Right-click column → Fill → Down
Example before Fill Down:
| Region | Sales |
|---|---|
| East | 100 |
| null | 200 |
| null | 150 |
| West | 300 |
After Fill Down:
| Region | Sales |
|---|---|
| East | 100 |
| East | 200 |
| East | 150 |
| West | 300 |
Fill Up works in the reverse direction.
Replace Values
Replace specific values (including nulls) with a standard value:
Right-click column → Replace Values → Value to Find: null → Replace With: "Unknown"
M code: Table.ReplaceValue(previousStep, null, "Unknown", Replacer.ReplaceValue, {"ColumnName"})
Conditional Column
Create logic-based replacements:
Add Column → Conditional Column →
If [Status] is null Then "Pending" Else [Status]
Remove Null Rows
When entire rows are null (common in Excel imports):
Home → Remove Rows → Remove Blank Rows
Resolving Data Quality Issues
Inconsistent Values
Text data often has inconsistencies that must be standardized:
| Issue | Solution | M Function |
|---|---|---|
| Mixed case | Capitalize Each Word, lowercase, UPPERCASE | Text.Proper(), Text.Lower(), Text.Upper() |
| Leading/trailing spaces | Trim | Text.Trim() |
| Extra spaces | Clean | Text.Clean() |
| Inconsistent naming | Replace Values | Table.ReplaceValue() |
Example: Standardizing city names:
- "new york", "New York", "NEW YORK", " New York " → all should become "New York"
Steps:
- Trim to remove leading/trailing spaces
- Capitalize Each Word to standardize case
- Replace Values for known variations
Duplicate Rows
Remove exact duplicates:
Home → Remove Rows → Remove Duplicates
Remove duplicates based on specific columns:
- Select the columns that define uniqueness
- Right-click → Remove Duplicates
Warning: Remove Duplicates keeps the first occurrence. Sort the data first if you need to keep the most recent or specific record.
Data Type Corrections
Setting correct data types is critical and should be done early:
Select column → Transform tab → Data Type → Select correct type
Common type corrections:
| Source Data | Default Type | Correct Type |
|---|---|---|
| "2024-01-15" | Text | Date |
| "42.50" | Text | Decimal Number |
| "TRUE" | Text | True/False |
| "00123" | Whole Number (loses leading zeros) | Text |
| "1,234.56" | Text (with locale issues) | Decimal Number |
Exam Tip: Always set data types as an early step in the query. If you change types after other transformations, Power Query may insert an automatic "Changed Type" step that can conflict with your manual type assignments.
Resolving Import Errors
Import errors occur when data cannot be loaded into the model:
| Error | Cause | Solution |
|---|---|---|
| Data type conflict | Column has mixed types | Set explicit type early or split column |
| Circular dependency | Query references itself | Restructure the query chain |
| Timeout | Source query takes too long | Optimize source query or increase timeout |
| Memory exceeded | Dataset too large for capacity | Filter rows, aggregate, or use DirectQuery |
| Authentication failed | Expired or incorrect credentials | Update in Data Source Settings |
On the Exam
The PL-300 frequently tests:
- Choosing between Remove Errors and Replace Errors for different scenarios
- Using Fill Down to handle null values in hierarchical data
- Identifying common causes of data import errors
- Setting appropriate data types to prevent downstream errors
- Understanding the order of operations in data cleaning steps
A Price column in Power Query contains some rows with the text "N/A" which cause errors when the column type is set to Decimal Number. You want to replace these errors with 0. What should you do?
An Excel import has a Region column where the region name only appears in the first row of each group, with null values for subsequent rows in that group. How should you fill in the missing region values?
You want to investigate which rows in a dataset have errors before deciding how to handle them. Which Power Query operation should you use?