3.3 Merge Queries, Append Queries, and Relationship Keys
Key Takeaways
- Merge Queries joins two tables horizontally based on matching columns (like SQL JOIN).
- Append Queries stacks two tables vertically with matching column structures (like SQL UNION).
- Six join types are available: Left Outer, Right Outer, Full Outer, Inner, Left Anti, Right Anti.
- Reference queries create a pointer to an existing query without duplicating data; Duplicate queries create an independent copy.
- Relationship keys must be unique in at least one table (the 'one' side) and share the same data type.
Merge Queries, Append Queries, and Relationship Keys
Quick Answer: Merge = horizontal join (adds columns from another table based on matching keys). Append = vertical stack (adds rows from another table with the same structure). Use Left Outer Join when you want all rows from the first table plus matching data from the second. Always verify that key columns have the same data type before merging.
Merge Queries
Merge Queries combines two tables horizontally by matching rows based on one or more key columns. This is equivalent to a SQL JOIN.
Home tab → Merge Queries (or Merge Queries as New)
Join Types
| Join Type | Description | Returns |
|---|---|---|
| Left Outer | All rows from left table + matching rows from right | Left table complete, nulls for non-matches from right |
| Right Outer | Matching rows from left + all rows from right table | Right table complete, nulls for non-matches from left |
| Full Outer | All rows from both tables | Both tables complete, nulls where no match |
| Inner | Only matching rows from both tables | Only rows with matches in both |
| Left Anti | Rows from left table with NO match in right | Left-only rows (useful for finding missing data) |
| Right Anti | Rows from right table with NO match in left | Right-only rows |
Merge Process
- Select the first (left) table
- Select the matching column(s) in the first table
- Select the second (right) table
- Select the matching column(s) in the second table
- Choose the join type
- Click OK → a new column with nested tables appears
- Expand the nested column to select which fields to include
Merge Queries vs. Merge Queries as New
| Option | Behavior |
|---|---|
| Merge Queries | Adds the merged data to the current query |
| Merge Queries as New | Creates a new query with the merged result |
Best Practice: Use "Merge Queries as New" to keep original queries unchanged and create a clean combined query.
Merge Gotchas
- Column data types must match between the two tables
- Case sensitivity matters for text matches (use Lowercase/Uppercase transform first)
- Trailing spaces can cause false non-matches (use Trim first)
- Multi-column matches are supported (hold Ctrl to select multiple columns)
Append Queries
Append Queries stacks tables vertically — combining rows from two or more tables with similar structures.
Home tab → Append Queries (or Append Queries as New)
Append Options
| Option | Use Case |
|---|---|
| Two Tables | Combine exactly two queries |
| Three or More Tables | Combine multiple queries at once |
How Append Works
- Columns are matched by name (not position)
- If a column exists in one table but not the other, nulls are added
- Column data types from the first table take precedence
- Row order: first table's rows, then second table's rows
Example: Appending monthly sales tables:
Table: JanSales + Table: FebSales → Combined: AllSales
Append vs. Merge
| Feature | Merge | Append |
|---|---|---|
| Direction | Horizontal (adds columns) | Vertical (adds rows) |
| SQL Equivalent | JOIN | UNION ALL |
| Key Required | Yes (matching columns) | No (matched by name) |
| Use Case | Combining related data | Combining same-structure data |
Reference vs. Duplicate Queries
Reference Query
Creates a pointer to an existing query — changes to the source query propagate to the reference:
Right-click query → Reference
- Uses the output of the source query as its starting point
- Maintains a dependency (changes flow through)
- Does not duplicate the data transformation work
- Ideal for creating different views of the same base data
Duplicate Query
Creates an independent copy of the query — changes to the source do NOT propagate:
Right-click query → Duplicate
- Copies all steps from the source query
- No dependency — completely independent
- Changes to one query do not affect the other
- Ideal when you need a starting point but will diverge significantly
When to Use Each
| Scenario | Use |
|---|---|
| Creating a filtered subset of a base query | Reference |
| Creating a staging query that feeds multiple outputs | Reference |
| Isolating error rows for investigation | Reference |
| Creating an independent query starting from similar logic | Duplicate |
Creating Appropriate Relationship Keys
For the data model, relationships require properly formatted keys:
Key Requirements
- Unique values on at least one side (the "one" side of a one-to-many)
- Same data type in both columns
- No null values in the key column (nulls don't match)
- Consistent formatting (same case, no extra spaces)
Creating Composite Keys
When no single column uniquely identifies rows, create a composite key:
// In Power Query, combine columns
Table.AddColumn(previousStep, "CompositeKey",
each [StoreID] & "-" & [ProductID])
Creating Surrogate Keys
When natural keys are unreliable, add an index column:
Add Column → Index Column → From 1
Configuring Data Loading
Control which queries are loaded into the data model:
| Action | How |
|---|---|
| Load to model | Right-click query → Enable Load (checked) |
| Staging only (not loaded) | Right-click query → Enable Load (unchecked) |
| Load to both model and Excel | Power Query → Close & Load To → select destinations |
Best Practice: Disable loading for intermediate/staging queries that exist only to feed other queries. This reduces model size and refresh time.
On the Exam
The PL-300 frequently tests:
- Choosing the correct join type for a Merge scenario
- Understanding the difference between Merge and Append
- Knowing when to use Reference vs. Duplicate queries
- Creating composite keys for relationships
- Configuring which queries load into the data model
You have a Sales table and a Products table. You want to add the product name to each sales row. Some sales reference products that no longer exist in the Products table. You want to keep ALL sales rows. Which join type should you use?
You have 12 monthly CSV files (Jan.csv through Dec.csv) with identical column structures. You need to combine them into one table. Which Power Query operation should you use?
What is the key difference between a Reference query and a Duplicate query in Power Query?