6.2 Duplicate Management, Validation, and Data Quality Controls
Key Takeaways
- Matching rules define how potential duplicates are identified, while duplicate rules define what happens when a match is found.
- Validation rules enforce business requirements at save time and affect imports, integrations, quick actions, APIs, and user edits.
- Data quality controls should combine prevention, detection, stewardship queues, reporting, and exception handling.
- The best rule is not always the strictest rule; admins must balance clean data with legitimate business exceptions.
Prevention versus correction
Data quality is a lifecycle discipline. Salesforce gives administrators several control points, but each one solves a different problem. Duplicate management helps identify records that appear to represent the same person or company. Validation rules stop a record from saving when it violates a defined business condition. Picklists reduce free-text variation. Required fields force capture of critical values. Reports, dashboards, list views, and exception queues help data stewards find and fix problems that prevention controls cannot handle safely.
Matching rules and duplicate rules are related but not interchangeable. A matching rule defines the comparison logic, such as exact match on email or fuzzy match on account name and billing address. A duplicate rule uses one or more matching rules and decides the behavior, such as alerting the user, blocking the save, or allowing the save while reporting the duplicate. Standard rules exist for common objects, and custom rules can be tailored when the business has a defensible matching strategy.
| Control | Best use | Common trap |
|---|---|---|
| Matching rule | Identify likely duplicate records | Treating fuzzy matches as perfect proof. |
| Duplicate rule | Alert, block, or report when matches occur | Blocking legitimate branch offices or household members. |
| Validation rule | Enforce record-level business logic | Forgetting imports and integrations must also pass the rule. |
| Required field | Make a field mandatory in the data model or page flow | Using page layout requiredness when API enforcement is needed. |
| Picklist | Standardize values for reporting and automation | Leaving old free-text values unmapped during migration. |
| Exception report | Find records needing cleanup | Treating reports as prevention when bad data must be blocked. |
Scenario judgment matters. A nonprofit may want to warn on contacts with the same email because spouses sometimes share an address. A B2B company may want to block duplicate accounts when the same tax ID already exists. A call center may need to create a lead quickly during a phone call, then route possible duplicates to a steward for review. The strongest answer is the one that fits the cost of duplicates, the likelihood of false positives, and the user workflow.
Setup paths are practical anchors. Duplicate rules and matching rules live in Setup > Duplicate Management. Validation rules are managed from Object Manager > Object > Validation Rules. Field requiredness can be configured at field definition, page layout, dynamic forms, flow screens, and validation rules, but these choices do not behave identically. If the value must be required regardless of entry point, a field-level required setting or validation rule is stronger than a page layout-only setting.
Designing rules that users can live with
Validation rules should be written as business policy, not as punishment. A good rule has a clear condition, a user-readable error message, and a predictable exception path. For example, an opportunity moving to Closed Won might require Contract Signed Date, Primary Contact, and Close Reason. The admin should confirm that renewals, amendments, partner deals, and integration-created opportunities have a valid way to satisfy the rule. Otherwise users will find workarounds that reduce data quality more than the rule improves it.
A strong validation design checklist:
- Define the business event that should trigger the rule, such as status change, stage change, record type, or ownership change.
- Confirm the rule applies to the right users, profiles, permission sets, record types, and automation contexts.
- Test creates and edits, not only the happy path.
- Include API, Data Loader, Flow, and integration scenarios in testing.
- Write an error message that tells the user what to fix and where possible which field needs attention.
- Document any bypass, such as a custom permission for a controlled integration user.
Duplicate rules need similar care. Blocking duplicates may be correct for a unique external ID or a regulated identifier, but warnings are often better when data is messy or partial. If the sales team imports conference leads, blocking every similar name may stop legitimate prospects. A duplicate rule that allows the record but reports it can create a review queue for a data steward. The steward can merge records where appropriate, keep legitimate separate records, and tune the matching rule when false positives become obvious.
Merging is not always neutral. Merging accounts, contacts, or leads can change activity history, campaign influence, account hierarchy, ownership, sharing, related lists, and reports. The admin should train stewards to pick the master record deliberately and review field values before merge. In regulated or integrated environments, merging may also affect downstream systems, so ownership of the cleanup process should be clear.
Data quality also depends on metadata choices. Restricted picklists prevent unexpected values through the UI and API. State and country picklists improve address consistency. External IDs prevent repeated fuzzy matching. Help text and field descriptions reduce ambiguity. Formula fields can expose quality flags, such as missing industry on strategic accounts. Reports can group exceptions by owner or region so managers can correct data as part of normal operations.
Study trap: do not solve every data problem with a validation rule. If users need coaching but should not be blocked, use guidance, path fields, screen flow prompts, duplicate warnings, or exception reports. If the data must never be saved incorrectly, use a hard control. If the source is an integration, make sure the integration user has exactly the bypass or error handling the business approved. A serious admin explains the tradeoff instead of simply making the rule stricter.
What is the main difference between a matching rule and a duplicate rule?
A validation rule blocks a Data Loader update that an integration team expected to pass. What should the admin evaluate?
A business has many legitimate contacts that share household email addresses. Which duplicate approach is usually safer than blocking every same-email contact?