Free exam prep

CPA Certified Public Accountant

Practice Questions Study Guide Cheat Sheet Flashcards

16.3 Data Governance, SQL, and Analytics

Key Takeaways

Data governance assigns ownership, quality expectations, security requirements, retention rules, and permitted uses across the entire data life cycle.
Candidates should distinguish data warehouses, data lakes, data marts, star schemas, snowflake schemas, data dictionaries, and normalized relational tables.
Relational integrity depends on primary keys, foreign keys, referential integrity, validation rules, and metadata describing tables and fields.
SQL questions test whether a query retrieves a relevant and complete data set for the objective, not whether a candidate can build an application.
Analytics are reliable only when source data is sound, transformations are understood, and results reconcile back to control totals and the business question.

Last updated: June 2026

Data Governance and the Data Life Cycle

Data governance is the set of roles, policies, standards, and monitoring activities that keep data reliable and usable. It answers practical questions: Who owns the customer master file? Which fields are required? How long is payroll data retained? Who may extract data? What quality checks run before a report is trusted?

ISC Area I includes the data life cycle. A common five-stage model is definition, capture (creation), use, archival/storage, and destruction (disposition). Controls follow the data:

Definition: standards, data dictionary, ownership, and classification are set.
Capture: validation and authorization at entry.
Use: integrity, access restriction, and change history.
Storage/archival: security, encryption, backup, retention, and retrieval.
Destruction: approved deletion or anonymization consistent with policy and law.

Data quality is usually described along dimensions such as completeness, accuracy, consistency, timeliness, validity, and uniqueness. A fact pattern naming one dimension is steering you to the control that protects it.

Governance also assigns clear roles. A data owner (often a business executive) is accountable for the data's classification, quality, and permitted uses; a data steward enforces standards day to day; a data custodian (typically IT) safeguards storage and access. The exam tests whether the right role is performing a control. For example, IT custodians should not unilaterally approve who may see payroll data; the business data owner sets that policy. Retention and disposition tie back to legal and regulatory drivers such as records-retention statutes and privacy laws, so deletion must be approved and documented, never ad hoc.

Storage and Schema Terms

Term	Meaning	CPA use case
Data warehouse	Curated, structured data optimized for reporting	Financial dashboards and trend analysis
Data lake	Broad store of structured and unstructured raw data	Exploratory analytics before modeling
Data mart	Subject-area subset of a warehouse	Sales, payroll, or procurement reporting
Star schema	One fact table linked to denormalized dimension tables	Fast reporting by product, customer, date
Snowflake schema	More normalized dimensions (more tables, more joins)	Reduced redundancy at cost of query complexity
Data dictionary	Metadata about fields, formats, and owners	Understanding definitions before extraction
ETL / ELT	Extract, transform, load processes	Moving and reshaping data into the warehouse

A relational database uses primary keys to uniquely identify records, foreign keys to link related tables, and referential integrity rules to prevent orphan records. Normalization stores each fact once to reduce redundancy and update anomalies; the tradeoff is that reporting then requires joins or a warehouse layer.

SQL Review for Completeness and Relevance

Candidates need not memorize every function. They must read standard SQL structure: SELECT lists fields, FROM names tables, JOIN connects tables, WHERE filters rows, GROUP BY summarizes, HAVING filters groups, and aggregates SUM, COUNT, AVG, MIN, MAX compute totals. Join type drives population:

INNER JOIN keeps only rows matching in both tables, so unmatched records vanish.
LEFT (OUTER) JOIN keeps all left-table rows even when no match exists on the right.

Worked example. A query to find all vendor payments uses payments p INNER JOIN vendors v ON p.vendor_id = v.id. Any payment with a null or invalid vendor_id is silently dropped, understating disbursements and masking exactly the unauthorized payments an auditor wants to see. A LEFT JOIN preserves them.

A query can be syntactically valid yet wrong for the objective. A revenue query selecting only posted invoices omits unbilled shipments. A date filter on invoice_date rather than shipment_date misses cutoff issues. Always confirm the population matches the objective before trusting the output.

Integrating Data for Analytics

Analytics often combine ERP tables, bank files, payroll systems, ticketing data, and external sources. Integration creates risk because systems use different keys, date formats, currencies, time zones, or definitions. Before relying on analytics, reconcile record counts, hash totals, control totals, and key fields back to source reports, and confirm transformations did not duplicate or drop rows.

High-value analytics for accountants:

Duplicate vendor payments matched by invoice number, amount, and vendor.
Round-dollar journal entries posted late at night or by unusual users.
Sales orders shipped before credit approval.
Payroll payments to terminated employees (joining payroll to the HR termination file).
Inventory items with negative quantities or stale movement dates.
Gaps or duplicates in sequential document numbers (completeness of transactions).

Exam Focus

When a question shows a query or report extract, ask three things: Does the data population match the objective? Are filters excluding needed records? Are joins or grouping creating duplicates or omissions? The best answer usually protects completeness first, then relevance, accuracy, and proper interpretation.

Finally, judge the analytic technique against the question. Descriptive analytics summarize what happened; diagnostic analytics explain why; predictive analytics estimate what will happen; and prescriptive analytics recommend action. A 100% transaction test (rather than a sample) is feasible when full populations are available, but it raises the stakes on data reliability: testing every row of a flawed extract simply produces precise-looking nonsense. Document the data source, extraction logic, and reconciliation so the work is reproducible and defensible.

Test Your Knowledge

An analyst wants to identify all shipments made before billing. The query selects from the invoice table and filters invoice_date before year-end, but it does not join to the shipment table. What is the main problem?

The query may be incomplete for the objective because unbilled shipments would not appear in the invoice table.

The query is automatically invalid because SQL cannot use dates.

The query will always overstate shipments because all invoices are duplicates.

The analyst should remove all filters to make the report more precise.

Test Your Knowledge

Which database design feature most directly prevents a sales-order line from referencing a product ID that does not exist in the product master table?

A pie chart in the sales dashboard

A wider monitor for the order-entry clerk

A foreign key with referential integrity

A monthly management meeting about sales trends

Test Your Knowledge

An auditor runs payments INNER JOIN vendors to total disbursements, but some payments have a null vendor_id. What is the most likely effect?

Payments with a null vendor_id are dropped, understating total disbursements.

The query duplicates every payment, overstating disbursements.

The join converts the data into a star schema automatically.

Referential integrity is permanently disabled by the join.

Up Next

16.4 Availability, Backup, and Recovery Controls

Continue learning

Uniform CPA Examination

1Chapter 1: CPA Exam Structure and Licensure Rules

2Chapter 2: CPA Scoring, Interface, and Exam Strategy

3Chapter 3: AUD Ethics, Independence, and Engagement Acceptance

4Chapter 4: AUD Risk Assessment, Controls, and Evidence

5Chapter 21: AUD Sampling, Analytics, and Accounting Estimates

6Chapter 22: AUD Integrated Audits, IT Controls, and SOC Reliance

7Chapter 5: AUD Reporting, Attestation, and Review Engagements

8Chapter 36: AUD Workbook - From Risk to Report

9Chapter 6: FAR Financial Statements and Reporting Framework

10Chapter 7: FAR Balance Sheet Accounts and Rollforwards

11Chapter 8: FAR Selected Transactions and Measurement Models

12Chapter 32: FAR Revenue, Leases, and Income Taxes Deep Dive

13Chapter 31: FAR Financial Instruments, Fair Value, and Derivatives

14Chapter 24: FAR Consolidations, Equity Method, and Intercompany Workpapers

15Chapter 9: FAR Governmental Accounting and Fund Reporting

16Chapter 23: FAR Not-for-Profit and Special Reporting Topics

17Chapter 37: FAR Workbook - Journal Entries, Rollforwards, and Reconciliations

18Chapter 10: REG Tax Procedure, Ethics, and Business Law

19Chapter 11: REG Individual Federal Tax Workflows

20Chapter 12: REG Property Transactions and Entity Taxation

21Chapter 25: REG Entity Choice, Basis, and Owner Tax Consequences

22Chapter 26: REG Property Tax, Character, and Deferral Transactions

23Chapter 33: REG Trusts, Estates, Gifts, and Exempt Organizations

24Chapter 38: REG Workbook - Tax Return Schedules and Basis Workpapers

25Chapter 13: BAR Business Analysis, Economics, and Forecasting

26Chapter 14: BAR Technical Accounting and Advanced Reporting

27Chapter 15: BAR State and Local Government Analysis

28Chapter 16: ISC Information Systems and Data Management

29Chapter 17: ISC Security, Privacy, and SOC Engagements

30Chapter 18: TCP Individual Tax Planning and Personal Financial Issues

31Chapter 19: TCP Entity Compliance and Planning

32Chapter 34: TCP International and Advanced Entity Tax Topics

33Chapter 35: Choosing and Preparing for BAR, ISC, or TCP

34Chapter 39: CPA TBS Research, Authoritative Literature, and Document Review

35Chapter 20: CPA Simulations, Retakes, and Final Review

36Chapter 40: Discipline Workbook Drills for BAR, ISC, and TCP

CPA Certified Public Accountant

16.3 Data Governance, SQL, and Analytics

Key Takeaways

Data Governance and the Data Life Cycle

Storage and Schema Terms

SQL Review for Completeness and Relevance

Integrating Data for Analytics

Exam Focus