2.4 Common Traps in Implement and manage an analytics solution

Key Takeaways

  • Do not confuse a domain (governance grouping) with a capacity (compute) or a workspace (collaboration) — each sets different controls.
  • Git integration binds one workspace to one branch and stores definitions, not data; it is not a backup of your tables.
  • Workspace roles do not by themselves grant raw OneLake file access once OneLake security is enabled.
  • Promoted is self-service; Certified needs an authorized reviewer — never swap them.
  • A shortcut references external data without copying it; deleting the shortcut does not delete the source data.
Last updated: June 2026

Trap 1: Mixing Up the Object Layers

The single most common mistake is answering at the wrong layer. A capacity controls compute and throttling; a domain controls governance and discovery; a workspace controls collaboration, roles, and ALM binding; an item holds data and item-level permissions. When a stem says "free viewers can't open content," that is a capacity problem (F64 threshold), not a role or domain problem. When it says "Finance should manage its own assets," that is a domain problem. Always restate which layer the symptom lives at before scanning the options.

Trap 2: Assuming Git or Pipelines Move Data

Neither Git integration nor deployment pipelines move table data. Git stores item definitions as text (a warehouse becomes a SQL database project; a lakehouse stores shortcuts and OneLake-security definitions, not rows). Deployment pipelines copy item definitions and, for a lakehouse, optionally schema and shortcuts — but table data stays stage-specific. Treating Git as a data backup is wrong: it version-controls structure, not the contents of your Delta tables.

MechanismMoves data?Moves definitions?
Git integrationNoYes (as text)
Deployment pipelineNoYes (optionally schema/shortcuts)
ShortcutNo (references in place)N/A
Pipeline Copy activityYes (this is the data mover)No

Trap 3: Over-Trusting Workspace Roles for Data

A workspace Viewer can consume reports and query a lakehouse through the SQL analytics endpoint, but once OneLake security is enforced, the role alone does not grant raw file access to OneLake folders. The correct fix is an explicit OneLake data-access role, not promoting the user to Contributor. Conversely, do not over-grant: if someone only needs to read one item, share that item rather than adding them to the workspace. Least privilege is almost always the intended answer.

Related trap: assuming Admin is required for routine editing. Editing content needs only Contributor; sharing the whole workspace and managing permissions needs Member or Admin; only Admin manages the Git connection and deployment settings. Match the verb in the stem (view, edit, share, manage) to the lowest sufficient role.

Trap 4: Endorsement and Shortcut Confusions

Promoted is self-service by item owners; Certified requires an authorized reviewer designated by an admin. If the stem stresses "no special authorization," it is Promoted; if it stresses "officially vetted," it is Certified. Finally, a shortcut is a pointer: it surfaces external data (other OneLake, ADLS Gen2, S3, GCS) without copying it, and deleting the shortcut removes only the reference, never the underlying source data. Expect a distractor claiming a shortcut duplicates or migrates data — it does not.

Trap 5: Capacity, Throttling, and Pause Misconceptions

Several capacity myths show up as distractors. First, pausing a capacity does not delete data — data lives durably in OneLake; pausing only stops compute billing, and the workspaces become unusable until you resume. Second, autoscale is not the universal fix for slowness: it adds CUs for occasional spikes, but if a capacity is throttled every day, the correct action is to scale up the SKU, not to lean on autoscale, which costs more when it fires constantly.

Third, throttling follows a deliberate order — interactive delays first, then background rejection — so a stem describing "reports slow at peak but overnight jobs still run" is an early-stage throttling signal pointing at capacity sizing.

Trap 6: ALM Provider and Scope Errors

Do not accept a distractor naming GitLab, Bitbucket, or Subversion — only Azure DevOps Git and GitHub are supported. Do not accept "a workspace can sync to multiple branches at once"; the binding is one workspace to one branch. And do not accept "deployment pipelines need Git" — the two mechanisms are independent; you can run a deployment pipeline with no Git connection at all.

A final subtle trap: deployment pipelines promote forward by default (Dev->Test->Prod), and the comparison view shows what differs, but credentials and table data never travel, so a "clean" promotion can still leave a downstream item pointing at the wrong source until you set a deployment rule or parameter. When two answers look right, choose the one that respects the correct provider, the one-branch rule, and the least-privileged access path.

Trap 7: Confusing Item Capabilities

A last cluster of distractors swaps one item's job for another. A lakehouse SQL analytics endpoint is read-only T-SQL over Delta tables, so any answer that has you running inserts or updates against the endpoint is wrong — write operations belong to a warehouse or to Spark in a notebook. A dataflow Gen2 is low-code, Power Query-based shaping; it is not the right answer when the stem demands custom Python libraries or distributed Spark, which is a notebook. A semantic model serves analytics to reports; it is not a transformation engine and not an orchestrator.

And a domain governs and groups — it never runs compute. When the stem names a capability (transactional writes, distributed code, low-code shaping, scheduling, governance grouping), match it to the one item that actually provides it rather than to a familiar-sounding neighbor. Mapping capability-to-item is the antidote to this whole family of traps, and it reinforces the layer map from earlier: pick the object whose job exactly matches the verb in the question.

Test Your Knowledge

A team assumes that committing their lakehouse to Git also backs up the table data. Why is this assumption wrong?

A
B
C
D
Test Your Knowledge

A user only needs read access to a single lakehouse and is not part of its workspace. What is the least-privileged way to grant this?

A
B
C
D
Test Your Knowledge

A developer deletes a OneLake shortcut that pointed to an Amazon S3 bucket. What happens to the source data in S3?

A
B
C
D