DP-600 - Semantic Models and Data Modeling

Quick Navigation

Star Schema Design
Normalization (1NF / 2NF / 3NF)
Relationships
Storage Modes
Direct Lake Mode
Row-Level Security (RLS)
Object-Level Security (OLS)
Composite Models

Star Schema Design

A star schema is the recommended pattern for Power BI semantic models. It has:

Fact tables: contain measurable events (sales, orders, transactions). Typically narrow columns: foreign keys + numeric measures.
Dimension tables: contain descriptive attributes (Date, Customer, Product, Geography). Usually wider with text and categorical columns.

Why star schema? VertiPaq (the in-memory engine) excels with wide, denormalized dimension tables joined to narrow fact tables via integer foreign keys. Snowflake schemas (normalized dimensions) require more joins and are slower in DAX.

Date dimension requirements: A dedicated Date table must cover all dates in the fact table, have no gaps, and be marked as a Date table (via the Modeling tab). This is required for time intelligence functions to work correctly.

Normalization Forms

Form	Rule	Example
1NF	Atomic values - no repeating groups or arrays in columns	City and Country in separate columns, not "Seattle, USA" in one
2NF	1NF + every non-key attribute depends on the FULL primary key (no partial dependency)	CustomerName removed from Orders table - depends only on CustomerID, not the full {OrderID, ProductID} key
3NF	2NF + no transitive dependencies (non-key depends on non-key)	City and Region in a separate Geography table keyed by ZipCode rather than stored in Orders

For Power BI, dimension tables are often denormalized (1NF, not 2NF/3NF) for performance, but source systems should be normalized to 3NF.

Relationships

Cardinality	Description	Example
One-to-Many (1:*)	Most common - one row in dimension, many in fact	One Customer -> many Orders
Many-to-One (*:1)	Same as 1:* viewed from the fact side	Many Orders -> one Customer
One-to-One (1:1)	Rare - merging is usually better	Employee ? EmployeeDetail
Many-to-Many (:)	Supported natively (Power BI >= 2019); use bridge table for explicit control	Students ? Courses

Cross-filter direction: Single (default) - filter flows from 1 side to * side. Both - filter flows both ways (use carefully; can cause ambiguity). In DAX use CROSSFILTER() to override per-measure.

Active vs Inactive relationships: Only one relationship between two tables can be active at a time. Additional relationships are inactive and must be activated via USERELATIONSHIP() inside CALCULATE.

Storage Modes

Mode	Data Location	Refresh Required	Best For
Import	Compressed in-memory (VertiPaq)	Yes - manual or scheduled	Fast query performance, data < ~10GB
DirectQuery	Source system - queries sent live	No - always current	Very large data, strict data freshness needed
Dual	Both imported AND DirectQuery	Yes for import portion	Dimension tables in composite models
Direct Lake	OneLake Delta files (Fabric only)	No - reads Delta files on demand	Large Fabric Lakehouse/Warehouse tables with fresh data

Direct Lake is unique to Microsoft Fabric. It reads Delta/Parquet files from OneLake directly using the VertiPaq engine's framing process, providing Import-speed performance without full data copies. It falls back to DirectQuery automatically if data is too large to frame.

Direct Lake Mode - Deep Dive

When you create a semantic model from a Fabric Lakehouse or Warehouse, the default mode is Direct Lake. Key concepts:

Framing: The process of loading Delta column segments into VertiPaq memory. Happens when a report is first queried after data changes.
Fallback: If framing cannot complete (data too large for capacity), the query falls back to DirectQuery against the Lakehouse SQL endpoint.
No gateway required: Data lives in OneLake, not on-premises.
Requirements: Fabric capacity (F-SKU or P-SKU); tables must be Delta format in the Tables section of a Lakehouse.
Direct Lake on SQL endpoint: Queries the SQL analytics endpoint of a Lakehouse rather than reading raw Delta files - used when V-Order optimization applies.

Row-Level Security (RLS)

RLS restricts which data rows a user can see:

In Power BI Desktop: Modeling tab -> Manage Roles -> create role -> add DAX filter expression
Publish to the Service
In the Service: dataset Settings -> Row-level security -> assign users/groups to roles

// Static RLS filter  -  East region role
[Region] = "East"

// Dynamic RLS  -  user sees only their own data
[Email] = USERPRINCIPALNAME()

Dynamic RLS uses USERPRINCIPALNAME() or USERNAME() to filter rows based on the logged-in user's email. No separate role assignment needed per user - one role, one filter expression handles all users via a lookup table.

Object-Level Security (OLS)

OLS restricts visibility of entire tables or columns. Users assigned to a role with OLS restrictions cannot see those objects in reports at all - the field is hidden from the field list and any query referencing it returns an error.

OLS is configured in Tabular Editor (not directly in Power BI Desktop UI). Set the table or column permission to None for the role.

RLS vs OLS: RLS filters rows (user sees the table but fewer rows). OLS hides objects entirely (user cannot see the column or table exists). Both can be used together.

Composite Models

A composite model mixes Import and DirectQuery tables in the same semantic model. Use cases:

Import small dimension tables for speed; DirectQuery large fact tables from a data warehouse
Connect a Power BI dataset (another shared semantic model) alongside local Import tables
Combine Fabric Lakehouse (Direct Lake) with an on-premises DirectQuery source

Aggregations: Composite models support automatic aggregations - a summarized Import table speeds up common queries, with detailed DirectQuery as fallback. Query matching decides which layer to use based on the requested granularity.

← Take DP-600 Practice Tests | Back to Study Topics

ï¿½ï¿½

Search Tutorials