DP-600 - KQL and Eventhouse
Quick Navigation
Eventhouse Overview
An Eventhouse in Microsoft Fabric is a workspace item that hosts one or more KQL (Kusto Query Language) databases. It is built on the same engine as Azure Data Explorer (ADX) and is optimized for:
- High-throughput real-time data ingestion (IoT, telemetry, logs, clickstreams)
- Time-series analytics with sub-second query performance
- Column-store compressed storage for efficient time-window scans
- Free-text and regex search on log data
Choose Eventhouse over Lakehouse/Warehouse when data is:
- High frequency (millions of events per second)
- Time-indexed (queries almost always include time filters)
- Append-only (no updates to historical data)
- Requiring streaming analytics or near real-time dashboards
KQL Query Structure
KQL queries are written as a pipeline of tabular operators connected by the pipe character |:
TableName | operator1 ... | operator2 ... | operator3 ...
Every operator takes a table as input and produces a table as output. The result flows down the pipe. KQL is case-sensitive for table and column names (by default). String literals are case-insensitive in comparisons unless using _cs variants.
| where Timestamp > ago(1d)) to leverage the time index and avoid full table scans.where, project, take, top, order by
// Filter rows - equivalent to SQL WHERE StormEvents | where StartTime > ago(365d) | where State == "TEXAS" and EpisodeId > 100 // Select columns - equivalent to SQL SELECT | project EventType, StartTime, DamageProperty // project-away: keep everything except listed columns | project-away EndTime, Source // project-rename: rename columns | project-rename EventDate = StartTime // Limit rows (arbitrary, no order) | take 10 // Top n rows by a column | top 5 by DamageProperty desc // Sort (order by) | order by StartTime asc
summarize and Aggregations
// Group by State with count and sum
StormEvents
| summarize EventCount = count(), TotalDamage = sum(DamageProperty)
by State
| order by TotalDamage desc
// Aggregate without grouping (single row result)
StormEvents | summarize TotalEvents = count()
// Bin for time-series grouping (bin = time window)
StormEvents
| summarize Events = count() by bin(StartTime, 1d)
| render timechart
| Aggregate Function | Description |
|---|---|
count() | Count all rows |
sum(col) | Sum of values |
avg(col) | Average of values |
min(col) / max(col) | Minimum / maximum |
dcount(col) | Distinct count (approximate) |
countif(condition) | Count rows where condition is true |
make_list(col) | Collect values into a dynamic array |
percentile(col, p) | p-th percentile |
extend - Add Calculated Columns
StormEvents | extend DamageMillions = DamageProperty / 1000000.0 | extend Duration = EndTime - StartTime | extend IsMajorEvent = DamageProperty > 1000000 | extend StateUpper = toupper(State)
extend keeps all existing columns and adds new computed ones. Use project to select which columns to keep after extending.
String Functions
| Function | Description |
|---|---|
strlen(str) | Length of string |
toupper(str) | Uppercase conversion |
tolower(str) | Lowercase conversion |
substring(str, start, len) | Extract substring (0-based index) |
strcat(s1, s2, ...) | Concatenate strings |
trim(pattern, str) | Remove leading/trailing matches |
split(str, delimiter) | Split into dynamic array |
replace_string(str, old, new) | Replace substring |
indexof(str, find) | First position of substring (-1 if not found) |
has | Whole-word match using index (fast) |
contains | Substring match (slower, no index) |
startswith / endswith | Prefix / suffix check |
matches regex | Regular expression match |
StormEvents | where EventType has "flood" // word match (fast) | where Description contains "hurricane" // substring match | where State startswith "T" | extend EventTypeLower = tolower(EventType)
Datetime and Time Functions
// Current time
now() // current UTC datetime
ago(1d) // 1 day ago
ago(7d) // 7 days ago (d=day, h=hour, m=minute, s=second)
// Date parts
year(datetime_col)
month(datetime_col)
dayofmonth(datetime_col)
hourofday(datetime_col)
// Date arithmetic
datetime_add('day', 7, datetime_col) // add 7 days
datetime_diff('hour', end_dt, start_dt) // difference in hours
// Format
format_datetime(datetime_col, 'yyyy-MM-dd')
// Bin (group by time window)
bin(Timestamp, 1h) // 1-hour time bins
bin(Timestamp, 15m) // 15-minute bins
Timespans (duration values): 1d, 2h, 30m, 10s. Use timespan type for duration arithmetic.
join and union
// Inner join - match on key column
FactSales
| join kind=inner (
Products | project ProductID, ProductName, Category
) on ProductID
// Left outer join
Employees
| join kind=leftouter (
Departments | project DeptID, DeptName
) on DeptID
// Union - combine rows from two tables
union SalesQ1, SalesQ2, SalesQ3, SalesQ4
| summarize Total = sum(Amount)
// Union with source column
union withsource=Quarter SalesQ1, SalesQ2
| project Quarter, CustomerID, Amount
Join kinds: inner, leftouter, rightouter, fullouter, leftanti, rightanti, leftsemi, rightsemi.
For small lookup tables, use lookup instead of join for better performance.
Conditionals: iff() and case()
// iff - inline if-then-else
StormEvents
| extend SeverityLabel = iff(DamageProperty > 1000000, "Major", "Minor")
// case - multi-branch conditional
StormEvents
| extend SeverityLabel = case(
DamageProperty > 10000000, "Catastrophic",
DamageProperty > 1000000, "Major",
DamageProperty > 100000, "Moderate",
"Minor"
)
// isempty / isnotempty
| where isnotempty(Description)
| extend HasDesc = iff(isempty(Description), false, true)
Eventhouse and OneLake Integration
KQL database tables in Eventhouse can be mirrored to OneLake by enabling OneLake availability on the table. This:
- Continuously backs up KQL table data as Delta/Parquet files in OneLake
- Makes the data accessible as a Lakehouse shortcut
- Enables joining KQL data with Lakehouse/Warehouse data using Fabric cross-item queries
- Allows Power BI Direct Lake to read KQL data via the resulting Delta path
This is a key DP-600 pattern: ingest streaming real-time data into Eventhouse -> enable OneLake availability -> expose as Delta for batch analytics and Power BI reporting.
← Take DP-600 Practice Tests | Back to Study Topics