Search Tutorials


DP-600 KQL and Eventhouse | Microsoft Fabric Analytics Engineer | JavaInUse

DP-600 - KQL and Eventhouse

Eventhouse Overview

An Eventhouse in Microsoft Fabric is a workspace item that hosts one or more KQL (Kusto Query Language) databases. It is built on the same engine as Azure Data Explorer (ADX) and is optimized for:

  • High-throughput real-time data ingestion (IoT, telemetry, logs, clickstreams)
  • Time-series analytics with sub-second query performance
  • Column-store compressed storage for efficient time-window scans
  • Free-text and regex search on log data

Choose Eventhouse over Lakehouse/Warehouse when data is:

  • High frequency (millions of events per second)
  • Time-indexed (queries almost always include time filters)
  • Append-only (no updates to historical data)
  • Requiring streaming analytics or near real-time dashboards

KQL Query Structure

KQL queries are written as a pipeline of tabular operators connected by the pipe character |:

TableName
| operator1 ...
| operator2 ...
| operator3 ...

Every operator takes a table as input and produces a table as output. The result flows down the pipe. KQL is case-sensitive for table and column names (by default). String literals are case-insensitive in comparisons unless using _cs variants.

KQL is optimized for time-filter queries. Always include a time filter first in your pipeline (e.g., | where Timestamp > ago(1d)) to leverage the time index and avoid full table scans.

where, project, take, top, order by

// Filter rows  -  equivalent to SQL WHERE
StormEvents
| where StartTime > ago(365d)
| where State == "TEXAS" and EpisodeId > 100

// Select columns  -  equivalent to SQL SELECT
| project EventType, StartTime, DamageProperty

// project-away: keep everything except listed columns
| project-away EndTime, Source

// project-rename: rename columns
| project-rename EventDate = StartTime

// Limit rows (arbitrary, no order)
| take 10

// Top n rows by a column
| top 5 by DamageProperty desc

// Sort (order by)
| order by StartTime asc

summarize and Aggregations

// Group by State with count and sum
StormEvents
| summarize EventCount = count(), TotalDamage = sum(DamageProperty)
    by State
| order by TotalDamage desc

// Aggregate without grouping (single row result)
StormEvents | summarize TotalEvents = count()

// Bin for time-series grouping (bin = time window)
StormEvents
| summarize Events = count() by bin(StartTime, 1d)
| render timechart
Aggregate FunctionDescription
count()Count all rows
sum(col)Sum of values
avg(col)Average of values
min(col) / max(col)Minimum / maximum
dcount(col)Distinct count (approximate)
countif(condition)Count rows where condition is true
make_list(col)Collect values into a dynamic array
percentile(col, p)p-th percentile

extend - Add Calculated Columns

StormEvents
| extend DamageMillions = DamageProperty / 1000000.0
| extend Duration = EndTime - StartTime
| extend IsMajorEvent = DamageProperty > 1000000
| extend StateUpper = toupper(State)

extend keeps all existing columns and adds new computed ones. Use project to select which columns to keep after extending.

String Functions

FunctionDescription
strlen(str)Length of string
toupper(str)Uppercase conversion
tolower(str)Lowercase conversion
substring(str, start, len)Extract substring (0-based index)
strcat(s1, s2, ...)Concatenate strings
trim(pattern, str)Remove leading/trailing matches
split(str, delimiter)Split into dynamic array
replace_string(str, old, new)Replace substring
indexof(str, find)First position of substring (-1 if not found)
hasWhole-word match using index (fast)
containsSubstring match (slower, no index)
startswith / endswithPrefix / suffix check
matches regexRegular expression match
StormEvents
| where EventType has "flood"               // word match (fast)
| where Description contains "hurricane"    // substring match
| where State startswith "T"
| extend EventTypeLower = tolower(EventType)

Datetime and Time Functions

// Current time
now()                                   // current UTC datetime
ago(1d)                                 // 1 day ago
ago(7d)                                 // 7 days ago (d=day, h=hour, m=minute, s=second)

// Date parts
year(datetime_col)
month(datetime_col)
dayofmonth(datetime_col)
hourofday(datetime_col)

// Date arithmetic
datetime_add('day', 7, datetime_col)    // add 7 days
datetime_diff('hour', end_dt, start_dt) // difference in hours

// Format
format_datetime(datetime_col, 'yyyy-MM-dd')

// Bin (group by time window)
bin(Timestamp, 1h)                      // 1-hour time bins
bin(Timestamp, 15m)                     // 15-minute bins

Timespans (duration values): 1d, 2h, 30m, 10s. Use timespan type for duration arithmetic.

join and union

// Inner join  -  match on key column
FactSales
| join kind=inner (
    Products | project ProductID, ProductName, Category
) on ProductID

// Left outer join
Employees
| join kind=leftouter (
    Departments | project DeptID, DeptName
) on DeptID

// Union  -  combine rows from two tables
union SalesQ1, SalesQ2, SalesQ3, SalesQ4
| summarize Total = sum(Amount)

// Union with source column
union withsource=Quarter SalesQ1, SalesQ2
| project Quarter, CustomerID, Amount

Join kinds: inner, leftouter, rightouter, fullouter, leftanti, rightanti, leftsemi, rightsemi.
For small lookup tables, use lookup instead of join for better performance.

Conditionals: iff() and case()

// iff  -  inline if-then-else
StormEvents
| extend SeverityLabel = iff(DamageProperty > 1000000, "Major", "Minor")

// case  -  multi-branch conditional
StormEvents
| extend SeverityLabel = case(
    DamageProperty > 10000000, "Catastrophic",
    DamageProperty > 1000000, "Major",
    DamageProperty > 100000, "Moderate",
    "Minor"
)

// isempty / isnotempty
| where isnotempty(Description)
| extend HasDesc = iff(isempty(Description), false, true)

Eventhouse and OneLake Integration

KQL database tables in Eventhouse can be mirrored to OneLake by enabling OneLake availability on the table. This:

  • Continuously backs up KQL table data as Delta/Parquet files in OneLake
  • Makes the data accessible as a Lakehouse shortcut
  • Enables joining KQL data with Lakehouse/Warehouse data using Fabric cross-item queries
  • Allows Power BI Direct Lake to read KQL data via the resulting Delta path

This is a key DP-600 pattern: ingest streaming real-time data into Eventhouse -> enable OneLake availability -> expose as Delta for batch analytics and Power BI reporting.

← Take DP-600 Practice Tests  |  Back to Study Topics

Popular Posts

��