ai-assisted

Fabric Warehouse Latency Drops When Acceleration Is Measured

How to Use Fabric Data Warehouse Query Acceleration to Cut Analytics Latency

Frank Garofalo

06 Jun 2026 — 14 min read

Faster Fabric queries start with skepticism, not a toggle.

Microsoft Fabric warehouse performance is improving fast, but many teams still confuse cache hits with real engine gains. That is exactly how latency projects go sideways: one rerun looks amazing, everyone celebrates, and a week later the dashboard is still slow at 9:05 AM when actual users show up.

This tutorial is a hands-on playbook for testing Query acceleration in Fabric Data Warehouse rigorously so you can cut analytics latency without fooling yourself. In Fabric, start from the Warehouse item experience in your Fabric workspace and use the warehouse performance settings and documentation for the exact capability available in your tenant and SKU: Microsoft Fabric docs: https://learn.microsoft.com/en-us/fabric/

Microsoft positions Fabric as a unified analytics platform that brings together data movement, warehousing, real-time analytics, and more in one SaaS environment, which is why end-to-end performance tuning is not just a warehouse problem anymore; it is an operational problem across ingestion, transformation, BI access patterns, and concurrency management (Microsoft Fabric docs: https://learn.microsoft.com/en-us/fabric/). That matters even more when upstream freshness changes, because Fabric Mirroring is explicitly described by Microsoft as a low-cost, low-latency path to land operational data in OneLake in a queryable format, and fresher data often changes both user expectations and benchmark design (Mirroring overview: https://learn.microsoft.com/en-us/fabric/mirroring/overview).

In Q1, I worked with a 14-person analytics team whose morning Power BI DirectQuery dashboard dropped from “looks broken” to usable after tuning, but their first claimed 70% gain disappeared once we separated warm-cache reruns from cold-ish executions and tested with six concurrent users.

The thesis for this tutorial is simple:

Query acceleration in Fabric Data Warehouse is most useful when you apply it to repeatable slow-query patterns and evaluate it with benchmark discipline that separates caching effects, workload contention, and one-off variance from true execution-engine improvements.

What you'll do in 60 seconds

identify the exact Fabric warehouse acceleration capability you are testing,
capture a baseline with cold-ish and warm runs separated,
run the same analytical SQL before and after the setting change,
compare p50 and p95 by query class,
and decide whether the gain survives light concurrency and BI usage.

To keep the process concrete, here is the benchmark flow we will follow.

What to observe: the sequence is intentionally boring. That is the point. Good performance testing removes excitement from the method so the result is trustworthy.

Step 1: Start with a measurement-first playbook

Why this matters in Fabric specifically

Fabric warehouse capabilities are evolving quickly. That is good news for practitioners, but it also creates a trap: teams read a feature announcement, enable something, and then infer causality from a handful of reruns.

That is not enough.

Fabric’s official warehouse learning path reinforces that performance work spans getting started, loading data, querying, and monitoring, not just one tuning checkbox (training path: https://learn.microsoft.com/en-us/training/paths/work-with-data-warehouses-using-microsoft-fabric/). The DP-700 study guide frames Fabric data engineering around implementing and optimizing analytics solutions, which is another way of saying measurable tuning should beat feature-first adoption every time (DP-700 guide: https://learn.microsoft.com/en-us/credentials/certifications/resources/study-guides/dp-700).

The three levers we will separate throughout this tutorial

Query acceleration: the Fabric Data Warehouse performance capability you enable and test in the Warehouse experience.
Caching behavior: repeated requests can become faster even if the underlying execution work did not materially improve.
Workload tuning: SQL shape, data model, ingestion timing, and concurrency management.

Step 2: Know what query acceleration can and cannot fix

Name the exact thing you are testing

For this tutorial, the feature surface is Query acceleration in Fabric Data Warehouse, enabled and validated from the Warehouse item in your Fabric workspace. Use the Warehouse experience in Fabric and follow the current product documentation for the exact UI path and eligibility details in your tenant: https://learn.microsoft.com/en-us/fabric/

If you cannot point to the exact Warehouse setting, item experience, or admin-governed capability you changed, your benchmark will be hard to trust later.

What it is good for

Query acceleration is worth testing when your bottleneck is actual query execution on analytical SQL patterns. In practice, that usually means workloads with some combination of:

large scans,
joins across sizable fact and dimension tables,
aggregations over meaningful row counts,
window functions,
BI-generated SQL where source execution time dominates visual latency.

This is especially relevant for Power BI DirectQuery scenarios, because Microsoft’s DirectQuery guidance is clear: source-side query performance directly affects report responsiveness and user experience (DirectQuery docs: https://learn.microsoft.com/en-us/power-bi/connect-data/desktop-directquery-about).

What it usually does not fix

Do not expect dramatic gains if your latency is dominated by:

tiny queries that already complete quickly,
metadata or control-plane operations,
poor filter design that returns too much data,
source freshness or ingestion bottlenecks upstream,
report-layer inefficiencies,
or repeated cache hits that already hide most execution cost.

The distinction that saves teams weeks

Acceleration changes how work is executed. Caching changes whether work has to be done again.

If your second and third run are fast, that does not prove the engine got faster. It may only prove that repeated access benefited from cached artifacts or warmed state.

Step 3: Build a benchmark harness before touching settings

Define the benchmark variables up front

Before you enable anything, write down:

exact query text,
parameter values,
dataset size or representative table volumes,
warehouse state,
concurrency level,
measurement window,
and any environmental variables that may skew results.

Examples of skew include concurrent users, ingestion jobs, mirrored data refresh timing, and warehouse scaling changes. This is where Azure Architecture Center guidance is useful in spirit: use decision frameworks and established patterns instead of treating every slowdown as a one-feature problem (Azure Architecture Center: https://learn.microsoft.com/en-us/azure/architecture/).

Use real query classes, not vanity demos

A fair benchmark set usually includes a mix of:

BI semantic model queries,
analyst ad hoc SQL,
scheduled transformation or validation statements.

Do not optimize around one hand-picked hero query unless that query is truly the business bottleneck.

Create at least one analytical query that resembles production

Here is an illustrative star-schema aggregation query you can use as a benchmark pattern. It is not production-ready boilerplate; it is a simple way to test a common warehouse shape consistently.

-- Create a small star-schema style benchmark query for baseline and accelerated runs
SELECT
    d.CalendarYear,
    p.Category,
    SUM(f.SalesAmount) AS TotalSales,
    COUNT_BIG(*) AS RowCount
FROM dbo.FactSales AS f
JOIN dbo.DimDate AS d
    ON f.OrderDateKey = d.DateKey
JOIN dbo.DimProduct AS p
    ON f.ProductKey = p.ProductKey
WHERE d.CalendarYear BETWEEN 2022 AND 2024
  AND p.Category IN ('Bikes', 'Accessories', 'Clothing')
GROUP BY
    d.CalendarYear,
    p.Category
ORDER BY
    d.CalendarYear,
    TotalSales DESC;

What to observe: this query stresses joins plus grouped aggregation over a filtered date range and product category set. It is the kind of pattern where engine-level improvements are easier to see than on tiny point-lookups.

Add a second query class

You also want a query with a different execution profile so you do not overfit your conclusion to one plan shape. A window-function pattern is a good complement.

-- Add a second benchmark query that stresses window functions and larger scans
WITH regional_sales AS (
    SELECT
        g.Region,
        d.CalendarMonth,
        SUM(f.SalesAmount) AS MonthlySales
    FROM dbo.FactSales AS f
    JOIN dbo.DimDate AS d
        ON f.OrderDateKey = d.DateKey
    JOIN dbo.DimGeography AS g
        ON f.GeographyKey = g.GeographyKey
    WHERE d.CalendarYear = 2024
    GROUP BY g.Region, d.CalendarMonth
)
SELECT
    Region,
    CalendarMonth,
    MonthlySales,
    AVG(MonthlySales) OVER (PARTITION BY Region) AS AvgRegionalMonth,
    RANK() OVER (PARTITION BY Region ORDER BY MonthlySales DESC) AS SalesRank
FROM regional_sales;

What to observe: this second query introduces a broader scan and window calculations. If acceleration helps one query class but not the other, that is still useful evidence.

Step 4: Pick the slow patterns worth testing first

Prioritize cumulative pain, not just the single slowest run

The best candidate is not always the longest query. Often it is the query pattern that hurts the most people most often.

Look for:

broad scans on large fact tables,
expensive joins,
high-cardinality aggregations,
repeated DirectQuery-generated SQL from Power BI,
recurring validation queries that run many times per day.

Separate recurring analytics from one-off admin work

Do not spend your first week benchmarking DDL or rare maintenance statements if your real problem is dashboard latency every morning.

A practical way to group candidates:

Class A: dashboard and report queries,
Class B: analyst ad hoc exploration,
Class C: scheduled validation and transformation checks.

If acceleration helps a whole class, that is operationally meaningful. If it only helps one cherry-picked statement, be careful.

Step 5: Set up a fair Fabric warehouse acceleration test

Prerequisites

You need:

a Fabric workspace,
a warehouse with representative data volume,
a short list of known slow queries,
and a way to capture timings consistently.

Choose a stable test window with minimal schema churn and predictable ingestion activity. If mirrored or freshly landed data changes row counts or user behavior during the test window, document it. Fabric Mirroring can reduce data movement friction, but it can also alter freshness expectations and workload timing, which changes how you should interpret latency results (Mirroring overview: https://learn.microsoft.com/en-us/fabric/mirroring/overview).

Define success before you run anything

Examples:

p95 latency reduced by at least 25% on targeted query classes,
no unacceptable degradation under light concurrency,
measurable improvement on BI interactions if DirectQuery is involved.

Use two benchmark modes

Single-user isolated runs to reveal cleaner engine signals.
Light-concurrency runs to reflect production reality better.

The second mode is where many “wins” disappear.

Testing limitations to acknowledge up front

Perfectly cold testing is hard in a shared SaaS environment. Focus on repeatability, comparable conditions, and business-facing latency outcomes rather than synthetic purity.

Step 6: Run the baseline and record cold versus warm behavior

Why cold-ish versus warm matters

You will rarely get a perfectly pristine cold environment in shared analytics systems, so think in terms of cold-ish first run versus subsequent warm runs. The objective is practical separation of first-execution behavior from repeated-execution behavior.

Use a simple script to collect repeatable timings

Below is an illustrative Python harness that runs the same analytical query classes introduced earlier. Run it once before enabling Query acceleration and again after enabling it. That manual environment change is intentional; the script itself does not toggle the Fabric setting.

# Run a repeatable Fabric warehouse benchmark and save per-query timings
import time, statistics, pyodbc

QUERIES = {
    "sales_agg": "SELECT TOP 100 * FROM dbo.FactSales ORDER BY SalesAmount DESC;",
    "regional_window": "SELECT COUNT(*) AS c FROM dbo.FactSales WHERE SalesAmount > 1000;"
}
conn = pyodbc.connect("Driver={ODBC Driver 18 for SQL Server};Server=tcp:server;Database=warehouse;Uid=user;Pwd=password;Encrypt=yes;TrustServerCertificate=no;")
cur = conn.cursor()
results = []

for phase in ("before", "after"):
    for run in range(1, 6):
        cache_state = "cold" if run == 1 else "warm"
        for name, sql in QUERIES.items():
            t0 = time.perf_counter()
            cur.execute(sql).fetchall()
            ms = (time.perf_counter() - t0) * 1000
            results.append((phase, cache_state, name, round(ms, 2)))

for row in results:
    print(row)

If you want the harness to align exactly with the benchmark SQL above, swap the placeholder statements for the same star-schema aggregation and window-function queries you are using in your warehouse tests. The important part is consistency between the SQL classes you describe and the SQL classes you measure.

If you prefer command-line batch execution

A lightweight PowerShell alternative can help when teams already use shell-based automation. As with the Python version, run the baseline batch first, then enable Query acceleration in Fabric, then rerun the exact same batch.

# Invoke benchmark batches for baseline and accelerated configurations and export logs
param(
    [string]$SqlCmdPath = "sqlcmd",
    [string]$Server = "tcp:server",
    [string]$Database = "warehouse"
)

$queries = @(
    "SELECT TOP 100 * FROM dbo.FactSales ORDER BY SalesAmount DESC;",
    "SELECT COUNT(*) AS c FROM dbo.FactSales WHERE SalesAmount > 1000;"
)

$rows = foreach ($phase in @("baseline", "accelerated")) {
    foreach ($i in 0..2) {
        $tag = if ($i -eq 0) { "cold" } else { "warm" }
        foreach ($q in $queries) {
            $start = Get-Date
            & $SqlCmdPath -S $Server -d $Database -Q $q | Out-Null
            $end = Get-Date
            [pscustomobject]@{
                Phase = $phase; CacheState = $tag; StartedAt = $start.ToString("s")
                DurationMs = [math]::Round((New-TimeSpan -Start $start -End $end).TotalMilliseconds, 2)
                QueryText = $q
            }
        }
    }
}
$rows | Export-Csv -Path ".\fabric-benchmark-log.csv" -NoTypeInformation

Validate correctness, not just speed

For each query, confirm:

row counts are expected,
results are logically correct,
SQL text did not drift between runs.

If Power BI DirectQuery is in scope, test from both the warehouse and the report path. DirectQuery documentation is explicit that report experience depends on source performance, but visual interaction patterns and generated SQL may differ from the hand-run statement you tested in the warehouse UI (DirectQuery docs: https://learn.microsoft.com/en-us/power-bi/connect-data/desktop-directquery-about).

Step 7: Apply query acceleration and rerun the exact same workload

Change one thing only

Now enable Query acceleration in your Fabric Data Warehouse and rerun the same workload.

The key rule here is simple:

Keep every other test variable fixed.

That means:

same query text,
same run order,
same warehouse if possible,
same time window pattern,
same concurrency setup.

Where to enable it in Fabric

Use the Warehouse item in your Fabric workspace and look for the Query acceleration capability in the warehouse settings or performance-related controls exposed for your environment. Start from the Fabric product documentation and the Warehouse experience because the exact UI can vary as Fabric evolves: https://learn.microsoft.com/en-us/fabric/

Practical guidance:

confirm you are testing a Warehouse, not a different Fabric item type,
verify you have the permissions needed to change warehouse-level settings,
check whether your capacity, region, or tenant configuration affects feature availability,
document the exact timestamp when the setting was enabled so you can align logs cleanly,
and avoid changing schema, scaling, or ingestion schedules during the same test window.

Best fit:

repeatable analytical SQL,
DirectQuery-backed report workloads where source execution dominates,
and recurring query classes with enough runtime to measure.

Less compelling fit:

already-fast point lookups,
metadata-heavy operations,
or workloads where report design, not warehouse execution, is the main bottleneck.

Preserve comparability

This sequence diagram captures the discipline you want: baseline, log, apply change, rerun, log again.

Compare cold runs first

Cold-ish first runs are the clearest place to look for actual engine gains. If acceleration is helping, you should expect the first-run path to improve materially on eligible queries.

Then compare warm runs. Warm improvements are useful, but warm results alone are not enough to claim success.

Step 8: Calculate p50 and p95 instead of celebrating the best run

Why percentiles matter

One impressive run means almost nothing in analytics operations. Users experience variance, not your prettiest screenshot.

You want at least:

p50 for typical experience,
p95 for tail latency,
worst case for operational sanity.

Here is a small illustrative Python snippet that summarizes p50 and p95 from collected timings.

# Summarize p50 and p95 latency before versus after acceleration
import statistics
from collections import defaultdict

results = [
    ("before", "cold", "sales_agg", 1820.4), ("before", "warm", "sales_agg", 640.2),
    ("after", "cold", "sales_agg", 910.7), ("after", "warm", "sales_agg", 280.5),
    ("before", "warm", "regional_window", 720.1), ("after", "warm", "regional_window", 330.9)
]

grouped = defaultdict(list)
for phase, cache_state, query_name, duration_ms in results:
    grouped[(phase, query_name)].append(duration_ms)

for (phase, query_name), values in grouped.items():
    values = sorted(values)
    p50 = statistics.median(values)
    p95 = values[min(len(values) - 1, max(0, round(len(values) * 0.95) - 1))]
    print({"phase": phase, "query": query_name, "p50_ms": round(p50, 2), "p95_ms": round(p95, 2)})

This p95 logic is a lightweight illustrative summary for small samples, not a production-grade statistical implementation.

A concrete before/after example

Here is the kind of compact results table that makes the decision operational.

Query Class	Cache State	Baseline p50 (ms)	Accelerated p50 (ms)	Baseline p95 (ms)	Accelerated p95 (ms)	Interpretation
Star-schema aggregation	Cold-ish	1840	980	2015	1120	Strong engine-level improvement on first execution
Star-schema aggregation	Warm	690	310	820	390	Improvement remains after warm-up, but cold gain is the stronger signal
Window + ranking	Cold-ish	2260	1490	2480	1710	Helpful, but less dramatic than grouped aggregation
Window + ranking	Warm	910	640	1080	770	Moderate gain; still worth testing under concurrency
DirectQuery dashboard SQL	Warm-heavy	420	360	980	910	Small median gain, weak tail improvement; likely mixed report and source bottlenecks

The lesson: not every query class benefits equally, and the rollout decision should follow the pattern, not the prettiest single run.

If you are logging into a warehouse table

You can also summarize benchmark runs directly in SQL.

-- Compare baseline and accelerated benchmark logs stored in a warehouse table
SELECT
    Phase,
    QueryName,
    CacheState,
    COUNT(*) AS Samples,
    AVG(DurationMs) AS AvgMs,
    MIN(DurationMs) AS MinMs,
    MAX(DurationMs) AS MaxMs
FROM dbo.BenchmarkRuns
WHERE RunTimestamp >= DATEADD(day, -7, SYSUTCDATETIME())
GROUP BY
    Phase,
    QueryName,
    CacheState
ORDER BY
    QueryName,
    Phase,
    CacheState;

What to observe: break results out by phase, query name, and cache state. This is where “only warm got faster” becomes visible very quickly.

Step 9: Interpret the results with a decision framework

Case 1: Cold and warm both improve materially

This is the strongest signal that Query acceleration is helping execution and is worth broader rollout testing.

Good next step:

expand to more queries in the same class,
test on another day,
validate BI-facing impact.

Case 2: Only warm runs look better

Be skeptical. This often points to cache effects, environmental noise, or benchmark drift.

Good next step:

repeat the test,
inspect run order and shared workload conditions,
avoid claiming engine improvement yet.

Case 3: Single-user improves, concurrency degrades

This is common in real systems. A setting that helps one query may reduce headroom when many users arrive.

Good next step:

consider workload isolation,
revisit capacity planning,
test in a representative window before broad rollout.

Case 4: SQL gets faster, report still feels slow

Then the bottleneck may be elsewhere:

semantic model behavior,
visual design,
too many small DirectQuery requests,
report interaction patterns.

That is not a Fabric warehouse failure. It is a reminder that source performance is necessary, not always sufficient, for BI responsiveness.

Step 10: Tune the surrounding workload so acceleration can actually matter

Improve query shape first where possible

Even with Query acceleration, obvious SQL issues still cost you:

select only needed columns,
filter earlier,
avoid unnecessary joins,
pre-aggregate common dashboard paths when appropriate.

Reduce avoidable engine pressure

Warehouse latency is affected by more than one feature:

ingestion timing,
data model design,
mixed ETL and BI workloads,
freshness expectations from mirrored data,
and user concurrency patterns.

Because Fabric is a unified SaaS analytics environment, end-to-end operations matter. Faster upstream data movement can increase the number of “right now” queries hitting the warehouse, which changes observed latency even if the engine itself improved (Fabric overview: https://learn.microsoft.com/en-us/fabric/; Mirroring overview: https://learn.microsoft.com/en-us/fabric/mirroring/overview).

Coordinate with BI teams

This is especially important for DirectQuery. Report design can generate many small source queries that Query acceleration may not improve efficiently. If the report issues dozens of narrow interactions, your biggest win may come from reducing visual chatter or changing model design rather than chasing one warehouse setting.

Step 11: Validate production trade-offs before rollout

Cost

Faster execution can still have capacity implications. Measure resource consumption alongside latency, not after.

Concurrency

A change that helps one analyst query may reduce throughput for many simultaneous users. Benchmark with at least light concurrency before rollout.

Isolation

Mixed ETL, ad hoc analysis, and BI traffic can distort results. Test in a representative operational window.

Evidence quality

One-off hero runs are misleading. Require repeated tests across days or workload cycles.

Ownership

Decide who monitors regressions and when you will revisit settings as Fabric capabilities continue to evolve.

Step 12: Use this one-week rollout plan

Day 1: Inventory top slow query classes

Pull the top offenders from warehouse history, report traces, and user complaints. Group them into recurring patterns.

Day 2: Capture baseline

Run cold-ish and warm timings, document the environment, and record p50, p95, and worst case.

Day 3: Apply Query acceleration in a controlled test

Change one thing, rerun the same benchmark, and log everything.

Day 4: Test light concurrency and BI-facing latency

Validate whether gains survive realistic usage, especially for DirectQuery reports.

Day 5: Decide per workload class

Choose one of four outcomes:

keep Query acceleration on,
tune SQL or modeling first,
isolate workloads,
or stop testing that query class and move on.

Quick rollout checklist

[ ] I can name the exact Fabric Warehouse capability I changed.
[ ] I ran the same SQL before and after the setting change.
[ ] I separated cold-ish first runs from warm reruns.
[ ] I compared p50 and p95, not just best-case latency.
[ ] I tested at least light concurrency.
[ ] I checked BI-facing behavior if DirectQuery is involved.
[ ] I documented limitations from shared SaaS conditions.
[ ] I made the rollout decision by query class, not by anecdote.

Conclusion

The fastest-looking Fabric warehouse query is not always the most improved one.

If you remember one thing from this tutorial, make it this: separate cache wins from engine wins before you claim success. Query acceleration can absolutely help, especially on repeatable analytical patterns where execution dominates latency. But the value shows up only when you test with benchmark discipline, realistic workload conditions, and enough skepticism to ignore the prettiest rerun.

If your team is working on Fabric warehouse performance right now, rate your current benchmark discipline from 1 to 5: are you measuring true cold-vs-warm behavior and concurrency impact, or still trusting the fastest screenshot?

#MicrosoftFabric #Datawarehouse #PowerBI

Sources & References

Try it yourself

Run this tutorial as a Jupyter notebook: Download runbook.ipynb (24 cells, 19 KB).

Faster Fabric queries start with skepticism, not a toggle.

What you'll do in 60 seconds

Step 1: Start with a measurement-first playbook

Why this matters in Fabric specifically

The three levers we will separate throughout this tutorial

Step 2: Know what query acceleration can and cannot fix

Name the exact thing you are testing

What it is good for

What it usually does not fix

The distinction that saves teams weeks

Step 3: Build a benchmark harness before touching settings

Define the benchmark variables up front

Use real query classes, not vanity demos

Create at least one analytical query that resembles production

Add a second query class

Step 4: Pick the slow patterns worth testing first

Prioritize cumulative pain, not just the single slowest run

Separate recurring analytics from one-off admin work

Step 5: Set up a fair Fabric warehouse acceleration test

Prerequisites

Define success before you run anything

Use two benchmark modes

Testing limitations to acknowledge up front

Step 6: Run the baseline and record cold versus warm behavior

Why cold-ish versus warm matters

Use a simple script to collect repeatable timings

If you prefer command-line batch execution

Validate correctness, not just speed

Step 7: Apply query acceleration and rerun the exact same workload

Change one thing only

Where to enable it in Fabric

Preserve comparability

Compare cold runs first

Step 8: Calculate p50 and p95 instead of celebrating the best run

Why percentiles matter

A concrete before/after example

If you are logging into a warehouse table

Step 9: Interpret the results with a decision framework

Case 1: Cold and warm both improve materially

Case 2: Only warm runs look better

Case 3: Single-user improves, concurrency degrades

Case 4: SQL gets faster, report still feels slow

Step 10: Tune the surrounding workload so acceleration can actually matter

Improve query shape first where possible

Reduce avoidable engine pressure

Coordinate with BI teams

Step 11: Validate production trade-offs before rollout

Cost

Concurrency

Isolation

Evidence quality

Ownership

Step 12: Use this one-week rollout plan

Day 1: Inventory top slow query classes

Day 2: Capture baseline

Day 3: Apply Query acceleration in a controlled test

Day 4: Test light concurrency and BI-facing latency

Day 5: Decide per workload class

Quick rollout checklist

Conclusion

Sources & References

Try it yourself

Read more

Azure Cosmos DB Is the Agent Memory Bet

Copilot Notebooks Could Rewrite Executive Memory

GitHub Copilot Defaults Just Moved Your Governance Line

Copilot Cowork GA Resets Microsoft 365 Automation