Don't miss your chance to take the Fabric Data Engineer (DP-700) exam on us!
Learn moreWe've captured the moments from FabCon & SQLCon that everyone is talking about, and we are bringing them to the community, live and on-demand. Starts on April 14th. Register now
As we are implementing an enterprise-scale Power BI solution on Microsoft Fabric using DirectLake semantic models over Lakehouse Delta tables for an insurance analytics use case.
During implementation, we are observing multiple non-obvious behaviors where official documentation is limited.
Sometimes data appears only after a manual dataset refresh.It appears after several minutes without any action.
Note: No incremental refresh is configured (not supported in DirectLake)
Please calrify the above observations and below questions.
1) How does DirectLake detect new Delta commits internally?
2) Is there a way to programmatically force DirectLake to re-read the latest Delta snapshot?
Solved! Go to Solution.
Hi @Savir,
When using DirectLake semantic models in Microsoft Fabric over Lakehouse Delta tables, the behavior you’re observing is expected and stems from how DirectLake controls data visibility. DirectLake does not continuously monitor Delta tables for changes. Instead, it only becomes aware of new data during a framing operation.
During framing, the semantic model reads the Delta transaction log (_delta_log) and records which Parquet files make up the latest committed snapshot of each table. All queries run strictly against that recorded snapshot. If new Delta commits occur after framing, they remain invisible until framing happens again. This explains why data sometimes appears only after a manual refresh, or “magically” after a few minutes—Fabric is performing an automatic background framing cycle, but the timing is asynchronous and capacity-dependent.
If you need deterministic control, you can explicitly force DirectLake to re-read the latest Delta snapshot. This is done by triggering a semantic model refresh in Power BI, which for DirectLake does not reload data but forces a new framing operation. You can do this manually, on a schedule via the model’s refresh settings, or programmatically using the Power BI REST API or a Fabric pipeline activity.
If data constantly changes in your lakehouse using pipelines, you can use the Semantic model refresh pipeline activity to trigger the framing process.
In a notebook you can use Semantic Link library to trigger the framing process, either on the full model or just specific tables.
import sempy.fabric as fabric
# Define the dataset and workspace
dataset = "YourDatasetName"
workspace = "YourWorkspaceName"
# Objects to refresh
objects_to_refresh = [
{"table": "YourTableName"}
]
# Refresh the dataset
fabric.refresh_dataset(workspace=workspace, dataset=dataset, objects=objects_to_refresh)
# List the refresh requests
fabric.list_refresh_requests(dataset=dataset, workspace=workspace)
Hope this helps. If so, please give kudos 👍 and mark as Accepted Solution ✔️ to help others.
If you enable this option, please make sure your data writing job to OneLake happens in one go without any significant delays. Otherwise you might be exposing incomplete data to the data consumers.
For example, if you write data to the header table, but writing data to the details table has a significant delay after the header table, the consumer will see the header information without any existing details. If this might be the case, explicitly invoking a refresh is the best solution instead of using the automatic refresh.
Hope this helps. If so, please give kudos 👍 and mark as Accepted Solution ✔️ to help others. If you resolved your question, let us know what worked for you.
Hi @Savir,
Checking in to see if your issue has been resolved. let us know if you still need any assistance.
Thank you.
HI @Savir,
Have you had a chance to review the solution we shared by @cengizhanarslan @nielsvdc ? If the issue persists, feel free to reply so we can help further.
Thank you.
In you semantic model options, if you enable "Keep your Direct Lake dat aup to date" the model will always the last version of the delta table. If you disable it, it will only use the exact version when to actually refreshed your semantic model.
If you enable this option, please make sure your data writing job to OneLake happens in one go without any significant delays. Otherwise you might be exposing incomplete data to the data consumers.
For example, if you write data to the header table, but writing data to the details table has a significant delay after the header table, the consumer will see the header information without any existing details. If this might be the case, explicitly invoking a refresh is the best solution instead of using the automatic refresh.
Hope this helps. If so, please give kudos 👍 and mark as Accepted Solution ✔️ to help others. If you resolved your question, let us know what worked for you.
Hi @Savir,
When using DirectLake semantic models in Microsoft Fabric over Lakehouse Delta tables, the behavior you’re observing is expected and stems from how DirectLake controls data visibility. DirectLake does not continuously monitor Delta tables for changes. Instead, it only becomes aware of new data during a framing operation.
During framing, the semantic model reads the Delta transaction log (_delta_log) and records which Parquet files make up the latest committed snapshot of each table. All queries run strictly against that recorded snapshot. If new Delta commits occur after framing, they remain invisible until framing happens again. This explains why data sometimes appears only after a manual refresh, or “magically” after a few minutes—Fabric is performing an automatic background framing cycle, but the timing is asynchronous and capacity-dependent.
If you need deterministic control, you can explicitly force DirectLake to re-read the latest Delta snapshot. This is done by triggering a semantic model refresh in Power BI, which for DirectLake does not reload data but forces a new framing operation. You can do this manually, on a schedule via the model’s refresh settings, or programmatically using the Power BI REST API or a Fabric pipeline activity.
If data constantly changes in your lakehouse using pipelines, you can use the Semantic model refresh pipeline activity to trigger the framing process.
In a notebook you can use Semantic Link library to trigger the framing process, either on the full model or just specific tables.
import sempy.fabric as fabric
# Define the dataset and workspace
dataset = "YourDatasetName"
workspace = "YourWorkspaceName"
# Objects to refresh
objects_to_refresh = [
{"table": "YourTableName"}
]
# Refresh the dataset
fabric.refresh_dataset(workspace=workspace, dataset=dataset, objects=objects_to_refresh)
# List the refresh requests
fabric.list_refresh_requests(dataset=dataset, workspace=workspace)
Hope this helps. If so, please give kudos 👍 and mark as Accepted Solution ✔️ to help others.
If you have recently started exploring Fabric, we'd love to hear how it's going. Your feedback can help with product improvements.
A new Power BI DataViz World Championship is coming this June! Don't miss out on submitting your entry.
Share feedback directly with Fabric product managers, participate in targeted research studies and influence the Fabric roadmap.
| User | Count |
|---|---|
| 51 | |
| 37 | |
| 35 | |
| 19 | |
| 17 |
| User | Count |
|---|---|
| 69 | |
| 65 | |
| 39 | |
| 33 | |
| 23 |