You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
caching: do not try to fill the gap in log results cache when the new query interval does not overlap the cached query interval (#9757)
**What this PR does / why we need it**:
Currently, when we find a relevant cached negative response for a logs
query, we do the following:
* If the cached query completely covers the new query:
* return back an empty response.
* else:
* fill the gaps on either/both sides of the cached query.
The problem with filling the gaps is that when the cached query does not
overlap at all with the new query, we have to extend the query beyond
what the query requests for. However, with the logs query, we have a
limit on the number of lines we can send back in the response. So, this
could result in the query response having logs which were not requested
by the query, which then get filtered out by the [response
extractor](https://github.com/grafana/loki/blob/b78d3f05525d8bcab13e621bc2e5851aadc8fc91/pkg/querier/queryrange/log_result_cache.go#L299),
unexpectedly resulting in an empty response. For example, if the query
was cached for start=15, end=20 and we get a `backwards` query for
start=5, end=10. To fill the gap, the query would be executed for
start=5, end=15. Now, if we have logs more than the query `limit` in the
range 10-15, we would filter out all the data in the response extractor
and send back an empty response to the user.
This PR fixes the issue by doing the following changes when handling
cache hit:
* If the cached query completely covers the new query:
* return back an empty response[_existing_].
* else if the cached query does not overlap with the new query:
* do the new query as requested.
* If the new query results in an empty response and has a higher
interval than the cached query:
* update the cache
* else:
* query the data for missing intervals on both/either side[_existing_]
* update the cache with extended intervals if the new queries resulted
in an empty response[_existing_]
**Special notes for your reviewer**:
We could do further improvements in the handling of queries not
overlapping with cached query by selectively extending the queries based
on query direction and cached query lying before/after the new query.
For example, if the new query is doing `backwards` query and the
`cachedQuery.End` < `newQuery.Start`, it should be okay to extend the
query and do `cachedQuery.End` to `newQuery.End` to fill the cache since
query would first fill the most relevant data before hitting the limits.
I did not want to complicate the fix so went without implementing this
approach. We can revisit later if we feel we need to improve our
caching.
**Checklist**
- [x] Tests updated
- [x] `CHANGELOG.md` updated
---------
Co-authored-by: Travis Patterson <travis.patterson@grafana.com>
Copy file name to clipboardExpand all lines: CHANGELOG.md
+1Lines changed: 1 addition & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -56,6 +56,7 @@
56
56
*[9495](https://github.com/grafana/loki/pull/9495)**thampiotr**: Promtail: Fix potential goroutine leak in file tailer.
57
57
*[9650](https://github.com/grafana/loki/pull/9650)**ashwanthgoli**: Config: ensure storage config defaults apply to named stores.
58
58
*[9629](https://github.com/grafana/loki/pull/9629)**periklis**: Fix duplicate label values from ingester streams.
59
+
*[9757](https://github.com/grafana/loki/pull/9757)**sandeepsukhani**: Frontend Caching: Fix a bug in negative logs results cache causing Loki to unexpectedly send empty/incorrect results.
59
60
*[9754](https://github.com/grafana/loki/pull/9754)**ashwanthgoli**: Fixes an issue with indexes becoming unqueriable if the index prefix is different from the one configured in the latest period config.
60
61
*[9763](https://github.com/grafana/loki/pull/9763)**ssncferreira**: Fix the logic of the `offset` operator for downstream queries on instant query splitting of (range) vector aggregation expressions containing an offset.
61
62
*[9773](https://github.com/grafana/loki/pull/9773)**ssncferreira**: Fix instant query summary statistic's `splits` corresponding to the number of subqueries a query is split into based on `split_queries_by_interval`.
returnfmt.Errorf("unexpected response type %T", resp)
227
-
}
228
-
returnnil
229
-
})
230
-
}
231
190
232
-
iferr:=g.Wait(); err!=nil {
233
-
returnnil, err
234
-
}
191
+
updateCache:=false
192
+
// if the query does not overlap cached interval, do not try to fill the gap since it requires extending the queries beyond what is requested in the query.
193
+
// Extending the queries beyond what is requested could result in empty responses due to response limit set in the queries.
0 commit comments