Conversation
b8b2af8 to
24947d9
Compare
|
Interesting, I like where this is going but I find the query building code a bit difficult to follow, how did you determine the number of preloads in your assessment? I understand that for "time:minute" and "time:hour" we don't need them at all, but otherwise aren't preloads idempotent at Ecto level? |
|
Another minor optimization we could also do there is, we don't need to query for |
I noticed that loading the /sites page resulted in a lot of duplicate DB queries in local server logs but it was pretty hard to parse out what was going on. I captured the logs from a single page refresh and asked claude to figure out what's going on. My previous PR was found, this PR is the second issue. Manually verified with logs of a single sparkline request.
Note that there are 5 queries to this branch: There is 1 query to
They are but only in the case that the result is stored and re-used. We often don't. For example: site = ...
Repo.preload(site, :completed_imports) # Fires DB query. Returns site with completed_imports preloaded
Repo.preload(site, :completed_imports) # Fires DB query again because the preload from last line was discarded
site = Repo.preload(site, :completed_imports) # Preload completed imports and store the site with preloaded data in variable
Repo.preload(site, :completed_imports) # Does not fire DB query since the site variable now has preloaded data
Nice! I hadn't considered that. I don't think it's that minor because it means we'll avoid JOINing with sessions in clickhouse queries for site cards. That would be a significant win. |
We can include it in this PR if you like via #6109 |
Currently a call to
Query.buildwill result in reduntantRepo.preload(site, :completed_imports)calls:This PR makes two changes to preloading
completed_imports::unsupported_intervalor:unsupported_query. The biggest win here is that for daily stats (interval == "hour") we will do 0 preloads as opposed to 2 or 3 postgres roundtrips./sites sparklines
I looked into this because the separate sparkline graph queries on
/sitescurrently result in 5 identical preloads per site card: 3 preloads forquery_24h_statsand 2 forquery_24h_intervals. With default page_size of 24 this will result in 120 postgres queries per page load when one per site would suffice.With this PR, this will cut it down from 5 identical redundant queries per site to 0 because with
interval == "hour"the preloads are not needed. For longer time ranges in works by @aerosol, it will be 2 queries per site. It would be possible to cut it down to 1 per site at the cost of some complexity. For that theSparklines.overview_24hfunction would have to figure out whether imports are supported for the query and if so, run the preload before callingquery_24h_statsandquery_24h_intervals.However, this would require exposing some of the Query internals and I felt like it isn't worth it at the moment. One duplicated (surely cached on postgres side) query per site is not terrible.