The fundamental issue here is that using the "label_values" query variable definitions will query all time series that exist in Prometheus, which IIRC defaults to 15 days, meaning that if a bucket ever existed in that time it will be given as a candidate (similar for nodes, clusters...).
I imagine we could switch to using "query_result" with $__range, to constrain it to the time range select in the dashboards, that way it'd only pick up time series that have existed in that time period. My concern is that this would increase load on Prometheus, because I imagine /api/v1/labels/$label/values is considerably faster than parsing and executing PromQL. There's also the issue that this may lead to confusing cases of nodes becoming unavailable to select, if e.g. that node goes down for whatever reason and Prometheus doesn't scrape it during the $__range period.
There's no good solutions here, just multiple bad ones, and we need to pick the least bad of those.
The fundamental issue here is that using the "label_values" query variable definitions will query all time series that exist in Prometheus, which IIRC defaults to 15 days, meaning that if a bucket ever existed in that time it will be given as a candidate (similar for nodes, clusters...).
I imagine we could switch to using "query_result" with $__range, to constrain it to the time range select in the dashboards, that way it'd only pick up time series that have existed in that time period. My concern is that this would increase load on Prometheus, because I imagine /api/v1/labels/$label/values is considerably faster than parsing and executing PromQL. There's also the issue that this may lead to confusing cases of nodes becoming unavailable to select, if e.g. that node goes down for whatever reason and Prometheus doesn't scrape it during the $__range period.
There's no good solutions here, just multiple bad ones, and we need to pick the least bad of those.