Hello,
I'm using the Metrics platform in production for about 2 months and have quite some dashboards setup in Grafana. Since about a week ago nothing works anymore.
At first it was reporting "unauthorized access" in the UI for some queries only. Then the proxy option for the backend stopped to work, and now (I suppose after the latest Grafana update) even the direct access to the prometheus backend is broken.
In my admin panel, I can't even see any of the usual backend url in the "platforms".
Can someone check what is wrong?
[Metrics] Grafana completely broken (fixed)
Related questions
- Evaluer le volume envoyé de logs
10360
28.10.2016 00:07
- [ Metrics ] New release : 2.0
9904
26.06.2017 22:06
- Coupures de service ?
7599
29.05.2019 16:10
- Supprimer data/Metrics sur opentsdb
7524
05.03.2018 15:48
- Envoi des logs depuis un cluster kubernetes via fluentd
6838
31.10.2016 13:28
- Problème de performances sur la partie metrics
6516
17.01.2019 08:01
- Fluent-bit Parser for logback JSON
6198
04.10.2022 08:38
- [FR] [Metrics] Support du protocole Graphite disponible !
5839
28.12.2017 14:08
- Erreur lors de la commande d'options
5753
10.05.2017 10:53
Hello @JonathanTron,
you should not see 'unauthorized access' anymore. If this often occur, can check for your tokens?
For your broken dashboards, we have an issue on the last version of our query proxy which respond to CORS request with an invalid __Access-Control-Allow-Origin__ header.
We are on it, we will fix it quickly.
About the manager, backends list doesn't handle well switch between Metrics project. Refreshing the page will show you the backends. we are sending this issue to the in charge team.
Hi @miton18,
The "unauthorized access" is fixed when going directly to the new url (ie: https://grafana.metrics.ovh.net) but the OVH Admin link still point to the old one (ie: https://grafana.tsaas.ovh.com/login), which then don't allow you to login properly because no matter what the message at the bottom of the auth page says, it still redirect you to the old url.
Yes the dashboards queries spits a lot of errors because of CORS.
Thanks for the answer.
Yes, An update of the manager will set the right link to Grafana.
The cors patch is prodded, have you still got issues on your dashboards ?
Thanks, CORS issues are gone now, but the dashboards are still broken.
I'm using a pretty standard templating variable with a Prometheus backend (Query: `label_values(alias)`) which result in a query being sent to the backend at `https://prometheus.gra1.metrics.ovh.net/api/v1/label/alias/values` but response is a `400 Bad Request` message: `Unprocessable Entity: label`.
Hi Jonathan,
We found an issue concerning api/v1/label call that is now resolved. Is it working better now?
Hi @PierreZ,
yes, `label_values(alias)` works again.
Now I have a lot of queries which don't return anything anymore, while there are data for them if I query from another datasource type (eg: graphite).
Here's an example:
Query: `pg_stat_database_tup_fetched{datname=~"$datname", instance=~"$instance"} != 0`
generated request: `https://prometheus.gra1.metrics.ovh.net/api/v1/query_range?query=pg_stat_database_tup_fetched%7Bdatname%3D~%22%22%2C%20instance%3D~%22%22%7D%20!%3D%200&start=1519664444&end=1519750844&step=240`
Response: `{"status":"success","data":{"resultType":"matrix","result":null}}`
Those were obviously working fine 2 weeks ago.
URLdecoding your request gave me this:
pg_stat_database_tup_fetched{datname=~"", instance=~""} != 0
It seems that your grafana variables $datname and $instance and are not set. How do you populate them ?
Yes the variables are not there, but it should then return any `pg_stat_database_tup_fetched`, at least that's what it did before.
Even removing the variable, the series returns nothing. I checked using the Warp10 Quantum UI and there are data in there: `[ $TOKEN 'pg_stat_database_tup_inserted' {} NOW 30 s ] FETCH` returns some data with the expected `instance` tag.
Hi Jonathan!
> 'pg_stat_database_tup_inserted' {}
and
> 'pg_stat_database_tup_fetched' { 'datname' '' 'instance' '' }
are different:
1. pg_stat_database_tup_**inserted** is different than pg_stat_database_tup_**fetched**
2. The first case is matching every GTS with a classname equals to 'pg_stat_database_tup_inserted' whatever are the labels. The second case is a valid GTS selector but will not match anything because values for label cannot be empty.
Can you confirm that you have the right timeseries, using queries like this one:
> [ $TOKEN '~pg_stat_database_tup_.*' {} ] FIND
?
Hi @PierreZ,
yes, my mistake for using a different name, I've checked with a lot of them already to ensure my data were ingested correctly and yes the timeseries are returned as expected when using the Warp10 backend directly.
As far as I understand it, you've developed an api compatibility layer on top of your Warp10 backend for other protocols (graphite, prometheus, etc...).
I'm using the Prometheus backend and my mention of the Warp10 was just to ensure I don't hit any bug in the other backend integration.
Given all my dashboards were working 2 weeks ago but some queries do not work anymore, all I can assess is that something changed in the backend... maybe before empty values for labels from a Prometheus query were simply ignored when doing the Warp10 translation...
Thanks for the verification @JonathanTron !
The selector is wrong, so it will not return something.
You are correct. In our core, we have the distributed versions of Warp10. When you are using another protocol, we are translating the queries into Warp10 format.
As PromQL support is in preview, we are always improving translation to WarpScript. I'm currently digging into our code to see what went wrong.
This is one possibility, I am checking it right now. I need to check how Prometheus is behaving in this case.
Thanks @PierreZ !
@JonathanTron, we are also available on Gitter.im/ovh-metrics, if you prefers chat-based discussion. It may be easier to resolve your issue!
Issue is fixed, thanks to @PierreZ and @miton18