... / [Metrics] Grafana complet...

Question

[Metrics] Grafana completely broken (fixed)

Created on 2017-01-02 10:44:20 (edited on 2024-09-04 13:23:51) in Logs & Metrics-old

Hello,
I'm using the Metrics platform in production for about 2 months and have quite some dashboards setup in Grafana. Since about a week ago nothing works anymore.

At first it was reporting "unauthorized access" in the UI for some queries only. Then the proxy option for the backend stopped to work, and now (I suppose after the latest Grafana update) even the direct access to the prometheus backend is broken.

In my admin panel, I can't even see any of the usual backend url in the "platforms".

Can someone check what is wrong?

Upvotes (0)

7603 Views

15 Replies ( Latest reply on 2018-02-28 11:39:14 by

JonathanTron

)

miton18

Hello @JonathanTron,

you should not see 'unauthorized access' anymore. If this often occur, can check for your tokens?

For your broken dashboards, we have an issue on the last version of our query proxy which respond to CORS request with an invalid __Access-Control-Allow-Origin__ header.
We are on it, we will fix it quickly.

About the manager, backends list doesn't handle well switch between Metrics project. Refreshing the page will show you the backends. we are sending this issue to the in charge team.

Helpful (0)

JonathanTron

Author

Hi @miton18,

The "unauthorized access" is fixed when going directly to the new url (ie: https://grafana.metrics.ovh.net) but the OVH Admin link still point to the old one (ie: https://grafana.tsaas.ovh.com/login), which then don't allow you to login properly because no matter what the message at the bottom of the auth page says, it still redirect you to the old url.

Yes the dashboards queries spits a lot of errors because of CORS.

Thanks for the answer.

Helpful (0)

JonathanTron

Author

Thanks, CORS issues are gone now, but the dashboards are still broken.

I'm using a pretty standard templating variable with a Prometheus backend (Query: `label_values(alias)`) which result in a query being sent to the backend at `https://prometheus.gra1.metrics.ovh.net/api/v1/label/alias/values` but response is a `400 Bad Request` message: `Unprocessable Entity: label`.

Helpful (0)

JonathanTron

Author

Hi @PierreZ,

yes, `label_values(alias)` works again.

Now I have a lot of queries which don't return anything anymore, while there are data for them if I query from another datasource type (eg: graphite).

Here's an example:

Query: `pg_stat_database_tup_fetched{datname=~"$datname", instance=~"$instance"} != 0`
generated request: `https://prometheus.gra1.metrics.ovh.net/api/v1/query_range?query=pg_stat_database_tup_fetched%7Bdatname%3D~%22%22%2C%20instance%3D~%22%22%7D%20!%3D%200&start=1519664444&end=1519750844&step=240`
Response: `{"status":"success","data":{"resultType":"matrix","result":null}}`

Those were obviously working fine 2 weeks ago.

Helpful (0)

PierreZ

URLdecoding your request gave me this:
pg_stat_database_tup_fetched{datname=~"", instance=~""} != 0

It seems that your grafana variables $datname and $instance and are not set. How do you populate them ?

Helpful (0)

JonathanTron

Author

Yes the variables are not there, but it should then return any `pg_stat_database_tup_fetched`, at least that's what it did before.

Even removing the variable, the series returns nothing. I checked using the Warp10 Quantum UI and there are data in there: `[ $TOKEN 'pg_stat_database_tup_inserted' {} NOW 30 s ] FETCH` returns some data with the expected `instance` tag.

Helpful (0)

PierreZ

Hi Jonathan!
> 'pg_stat_database_tup_inserted' {}

and

> 'pg_stat_database_tup_fetched' { 'datname' '' 'instance' '' }

are different:

1. pg_stat_database_tup_**inserted** is different than pg_stat_database_tup_**fetched**
2. The first case is matching every GTS with a classname equals to 'pg_stat_database_tup_inserted' whatever are the labels. The second case is a valid GTS selector but will not match anything because values for label cannot be empty.

Can you confirm that you have the right timeseries, using queries like this one:

> [ $TOKEN '~pg_stat_database_tup_.*' {} ] FIND

?

Helpful (0)

JonathanTron

Author

Hi @PierreZ,

yes, my mistake for using a different name, I've checked with a lot of them already to ensure my data were ingested correctly and yes the timeseries are returned as expected when using the Warp10 backend directly.

Helpful (0)

JonathanTron

Author

As far as I understand it, you've developed an api compatibility layer on top of your Warp10 backend for other protocols (graphite, prometheus, etc...).

I'm using the Prometheus backend and my mention of the Warp10 was just to ensure I don't hit any bug in the other backend integration.

Given all my dashboards were working 2 weeks ago but some queries do not work anymore, all I can assess is that something changed in the backend... maybe before empty values for labels from a Prometheus query were simply ignored when doing the Warp10 translation...

Helpful (0)

PierreZ

Hi @PierreZ,

yes, my mistake for using a different name, I've checked with a lot of them already to ensure my data were ingested correctly and yes the timeseries are returned as expected when using the Warp10 backend directly.

Thanks for the verification @JonathanTron !

Yes the variables are not there, but it should then return any pg_stat_database_tup_fetched, at least that's what it did before.

The selector is wrong, so it will not return something.

As far as I understand it, you've developed an api compatibility layer on top of your Warp10 backend for other protocols (graphite, prometheus, etc...).

You are correct. In our core, we have the distributed versions of Warp10. When you are using another protocol, we are translating the queries into Warp10 format.

Given all my dashboards were working 2 weeks ago but some queries do not work anymore, all I can assess is that something changed in the backend...

As PromQL support is in preview, we are always improving translation to WarpScript. I'm currently digging into our code to see what went wrong.

maybe before empty values for labels from a Prometheus query were simply ignored when doing the Warp10 translation...

This is one possibility, I am checking it right now. I need to check how Prometheus is behaving in this case.

Helpful (0)

Welcome to OVHcloud Community

Ask questions, search for information, post content, and interact with other OVHcloud Community members.

[Metrics] Grafana completely broken (fixed)

Related questions

Join discussion

Most viewed in same Forum

Most recent in same Forum