GuestNo new alerts

Performance Analytics

- General309
 
I was thinking that maybe administrators (I love statistics lol) might be interested in analytics and statistics on how well or not well the process is doing with requests.

There could be two parts:

Part 1 - One that is simply a timer which averages out the speed of every request. This way, we can pick up times when Gosora is running particularly slow without relying on the performance probe which will possibly have CDN latency obscuring Gosora's true performance.

It will also gather data from a wider period of time, so temporarily blips are more likely to show up.

Part 2 - One that fires a request once every fifteen second at it's domain to test the latency from the server to CDN (if any) and back to the server. This was briefly mentioned in https://gosora-project.com/topic/performance-probe-analytics.93

This will be particularly useful in tracking CDN latency versus origin server latency and any other slowdowns which might arise in the server's TCP stack, but not in Gosora proper and maybe a bit away from the server depending on how far it goes before getting routed back.


The following is the list of possible implementations for the request duration averager, this is mainly for developers and might go completely over your head, if you don't have a technical background.

Option 1 - We could a database query for every request. This would be so slow that it's not even funny, I've only included here just to show I've considered it... Before dismissing it. The last thing we need is a performance feature which does more harm that it prevents.

Option 2 - We could have an int64 integer as a counter and add the duration of the request with an atomic increment once every request. It could then be divided by the number of requests once every fifteen minutes to add that chunk to the database and the counter could be zeroed for the next chunk.

This would take two instructions per request (one for the sum and one for the request count, although we might be able to borrow different counter's data for the total number of requests).

We may require a big mutex for quickly swapping data out of the counters when co-ordinating writes to the database, but it shouldn't have much of an impact on individual requests.

This would be fairly fast, although there may be some degree of cacheline contention without sufficient sharding on 20 core processors and an astronomical number of requests. This probably isn't relevant here, although if becomes a problem then we could divide the counter into multiple.

Also, the counter may overflow, although if we consider a microsecond to be 1, then if every request took 100 seconds (astronomically slow), then that would still take 153,722,867,281 requests per second to overflow a single int64 counter.

DoS attacks aren't a concern here either, as the default timeouts for requests should be sufficient to lower the possible durations below ten seconds.

Option 3: Like Option 2 but uses mutexes. We avoid building up too much lock contention here by using multiple counters, possibly one for each core, which are chosen at random every request.

Edit: I've done a number of little edits to help clarify some various bits and pieces.

Edit 2: Option 3 is back.

Edit 3: Adding musings as I ponder over the best implementation for this thing.