robbat2: (Default)
[personal profile] robbat2

As a few folks are aware, I've been working on the new Gentoo Bugzilla. While I have got it up now (bugstest.gentoo.org), I'm uncertain as to how the new system would cope under the load, having never done any management of a Bugzilla installation previously, nor handled anything about the existing Gentoo Bugzilla instance.

This of course nessicates a performance analysis of the existing Bugzilla 8-). Read on for statistics on the matter, including the top 3 most active users in the Gentoo Bugzilla.

I got a copy of the logs from the main Gentoo Bugzilla, and I've been getting some statistics out of them. The logs span the two weeks of 2006/10/26 thru 2006/11/08, inclusive (actually, I have part of 2006/11/09 as well, but I've excluded it since it isn't a complete day).

Unique IPs (daily average)
  • 4211 unique IPs perform a GET each day.
  • 427 unique IPs perform a POST each day.

However, there is something odder. Over the complete 2 weeks, there are 4079 unique IP addresses that performed a POST and 38890 unique IP addresses that performed a GET. Also weirdly, 207 unique IP addresses did a POST, but not any GETs.

Now let's focus on the interesting pages only. By interesting, I'm excluding all pages that are seldom hit (like the custom graphing, or the activity log), or are static pages that don't need any processing before the server can send them.

POST queries (daily average):
  • 222 logins.
  • 139 new bugs (post_bug.cgi).
  • 818 changes to existing bugs (process_bug.cgi).
  • 102 attachments (attachment.cgi).
Top three Bugzilla users, by percentage of daily changes to bugs:
  1. 13.50% - jakub
  2. 1.85% - vapier
  3. 1.18% - flameeyes

(Now I see why it seems that jakub complains so often about Bugzilla being slow - he uses it significently more than anybody else!).

GET queries (daily average):
  • 8929 loads of a specific bug (show_bug.cgi).
  • 1852 loads of an attachment (attachment.cgi).
  • 6123 loads of the list of bugs (buglist.cgi).

Next up, doing time-series distributions of buglist.cgi, and fitting a curve to it.

Why buglist.cgi? Because depending on the query, it is very heavy on the database. (TODO: include an actual graph here. I did graph it, but my graph is ugly, so I'm not including it right now).

Conclusions from the graph:
  • The daily minumum point is around 03h30 UTC, when there are 5 requests per minute.
  • The daily maximum point is around 17h30 UTC, where there are 10 requests per minute.

I should bring in queue simulation stuff here now to continue this analysis, but it's after 6am, and I'm tired of this.

My intermediate conclusion is that we need to ensure buglist.cgi returns a result in under 6 seconds, to avoid going unbounded. This however a gross over-simplification of the matter, since we only had a single read-slave before, and we now have two read-slaves to distribute queries to.

This account has disabled anonymous posting.
If you don't have an account you can create one now.
HTML doesn't work in the subject.
More info about formatting

If you are unable to use this captcha for any reason, please contact us by email at [email protected]

May 2017

S M T W T F S
 123456
78910111213
141516171819 20
21222324252627
28293031   

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
OSZAR »