mirror of
https://github.com/haproxy/haproxy.git
synced 2026-04-21 22:28:41 -04:00
Reinhard Vicinus reported that the reported average response times cannot be larger than 16s due to the double multiply being performed by swrate_add() which causes an overflow very quickly. Indeed, with N=512, the highest average value is 16448. One solution proposed by Reinhard is to turn to long long, but this involves 64x64 multiplies and 64->32 divides, which are extremely expensive on 32-bit platforms. There is in fact another way to avoid the overflow without using larger integers, it consists in avoiding the multiply using the fact that x*(n-1)/N = x-(x/N). Now it becomes possible to store average values as large as 8.4 millions, which is around 2h18mn. Interestingly, this improvement also makes the code cheaper to execute both on 32 and on 64 bit platforms : Before : 00000000 <swrate_add>: 0: 8b 54 24 04 mov 0x4(%esp),%edx 4: 8b 0a mov (%edx),%ecx 6: 89 c8 mov %ecx,%eax 8: c1 e0 09 shl $0x9,%eax b: 29 c8 sub %ecx,%eax d: 8b 4c 24 0c mov 0xc(%esp),%ecx 11: c1 e8 09 shr $0x9,%eax 14: 01 c8 add %ecx,%eax 16: 89 02 mov %eax,(%edx) After : 00000020 <swrate_add>: 20: 8b 4c 24 04 mov 0x4(%esp),%ecx 24: 8b 44 24 0c mov 0xc(%esp),%eax 28: 8b 11 mov (%ecx),%edx 2a: 01 d0 add %edx,%eax 2c: 81 c2 ff 01 00 00 add $0x1ff,%edx 32: c1 ea 09 shr $0x9,%edx 35: 29 d0 sub %edx,%eax 37: 89 01 mov %eax,(%ecx) This fix may be backported to 1.6. |
||
|---|---|---|
| .. | ||
| common | ||
| import | ||
| proto | ||
| types | ||