2013-08-20

Busy wait, Sleep(0), Sleep(1), Sleep(1000)

* Busy wait, or spin wait: not required for most of the cases
* Sleep(0): fairly good choice
  * if you don't know how long to "sleep", or need to "sleep" less than 1ms
  * if it's not constraint to a single CPU/core
  * if you don't care about the "high" CPU usage
  * if you need an immediate response
* Sleep(1): "sleep" a little bit, but normally more than 1ms
  * if you don't need immediate response
  * if you'd like give out the time slice anyway
  * if you want a "lower" CPU usage than Sleep(1)
* Sleep(1000): just an arbitrary selected number
  * if it's not likely to happen
  * if you are waiting for another time consuming counter part

"Sleep()" is not sleep, but tells the scheduler "Don't bother me for at least some time".

Sleep(0) has a special meaning and handled differently than other values.

WaitOnAddress() seems promising, but:
* seems cause more CPU usage
* with more wall time
Compared to my Sleep(0) version.

Will check the testing/benchmark code later. But, it seems so now.
And the document on MSDN seems at least not reflecting the implementation.

I already sent comments with questions on the documentation and the underlying tech, via email by clicking the bottom link.
I don't expect any (immediate) response. But, before I get any response, I will stop there, no using/testing WaitOnAddress anymore.

Some obvious questions:
* Parameter dwMilliseconds
  * What does "indefinite" mean? If given 0.
  * An optional parameter?
* The example code
  * Why not check the return value?
  * Have you noticed a race condition of using CapturedValue?

Sleep(0) might be further improved, both the performance/speed and power consumption. I'm not sure about the first claim, neither have any knowledge on power consumption.
I'm sure of the implementation of ReaderWriterLockSlim (C#):
* Why not use existing WIN API, like AcquireSRWLockExclusive?
* Why busy wait, then Sleep(0), then Sleep(1)?
* And how to choose the arbitrary numbers for busy wait and Sleep(0)?
* It is with the best performance, or has some compromises?

Anyway, it seems the same approach as the guys from Intel did.
I will give it a try, when I have time.
But, actually, I don't care about performance too much. So, if there is no big improvement (like 10x), I will still use the simple Sleep(0).

Just now, it is only tested against wall time (CPU instruction).
Maybe I'll test it against wall time (performance counter), and analysis the stdev besides just avg.