So here’s a bit of Windows Vista Ultimate 64-bit arcana for you… I was doing some research on the performance and efficiency of relative cluster sizes, and because of this I wanted to know how many files of certain sizes were on my disk. So I started running some searches, with various cluster sizes that I was considering, hoping to get some data points against which to run some statistical analysis. Here’s what I ended up with, running Vista’s file search in “non-indexed” mode, and choosing to include hidden and system files:
File Size | File Count |
<64KB | 72,781 |
<16KB | 53,480 |
<8KB | 42,696 |
<4KB | 31,542 |
<2KB | 15,822 |
<1KB | 19,528 |
<0.5KB | 10,058 |
Did you notice something odd? That’s right, the number of files <1KB in size is greater than the number of files <2KB in size! This is mathematically impossible, of course.
Using a manual binary search algorithm, I finally arrived at the magic point: something weird happens between the 1272 and 1273 byte count, as the following two screen shots illustrate (click for larger versions, look at the upper right and lower left of each).
Logically, the second search should yield slightly fewer results, assuming there are a couple of files on the drive that are exactly 1273 bytes (in reality, there are exactly 15 1273-byte files–this should be the delta between the two searches). In fact, the second search yields more than twice as many!
I was hoping I could narrow down what was going on by searching for specific file types instead of the *.* pattern, but as soon as I did that, everything seemed to work. Interestinly, if I then went back to the *.* pattern, the 1272B search produces a correct (lower) number! However, if I then run a 1KB search I get the higher number again, and if I repeat the 1272B search I again get a higher number.
Pretty strange, huh?
In case you’re wondering:
Intel E8400
8GB DDR800
Windows Vista Ultimate, 64-bit, SP1 and all “important” updates applied
Seagate 1TB SATA at default cluster size