Category Archives: geek-out

What’s Magical About 1272 Bytes?

So here’s a bit of Windows Vista Ultimate 64-bit arcana for you… I was doing some research on the performance and efficiency of relative cluster sizes, and because of this I wanted to know how many files of certain sizes were on my disk. So I started running some searches, with various cluster sizes that I was considering, hoping to get some data points against which to run some statistical analysis. Here’s what I ended up with, running Vista’s file search in “non-indexed” mode, and choosing to include hidden and system files:

File Size File Count
<64KB 72,781
<16KB 53,480
<8KB 42,696
<4KB 31,542
<2KB 15,822
<1KB 19,528
<0.5KB 10,058

Did you notice something odd? That’s right, the number of files <1KB in size is greater than the number of files <2KB in size! This is mathematically impossible, of course.

Using a manual binary search algorithm, I finally arrived at the magic point: something weird happens between the 1272 and 1273 byte count, as the following two screen shots illustrate (click for larger versions, look at the upper right and lower left of each).

search for files <1273 bytes in size

search for files <1272 bytes in size

Logically, the second search should yield slightly fewer results, assuming there are a couple of files on the drive that are exactly 1273 bytes (in reality, there are exactly 15 1273-byte files–this should be the delta between the two searches). In fact, the second search yields more than twice as many!

I was hoping I could narrow down what was going on by searching for specific file types instead of the *.* pattern, but as soon as I did that, everything seemed to work. Interestinly, if I then went back to the *.* pattern, the 1272B search produces a correct (lower) number! However, if I then run a 1KB search I get the higher number again, and if I repeat the 1272B search I again get a higher number.

Pretty strange, huh?

In case you’re wondering:

Intel E8400

8GB DDR800

Windows Vista Ultimate, 64-bit, SP1 and all “important” updates applied

Seagate 1TB SATA at default cluster size

Return a Record for Each Date Between Two Dates in SQL Server >= 2005

Blogging this so I don’t forget it…

It used to require some fairly ugly, resource intensive hacks (cursors, temp tables, etc.) to emit an inclusive list between two data points when the source data might not include an entry for every point (for example, a calendar, where not every day contains an event). In SQL Server 2005 and above, this is trivially easy, with a Common Table Expression (CTE) and a Recursive Query. To emit one record for every date between 1/1/2008 and 1/31/2008, you do this:


WITH datecte(anydate) AS (SELECT CAST('1/1/2008' AS datetime) AS anydate
UNION ALL
SELECT anydate + 1 AS anydate
FROM datecte AS datecte_1
WHERE (anydate < CAST('2/1/2008' AS datetime) - 1))
SELECT anydate
FROM datecte AS datecte_2

If you need more than 100 days (the recursion limit is 100), add this to the end:

OPTION (MAXRECURSION 1000)

The fact that they stop recursion short at 100 by default would seem to indicate that this is an expensive procedure, but even if you’re just using this to produce a dummy table with all the dates for several years, it’s a nice shortcut.

I just tried the following query, which emits a record for every day between 1/1/2000 and 12/31/2020:


WITH datecte(anydate) AS (SELECT CAST('1/1/2000' AS datetime) AS anydate
UNION ALL
SELECT anydate + 1 AS anydate
FROM datecte AS datecte_1
WHERE (anydate < CAST('1/1/2021' AS datetime) - 1))
SELECT anydate
FROM datecte AS datecte_2
OPTION (MAXRECURSION 10000)

On my P4-641+ the script emits 7671 records in 0 (that’s zero) seconds and “spikes” the processor to all of 3%. Granted this is not a complex query, but at least we know the recursion (if it really is recursion internally, which I doubt) isn’t expensive by itself.

Vista… 64-bit… Where’s My Headroom?

Other than a couple of virtual machine beta builds, I had managed to stay out of Vista entirely until the last month or so. Since then I’ve tried to install on three machines–a client’s Dell Optiplex, which never was able to boot after install, and two home-built systems. This weekend I built a brand new system out of all Vista-logo components. It booted; Vista reported the hardware compatible; it even got a 5.8 experience index score. But I had continuous crashing of both IE and Windows Explorer. Also, on what should have been basically the fastest hardware available, the Vista with SP1 install took over 90 minutes.

I’m walking away. My current approach for my development machine is going to be Windows Server 2008 Standard 64-bit. Again I have certified components, but 64-bit in itself represents a struggle in terms of driver and application compatibility. We’ve had 64-bit CPUs in our machines for going on 5 years, and 64-bit Windows options for almost as long, and yet you still cannot run common programs and drivers in the environment–Flash, TWAIN, most VPN software, the list of things you can’t do (or do well) is astoundingly comprehensive.

We’re heading toward a very real wall here: 32-bit versions of Vista (as with other flavors of Windows) are limited to 4GB of RAM. Yet that is simply not enough for Vista plus any serious suite of applications. At the same time, 64-bit Windows still isn’t a truly viable desktop for most users. Out of necessity, I’ll compromise on many fronts–multimedia capability, peripheral compatibility, native software availability–but some of this stuff isn’t easily virtualizable, so I’m looking at the possibility of having to keep 32-bit systems around (for example for scanning, connecting to client VPNs, etc.). I’m really starting to feel hemmed in.

I guess I could take a step back here and look at it from the Mac perspective. It works because it’s broken; it’s broken because it works. That is, by forcing a switch to 64-bit Server I’m pruning the 16- and 32-bit dead wood that’s keeping me in the 4-gigabyte sandbox. Apple users long ago embraced obsolescence as a feature. Vista and 64-bit computing may (finally) force the Windows side of the PC world to wake up to this. Or maybe not. There’s a downside to the Mac example: “performance” in an absolute sense is to some extent irrelevant, and scaling up doesn’t necessarily have to be as smooth or cheap as we’re used to, as long as the chrome is shiny and doesn’t peel off too obviously.

This issue bears similarities to the current Internet Explorer 8 web standards argument–do we break the web (force IE8 standards mode, cripple billions of web pages) to move toward the Platonic ideal of standards? Do we break the PC ecosystem (Vista, 64-bit) for the hope of increased functionality and capacity in the next generation of platforms (available today, but considered unusable by the consumer)? You know it’s a tough question when even Joel Spolsky can’t tell you the answer. But generally, culturally, we’re not long-term investors, certainly not when the benefits are nebulous and far off and the pain points are obvious and immediate. As Joel argues, for web standards under IE this is a late-bound issue–they can throw the switch any time to go back to a more relaxed mode. But the issue of Vista running out of memory and 64-bit versions not being ready for prime time is a lot harder to resolve.

Dynamically Adding Option Elements to Select Objects… The Real Story

I don’t normally blog about pure code subjects, so this is going to be way too technical/boring for the general readership of this blog. However, I wasn’t able to find the definitive answer to this question after searching around with Google, so I figured I’d contribute a little. This is also a wiki-style reminder to myself the next time I need to do this.

Mostly you’ll be interested in this post if you’re trying to programmatically add options to a select list using JavaScript. Of all the ways to do this, there are a few that work and many that don’t. The catch seems to be that IE is picky about how and when the data “inside” the option tag (the visible text in the option list) is set. According to the documentation there should be any number of ways to set this–.text, .innerHTML, .innerText. Of those, the only one that seems to be broadly compatible is .text. .innerText is IE-only, so it’s right out. .innerHTML is valid across platforms, but it messes with the object model in IE6 and IE7 (and IE8 beta 1, now that I look) in such a way that if you use it, you have to add the option element to the select element before setting the .innerHTML property. See below for a breakdown of the methods, plus test code. Blogger might break this, so be sure to look at the plain test page (“view source” is going to be a lot friendlier on that page as well).

In any case, here’s the take-away when adding options to select elements:

  • Always use the .text property of the newly-created option object to set the visible text for the option.
  • As a precaution, add the new option object to the select object’s options collection before setting other properties of the option.



create option, set .value, set .text, add to options list (generally compatible)


function createSetTextAddOption(selectEl,val,displayText) {
var o = document.createElement("OPTION");
o.value = val;
o.text = displayText;
selectEl.options.add(o);
return false;
}




create option, set .value, set .innerHTML, add to options list (fails IE6, IE7, IE8-‘invalid argument’)


function createSetInnerHTMLAddOption(selectEl,val,displayText) {
var o = document.createElement("OPTION");
o.value = val;
o.innerHTML = displayText;
selectEl.options.add(o);
return false;
}




create option, add option to list, set .value, set .innerHTML (generally compatible)


function createAddOptionSetInnerHTML(selectEl,val,displayText) {
var o = document.createElement("OPTION");
o.value = val;
selectEl.options.add(o);
o.innerHTML = displayText;
return false;
}




create option, add option to list, set .value, set .text (generally compatible)


function createAddOptionSetText(selectEl,val,displayText) {
var o = document.createElement("OPTION");
o.value = val;
selectEl.options.add(o);
o.text = displayText;
return false;
}




Attempt to add new options/selects inline (option 2 and select 2 fail in IE, either silently or with the “invalid argument” error)

According to the Microsoft documentation, this shouldn’t work at all. Here’s what they say about the option.add() method:

This method can be used to add elements only after the page loads.

If the method is applied inline, a run-time error occurs.

And yet, it mostly does work, aside from the innerHTML limitation…


// generally compatible inline adds
createSetTextAddOption(document.getElementById('sel5'),'1','added value 1');
createAddOptionSetInnerHTML(document.getElementById('sel5'),'3','added value 3');
createAddOptionSetText(document.getElementById('sel5'),'4','added value 4');

addNewSelect(document.getElementById('sel5'),'1','added value 1',createSetTextAddOption);
addNewSelect(document.getElementById('sel5'),'1','added value 3',createAddOptionSetInnerHTML);
addNewSelect(document.getElementById('sel5'),'1','added value 4',createAddOptionSetText);

// these two fail in IE6, IE7, IE8
createSetInnerHTMLAddOption(document.getElementById('sel5'),'2','added value 2');
addNewSelect(document.getElementById('sel5'),'1','added value 2',createSetInnerHTMLAddOption);

Again, I encourage you to view the plain test page to avoid any Blogger-induced weirdness.

This Meme Needs a Name

All of a sudden (though probably not–I’m just catching on) there’s a rash of services that crunch plain text or data keys and return parsed, formatted, drilled-down usable data. For example…

Tripit will digest all your travel confirmations and produce a rich itinerary.

Google SMS will take a zip code, flight number, etc., from your cell phone, and return a rich data node.

Opencalais is a platform for doing stuff like this.

So what do we call it?

Mailinator: Free, Instant, Completely Unsecure Email

What if you’re downloading a product demo, need a unique address to get the key, and never want to hear from that company again? Good luck! Actually, you don’t need good luck, just mailinator.

Send to any address at mailinator.com (or one of several other domains) and then go to the site and check your email. But make it a long, complex address, because there’s no password. In fact, anyone can check “your” email if they know the email address (including the person/company who sends email to you).

If you completely understand the limitations, this is a wonderful service.

When You’re an End Node, It Doesn’t Pay to Ask Why

I think what he’s saying here is that where you look for answers as a developer is heavily influenced by the domain in which you’re operating. Yes, you need to consider “best practices” (groan), and sometimes it’s a good idea to “think outside the box” (wretch), but most of the time you really need to concentrate on what is possible and efficient and makes sense in the current context. This is why when you want to learn about a technology you can read a book, but when you actually have to implement it you end up sorting through a lot of discussion groups and blog posts, and especially blog comments–the ultimate end nodes of the infocloud.

Just a neat blog I stumbled on in an otherwise anxious, code-heavy week of integrating things that were never meant to work together.

Cheap Router as Wireless Bridge

I recently had a desperate need for a wireless bridge device. The need has passed, but I finally figured out a way to do it without spending $60+ for a dedicated (one port) bridge. The main goal here is to provide a physical Ethernet jack somewhere out on your wireless network for a device or devices that can’t connect to wireless directly. I was able to get this working just now using a $25 refurbished Netgear WGT624v3 from Fry’s. I followed (and interpreted, because it’s not as step-by-step as it could be) these instructions. Supplemented with information from this thread.

What’s really amazing about this is that you end up using a shell session on the router, without having to hack the firmware (though you are exploiting a disabled interface and a NetGear diagnostic tool that turns it back on). It’s pretty strange.

In any case, I can now put four physical Ethernet ports anywhere within range of my wireless network. The bridge is effectively dumb and invisible–DHCP, DNS, etc. all come from the access point.

Not directly needed, but here’s some interesting background on hacking NetGear equipment.

The best part about this? I found the original thread, and the fact that this was all possible, using my Sprint phone while standing in Fry’s staring at the blank brown box of the refurbished WGT624 wondering “WTF is this?” (iPhone? We don’t need no stinking iPhone.)

Potential limitations that may reduce the usefulness of this. I don’t know if these are actual limitations, but I haven’t tested beyond my own setup.

  • Tested only bridging to NetGear access point (potential issues with other brands?)
  • Tested only 64-bit WEP encryption (some of the comments mentioned problems with WPA)
  • Tested only with published SSID at the AP
  • Possible wireless saturation/interference–when I tried this with the bridge a few inches from my Thinkpad, the internal Centrino wireless could no longer connect, and I’ve read that some BIOS versions of this router produce illegally-strong radio signals