MacGyver
This is one of several stories about cool stuff I’ve done. See the Portfolio Intro post for more info.
This is from 2009. Make allowances. :)
I recently saw a job posting that called for “MacGyver”-ish trouble-shooting skills. It’d take more chutzpah than I’ve got to lay claim to that title, but let me share a little story.
I was recently trying to look at a friend’s web site, and ran into a weird problem. I typed the URL into Safari, got the “Contacting…” message, and then it just hung. It eventually gave up and said “Failed to open page.” Hmm. Tried a few other URLs: The rest of the internet is working fine. Switched to Firefox; same problem. Tried my girlfriend’s laptop; same thing.
Popped open a terminal: “host” resolves fine; “ping” fails. “traceroute” gets through a couple of major service providers before it stops returning useful info: It’s not just dying inside my provider’s network. I have a shell account with my web hosting provider, so I ssh there: “ping” works fine, and I’m able to grab my friend’s web page using “curl”. I emailed my friend, on the off chance that there’s some known weirdness about his machine. He’s a sysadmin, so he may also have some ideas about how to debug this. Quick back and forth; no dice.
At this point, it’s pretty clear this is a network issue, and that it’s only between my ISP and his hosting service. Even if I got ahold of somebody competent at my provider, they’d say, “Well, you can get to everywhere else; it must be their problem.” And if I got ahold of someone at his hosting service, they’d say, “You can get here from everywhere else; it must be their problem.” That’s never going to get sorted out.
What did I do? I wrote a custom proxy server.
I’ve never written a proxy server, but I know the basic idea. I shelled back into my hosting service and hacked together a Perl CGI script. There’s probably a Perl library to do most of the work for me, but I was in more of a mood to tinker than research. My script just takes a relative URL, and uses “curl” to grab the corresponding page from my friend’s site (his front page by default). Then it does a couple of regex search-and-replaces on the contents: It rewrites all of the href and src attributes that point to his site so that they go through the proxy script instead.
It took me about half an hour to get to a good 80/20 stopping point (a broken Flickr badge and a non-fatal javascript error). The script weighed in around 20 lines, including some basic input checking. That was good enough, but the Javascript error was nagging at me, so I spent another half-hour poking at that. I used curl to grab the included javascript files for that page, and dug through them for the error message. It turned out to be some sort of security check: It had a list of approved domains, and going through the proxy caused a mismatch. So I added a line to the proxy to edit the javascript as it came through, disabling the check. I mentioned this later to a friend who does security work. Apparently, this is known as a client-side proxy, and it’s a known technique for breaking Javascript security. He was impressed that I’d figured it out on my own.
So, a weird problem, some quick analysis, and a minimal and specific solution. As gravy, a little insight into Javascript security. Maybe not MacGyver, but maybe not far off?