Last time a new version of Mac OS X came out, I upgraded right away. This turned out to be a bit of a mistake— the .0 release of Snow Leopard froze up a lot and even KP’d on me a couple times. Everything was cleared up with the .1 release not even two weeks later, but I vowed never to use a .0 release of OS X again. Now Lion is out and— damn!— I’ve gone and done it again. Except, not really. This time I have Jake as my guinea pig. He’s a registered developer and so had access to the beta, and assured me it was stable as could be. So far, he’s right. If you’re holding off on Lion for that reason alone, go ahead and take the plunge.

I’ve got a few general comments I want to zoom through before I get to the real meat of why I’m writing this post. Let’s do them in my patented bullet-list style so I can spare you my engineer’s purple prose.

  • Apple has messed with the appearance again. The change is so slow, and so far between, that it’s easy to fool yourself into thinking that OS X has remained visually constant over its seven releases, but nothing could be farther from the truth. If you compare OS X 10.0 to today, it’s shocking how subdued the appearance has become. Do you remember the pinstripes everywhere? Ick! It’s almost like Apple is moving in the opposite direction to Microsoft in terms of visual evolution, and the trend continues with Lion: the famous capsule-shaped buttons are now square, gradients have been replaced by solid colors, blue has often been replaced by grey. I like it, though a few of their choices are irksome: the “stoplight” buttons in the upper-left of windows have been inexplicably shrunk and are smaller click targets. Why? Who knows. The inexplicable exception is iCal and the Address Book, which are now hideously ugly. I never used those to begin with so it doesn’t bother me, but what the hell went wrong here? Apple, we need an option to turn that monstrosity off. Happily, there are also changes to the Mail interface, which are very much for the better. Three-pane browsing, yay!
  • Apple is determined to bring as much of the iPad experience to the desktop as they can. One of the most evident places is the abundance of new trackpad gestures: swipe to change screens, swipe to bring up the application launcher, etc. If you haven’t been using the trackpad on your fancy Mac for navigation and other tasks yet, I can’t recommend it enough: if you’re performing tasks where your fingers aren’t on the keys, keeping them that way has a measurable productivity boost. My one issue was that I had already been doing this sort of thing with jiTouch, so there were a bunch of gesture conflicts. Check for application compatibility before you upgrade! (TrueCrypt breaks too, you’ll need to get an updated version of MacFUSE before you can proceed)
  • Performance seems about the same. Maybe slightly faster. Though in their never-ending quest to provide more eye candy, Apple has now made Finder windows expand to their full size with a quick animation when you open them. Why? Who knows. I don’t think this has caused opening a Finder window to take any more time than it used to in absolute terms, but it feels slower, and that’s all users care about.
  • Support for full-screen apps is awesome on so many levels. I’ve often been working that way anyway— my Chrome window is always fullscreened, and I’ve applied various hacks to get Aquamacs and the Terminal close too— but having every application be able to take up the whole screen will rule, as soon as third-party developers get the necessary code in place. Having the operating system be able to fully get out of your way so you can focus on what you’re doing is an absolute win.

So there’s that. Now let’s talk about the biggest set of changes in Lion, and the one that’s most likely to baffle and rile people.

I shouldn’t have to tell you that its Apple’s mission to make interacting with the computer as easy as possible, even if that comes at the expense of fine-grained control over what’s happening. Lion continues this trend in a couple ways.

First, autosaving. Third-party programs like Office have often rolled their own autosave functionality, but there now APIs so that any application can do it with the operating system’s help. The autosave and versioning functionality is so constant and so pervasive, that there is literally no need to actually save your documents now— Lion is doing it for you. Open up TextEdit and you’ll see that there is no Save option anymore. It has been replaced with “Save a Version”, which essentially forces the autosave feature to keep that particular version no matter what (normally it eventually stops saving versions). As computer users, we’ve been trained so thoroughly so periodically save our work that it’s second-nature. It’s practically part of the standard procedure of using a computer, like clicking “Shut Down” when you’re ready to power down. Apple has gone and said, “Wait, since everyone does this, and everyone should be doing it- why do they have to do it themselves? Why can’t it be part of the operating system?” The idea is that isolating you from accidental deletions and application crashes should be the operating system’s responsibility, just like isolating you from the details of the TCP/IP stack when you browse the Web. It does make sense if you think about it (I promise!), but it flies so firmly in the face of how we’ve been using computers up until now that it may take some getting used to. You don’t have to hit Save! It’ll be OK!

Second, and this is the one that’s really to drive computer geeks up the wall, Apple has now decided to start shielding you from whether or not applications are running, and when they start and stop. What does that mean? It means that Lion can quit your applications whenever it wants, if it decides it needs the additional resources (note: just like iOS). I can already see many of you starting to hyperventilate at this possibility, so let me explain why this isn’t as crazy as it sounds. First, because of the autosave functionality and because Lion can restore applications to their exact state— right down to what text was highlighted when you quit— if Lion quits an application and you start it back up, it will be exactly as if it had never quit at all. Second, Lion is pretty conservative about when it kills a program: it won’t kill it if it has open windows, or is blocked on waiting for data from the disk or network. Here, Apple has said, “As far as the user is concerned, what does it actually matter whether the application is running or not? If the state is always preserved (and it is), the user shouldn’t care about whether there’s a CPU process for that program. Quitting a program to free resources is something the OS should handle.”. Bottom line? You don’t have to quit programs any more. Just keep opening them up, and when Lion needs the resources, it’ll kill a program that isn’t in use. When you need that program again, open it up, and the state restoration will guarantee that it hasn’t changed.

So: you don’t have to Save, and you don’t have to Quit. Two of the most fundamental operations in how we interact with computers— these predate the GUI, for heaven’s sake— deprecated in one stroke. They’re still there, for the moment, but don’t be surprised if there comes a time in the future when OS X doesn’t have them at all. This is the future, folks. Swallow your fears and climb aboard.

When you graduate from college with a CS degree, potential employers aren’t terribly interested with the knowledge you gained from your classes. I mean, they are, but what they care about more is how well you can meet project deadlines, how quickly you can start working with their codebase/algorithms/platforms when you start the job, can you communicate well with your teammates and managers, how fast can you pick up new knowledge, that sort of thing. These kinds of things are difficult to learn in class; doing projects helps but really the best way to improve is to actually have job experience. This is true with a lot of majors, but, in my opinion, CS more than most. Thus, during the Fall semester, most of us spend as much time worrying about getting a summer internship as we do about our classes. Don’t get one, and you’re behind. I actually applied to Microsoft last year, and made it to their second-round interviews. In case you’re not familiar with Microsoft’s interview process, they do first-rounds over the phone or in person at your school, and if they like what they saw, they transport you to Redmond (or occasionally other places, depending on what group you’re interviewing with) and subject you to the infamous back-to-back-to-back 45-minute interviews. Almost immediately after those, you find out whether you got the job or not.

Last year, as it turned out, the answer for me was “not”; I ended up spending the summer at Cisco, helping port some security appliance code to 64-bit (good times, hopefully more about that in a later post). This year my fortunes were significantly better: I landed the job. Starting at the very end of May next year, I’m going to be working with the online services division, specifically Bing. Hooray! Now I get to pass the secrets of my success unto you. There are already literally dozens of blog posts about how to prepare for, and what to expect in, Microsoft interviews, but I still felt it was a good idea to throw my two cents in for a couple reasons. First, because other blog writers helped me, and I wanted to keep up the chain of good will (giving back, y’know?). Second, a lot of those posts are written for people applying for full-time positions, and things are a little easier for potential interns. I mean, it’s not a cake walk, but they know you don’t have years of work experience to draw on, and they’re fair about assessing you accordingly. Third, because I want to boil down the several really vital tips I think you should know for the final-round questions. In fact, I’m going to do that right off the bat, before I describe the interview process and the kinds of questions I got, because remembering these things is so important.

The Three Things You Must Remember at a Microsoft Interview

  1. Ask Your Interviewer Clarifying Questions – This is so important it should be 1 and 2. No matter what kind of question you get asked— whether it’s “code this” or “test this” or “design this” or “how would you do this”— ask your interviewer questions to be sure you understand exactly what it is they want. Sometimes they ask questions that are vague on purpose to see if you get them to clarify, and sometimes it’s totally unintentional, but you will not get all the information you need the first time around, I guarantee it.

    Asking questions not only helps you get closer to the right answer, but even more importantly, people who ask questions and know more precisely what it is they’re supposed to be doing, instead of jumping in and making mistakes which have to be corrected at great expense later, make much better software engineers.

  2. Show Your Work – And by this I don’t just mean write your code down on the board (that’s sort of a requirement). I mean, talk about your thought process, all of it. Everything that crosses your mind as you work out a solution to the problem should be discussed with your interviewer. For one question, when I was asked “How would you code this?”, I actually responded, “Well, the obvious way is to…”, and then tossed out a very bad, exponential-time answer, because that’s where my train of thought started. Of course, I immediately clarified that it was a bad way, and then explained how to go from my answer to a much better one. Even if you don’t go that far, always talk about what you’re doing and thinking.

    The reason for this is, again, they’re not that interested in the extent of your programming knowledge or skills— they already know you’re good enough on that score, otherwise you wouldn’t have come out to the final-round interviews in the first place. What they’re trying to figure out is how you think when presented with a problem; that‘s how they find out if you’ll be an effective employee. Language syntax and even algorithms can be looked up on Wikipedia, bad thought processes can’t be corrected so easily. It has a more immediate advantage too: if you start going down a dead end on your solution to a problem, your interviewer might actually help you and say, “Is that such a great idea?”. If you silently puzzle away, they can’t assist you. Don’t be stoic, help them help you.

  3. Be Thorough – When you get asked to write down a piece of code, or even a test specification, you can be sure that your interviewer is going to try and come up with situations where it’ll break/fail. Microsoft writes code for the real world, not academic projects. You can stay one step ahead by making your solutions as robust as possible. Try to think of everything. What if a malloc() fails? If you’re iterating over an array, what if that array has so many elements that your loop index overflows? Does your string handle Unicode? What if the network connection goes down? If you’re using a random number generator, sometimes they block if there’s not enough entropy; what will your code do then? This can sound a little overwhelming, but you don’t have to think of all this stuff on the first draft of your code— just get it eventually. More than once I wrote down embarrassingly broken code at first (even though I had followed the first two steps), then corrected it to be bombproof over the course of the interview, and they like to see that: can you spot your mistakes and fix them?

If you remember these three things, of which 1) and 2) are the most important, you’re in solid shape. The overwhelming temptation when interviewing with Microsoft, or for any software job really, is to cram all manner of computer science information into your brain. This is a mistake, I think. Like I said, if you make it to the final-round interviews, they already pretty know you’re technically proficient enough, so don’t try to squeeze CS esoterica to get an advantage, it won’t help. What will help is problem-solving skills, not trivial facts. That being said, here’s the kind of before-hand preparation I think will help the most:

  • You only need to know basic data structures but you need to know them backwards and forwards. Lists (singly- and doubly-linked, circular, etc.), list derivatives like queues/stacks, and trees are the must-haves. What you can probably count on is that you’re only going to be asked questions about simple data structures, but the questions themselves will be tough. For instance, any decent CS student can write a recursive function to iterate over every element in a tree, but can you write such a function iteratively? Pre-order and in-order traversals are a bit tricky, post-order traversal done iteratively is genuinely difficult (at least when you’re under the time and pressure constraints of an interview). I wasn’t asked this question in particular, but another interviewee in my group was.

    In addition to making you write familiar functions in unfamiliar ways, Microsoft also likes to make you write functions with no special cases. Can you insert an element into a doubly-linked list at a given index with no special cases at all?

    Doing lots of problems like this beforehand will help a great deal. Write functions like the two described above, and think of other brain-twisters on lists and trees (I’m not going to give you any; coming up with them is as helpful as solving them!). Questions about string manipulation seem to be a favorite too: write a function to test whether a string contains another string, or whether two strings are anagrams of each other, things like that. Do as much practice as you can for things like this.

  • Practice testing things. This is a no-brainer if you’re applying for SDET, but dev applicants get testing questions too. You can be sure that you’ll be asked to test the code you just wrote, and you’ll want to know how to do that, but more abstract questions are equally common. For example, if your interviewer were to hand you the stapler on her desk and say, “Test this”, what would you do? You’ll want to be sure that you’re keeping my three guidelines above in mind, so I think a good start would go something like this:

    “Who’s using this stapler?” – Remember, ask clarifying questions. “Who is the target audience” is a great one that they love to see, as it indicates that you’re thinking about target markets and how to meet their needs. In this case, a stapler for use in a kindergarden might have very different requirements and a very different construction, hence, different test procedures, than one in a university library. Some staplers are huge things that can staple hundreds of sheets, is the stapler one of those? If so, how should its tests differ?

    “Well, since the stapler is for X, we should probably start by Y…” – Show your work. Before you even start coming up with test cases, talk about how you’re going to come with tests. Are you going to divvy it up by feature, or by type of test (fuzz testing, performance testing, security testing— I’m not sure how well these apply to a stapler, but you get the idea)?

    This goes for actual applications too. Sometimes, they actually sit you down in front of an application and ask you to test it. (It happened to me last year, but not this year). If it does happen, not only will you want to explore every nook and cranny of whatever app they put in front of you, but you’ll want to think like a programmer: if I had been the one to write this app, what mistakes might I have made? Put negative values, or strings, in boxes that take positive numbers. Rapidly click the “Save” button a dozen times to see if it breaks the write to disk. Turn off the computer’s wireless card while the application is making a network query. Be Thorough.

The last thing I’ll do is talk quickly about some of the questions I got, and how I responded to them. These were last year’s questions (I think individual interviewers come up with their own questions rather than use anything company-prescribed, but even so, they might be a little miffed if I talk about the questions I got this year), and I’m going to speak in very general terms rather than give a total play-by-play, but they should help you understand the kinds of things you’re likely to see.

Write and test a function to see if a string contains another string

IIRC, this was part of my first interview of the four, really a warm-up question. I ended up coding it in Java (choice of language doesn’t really matter), using the straightforward solution of nested loops where you match the first character of the string to be found against successive characters in the enclosing string, then enter an inner loop if they match. Nothing too difficult here, but just don’t rush and make silly mistakes: for questions like this just make sure your loop indices are correct, and that you’re handling edge cases correctly, stuff like that. In this case, the testing involved coming up with lots of string pairs and checking whether my code returned the correct result. Again, at every stage in the process (maybe for every line of code you write), talk about what you’re doing and why.

*hands me her ID badge to get into the building* “Test this”

I badly botched this one, because this was my first time around and I wasn’t aware of my three guidelines. What I assumed she meant was that she wanted me to test the badge in isolation; it took a while before I realized she meant the entire system— card, card reader, the door locking mechanism connected to the card reader, the server the card reader connects to, the personnel database backing that server, and so on. She would’ve gladly mentioned this to me if I had asked, but I didn’t. I also just jumped in and start vomiting out a big list of test cases, another mistake. The way to go here would’ve been to draw (show your work!) a diagram of the whole system, and how the components interact, and then talk about testing each component in isolation, for each one enumerating the various kinds of tests (again: security, performance, fuzz testing, stress testing, usability testing, failure cases) and what cases would be necessary to perform them.

You’ve founded a company that sells pens. How would you design your website?

I still chuckle a little in admiration when I remember this, because it’s such a great question. “How would you design your website?” is just so vast, and is exactly the sort that can get you bogged down in a maze of unimportant details if you don’t handle it correctly. Again, the way to go is to ask questions first.

Who am I selling pens to? Home shoppers will probably only be buying one, looking at pretty pictures, and paying with PayPal; corporate users might buy ten thousand, want to know the pen’s specifications, and need your website to be able to charge to their expense account. Am I a local company selling to my county, or a multinational? This affects decisions like how robust your underlying application and database layers need to be, what kinds of third-party services you might need to tie into, and so on. Which parts of my company (shipping, billing, etc.) are being handled in-house, and which are being handled by other companies? If you make assumptions about the answers to these questions, and it turns out your assumptions are not the assumptions your interviewer is making, you’re going to design entirely the wrong site!

After that, I broke the website into parts (presentation, application, database), broke each of those parts into components, and then outlined a development strategy for each of those components. At one point I was actually asked how I would divvy up developer time among these various components, which surprised me a little as it seemed like more of a PM question. I took my best shot at it though, and that’s something you should also be ready for.

At the end of the day, though, the thing that’ll help you the most is if you just relax. The interviewers don’t grill you relentlessly: they’re easy-going and want you to chat with them. If you do a little beforehand preparation, take it easy and act normally, while remembering my tips above, you’re in a solid position. Good luck!

What kind of person would I be if I didn’t leave you with a brain-twisting logic puzzle to help work your neurons and burn off all those calories from your Halloween candy? A bad person, that’s what. And I’m not a bad person.

So, let’s say you’ve heard of the Monty Hall problem. And you’re not like the uncultured masses, you know that the probability of getting the car if you switch is 2/3, not 1/2. And you feel so smart, don’t you, knowing the fundamentals of probability better than Joe Schmoe down the street. Well, try this one, tough guy:

Suppose you pick four cards out of a standard deck: the Ace of Spades, the Ace of Hears, the 2 of Spades, and the 2 of Hearts. You shuffle these four cards very well, and then hand two of them to me (we both know what the four cards are). So, now I’ve got two out of the four cards, and it’s your job to guess what the probability is that I have both aces.

At the moment, the probability is clear. There are six possible card combinations that I could have:

A♠A♡   A♠2♠   A♠2♡

A♡2♠   A♡2♡   2♠2♡

and since they occur with equal probability, I’ve got a 1/6 chance of having both aces.

But now I say, “I have at least one ace.” (and we will assume I’m not lying). Now we want to compute the probability of my having both aces, given that I have at least one. In math terms, this would be P( both aces | one ace ).

Well, there is one combination out of the six that does not have any aces (2♠2♡), and we can eliminate that possibility. So now my chance of having both aces rises to 1/5.

I don’t stop there, though. Now I say “I have the ace of spades”, and again, I’m not lying. What does this do to the probability? Clearly, it eliminates all of the possible combinations that don’t have the ace of spades, leaving

A♠A♡   A♠2♠   A♠2♡

A♡2♠ A♡2♡ 2♠2♡

and so the probability is 1/3. Notice, though, that we could apply exactly the same reasoning if I had said “I have the ace of hearts”:

A♠A♡   A♠2♠ A♠2♡

A♡2♠   A♡2♡   2♠2♡

and the probability would again be 1/3.

But wait a minute. We said before that if I announce, “I have one ace”, the probability that I have both aces is 1/5. But because we both know what the four cards are, if you know that I have one ace, you know it must be either the ace of spades or the ace of hearts, and either way, the probability of my having both aces is 1/3.

So which is it? If I say, “I have one ace”, do I have a one-in-five chance of having both aces, or a one-in-three chance? They can’t both be right. More importantly, explain why this is the case, and why the wrong answer this wrong (it’s very easy to be right for the completely wrong reason with this puzzle).

Now, this is not a trick question. It’s like the xkcd Blue Eyes puzzle: there’s no lying or guesswork involved, and the solution is not some stupid out-of-the-blue side-attack that will make you groan. It takes some careful deliberation to work out the right answer.

I can’t claim credit for this question: Cornell’s Professor Halpern used it in a recent Decision Theory lecture to help illuminate the dangers of improperly constructed state spaces for us. I did a little Googling and couldn’t find it on the internet, though, so I figured I’d share it for your enjoyment.

I’ve tried to make the question as clear and unambiguous as possible, but if you need clarification on something, use the comments or you can hit me up on Twitter. Have fun!

I’ve been putting this off for several months now (meant to do it way back at the end of April), but I’m finally getting my butt in gear and shutting down my Facebook page. When I say “shutting down”, I mean as in total deletion, not just stopping using it. I’ve mentioned this plan to a couple friends and whatnot, some of whom responded apprehensively or negatively, so I just wanted to write this for those people, to explain my rationale as well as the reason why this doesn’t mean you can’t get in touch with me anymore. There are two key reasons why I’m bailing out:

  • I never use it anymore. I’ve just sorta stopped. If you look at my wall, on most days it’s big blocks of my status posts, with an occasional comment from someone else. I don’t upload pictures, don’t really look at anyone else’s status or what they’re doing, etc. I’m not totally sure why I’ve just gradually drifted away from Facebook, but I have a few theories (indented bulleted list, yay!)
    • All I used it for was status updates. Sometime around the spring of 2009, I realized that pretty much the only thing I was using Facebook for was just broadcasting statuses: what I was doing, or what Slashdot article I was reading at the time. This is what my professors would call a “heavyweight solution to a lightweight problem”, which is why I signed up for Twitter around the same time: same task, without the overhead. For a while I’ve just been linking my Facebook status to my Twitter feed, but that just presents a different problem: since I essentially ignore Facebook now, it means I’m shooting statuses off into the ether. People occasionally comment on them, and then I just look like a jerk for not responding.
    • Facebook and me don’t really mesh. I’m going to be blunt here: I don’t have a lot of friends. I’m one of those people that has just a few really good friends, and I have other means besides Facebook for keeping in touch with them. This is sort of the opposite of the typical “Facebook model” where people have a lot of friends and use the site as a hub to coordinate communication with all of them. The other things people mention using Facebook for- like connecting with people from high school and so forth- don’t really apply to me either. I’ve really just run out of any good reason to keep using it, and having essentially not used it at all since April, I’ve found that I haven’t missed it.
  • Their privacy policy bugs me. It seems like clockwork that every six months Facebook has made some changes to their privacy policy, which are inevitably a turn for the worse. There was the Beacon debacle way back in the ancient days of 2007, but more recently there was the dustup where some pieces of information, like your Likes and friends list, not only weren’t hidden by default, but couldn’t be hidden from the general public no matter what. It’s true that Facebook has generally backed down whenever the anger over these changes gets out of hand, but these are clearly reluctant acts from a company that doesn’t really care, and treats user annoyance at privacy changes as an oddity instead of an entirely justified reaction. And even after Facebook “backs down”, the anti-privacy creep doesn’t seem to have stopped; the main cause of my anger is not what Facebook is doing now, but what such a clearly apathetic company might do in the future. If I was really invested in Facebook, I might be willing to overlook these things (I admit that I do give Google some leeway because I rely on them for so much), but I’m not about to tolerate this kind of behavior with regards to a service I don’t care about.

Basically I’ve decided that Facebook is more trouble than it’s worth. I’m not getting anything out of it now, so I don’t see why I should bother myself with either the time investment in it, or the concern about whether potential employers, sketchy advertisers, etc. are looking at it.

If you want to keep in touch with me, Facebook isn’t going to be an option any more, but there are plenty of other ways.

If you don’t mind trying out Twitter, I actually highly recommend it. Twitter sort of has a reputation for being a place where self-obsessed narcissists post every little detail of their drab lives, and there are plenty of people like that, but really it’s whatever you make of it. Like I was saying earlier, I just use it for posting anything I think people who are interested in me (that’s one of the reasons I like Twitter: if you care about what I’m saying, follow me; if you don’t, don’t. Compare this to Facebook where everyone who I’ve friended sees my status whether they want to or not) might want to see. The number of people I know who use Twitter has steadily been increasing, so it’s actually as useful as Facebook ever was for sending quick bursts of communication to people. If you don’t know as many people using it, its usefulness might be limited, but I encourage you to give it a try.

If you don’t feel like doing that, I’ve got e-mail, I have a cellphone with this fancy newfangled thing called texting (see the contact tab) or if you’re proximate to me in the real world you could even come up to me and talk ;) Point is, I’m not dropping off the face of the Earth just because I’m getting rid of Facebook, and I don’t want to lose touch with anyone who still wants to talk to me. I’ll actually check Facebook regularly for the next week or so: if you want to make plans to establish a communication link, that’s the time to do it.


Screenshot of Google Docs' new drawing function

Screenshot of Google Docs' new drawing utility, showing basic effects, text/photo insertion, and transparency

A few days ago, Google announced an updated version of Google Docs, their cloud-based collaborative document-editing software. There are essentially three major components to Google’s announcement: architecture, features, and Drawing.

From Google’s point of view, the architecture change is the biggest difference. Google says that the underlying software powering Google Docs (whatever it was) has been completely replaced, and that the new system will allow them to introduce features roll out updates more quickly than they could before. They also say it has significantly better performance. What went unsaid, but is almost certainly going on, is that this “new architecture” is able to do all this because it scraps support for IE6, as Google warned they would do, and is now fully built with modern web technologies that IE6 doesn’t support (HTML5, CSS3, etc.) I can’t emphasize enough what a good thing this is: I can say from experience (with my time at NextMark) that it takes almost 10 times as much effort to get a website working on IE6 than on every other browser put together, and ditching it will free up a ton of development time that Google can spend on other things. The best part is, the rest of us who are smart enough not to use IE6, see the benefits too, since we’re now running code that has been optimized for our browsers, not IE6.

Users are probably going to be more interested in the new features though. One of the coolest the real-time simultaneous editing: while Docs has always allowed multiple people to edit a document at the same time, these changes typically took several minutes to be seen by the other people working on a document. With the new Docs, Google has put in a system very similar to what Wave has: when someone else is editing a document at the same time as you, you see a cursor with their name, and you see their changes in real-time. I’ve tried this several times over the past few days, and it works great, even when the people working are separated by considerable physical distance. Other helpful new features include improved equation and macro capabilities in Spreadsheets, a Print Preview function, “snap grids” in Presentation to help line things up, and improved formatting when importing to or exporting from Office. The full list of new features is available here, and is impressively long: it won’t be long before Google Docs can do everything Microsoft Office can do, and in a more convenient (and less expensive) fashion.

Another addition is Drawing, which now joins the ranks with Documents, Spreadsheets, and Presentations. You can see a screenshot of Drawing in action above. Like the others, it has real-time collaborative editing, so you can immediately see the changes others are making. Feature-wise, it’s basically equivalent to Windows’ Paint utility: nothing too severe. It lets you draw lines, and text, and pictures, and there are some simple fill options and cheesy effects. Like Paint, it’s meant to be a utility for banging out quick sketches and explanatory diagrams, not editing your photos. For that job, it works very well.

Overall, the updates to Google Docs have taken a great thing and made it even better. Microsoft isn’t taking this lying down: they’re working on a web-based version of all their Office programs (I’ve used the beta and it’s actually quite good), but it’s hard to see it posing a real threat with Google’s huge headstart.

Last week my assignment in Systems Programming was to take a vulnerable program- for which we were not provided the source code, only the executable binary- and figure out how to hack it and make it do what we want. Specifically, the program, called “server”, would ask us to put in our NetID (Cornell’s equivalent to a username), and it would spit it back out and terminate, like so:
[alex@linus ~]$ server
What is your NetID? ais46
Goodbye, ais46!
[alex@linus ~]

Because there is an environment variable on Cornell’s Linux machines that is set to your NetID when you log in, the program could detect and respond accordingly if you entered something that was not your NetID. Behold:
[alex@linus ~]$ server
What is your NetID? ahc45
Nice try ais46, but you can't fool me!
Goodbye, ais46!
[alex@linus ~]$

My mission was to get server to print something else, namely, “All your base are belong to ais46”, proving that old internet memes never die, they just get recycled into college assignments by bored grad students.

How to go about this? The string “All your base are belong to” doesn’t appear anywhere in the program’s code, and it’s not as if the string I enter can contain hidden code, because all it does is go to a variable which is compared for equality with “ais46″… right?

Well, in a perfect world, yes. We don’t live in a perfect world, though: we live in a world where it is sometimes possible to exploit a buffer overflow. A buffer overflow occurs when data is placed in a variable that is too small to contain it, and the data “overflows” into other portion of memory. For example, if I were to declare an array big enough to hold ten characters, and then put the string, “My name is Alex Slover” into that array, I would be causing a buffer overflow. What happens to the part of memory that is overflowed into? Much of the time it will contain another variable in use by your program, which will now be changed to some new (essentially random) value, causing unexpected program behavior or a crash. This in itself is bad enough, but it is sometimes possible to use a buffer overflow to not only make a program crash, but take it over entirely.

(You may be wondering why buffer overflows are even possible, and why computers aren’t designed to simply keep track of how big a program’s variables are, and refuse to allow writing data beyond the boundary of a variable. The answer is that this is possible, and in fact it’s mandatory across all of the newer programming languages in existence: Java, Python, C#, etc. The problem is that enforcing variable boundaries (called bounds checking) takes extra time and extra memory, which is not acceptable in situations where code needs to run as fast as possible, such as in the kernel of an operating system. Thus, older languages (like C) and languages that are newer but designed to be used in low-level systems programming environments (like C++) do not require bounds checking, and and instead place the onus on the programmer to be extremely careful to not allow buffer overflows to occur. This does not mean that all programs written in C and C++ are vulnerable, of course, just that a great deal of extra caution is required; it would’ve been pretty easy to write the server program so that it did not create a weak spot and allow someone to break in, but that would’ve made for a rather unfair assignment.)

The next question is: how can a buffer overflow let you control a program and make it do whatever you want? Surely variables and executable code are stored in distant enough locations in memory as to make manipulating the executable code just by changing variables impossible. The answer is that code and data are indeed stored far, far apart; in fact, changing the executable data of a program once it has started running is impossible: that segment of memory is always marked as read-only and any attempt to write to it will cause a program to crash. (You may remember, back in the bad old days of Windows, getting that inexplicable error message “This program has performed an illegal operation and will be terminated”, and wondering whether your software had broken the law. What was actually going on was that the program, probably due to a careless programmer oversight, had attempted to write into a segment of memory it was not allowed to write to, and the operating system had killed it for security reasons.) So if we can’t change the executable program, how can we inject our own code? The answer lies in the stack, which is how all (or virtually all) programs running on a computer are organized in memory.

To understand the stack, remember programs running on a computer typically consist of functions that call other functions. In C, for example, all programs start at the main() function. If you were in the main function and wanted to print something to the screen, you’d call the printf() function, which would cause the program flow to jump to that function, execute whatever instructions form the prinf() function, and then jump back to main() when finished. But how does the program know where to jump back to when finished? The answer is that this value is stored on the stack, which is a special part of program memory. Every time a function is called, that function gets its own stack frame, which is a self-contained chunk of memory on the stack. So if I call the function printf(), then printf() will get its own stack frame, which is created when printf() starts and destroyed when it ends. The stack frame for a function contains all the local variables which are used by that function and that function alone (global variables, accessible between functions, are handled in a different manner), and it also contains the return address, which is where program flow jumps back to once the program is finished. For example, if I am in the main() function, and I call printf() when I am at location 0x84ae0000, the return address would be set to (for example) 0x84ae0004, which is where main() picks up and resumes executing once printf() is finished. (If the preceding notation was unfamiliar to you, seek guidance on Wikipedia)

So: return addresses (which is where the program will jump to when the function has finished) are stored on the stack, and local variables, which can overflow in some cases, are also stored on the stack. Are you seeing the answer? The trick to hijacking the server program is to put so much data into the variable which is supposed to contain my NetID, that it overflows onto the return address. For example, if I were to make my injected data just a long string of zeroes, then instead of the real return address, the program would try to jump to address 0x00000000. Now, forcing the program to jump to return address 0x00000000 isn’t very useful: I can’t write any evil code there, it would cause a program crash identical to the one described two paragraphs earlier. What to do? Well, through careful analysis and use of a debugger, I can figure out the memory address corresponding to the beginning of the variable that I am overflowing. Why not use that as the return address? Thus the answer becomes clear: turn the evil code to be executed into a series of bytes and use those bytes themselves as the padding to fill up the stack and let me write over the existing return address, substituting for that address the address of the beginning of my code. It basically loops back on itself!

Because my systems programming class may use this assignment in the future and probably wouldn’t take kindly to me giving away the answer, and because I think figuring things out for yourself is always more fun and useful than having someone else feed it to you, I’m not going to give a play-by-play of what I did. Suffice to say, I carefully crafted some data corresponding to the evil instructions I wanted to execute, then appended a return address to the end of that data. I then passed in that data where the program was expecting my NetID, and bam: my data overflowed its variable, spilling onto the rest of the stack, causing it to be corrupted. Nothing happened immediately, but when the current function ended, it did its job and looked at the return address on the stack to know where to jump back to, which had been replaced with the address of my malicious code. It dutifully jumped to that location, began executing the instructions it saw, and that was that. Of course, the only thing my instructions did was cause server to print a goofy message, but in the real world, it would not have been difficult to craft code that did something genuinely dangerous: such as getting “root” privileges to have full control over the entire computer.

This sort of attack was only made possible because of the artificial nature of the assignment; in the real world, things are not so simple. Nevertheless, buffer overflow attacks, even if they are more complicated than the one I have just described, are among the most common entry vectors for a hacker with bad intentions to gain control of a computer system. We’re doing assignments like this not because Cornell wants to secretly train a private army of hackers, but because by understanding exactly how these attacks work, we’re better equipped to stop them.

I’ll probably post some more things along these lines in the future, but there are a vast quantity of resources available if you want to learn more. I especially recommend The Art of Exploitation, a no-nonsense book that teaches you exactly how stacks, programs, and computer networks work, and how to look for ways to exploit them. This book is particularly good, because it’s not a “cookbook” that says, “Do this to hack into a computer” (such a book would be useless anyway, as whatever vulnerabilities it explained would quickly be patched and made useless). Rather, it explains how computers work at a fundamental level, why vulnerabilities occur, and where to find them. The book is useful even if you’ve never done a day of programming in your life: the first chapter consists of enough C lessons to get you moving. (Just to be perfectly clear: this is an academic interest on my part. I would never do anything illegal with the knowledge I have, and neither should you. Hacking is like the Force, it should be used for knowledge and defense, never for attack.)

It’s really too bad to see that Palm isn’t doing so well, since they have one of the best- the best, in many respects- smartphone OS on the market. Despite all the excitement in the days immediately before and after the launch of the Pre, sales have not been particularly good, and Palm stock is right back where it was before they dropped the webOS bombshell on the unsuspecting world. Ars has a good article on why that is, and it boiled down to two things:

  • Apps – The first rule when trying to compete with Apple on their own territory is, don’t. The second rule, though, is that if you are going to compete with Apple on their own territory, you have to observe the things people are complaining about vis-a-vis Apple, and improve on them; don’t make them worse. (For example, this is the reason why even fewer people care about the Dell Adamo than about the MacBook Air: the MacBook Air sells for highway robbery prices as it is, and the Adamo is even more expensive. Epic fail in every definition of the term). Palm made the exact same mistake: when people were (and still are) complaining about Apple’s restrictive and obscure app approval process, Palm could’ve swept in to save by the making webOS a completely open development platform (it is based on Linux, after all), but they decided to make the process even worse than Apple’s: the APIs available to developers were lousy, the app approval process was achingly slow, and there was just nothing to get developers excited about a platform that was already way behind the iPhone in terms of installed user base. Talk about shooting yourself in the foot. Then Android came along and actually delivered on the promise of a platform without development restrictions, and that was that. To this day, there are barely 2,000 apps available for webOS, slightly over a hundredth- 1%- of what you can get on the iPhone. That ain’t gonna cut it.
  • Palm’s bizarre advertising – It’s a shame that all it takes to kill the chances of a promising product is crappy advertising, but that’s the way of the world. This is especially frustrating because Palm had most of their work cut out for them: in terms of visuals, the Pre looked better than anything- including the iPhone- on the market when it was released. They just had to show the phone off and let the money roll in. Instead, they opted for those mildly disconcerting adds with the featureless woman emotionlessly doing… well, no one was really sure what she was doing, which was the problem. Palm actually just realized this, and fired their ad agency, but way too late.

With Palm’s revenue continuing to spiral downwards, the rumors inevitably started to swirl that someone would have to buy them out. For a while, Jon Rubenstein denied these, and insisted that everything was fine, but as of this morning the reports coming out say they’ve reached the end of the line: Palm is looking for offers.

Naturally, there’s a great deal of speculation as to who might decide to snap Palm up, assuming the report is even accurate. The current frontrunner seems to be HTC, the Taiwanese company which has exploded out of nowhere (after abandoning Windows Mobile, probably the best business decision of the 2009 year) to become the leading manufacturer of non-Apple desire-inducing smartphones, at least in the US. I see the logic in this, but I also don’t think it’s an absolute: HTC seems to be doing just fine manufacturing phones for Android and Windows Mobile. Nevertheless, it is possible: HTC has been making add-ons and skins for both Android and WinMo, so they clearly have both good programmers and a desire to tune the user experience to be exactly what they want, and having their own operating system would give them just that.

Some people have advanced that buyers might come from companies that already have mobile operating systems: RIM, Apple, Google and Microsoft have all been put forward as possibilities. I don’t think any of these are very likely: RIM’s all-work-no-play corporate culture doesn’t mesh with Palm at all, and Mike Lazaridis seems devoted to his current platform; Apple (and Steve Jobs in particular) would probably rather watch in glee at Palm’s bankruptcy than spend a penny on them, even to get their patent portfolio; Android is doing well enough that I doubt Google needs the help; and Microsoft just makes no sense: webOS’s Linux codebase would be of almost no help to Redmond.

To me, more likely possibilities seem like companies that want to branch into the smartphone market, but don’t have an operating system of their own and don’t feel like adopting Android. Dell and Lenovo both spring to mind: both are computer manufacturers looking enviously at the exploding smartphone market, and both are utterly lacking in the means to develop a nice smartphone operating system themselves. (Can you imagine an operating system designed by Dell? Are you picturing a great deal of grey, and a user experience that drains your soul? If not, you’re doing it wrong.) Lenovo is apparently already working on a mobile operating system, or at least a “platform”, but I can’t see it impressing people the way webOS can.

One intriguing possibility to me is Nokia. On the one hand, Nokia is doing just fine in the cellphone market as a whole: they’re the world’s top manufacturer, after all. On the other hand, their mindshare and market share in the United States is next to nonexistent, and they could really use a smartphone OS with a little more kick. I have never been all that impressed with Maemo: It’s fine for tinkerers and hackers, and people who need very basic smartphone functionality, but its interface is not designed for large touchscreen devices (which are, like it or not, the way of the future) and it’s just plain ugly. It’s connectivity can’t match up to Android or the iPhone either. No matter how hard Nokia tries, they can’t seem to break into the American market, but I think if they bought webOS, and either scrapped Maemo altogether or integrated it into Palm’s offering (they’re both Linux-based, making that easier), and offered some genuinely attractive phones, they may have better luck. Now, this too seems like a long shot: like RIM, Nokia has a corporate culture not conducive to buying a smaller company with the aim of using that company’s IP to scrap everything they’ve developed, but webOS is a great operating system that just needs a company with deep pockets and an ounce of sense to bring it to the masses, and Nokia has both. And they need to scrap Maemo. They need to scrap Maemo so bad.

Of course, the entire story about Palm looking for a buyer could just be heresy, but SEC filings don’t lie: unless something turns around fast, buyout or bankruptcy look to be the only options. Keep your eye on this one as it develops.


Get every new post delivered to your Inbox.