I was once on cleanup duty, in a particularly unfortunate situation.
I was once on cleanup duty, in a particularly unfortunate situation.
I got a chance to work for a long-time hero of mine: someone who was influential in computer science circles in the 60s and 70s, who (while fairly non-technical himself) has a lot of ideas that haven’t really been given the chance they deserve. Along with a friend I had roped in, we embarked upon trying to ‘finish’ a ‘prototype’ that had been provided several years before by a self-taught programmer who had never worked on a large project before. He had told the man we were working for that he had gotten it very nearly finished, and it did demo nicely, but he burned out so badly that he ended up (from what I understand) in a mental hospital for a while, and had been out of touch and unable to work on that or any code for several years by the time we saw it.
Initially, the problem was that we couldn’t find any of the code. We were told that it was a combination of C++ and python, wherein python was being used as a plug-in language; we found no C++ or python code in any of the “source” zip files we were given, and while revision control had been provided, this guy never used it except once, right before he quit, to check in all the windows binaries and a whole bunch of screenshots and videos of various prototypes, along with a couple pieces of out of date documentation and a pdf copy of Dive Into Python.
Eventually, after much digging around on the part of various parties who at one time or another had copies of this, we got a large zip file that contained a nested series of smaller zip file copies of the same directory structure with progressively earlier dates. On the third or fourth level down, we found a single C++ file.
We discovered that this single C++ file contained all of the C++ code for the entire project. Nearly all of it was commented out, using line (not block) comments. It wouldn’t build on any platform — it had a bunch of typos and wasn’t actually valid C++; in other words, the only copy we had of the C++ source was a messy, old, in-between version. But, inspecting it, we discovered that it had partial C++ implementations of several functions that were supposedly “done” (such as loading and parsing a proprietary file format), which were mostly disabled. It was difficult to determine which of these were disabled, because there would be several functions with almost the same name that had nearly exact duplicates of the same code, and various caller functions would use various version. Often we discovered that some function that was closest to complete-looking couldn’t possibly work, only to discover later that the only call to the whole chain of operations had been disabled and replaced with some hard-coded value. This single C++ file was several megabytes in size.
Eventually, in order to make it easier to debug, we separated this file into about ten, by categories laid out in this (obsolete) documentation, and determined which blocks of code were definitely not close to functional, removing them. After this streamlining we got the size down quite a bit, but discovered that nearly all of the functionality was missing. Complicated mechanics behind drawing, file parsing, object placement, and structure were nowhere to be found, but it built and worked — on windows, at least. (It had also been sold as cross-platform despite being build in visual studio; it turned out that it was windows-only, but mostly because the author had used a windows-specific sound library to play notification sounds instead of using the one that came with SDL. We quickly fixed that.)
Combing through the code, we discovered that there was a single line early in execution that loaded a hard-coded arbitrarily-named text file (“abiowy2222.txt” or something) as a python script. We found a directory full of strangely named text files, and while some of them were full of junk (copied and pasted pieces of documentation or forum discussions, lists of error messages), about half of the 100+ text files were partially overlapping versions of a big chunk of python code.
It turns out that this text file contained a bunch of python code that performed a bunch of calls back into C++ to perform draw calls on some large chunk of hard-coded data. This programmer hadn’t bothered to write code for loading that ugly file format he designed; he hardcoded the content of the one file he was using for the demo. He had skipped writing the logic for determining layout, and instead had hard-coded the positions for the objects described in this file. And then, this python file had its own main loop and exited at the end of it — in other words, nearly all of the C++ code was entirely disabled.
We endevoured to rewrite pretty much all of this logic, and we did, at least twice. We wrote an actual implementation of the file format loader (and discovered that most of the examples we had were subtly corrupted) and an actual implementation of the layout logic, both in C++. While trying to debug a (semi-independent) module that implemented a kind of non-relational database based on a graph of arrays of pointers, we decided to attempt a pure python implementation, in the hopes that it might be fast enough to be a good comparison. (This original author was obsessed with premature optimization and with using obscure features of C++, and had comments next to each function calculating — typically totally incorrectly — how many bytes per object were being transferred. The database was implemented in an overly complicated way for documented speed-related reasons, but it turned out to be both slower and more bug-prone than the straightforward and naive approach we took in python, across many varied tests.) Having determined that this database in its C++ form was essentially beyond salvaging, we used the python version instead, and spent a great deal of time trying to square the fact that initial draw time was so fast with the extreme slowness of interactivity. We ended up rewriting most of that draw code, before rewriting all of it from scratch in python in a single all-nighter. This all-nighter occurred about two years after we initially started working on this project.
We did this for free, since we were doing it for a mutual hero, but it really put us off the idea of playing the role of code doctor in the future.