Code Editors

I recently got serious about golf again, after a near-decade layoff. Although I played on my high school’s JV golf team (and even managed to win a few matches), I’d never had any formal instruction. I was only able to muddle by with my substandard gear and hopeless swing because I was playing every day. I’ve played only sporadically since then, maybe one or two rounds a year, so my makeshift ability to get around the course has evaporated with disuse.

In order to become a better golfer I decided to start from scratch, beginning with lessons from a PGA pro and purchasing a set of clubs suited to my ‘experienced beginner’ game. And in the process I have found some interesting parallels between learning to golf (again) and learning to use a code editor. (more…)

Building a bicycle-built-for-two isn’t a great metaphor for designing multi-threaded software, but I needed catchy title, so there you have it.

Few companies and few developers like to talk about the warts in their software, but let’s face it — for decades, SlickEdit has been a single threaded application, nothing but a bicycle-built-for-one. That single thread rode high and strong and did what it wanted, with all the powers and dangers of the world granted to it to do as it pleased. Let’s call this thread George, King George, as it was.

Then we decided to let another thread into our world. Well, needless to say, King George did not like this newcomer in his kingdom. Suddenly, he had to synchronize his peddling and braking and balancing with another thread, and let me tell you, he was not used to doing so much paperwork. Sorry, George, you have to acquire a Mutex before you can use the memory allocator. Sorry, George, you have to release the Mutex when you are done. George was angry, frustrated, and not accustomed to waiting in line behind these mere peasants. They were supposed to make his life easier. Instead, they pervasively filled his kingdom with landmines. George had to step carefully from now on. His smallest comfort was that those little threads also had to step carefully, very carefully. The King was also comforted to know that there still were places that only he could step.

Converting a single-threaded piece of software into multi-threaded software is a task that requires vision. It also requires strictness and thorough investigation of what-calls-what and what-uses-what. The vision comes from finding ways to utilize asynchronous processing and deciding what parts of the application to make thread safe first. The vision also comes from learning new ways to design less stateful APIs and data structures so that you can minimize access to shared data, or at least make the access thread-safe.

It can’t be done all at once.

The techniques for writing thread-safe software and multi-threaded software in general are well documented. It is certainly not a new science, but each thing you convert poses new challenges.

My challenge in SlickEdit 2010 and SlickEdit 2011 was to make it possible for a thread to parse code and insert indexing information into a tag database. Well, that certainly can’t be real hard. That is, until you meet King George.

Step 1: A thread that parses code

SlickEdit has great parsers for many, many languages; the problem was, none of them were written in a thread-safe manner. The lexical analyzer framework was completely stateful and also lacking in flexibility and power. Every parser would need a new lexer, and it had to perform well. The invocation mechanism depended on the Slick-C interpreter, which is still far away from being a thread-safe component. The parsers used global data when they wanted to and communicated directly with the database. To make a long story short, a lot of code had to change.

Step 2: A tagging job queuing framework

Once we could parse code, we couldn’t just start creating threads all willy-nilly. We needed a block of threads to do the work and pass the results forward to the editor. We needed to define what a tagging job was and how the indexing information was collected. We needed to reconsider every place where the editor would launch tagging jobs and see if those jobs could be done on a thread. As it turned out, the answer was “nearly everywhere”. Oh, by the way, King George liked that.

Step 3: A thread-safe tag database

This is one of the big steps forward in SlickEdit 2011. A thread can write to the tag database, making it possible to do everything required to build or update a tag database completely in the background. This was no easy task, because the tag database is a sophisticated component that was built specifically for King George. We had to rethink how we traversed through items in the database. We had to get rid of shared global variables. We had to refine the database block cache so it could be shared by threads.

Step 4: Getting a list of files to tag on a thread

The final step was to scale the threading up from allowing the main thread to schedule jobs one at a time to having a thread schedule a a list of files to be tagged. We needed this thread to find all the files in your workspace, check dates, figure out what language the files were, and finally schedule anything that was new or out-of-date to be tagged. This meant rewriting a lot of single-threaded Slick-C code as thread-safe C++. The speed gains from this change were significant.

The final result

SlickEdit 2011 is a huge improvement over SlickEdit 2010 with respect to its handling of background tagging. King George is beginning to appreciate what these little threads are doing for him and how they are making his job easier now.

The worst problem: Mutex acquisition order

This is what I regard as the worst thread-safe code synchronization problem. Now, if you are fortunate enough to be writing thread-safe code from scratch, where you have an existing, clearly modularized code whose critical sections are small, short-lived, and well encapsulated, it is not a problem likely to hurt you unless you do something silly. But, when you have a large base of single-threaded code with large amounts of shared data that you have to try to make access to thread safe, things get ugly really fast. Here is the classic deadlock condition.

   VSMUTEX mutex1;
   VSMUTEX mutex2;

In the main thread, King George does this.

    mutex1.lock();
    ...                        // I own mutex1 and rule supremely
    ...                        // except that I don't own mutex2 yet
    mutex2.lock();

In another thread, some peasant does this.

   mutex2.lock();
   ...                          // I own mutex2, nobody else can have it,
   ...                          // not even my reverent King
   mutex1.lock();

When King George has mutex2 and is trying to get mutex1, and simultaneously, another thread has mutex1 and is trying to get mutex2, they will deadlock forever. There are a few ways around this, but the best way is the classic solution of don’t do that. The problem is that it is hard to see how pervasive this problem can be when some special case, such as error handing, causes everything you think you know about what order you acquire Mutex’s in to be reversed. This is a problem that can only be solved with thorough analysis and testing and fundamentally sound designs. But, given the circumstances surrounding migrating older single-threaded code to support threading, such designs are harder to come by than one would hope.

Summary

If you read this whole thing looking for deep insight on how to write solid thread-safe code, well, you probably finished sadly disappointed. If you are planning on renting a bicycle built-for-two with your spouse and were looking for riding tips, then you really, really read the wrong article. If you came here because you always thought SlickEdit should do tagging in the background and wondered why it took us so long to implement it that way, then maybe now you understand the obstacles and hurdles a little better. We are still iteratively improving the system and working to improve the tagging throughput. You can look forward to seeing even more performance gains and scaling in the first update (16.0.1).

SlickEdit is moving forward, and finally, George isn’t the only one peddling the bike.

SlickEdit 2011 is an unusual release. Typically, a release contains a good number of new features that enhance your ability to edit source code. This year, the words “updated” and “enhancements” play more prominently in the list:

  • 64-bit Versions for Linux and Windows
  • Multithreading the Context Tagging Engine and Auto-Reload
  • Support for Ruby Debugging
  • Support for Git Version Control
  • Dynamic Debugger Enhancements
  • Updated Microsoft Visual Studio 2010 Support
  • Updated JUnit Support
  • SlickEdit License Manager

So what happened? Choosing what goes into a release is the toughest job for a product manager; there is never enough time to develop everything we’d like to get done in a given year. So we have to make hard choices.

We look at our customer base as a set of constituencies, each with different needs and change requests. Each language we support represents a different constituency with different needs, likewise for each platform. Fixing a tagging problem in C++ does little to help a Python programmer and vice versa. Some features, like Backup History, introduced in SlickEdit v9.0, are useful no matter what language or platform you are using.

Another way to divide constituents is into existing customers and new customers. Generally speaking, new features are considered more helpful in going after new customers, while bug fixes are aimed at existing customers. One consistent piece of feedback from existing customers is that they don’t really want new features; they just want the existing features to work better. In each release, we try to strike a balance between features to lure new customers and bug fixes for existing customers.

When we made the feature plan for this year, it became clear that there were parts of SlickEdit that really needed updating. As a product with a very long history—the first version of SlickEdit was released 23 years ago—we have seen some dramatic changes in the platforms we run on and the expectations of our customers.

In the early versions, resources were scarce so you needed to be as lean as possible, use as little memory and CPU as you can. This also makes your program very fast, which is one of our top goals. Now, a typical development machine has 4 cores and 4GB of memory or more. In this environment it’s frustrating to wait for an answer while the program is only using 25% of your available resources. That’s why the multithreading work was so important.

Don’t get me wrong. We’re not out to become a resource hog. We do believe that, for programmers, coding is the most important thing they are doing and that sufficient resources should be brought to bear.

As code bases grow, it’s even more important to have an editor, like SlickEdit, that knows the location and type of your symbols. Being able to generate that information efficiently and access it quickly is always our top priority. SlickEdit 2011 is a big step in giving you the fastest possible code navigation.

« Previous PageNext Page »