Categories

Let's Debate Coding Style

This might be somewhat of rant but what the hell. Let’s discuss coding style. We each probably have our personal preferences. Tabs, Spaces, 2 spaces, 3 spaces, 4 spaces, tabs set to 4, braces on lines by themselves or not, etc, etc. Until I started working at my present company I’ve never in 25 years worked at a company that had any real official coding style and things worked mostly fine. There might have been some un-official style or informal discussions but mostly people just did their own thing at it worked.

At my present company though they have a STRICT with a capital S style guide. I’m mixed on it. On the one hand it’s nice to agree on a style. On the other hand, I hate Hate HATE some of their decisions.

Let me say first off there are 2 classes of style. Functional style and Formatting style. For functional ones I have few if any beefs with my current company’s decisions. One good example, according to the style guide any constructor that takes just 1 argument must be declared explicit.

class Foo {
 public:
  explicit Foo(const Bar& b);   // 'explicit' is required here.
  ...
};

That’s a functional style issue. The reason for this rule is that single argument constructors act as implicit conversions. With out the explicit this code would compile

   ...
   Bar bar;
   Foo foo;

   foo = bar;   // WTF!?
   ...

Of course there are places where that kind of conversion would be useful but in general it can cause more trouble than it’s worth. Adding the explicit keyword prevents this kind of error and training coders to look for it is a good thing.

Formatting style on the other hand is often comes down to personal preference. It’s possible that if my company’s style guide said “this is just our personal preference and we aren’t going to justify it” I’d have less issues with it (doubtful) but it really irks me that they claim their style choices make code ‘more readable’ and that they value ‘consistency’ since many of their choices are neither.

So some examples with I hope some objective arguments in favor of my preferred styles

Braces

There’s 2 major brace styles I’ve seen (there are more but I’ve never run into them). The first is sometimes called K&R style since it was used in the original C book

// Style 1
void SomeFunction(int a) {
  if (a > 10) {
    printf("Yea!\n");
  } else {
    printf("Boo!\n");
  }
}

The other common style

// Style 2
void SomeFunction(int a) 
{
  if (a > 10) 
  {
    printf("Yea!\n");
  } 
  else 
  {
    printf("Boo!\n");
  }
}

I prefer style 2. Style 1′s only good argument is that it’s more vertically compact letting more code appear on the screen at once. That’s certainly a valid argument. For 25 years I used style 2. I never once thought “If only I could see more lines I’d be more productive” so that argument doesn’t hold a lot of water for me.

On the other hand, style 2 makes it very clear there are 3 blocks and those blocks can easily be copied, pasted, duplicated, and moved around with. The blocks are

{
  printf("Yea!\n");
}

and

{
  printf("Boo!\n");
}

And the outer block

{
   if (...) 
     block
   else
     block
}

Style 2 is also more consistent. In the example above braces appear in only 2 columns instead of 5. In style 2 there’s no where in the code where anything comes after a closing brace. In style 1 there’s the exception, else. Even better, it’s easier to comment style 2.

  // If were going too fast apply the brakes.
  if (velocity > max_velocity) 
  {
     ApplyBrakes():
  }
  // Otherwise if we're going too slow accelerate.
  elif (velocity < min_velocity)
  {
     StepOnTheGas();
  }

With style 1 where does the second comment go?

  // If were going too fast apply the brakes.
  if (velocity > max_velocity)  {
     ApplyBrakes():
  } elif (velocity < min_velocity) {
     // Otherwise if we're going too slow accelerate
     StepOnTheGas();
  }

There the comment is too late. All other comments come before the code they are describing. But doing that is just as bad

  // If were going too fast apply the brakes.
  if (velocity > max_velocity)  {
     ApplyBrakes():
  // Otherwise if we're going too slow accelerate
  } elif (velocity < min_velocity) {
     StepOnTheGas();
  }

Now the comment is inside the wrong code block.

Preprocessor #if/#else/#elif/#endif indentation

The Google style guide says always put preprocessor definitions flush left like this.

#if ENABLED

  DoIt();

#if SOME_FEATURE

  DoThis();

#if SOME_SUB_FEATURE

  DoThat();

#endif  // SOME_SUB_FEATURE

  DoSomethingElse();

#endif  // SOME_FEATURE

#endif  // ENABLED

I don't understand that decision at all. (1) It's inconsistent with normal if/else formatting. (2) If not indenting preprocessor #if why indent regular if/else statements? What's the argument for why indenting code blocks is good like this

  if (something)
  {
    doSomething();
    if (something_extra) 
    {
      DoSomethingExtra();
    }
  }

and indenting preprocessor blocks like this is not good?

#if ENABLED

  DoIt();

  #if SOME_FEATURE

    DoThis();

    #if SOME_SUB_FEATURE

      DoThat();

    #endif  // SOME_SUB_FEATURE

    DoSomethingElse();

  #endif  // SOME_FEATURE

#endif  // ENABLED

I prefer to indent preprocessor blocks like this bottom version. It's consistent. The same reasons that make normal block indentation more readable apply just as much to nested preprocessor directive blocks.

Formatting Function Arguments

There are a ton of ways to list arguments to a function. The most common is just across the line as in

  void SomeFunction(int arg1, float arg2, char arg3) 

The debate usually comes up when you have many arguments or arguments with long names or types such that you'd like to use multiple lines. My company prefers these styles. First do this

  void SomeFunction(int arg1, 
                    float arg2, 
                    char arg3) 

If things are still to long then do this

  void SomeFunction(
      int arg1, 
      float arg2, 
      char arg3) 

I generally jump straight to the second format. Why? Because the first style is inconsistent

void RunInPlace(int speed,
                float duration);

void Sprint(int speed,
            float duration);

void SwimAcrossOcean(int speed,
                     float duration);

void CrawlOnBack(int speed,
                 float duration);

vs

void RunInPlace(
    int speed,
    float duration);

void Sprint(
    int speed,
    float duration);

void SwimInOcean(
    int speed,
    float duration);

void CrawlOnBack(
    int speed,
    float duration);

The first one arguments appear in 4 different columns. In the second they are all in the same column. Another reason is the first style often ends up generating extra work. Let's stay I rename the functions using search and replace or refactoring tools.

void Run(int speed,
                float duration);

void Sprint(int speed,
            float duration);

void Swim(int speed,
                     float duration);

void Crawl(int speed,
                 float duration);

vs

void Run(
    int speed,
    float duration);

void Sprint(
    int speed,
    float duration);

void Swim(
    int speed,
    float duration);

void Crawl(
    int speed,
    float duration);

The first style is now broken and someone has to go back and manually fix the lines. The second style just works whether it's a function declaration or calling the function.

Variable Declaration

When declaring variables, some developers prefer

{
   var speed, hitpoints, damage;
}

or possibly

{
   var speed, 
       hitpoints, 
       damage;
}

I prefer

{
   var speed;
   var hitpoints;
   var damage;
}

Why? One reason is consistency. The first 2 styles are arguably less consistant than the 3rd. In the first 2, 2 variables have commas after them, 1 has a semicolon. 1 variable is preceded by 'var' and 2 are not. In the 3rd style every variable is treated consistently.

Another reason is ease of editing. Delete the first or last variable from the first 2 styles and they break and have to be edited further. With the 3rd style that doesn't happen. Let's say I want to search and replace to get rid of 'speed'. With the first 2 styles I need to know the order of declaration to do the search and replace. It might be "speed, " or it might be "speed;". With the 3rd style without looking I know exactly what will work. "var speed;"

Where to put *

You've got 2 options

  int *ptr;

or

  int* ptr;

Even K from K&R prefers the second in C++ but i'd argue even in C style 2 is more consistent.

  int   speed;
  float duration;
  char  *name;

vs

  int   speed;
  float duration;
  char* name;

With style 1, sometimes a variable name is by itself with its complete type on the left. Other times it's not with part of its type on the left and another part with part of its type connected to it. Style 2 is always the same. Type on the left, name by itself.

Constructors

Google prefers

SomeClass::SomeClass()
    : field1(value1),
      anotherField(value2),
      field3(value3),
      yetAnotherField(value4) {
}

I prefer

SomeClass::SomeClass()
    : field1(value1)
    , anotherField(value2)
    , field3(value3)
    , yetAnotherField(value4) 
{
}

Again it comes down to consistency and easy of editing as well as easy of checking. On consistency the first style has 3 different cases; 1 for the first field, 1 for the last field, and 1 for the fields in between. The 2nd style only has 2 cases. 1 for the first field, 1 for the rest. On editing, with the 2nd style adding or moving fields is easy. Except for the first line adding or removing a new field doesn't effect any other lines. With the second style, if you add a field to the end you have to edit 2 lines instead of just 1. Finally, which one of these is easier to spot the error?

SomeClass::SomeClass()
    : field1(value1),
      anotherField(value2),
      field3(value3)
      yetAnotherField(value4) {
}

or

SomeClass::SomeClass()
    : field1(value1)
    , anotherField(value2)
      field3(value3)
    , yetAnotherField(value4) 
{
}

BTW WebKit prefers the second style here.

Splitting Expressions

Slightly similar to the last one is how to split long expressions. Google prefers operators at the end of lines

  damage = speed +
           weight +
           strength;

I prefer something along the lines of

  damage = speed
         + weight
         + strength;

I think this one could maybe be argued either way. Style 2 has the advantage that if you can't see the end of the line you know that the 2nd and 3rd lines are continuations of the first since they start with an operator that requires or at least suggests they are a continuation. Not so with the first style. On the other hand, Google style has an 80 column limit. If you accept that the 80 column limit is a good thing (I don't) then maybe style 1 has no cons since you'll likely always be able to see the end of the line.

Lining things up

When it's not too inconvenient I like to line things up. Some examples

   struct Creature 
   {
     int         speed;
     float       endurance;
     const char* name;
   };

   Creature creatures[] = 
   {
      // spd  endure  name
      {  75,  1.075f, "Blarg",     },
      {   6,  0.500f, "Gern",      },
      { 103, 13.500f, "Nargblat",  },
   };

Google requires no padding so

   struct Creature  {
     int speed;
     float endurance;
     const char* name;
   };

   Creature creatures[] = {
      // spd, endure, name
      { 75, 1.075f, "Blarg", },
      { 6, 0.5f, "Gern", },
      { 103, 13.5f, "Nargblat", },
   };

The best argument I can think of for padding is email, spreadsheets, mp3 players and file explorers. Padding is basically turning stuff into tables. Would you rather see

Artist Album Song Duration
Madonna Erotica Rain 3:54
U2 The Joshua Tree In God's Country 2:57
Lady Gaga The Fame Poker Face 3:35
Red Hot Chili Peppers Mother's Milk Higher Ground 3:23

or this?

ArtistAlbumSongDuration
MadonnaEroticaRain3:54
U2The Joshua TreeIn God's Country2:57
Lady GagaThe FamePoker Face3:35
Red Hot Chili PeppersMother's MilkHigher Ground3:23

Some people, especially in JavaScript or Python, would be happy with this style

var creatures = [
  {  
    speed: 75, 
    endurance: 1.075, 
    name: "Blarg"  
  },
  {  
    speed: 6,  
    endurance: 0.5, 
    name: "Gern"
  },
  { 
    speed: 103, 
    endurance: 13.5, 
    name: "Nargblat"
  }
];

But I'd argue it's still often better in table form.

var creatures = [
  { speed:  75, endurance:  1.075, name: "Blarg"    },
  { speed:   6, endurance:  0.500, name: "Gern"     },
  { speed: 103, endurance: 13.500, name: "Nargblat" }
];

It makes it far easier to compare them. Which one has the fastest or slowest speed or which one has the most or least endurance is very clear in table for and not clear at all in non-table form. Of course if the tables are giant or edited by non-engineers it might be better to look into editing them outside of code.

(I'd also like to point out that any language designer that doesn't allow trailing comma's in lists needs to be shot. I'm taking to you JavaScript!)

Arguably formatting items in a table makes them more readable otherwise we wouldn't do it. Therefore formatting code in what is effectively a table format is often more readable. This includes expressions. I'll often do something like this

  uint32_t rgba8888 = ((red   & 0xFF) << 24) 
                    | ((green & 0xFF) << 16)
                    | ((blue  & 0xFF) <<  8)
                    | ((alpha & 0xFF)      );

I might even shift by 0 for more consistency

  uint32_t rgba8888 = ((red   & 0xFF) << 24) 
                    | ((green & 0xFF) << 16)
                    | ((blue  & 0xFF) <<  8)
                    | ((alpha & 0xFF) <<  0);

where as Google requires this

  uint32_t rgba8888 = ((red & 0xFF) << 24) | 
                      ((green & 0xFF) << 16) |
                      ((blue & 0xFF) << 8) |
                      (alpha & 0xFF);

I think the padded style is objectively more readable. Google thinks so to because they request that comments be aligned as in

  int speed;       // Speed in units per frame
  float duration;  // Duration in seconds

Why the inconsistency? If alignment is helps readability in one place it arguably can help it in many others.

Braces Even on Single Line Blocks

For a single line block styles can often be this

  if (condition)
     code

or

  if (condition) 
  {
     code
  }

Google's style lets you choose. WebKit's style requires the first. Several people would argue the second is objectively better. For one, it prevents this error

  if (condition)
     code
     more code

If braces are required that error is impossible. Another is consistency. Having braces sometimes and not others is inconsistent. Having them always is consistent.

I'm sure some of you are thinking, hey, python solves this problem. No, python's solution sucks even though I love python otherwise. It's often useful to insert extra lines for debugging and it's nice to make them stick out but. Example

    def SomeFunction():
        DoSomething()
print "here1"
        DoSomethingElse()
        DoAnotherthing()
print "here2"
        DoMore()

This makes it easy to go back and delete the debugging lines as they stick out. But I can't do that in python. I have to make the debugging lines blend in with everything else. If the debugging lines are more complicated, say a loop to validate something, they start looking like the real code making it harder to find them later. Yes there are ways around that. End every line with #DEBUG or put # -- DEBUG -- before and after the debug block. If python had real code blocks like most languages I wouldn't have to find ways to work around it's formatting limits.

Line Length Limits

Google requires an 80 character line limit. More than 80 characters and you must split the line following several rules. After 25 years of living without a line length limit I found this to be one of the hardest to accept. In particular, the Google Style guide talks about "readability" and requires not abbreviating names unless their super common. (src, dst, num, len are ok. mgr, rcvr, sgnl are probably not ok.) But these 2 conflicting rules leads to incredibly split lines.

Assume the limit was 40 columns. Which is more readable

  int w = desired_width 
      * scale_factor + padding;
  int h = desired_height
      * scale_factor + padding;

  glBindTexture(
      GL_TEXTURE_2D, someTexture);
  glTexImage2D(
      GL_TEXTURE_2D, level, format, 
      w, h, border, format, type, data);
  glTexParameter(
      GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, 
      GL_REPEAT);

vs

  int w = desired_width  * scale_factor + padding;
  int h = desired_height * scale_factor + padding;

  glBindTexture(GL_TEXTURE_2D, someTexture);
  glTexImage2D(GL_TEXTURE_2D, level, format, w, h, border, format, type, data);
  glTexParameter(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_REPEAT);

I find the second far more readable. The function of the code flows better. I see

compute size
bind texture
upload data
set parameters

where as in the first I see

compute width
bla bla bla
compute height
bla bla bla
bind texture
bla bla bal
upload data
bla bla bla
set parameters
bla bla bla

80 columns is left over from the 1980s and earlier computers with text displays. Nowadays we have 24 inch or 30 inch monitors that can easily display 320 columns or more depending on the font size. It seems silly to hold us back to a 1980s standard. Just as silly as it would seem to send a fax today instead of an email or other form of electronic sharing.

In fairness I've found the 80 column limit can come in handy. It kind of comes in handy when doing side by side diffs as all of both the old code and new code fit on the screen. On the other hand, for the previous 25 years of coding I never felt like "if only I could see the end of every line this diff wouldn't be so hard". Most modern visual diff programs will scroll both views together, or show the code in other ways (top, bottom), etc.

The other plus to 80 columns is it makes code snippets easier to paste into web pages ;-) Of course that's rare enough that I can hand format those cases and write samples that fit.

Strict Naming Rules

Many companies have rules for naming. For example Google's is constants are kCamelCase, functions, classes, structs and typedefs are CamelCase, variables, arguments, and fields are under_score, macros are ALLCAPS. Example:

class GameObject {
 public:
  enum MoveState {
    kWalk,
    kJog,
    kRun,
  };

  static const int kStartSpeed = 10;

  void MoveObject(float elasped_time);

 private:
  int max_speed_;
  int current_speed_;
};

That's all fine. But sometimes those rules should be broken. Upper and Lower case are a western language feature. Languages like Korean, Japanese, Chinese, don't have upper and lower case. (which suggests that languages and frameworks that use upper and lower case to select features might be culturally insensitive design).

The issue comes up with code generation. Sometimes you'd like to auto generate code but if you have to derive Uppercase, lowercase, CamelCase and under_score styles from the same data you're wasting your time just for the sake of style.

Edit: April 2013

Actually I now think naming should be more consistent. Google's style is listed above. Specifically, variables, getters, and setters are under_score and most everything else is CamelCase. The problem with this style is that it's not consistent. I'm not sure I can come up with a good example but...

class ThingAmaJigManager
{
public:
  int get_num_thing_ama_jigs() const 
  {
    return m_num_thing_ama_jigs;
  }

  void SetNumThingAmaJigs(int num);

  ...

private:
   ThingAmaJig* m_thing_ama_jigs;
   int          m_num_thing_ama_jigs;  
};

First rule is, setters and getters are under_score unless they are slow. In other words, if you wouldn't call the function in a tight loop then name it GetNumThingAmaJigvs not get_num_thing_ama_jigs. That's arguably inconsistent. Why, as a user of the class, should I have to guess what the name of the function is. I get that it's nice to know it's inexpensive but if that's the rule then it should be any function which is inexpensive follows this rule, not just setters and getters.

Second problem, it makes code search worse. I was recently trying to add a simple flag to some Chrome/WebKit code. There were 6+ classes involved. Most of the classes had functions like setEnableThisFeature and getEnableThisFeature. Some were virtual so they had no field named m_enable_this_feature. Searching the entire code base for "enableThisFeature" only finds some of the references. You also have to search for "enable_this_feature" to find everywhere that stuff is referenced. It seems arguable that having only to search for one of the 2 styles instead of both styles is better. For me, I searched for "enableThisFeature" and found all the classes. I edited the classes and saw the ones that had a field "enable_this_feature" and fixed those clases. What I didn't find was there was some marshaling code directly referencing "enable_this_feature" with no reference to "enableThisFeature". I spent 4 hours tracking that down. A more consistent naming convention would have avoided that. By that argument the consistent WebKit naming convention is better than the inconsistent Google one.

Member names

Lots of project require member names to be named differently than local variables. Microsoft and WebKit usually use m_name. I've seen some projects that use mName. Google uses name_, the name followed by an underscore. I hate that decision. The whole point of given them a different name is to make them stick out clearly. Underscore is not enough to do that in my opinion. Which of these is easier to spot the error.

void SomeClass::ProcessVeloctiy(float velocity) 
{
   velocity_  = velocity_ + accleration_;
   position_ += velocity_;
}   

vs

void SomeClass::ProcessVeloctiy(float velocity) 
{
   m_velocity  = m_velocity + m_accleration;
   m_position += m_velocity;
}   

For me at least the second one is far more clear that I never used 'velocity'. Maybe that's not a good example since the compiler would warn me that velocity is unused but a few times a year I've used name_ where I meant name or visa-versa. I feel like using m_name I'd be less likely to make that mistake.

I like that python and JavaScript mostly solve this problem as they require self.field or this.field which forces the issue. Of course even in those language you need to distinguish global variables, class variables, privately scoped variables, etc. Choose a naming convention that makes things clear and helps you prevent errors.

Functional Style

A few minor functional style preferences.

Google's style guide lets you use pre-increment or post-increment for simple types. I prefer pre-increment always when it's appropriate. If it's not clear pre-increment

   ++index;

means "increment index" where as post increment

   index++;

means "temp = index; ++index; return temp;". If your iterator is complex that's important since making a temporary copy of the iterator might actually do something. For a simple integer the compiler will of course throw away the temp part but from my point of view it's better to say what you mean. If you mean increment write "++index". If you mean "temp = index; ++index; return index;" then write "index++".

Use Unique Names

I'm really surprised when I come on a new project and people have chosen global variables or more often preprocessor defines with very simple names. Example DEBUG, OS_WIN, WIN, LINUX, MAC, OSX, LOGGING, ENABLE_LOGGING, etc.. The problem with using these kinds of names is you eventually want to use some third party code and they also picked similar names and you end up having conflicts. Choosing names that are unlikely to ever conflict like MYPROJECTNAME_PLATFORM_WIN or MYPROJECTNAME_ENABLE_LOGGING etc..., means you won't run into that problem.

Don't make single line functions

Some style guides, including Google's, allow 1 line functions as in

class Foo 
{
 public:
   int get_width() const { return m_width; }
   void set_width(int width) { m_width = width; }
};

The problem with these kinds of lines is they are often un-debuggable. The debugger won't stop on the line or if it does it's not possible to inspect 'this' or local fields or arguments. Where as this

class Foo 
{
 public:
   int get_width() const
   { 
     return m_width; 
   }
   void set_width(int width) 
   { 
     m_width = width; 
   }
};

Has no such problem. I've run into this issue on gdb, cgdb and VC++ so it's no specific to one platform's debugger.

I'd even go so far to say that I wouldn't be against a style guide that required values to be calculated outside calls. For example.

   ...
   drawCircle(x, y, radius * scaleFactor * fudge);
   ...

vs

   ...
   float effectiveRadius = radius * scaleFactor * fudge;
   drawCircle(x, y, effectiveRadius);
   ...

I've run into several situations where I needed to change the code from the first example into the second just so I could debug. I'm sure you're thinking I could step into 'drawCircle' to find out what the calculation is but that's not always the case. Often you can't debug into certain functions easily. If this is JavaScript in a browser and we're calling 'context2d.arc(...)' there is no stepping into 'context2d.arc()' as it's a native function. Similarly in C/C++ calling system functions or libstdc functions might not be easily debugged. There's no efficiency issue as the compiler will or at least should generate the optimal code when optimization is turned on but following this kind of style it will make debugging much easier.

Conclusion

The funny thing is, I've been at my present employer for over 4 years now and so I've had to follow their style for 4 years. That's made it so it's hard for me to write the style I prefer. Even in personal code I'm so used to doing it their way that I just continue to do it. That doesn't mean it doesn't still bother me though. Anytime I know something would be more readable if I could align it like a table or anytime I rename some method and then have to manually walk through all the files and changes and fix their formatting it can be frustrating. We also have to torture new hires as we beat them into style submission from whatever their previous style was. They run into similar issues where the style guide is arbitrary and not well thought out.

sigh...

If you've got some particular style and objective reasons for them I'd love to hear about them.

  • http://twitter.com/bionicbeagle Stefan Boberg

    Good points, all of them! I agree with many of your choices and that’s how I tend to code at work.

    I have one preference which relates to “lining up” which you didn’t mention: return values. I.e, in the cpp file, I write the return value on a separate line for member functions:

    const Foobar*
    Foobar::getBestest()
    {
    return nullptr;
    }

    To me this makes the code easier to scan, as the function names will always line up, and the return value stands out.

  • momo

    Where to put the star *:

    “int* var” is fully wrong to me, since int is not a pointer, var is the pointer. And it makes more sense when declaring multiple variable on the same line:

    int* var1, var2; -> var1 is a pointer, but not var2
    int *var1, *var2; -> both are pointers

  • ExCoder

    I prefer a single exit point to all functions.

    I HATE nesting function calls on a single line:
    X = FP( FD ( FQ(123)));

    Also, I have rules on where I put spaces:
    Do use a single space after a comma between function arguments.
    Right: Console.In.Read(myChar, 0, 1);
    Wrong: Console.In.Read(myChar,0,1);
    Do not use a space after the parenthesis and function arguments
    Right: CreateFoo(myChar, 0, 1)
    Wrong: CreateFoo( myChar, 0, 1 )
    Do not use spaces between a function name and parenthesis.
    Right: CreateFoo()
    Wrong: CreateFoo ()
    Do not use spaces inside brackets.
    Right: x = dataArray[index];
    Wrong: x = dataArray[ index ];
    Do use a single space before flow control statements
    Right: while (true)
    Wrong: while(false)
    Do use a single space before and after comparison and arithmetic operators
    Right: if (x == y)

  • C

    Style wars will never end, but I like that you’re trying to make it as objective as possible.

  • Dev

    I wish everyone followed these conventions! But maybe they just need to go through the hassle of debugging and maintaining for a little longer before they realize it’s worth it.

    Regarding the 80 character limit: I agree that it’s stupid, but there are consequences for readability if lines are too long and the cost of jumping back and down to the start of the next line is too high. This is probably infrequent enough that it doesn’t matter though.

    As for tabs vs. spaces, I use tabs for indentation (1 tab = 4 spaces) and spaces for delimiting. I still don’t know why anyone would prefer spaces.

    The only thing I disagree with is the table vs list thing; it’s generally inconvenient to edit the padded rows. If my editor let me switch between the forms automatically, then I would write in the list way and autoconvert it to the padded table form for reading.

  • http://greggman.com greggman

    spaces > tabs because they always look the same everywhere. Tabs break when one person has them set to 8 and another to 4 and another to 2 or 3. For example, AFAIK you can’t set the tabs in a textarea in html. They are 8 spaces. So you paste in some code from your editor and it’s too bad for you.

    Most good editors make the editing feel the same with tabs or spaces so there isn’t much reason to prefer tabs over spaces, at least for me.

  • http://greggman.com greggman

    I agree with most of your spacing rules. I think both the Google and WebKit rules for spaces are the same. As for the single exit point I’m not so strict. In facf I generally find code that requires a single exit point to be more convoluted than code that ignores that rule. Example:

    bool DoSomething(int thing_id)
    {
    Thing* t = GetSomething(thing_id);
    if (!t)
    {
    SetError(“Couldn’t get something”);
    return false;
    }

    if (!t->IsValid())
    {
    SetError(“Thing not valid”);
    return false;
    }

    return t->DoSomething();
    }

    Seems much better than

    bool DoSomething(int thing_id) { bool success = false; Thing* t = GetSomething(thing_id); if (!t) { SetError(“Couldn’t get something”); }
    else if (!t->IsValid()) { SetError(“Thing not valid”); }
    else { succees = t->DoSomething();
    } return success;}

    Maybe that’s not a good example but I’ve run into many cases where the single exit path seemed really convoluted and made the code harder to understand. Of course maybe I just didn’t see the clean way of doing it.

  • http://greggman.com greggman

    Putting more than one variable declaration per line is an anti-pattern IMO.
    But this is one of those subjective things. To me since you can’t do this

    int a, float b; // declaring 2 types, one int, one float

    You shouldn’t be able to do this

    int a, *b; // declaring 2 types, one int, one pointer to int,

    Seems like it’s more consistent to disallow both.

  • Dev

    Makes sense. I haven’t run into issues like that but I can see how that would be annoying. In my experience, most tab vs. space issues can be easily/automatically resolved if tabs are used (especially if the code is annotated with the author’s tab width), but it’s harder when they are spaces. Regarding tabs in textarea/pre/code/etc, client side replacement of tabs with spaces seems like it’d be pretty feasible, especially if you’re already using a syntax highlighting plugin. Or if you’re talking about tabular data I would just use tables. Hopefully css3 adds the tab-size or tab-width property though.

  • Axio

    Very interesting. I’ll probably change how I format my code now!
    There is still one point where I disagree with you : brackets.

    > On the other hand, style 2 makes it very clear there are 3 blocks and
    those blocks can easily be copied, pasted, duplicated, and moved around
    with.

    I assume that *any* text editor worth this name easily enables you to
    select text within brackets (brackets included or not) for copying,
    cutting, and moving around.

    Vim does it, I’m pretty sure emacs does it, and hopefully the other
    editors do it as well. If not, they probably shouldn’t be used for code
    edition.

  • Coder

    Fantastic article. Agreed with every single point, and for most of the same reasons my coworkers and I insist on most of these styles.

    Personally I insist on prefix notation for punctuation, like in your Constructors and Splitting Expressions section. It especially helps for compound conditional statements too.
    Extreme example of nested conditions:
    // style 1
    if (condA && ((cond B &&
    condC) || condD))
    {
    // style 2
    if ( condA
    && ( ( condB
    && condC
    )
    || condD
    )
    )
    {
    [it lines up better in fixed width]
    While the line count went up, I would argue style 2 is instantly understandable AND much easier to refactor and visualize in a line-based diff.
    You can shorten it up by opting for LISP-parens))))))) too.
    // style 3
    if ( condA
    && ( ( condB
    && condC)
    || condD))
    {

    The point about debuggable code is also huge.
    In modern C++ an inline function is always better than a macro, when possible, because you’ll actually get a callstack entry for it in debug builds.

    And lastly, code-browsing. Macros that do token pasting are evil, because the ‘calling’ code does not contain any symbol that is known to the symbol data base. I’m talking about stuff like this.
    #define SOME_CONSTANT_aaa 0×180
    #define SOME_CONSTANT_bbb 0x0c0

    #define GETC(name) SOME_CONSTANT_##name
    // later
    int x = 10 * GETC(aaa);
    int y = 10 * GETC(bbb);
    FIrst time looking at the code, you would say “go to declaration of aaa” and at best you get no result. Even worse you get taken to some other symbol having the name ‘aaa’.

  • yy

    The reason this works the way it works is that it makes clear you are declaring a and *b as int. You may find this article by Rob Pike interesting:
    http://golang.org/doc/articles/gos_declaration_syntax.html

  • http://greggman.com greggman

    no, I’m declaring a as in “int” and b as “pointer to an int”. Nothing in that article linked contradicted that. In fact for example it talks about ‘fp’ being a ‘pointer to a function’ not ‘*fp’ being a function. It also shows Go puts all the type declaration on the left but still points out that it’s a, b and c are different types. If they were the same type you could say a=b but Go and C and C++ will all fail that assignment

  • bquenin

    Hey, I like this coding style. Is there any IDE with a code formatter providing this template ?

  • oldcoder

    Are you a Windows programmer?
    Windows programs usually contain more line noises and long methods. Making all your function code inside one page is very important – so many styles you talking here are good for those. But if you prefer most methods only has at most 20 lines, you would prefer K&R style.

  • http://greggman.com greggman

    No, I’m not a windoes programmer. I don’t agree with the logic that scrunching things up = better since by that logic this

    void RemoveCharInPlace(char* s; char c) { char* d = s; while (*s) { if (d != c) { *d++ = *s; } ++s; } *d = ''; }
    

    would be better than this

    void RemoveCharInPlace(char* s; char c){  char* d = s;   while (*s)   {     if (d != c)     {       *d++ = *s;     }     ++s;   }   *d = ''; }
  • Walker

    Although I am sure you have an array of styles floating around in your head, and you’ve probably seen most- but your debate about braces is incomplete:

    // Style 3
    void FeatherweightFunction(bool a)
    {
    // if input true, success!
    if (a) {printf(“Yea!n”)}
    // else, we failed
    else {printf(“Boo!n”)}
    }

    // Style 4
    void AverageFunction(int a)
    {
    // if input is small, multiply a by itself & half!
    if (a = 10 && < 100)
    {
    printf("Smoothn");
    // else, we failed ..
    }else{
    printf("Boo!n");
    }
    }

    Style 4's trailing "}else{" with the comment following it, will have a full return separating itself between "}else if{"s code block to maintain readability (assuming such code block is much phatter)- this is my personal preference and I find it to be the most intuitive.

  • http://greggman.com greggman

    Nice.

    PS: If you add

       ... 

    around your sample I think disqus will format it as code

    This() 
       is
       a
    Test
    

    Did it work?

  • http://www.facebook.com/tony.rubolotta Tony Rubolotta

    I know I’m a bit late finding this post and making this comment but here it is anyway. My programming background is quite different but the styles I have adopted almost fit to a tee those you prefer and mostly for the exact same reason. If I may add, some of those preferences also allow quick modifications for testing/debugging using comments. For example

    damage = speed

    + mass
    //+ direction
    + somethingelse
    :

    It’s a minor thing but commenting out a single line is much easier than some of the other contortions I have seen.

    On naming conventions, I favor clarity over brevity. An old habit I have is to also include the data type in the name though I suppose in Javascript it might be frowned upon. I use a reverse notation however since I want to see related names listed contiguously when sorted by an IDE, such as

    starChartAddSP
    starChartDelSP
    starChartQRY
    starChartTBL

    or

    MemberIDINT
    MemberVCH32

    Oh well, just my 2 cents worth but a very interesting discourse on your part on an important topic.

  • Walker

    My current if-statement style, built off of ternary format:

    ((x + vX) > borderX) ? ( //if
        x+=(borderX - x),
        rotation=(-rotation)
    ) : ( //else
        x+=vX
    ); //end
    

    Extremely simple; NOT one-line (I am not single-line cowboy coder, sounds like
    backwards logic), faster to type than typical if statements, and still
    able to include multiple single-line statements rather than writing an
    annoying compound statement under the impression that ternary only supports single-statements..

    It does not, though!

    (expression) ? (statement, statement, statement) : (statement, statement);
    

    Therefore making this possible:

    
    (expression) ? (
      statement,
      statement,
      statement
    ) : (
      statement,
      statement
    );
    
  • http://www.facebook.com/jacev Jason Varlet

    Great points! I agree with all of them, and probably already use 80% of them in my day to day programming. However, I’d never used the formatting in the “Splitting Expressions” style, but I really like it! I’ll have to work that into my style.

  • http://tshepang.net/ Tshepang Lekhonkhobe

    I find Formatting Function Arguments the most interesting, followed by Splitting Expressions.