Things Every CS Student Needs To Know

I’ve interviewed and hired new computer grads. I’ve help a few prepare for interviews as they approach the end of school. You’d think after 4 years of learning, they’d know all the basics skills they need to do the job of a software engineer.

But it isn’t so.

So as a small service to humanity, here’s my list.

Source Control

If I was a professor my students would turn in all of their assignments via Git. Because in the real world, that’s how they turn in all of their work.

It boggles my mind that CS students haven’t had to use and become proficient in some form of source control. It isn’t like the concept is new, I was working on Mircosoft’s SourceSafe for the Mac in the 1990s. The dominate software in that area, like all areas of development, changes every half decade, the concept is always there.

Also they should have encountered git on their own as they work on open source projects, but that lack is for another point.

Agile/Scrum

I say Agile/Scrum, but I mean the Scrum framework. Agile is a philosophy and students should be exposed to it. In 80% of cases, the first day they start a programmer job they are going to be on a Scrum Team. Could be another kind of Agile framework, like Kaban, but most likely it will be Scrum. If you are hired by a company that doesn’t do some agile methodology it’s a red flag.

Scrum is also a great system to use in those group projects people dread so much. It’s transparency and daily accountability can help keep everyone working.

Tactical Programming

Tactical Programming is how you write a line of code, name a variable, or structure a function. I’ve written an article about it and this blog has a whole category (Tactical) dedicated to it. But most schools don’t seem to talk about it at all.

Initiative

When I went to the interview for my first job, I went with source code to a program I had written on the side while in school.

When I interviewed programmers for my video game company, I asked them to show me something they had created. Only one of the candidates could do that. He got the job.

People want to hire people who love to code. Who want to do it so much they will do it on their own time.

They want to hire people who they know can do the job because they have already done it.

Many companies also ask for your Github account ID so they can see if and what you’ve contributed to open source projects.

Truth be told there are a lot of programmers who lack these skills that have been doing it for awhile. Those folks need to up their game, but I kind of feel it is the schools letting down the new grads.

Tips For Lowering Indention Levels

Linus Torvalds, the creator of linux, sets his tabs/indentations to 8 spaces. Personally that seems a little wide, but he says some interesting things about why shorter isn’t necessarily better. Me, I’ll stay with 4 because it just looks right. But you still shouldn’t have too many indent levels.

“If you need more than 3 levels of indentation, you’re screwed anyway, and should fix your program.” – Linus Torvalds, Linux Kernel Coding Style

Use Continue

Example:

for item in sequence:
     if is_valid(item):
          if not is_inconsequential(item):
               item.do_something()

Becomes:

for item in sequence:
     if not is_valid(item):
          continue
     if is_inconsequential(item):
          continue
     item.do_something()

Notice you have to flip the values of your if statements making this change. This only works inside a loop and if your language has continue.

Factor Out A New Method

This is true for a lot of things. As I recently heard if a method starts to get over 10-15 lines it starts to feel like maybe I’m doing something wrong.

If your indents are starting to get deep, maybe some of those sub-levels should just be put in their own routine.

On a python related note, Mr Rhodes points out if you need a new routine and you are in a class, your natural thought is to make a new method on the object. But if you aren’t using self – referring to the object in your new method – you can just make a function. These become easier to test and easier to move.

Factor Out An Iterator

Mr Rhodes says iterators are a Python superpower. I don’t understand python iterators yet and didn’t even understand his example, in the interest of completeness, here’s his example.

Example:

for item in sequence:
     for widget in item:
          for pixel in bitmap:
               pixel.align()
               pixel.darken()
               pixel.draw()

becomes:

def widget_pixels(sequence):
     for item in sequence:
          for widget in item:
               for bitmap in widget:
                    for pixel in bitmap:
                         yield pixel
 
for pixel in widget_pixels(sequence):
     pixel.align()
     pixel.darken()
     pixel.draw()

Lastly he talked about going to an argument per line for function calls. I want to write a full post on this idea because it has some cool implications in version control. The short version is once a function call gets to 79 characters, just go right to putting each parameter on its own line.

Shortening Lines & Binary Operations

One of the core values of Python is readability.

I think it’s cool a computer language has values.

To encourage this value it has a set of guidelines on how to format your code called PEP 8. Most of these points are applicable to all languages, so even if you aren’t blessed to use Python you still might want to check it out.

One of the PEP 8 rules is lines shouldn’t be longer than 79 characters. This is also a rule in typography. That’s the length of line that doesn’t require extra mental effort to move back to the beginning of the next line.

Much of this post and the next one are from a talk by Brandon Rhodes entitled A Python Aesthetic: Beauty and Why I Python and available on YouTube. It got me inspired to think about this stuff again and motivated to make these posts.

Here’s a few of tips on how to shorten lines effectively.

Shorten Lines By Adding Variables

As Mr. Rhodes pointed out in the source video, it’s a feature of math that you can assign some part of an operation to a variable and use that to pass on to another operation. Why do programmers seem to want to avoid this?

I don’t know, probably because programmers are lazy. Well some programmers. The same kind of programs that use one character variable names and magic numbers.

But not you right?

Example:

canvas.drawString(x, y, ‘Please press {}’.format(key))

Our first thought is using return to shorten.

canvas.drawString(x, y, 
     ‘Please press {}’.format(key))

But if it has to be two lines anyway….

message = ‘Please press {}’.format(key)
canvas.drawString(x, y, message)

Adding the message variable not only shortens the line, it gives us a variable that tells us what it is. The message is now easily changed, printed for debugging or logged later.

Shortening Long Formulas

In math a binary operation is where the two parameters are on either side of the operator. Like 1 + 1. In a way this is a function, just written differently than a normal function would be, ie. add(1,1).

Sometimes you have a big mathematical formula that stretches past 79 characters. Especially if you use good variable names. Here Mr. Rhodes disagrees with PEP 8 and I agree with him.

Example:

adjusted_income = (gross_wages + taxable_interest + (dividends - qualified_dividends) - ira_deductions - student_loan_interest)

PEP-8 says to divide binary operations after the operator. This is bad.

PEP-8 badness:

adjusted_income = (gross_wages + 
     taxable_interest + 
     (dividends - qualified_dividends) - 
     ira_deductions - 
     student_loan_interest)

Donald Knuth says divide before the operator.

Knuth Goodness:

adjusted_income = (gross_wages 
      + taxable_interest 
      + (dividends - qualified_dividends) 
      - ira_deductions 
      - student_loan_interest)

This seems much clearer, and put the operators front and center. Subtractions actually look like the variable is negative.

Source: A Python Aesthetic: Beauty and Why I Python by Brandon Rhodes

Image: Photographer Ron Davis (Me). Model Virginia McConnell.

Remove Comments By Making Them Code

I like comments, probably too much. Lately I’ve been rethinking some comments because they just don’t make sense or work in the real world. But not till I watched this video did I realize you could change them into code.

Make A Comment Into a Variable.

Variable names – good variable names – are self commenting. Extreme Programming uses the phrase “Destroy All Comments”.

Example:

# is window too tall
if window.height > 100:
     fix_window_height()

Becomes:

IsWindowTooTall = window.height > 100
if IsWindowTooTall:
     fix_window_height()

Example:

widget.reset(True)  # forces re-draw

becomes:

force_redraw = True
widget.reset(force_redraw)

Also makes it easier to put in debugging code later because you already have a name for the value to print. True is also a magic number in a sense because we don’t know what it means.

Make Code Section Comments Into Functions

Example:

# open the barn
barn = code.Barn.get()
barn.unlock()
barn.open()
 
# saddle the horse
saddle = code.Saddle.get()
saddle.install(horse)

Becomes:

open_barn()
saddle(horse)
 
def open_barn():
     barn = code.Barn.get()
     barn.unlock()
     barn.open()
 
def saddle(horse):
     saddle = code.Saddle.get()
     saddle.install(horse)

I once asked a boss if we could get people to add a few more comments so it would be easier to tell what code did. His reply was that when consultant programmers who get to work on lots of code they didn’t write come into a new project, they run a script that removes all comments. He said comments are often misleading. I think that’s a little extreme, but if you turn comments into code, that documentation will always be there.

Source: A Python Aesthetic: Beauty and Why I Python by Brandon Rhodes

Tactical Coding

When programmers are taught they are taught about how programs work. They are also taught how structure large projects so the parts work together – architecture or strategic programming.

But I don’t ever remember learning tactical programming. This is how to write code at the function and line level to decrease errors and increase things like understanding and readability.

I remember reading the book Code Complete after just a few years as programmer and being blown away. Why didn’t anybody tell me this stuff. You mean my variable names shouldn’t just be one letter? I should check parameters before using them? If I write the shell of a function first, I probably won’t forget to close a bracket or return a value.

I’ve decided to write a series of blog posts about tactics of coding. They have their own category on the blog now – Coding – Tactical.

Of course that means that all our code must be black, because everything tacticool is black.

Footnote: I like the first edition of Code Complete better than the newest one, but it is still good stuff.

Verbose Switch a Cleaning Up Code Example

There are few principle we here at Reactuate Software try to keep in mind when writing code. One is DRY, or “Do not Repeat Yourself”. Another is “Readability Counts”. So when I see code that looks like this, I cringe.

- (IBAction)dayButtonPressed:(id)sender {
    UIButton *b = (UIButton*)sender;
    theDay = b.tag;
    switch (b.tag) {
        case 2:
            monButton.selected = YES;
            tueButton.selected = NO;
            wedButton.selected = NO;
            thButton.selected = NO;
            friButton.selected = NO;
            satButton.selected = NO;
            sunButton.selected = NO;
            break;
        case 3:
            monButton.selected = NO;
            tueButton.selected = YES;
            wedButton.selected = NO;
            thButton.selected = NO;
            friButton.selected = NO;
            satButton.selected = NO;
            sunButton.selected = NO;
            break;
        case 4:
            monButton.selected = NO;
            tueButton.selected = NO;
            wedButton.selected = YES;
            thButton.selected = NO;
            friButton.selected = NO;
            satButton.selected = NO;
            sunButton.selected = NO;
            break;
        case 5:
            monButton.selected = NO;
            tueButton.selected = NO;
            wedButton.selected = NO;
            thButton.selected = YES;
            friButton.selected = NO;
            satButton.selected = NO;
            sunButton.selected = NO;
            break;
        case 6:
            monButton.selected = NO;
            tueButton.selected = NO;
            wedButton.selected = NO;
            thButton.selected = NO;
            friButton.selected = YES;
            satButton.selected = NO;
            sunButton.selected = NO;
            break;
        case 7:
            monButton.selected = NO;
            tueButton.selected = NO;
            wedButton.selected = NO;
            thButton.selected = NO;
            friButton.selected = NO;
            satButton.selected = YES;
            sunButton.selected = NO;
            break;
        case 1:
            monButton.selected = NO;
            tueButton.selected = NO;
            wedButton.selected = NO;
            thButton.selected = NO;
            friButton.selected = NO;
            satButton.selected = NO;
            sunButton.selected = YES;
            break;
        default:
            break;
    }
}

This code is in an iOS app I inherited and it handles this control.FrequencyControl

To me this violates both the concept of DRY and the concept of Readability. It is pretty obvious what the routine does and you can even guess how it works. But the devil is in the details. Yes you know it is turning off all the buttons but the day that is selected. What if one of those case statements was messed up and accidentally had two buttons set to true? The sheer number of repeated very similar values would make it a pain to find.

Plus its just ugly and inelegant.

Here’s the 30 second fixed version:

 
- (IBAction)dayButtonPressed:(id)sender {
    UIButton *b = (UIButton*)sender;
    theDay = b.tag;
 
    monButton.selected = NO;
    tueButton.selected = NO;
    wedButton.selected = NO;
    thButton.selected = NO;
    friButton.selected = NO;
    satButton.selected = NO;
    sunButton.selected = NO;
 
    switch (b.tag) {
        case 2:
            monButton.selected = YES;
            break;
        case 3:
            tueButton.selected = YES;
            break;
        case 4:
            wedButton.selected = YES;
            break;
        case 5:
            thButton.selected = YES;
            break;
        case 6:
            friButton.selected = YES;
            break;
        case 7:
            satButton.selected = YES;
            break;
        case 1:
            sunButton.selected = YES;
            break;
        default:
            break;
    }
}

See how much cleaner that is. There are situation where setting all the controls one way and then immediately setting them to on might cause a flicker, but iOS isn’t one of them.

Since I’m in this code anyway….

Here’s an even more cleaned up version

 
/* --------------------------------------------------------------------------------------------
 
	dayButtonPressed
 
	Description: 
		Action method for the day of the week frequency control.
 
	-------------------------------------------------------------------------------------------- */
- (IBAction)dayButtonPressed:(id)sender
{
    UIButton*   button = (UIButton*)sender;
    theDay = button.tag;
 
    monButton.selected = NO;
    tueButton.selected = NO;
    wedButton.selected = NO;
    thButton.selected = NO;
    friButton.selected = NO;
    satButton.selected = NO;
    sunButton.selected = NO;
 
    switch (button.tag)
    {
        case SUNDAY:
            sunButton.selected = YES;
            break;
        case MONDAY:
            monButton.selected = YES;
            break;
        case TUESDAY:
            tueButton.selected = YES;
            break;
        case WEDNESDAY:
            wedButton.selected = YES;
            break;
        case THURSDAY:
            thButton.selected = YES;
            break;
        case FRIDAY:
            friButton.selected = YES;
            break;
        case SATURDAY:
            satButton.selected = YES;
            break;
        default:
            NSAssert(false, @"Day button Tag and unknown value %d", button.tag);
            break;
    }
}

I had to define some constants but it is much more readable now. I also hate one letter variables, so the button variable got renamed as well.

Added an assert for something that should never happen, namely the value of the tag not being between 1 and 7. NSAsserts are only handled in non-release builds, so there isn’t a chance of it crashing the shipping app.

Also added is my “short” routine header for Objective-C code. It just tells what the method does . It doesn’t bother to tell what the parameters or return values are, because saying it is an “action method” tells an experienced iOS/MacOS programmer all that information.

Variable Names – Coding Style

At Reactuate Software one of our main goals when creating software is making the code readable. Readable code makes everyone’s life easier. It is less prone to mistakes. It’s easier for new developers to learn. It’s easier for a developer to understand when they go back to code they wrote a long time ago.

Here’s how I do it. These practices were developed over decades of coding, and are heavily influenced by Objective-C/Cocoa conventions. Some may actually vary based on the language we’re coding in, but the principles apply. For example, CamelCase is the norm in Objective-C, but frowned on in Python. So if you see inTagArray below in Objective-C, that would be in_tag_array in Python.

Full Words Please

[box type=”warning”]Single letter variables are of the devil.

Never use them.[/box]

About the only place I accept a single letter variable is a simple loop. ‘for( int i=0;i<10;i++)' is pretty understandable1, but actually getting to be rare in the real world. Single letter loop variables can get confusing quickly if you have loops inside of loops.

Another exception are the special x,y,z variables when plotting points. Those are the actual names of the axis you are referencing, so go ahead and use them. Which is a also a good reason not to use X in other places.

[box type=”warning”]No abbreviations or acronyms either
They may be obvious to you today, but not so for other readers of the code. I don’t know why coder’s feel they have to save a few key strokes by making variables abbreviations, especially in the age of editors with autocompletion.[/box]

The other day I was looking at a tutorial and the writer’s sample started with something like this:

[cc lang=”objc”]NSString* rhs = @”abcd”;
NSString* lhs = @”efg”;

if ( [rhs isEqualString:lhs] )[/cc]

It’s pretty easy to understand what the code is doing, comparing two strings. But what does rhs/lhs mean? The sample code was discussing how lhs == rhs is done with string routines. Then I figure it out. They meant Right Hand Side, and Left Hand Side. OK, why not name them that? RightHandSide would be perfectly valid and more readable.

Actually I would probably have named them rightString and leftString. This tells what they do/mean, and it tells me their type. Which brings me to my first style guide.

Always Name The Variable By What It Represents

Needless to say…well I guess it’s not needless to say or I wouldn’t be writing this.

Give variables meaningful names. Later in the code you are only going to have the name when reading code.

I also like to include the type in the name, though it isn’t always needed. Like the above example, adding String to the end let’s the reader know the variable holds a string and not a number. Yeah, you could look back at the comment on the definition if you get confused, but why not make the name self documenting?

When I name my outlets for UI objects in iOS/MacOS, I name them with the type of object they are, for example usernameTextField, and passwordTextField. Because later I may see those variables, and while I know they handle text because that’s what a username is, knowing it is a text field tells me a lot more about that text. It’s a single line not multiple lines. It isn’t rich text. It’s probably editable or I’d have labeled it usernameLabel.

Windows programmers have long used Hungarian notation where the first characters or characters represent the type of the variable. I used to think this was dumb because you had to learn a new convention when you could just look at the definition. I still think having to learn a cryptic set of rules is a pain, but having the type in the name is very useful.

Meaning Can Be Relevant to Role in Code

[box type=”info”]Never use any variable for more than one thing.
Variables don’t cost you anything but keystrokes. If you want to do something else, make a new variable, don’t reuse one you think is “done”. This is asking for disaster.[/box]

Temporary Variables

Some variables don’t have a global meaning. For instance a variable that is just there to hold a value while you do something to it. If they are really, really temporary, I’ll just name it ‘temp’. But if they are only kind of temporary, meaning they stay around for more than a couple of lines, or there are more than one temporary variable, then I’ll name them based more on what they hold. tempString, or tempSortValue, etc.

Parameters

Another special naming convention I use is in and out prefixes on parameters. Parameters of a method/function/procedure always get prefixed with ‘in’. If they are being used to send some value back to the caller via reference or pointer, they get prefixed with ‘out’.

Why? Because you should never alter parameters that are passed in. In C++ you could define these as const and the compiler wouldn’t let you change them, but not in Objective-C. If you need to change something in an ‘in’ variable, make a copy. And think about what it means to make a copy.

Here’s an example method:

[cc lang=”objc”]- (void)addTagsFromArray:(NSArray*)inTagArray;
{
for (NSString* curString in inTagArray)
{
[self addTag:curString];
}
}[/cc]

The routine adds an array of strings to an object’s tag list. The parameter sent in is named inTagArray. ‘in’ because it is a parameter, ‘Tag’ because that’s what it contains, ‘Array’ because that’s its type.

Loop Variables

The addTagsFromArray method shows how to handle an enumerator variable. Prefix the variable that contains the current variable with ‘cur’. The variable probably should have been ‘curTag’, since it contains a tag, and a tag just happens to be a String. But in this case it works because in the future I may change a tag to be something other than a string, and when I do I’ll rename the variable to indicate it contains a Tag object.

Return Values

When I write a new method that is going to return something, the first code I write will look like this:

[cc lang=”objc”]- (int)someRoutine;
{
int returnInt = 0;

return returnInt;
}[/cc]

I learned this from the book Code Complete: A Practical Handbook of Software Construction, which is a great resource to learn about writing tactically defensive code.

I’m doing three things here.

1) I’m declaring a value that will hold what’s being returned. If that value isn’t set, the routine shouldn’t be returning yet. When you’ve figured out what to return, this is where it goes.

2) I’m giving the return value a useful default value. If an unset return is a error, then set the error value here. For an int, you probably want to initialize to zero just to be sure.

3) I’m actually doing the return. First this will get rid of a compiler warning that you don’t have a return value. Second, you’ll always return something, which is good, because just exiting a routine that expects a return value is undefined.

There are rare occasions when readability has to take a back seat to performance, and sometimes what the code does is just too complex to easily understand at a glance. But the readability of a variable name never impacts performance. Variable names are for the programmer, not the compiler.

Summary

  • Always give variables meaningful names.
  • Always use complete words. No abbreviations.
  • Include variable type in the name.
  • Parameter names are prefixed with ‘in’ or out ‘out’.
  • Create a variable prefixed with ‘return’ and return statement when creating a new routine.


Footnotes:

  1. Of course 10 is a “magic number” which will be the topic of another post.