A day at Hunter Industries

About a year or so ago, I had the chance to go on a Hunter Industries virtual tour. Basically, I spend a day working with them. I was looking forward to this since I learned about mob programming, as that’s the place where it was born. It was amazing. I’ve had enough time to reflect on that experience, so I want to share what I believe are the most outstanding traits of that culture

Experiment on everything

I mean, EVERYTHING. The team was working on some JS code and didn’t know how to use the IDE to debug it. The reason was that they were experimenting with using that new IDE that week! They had no prior experience with it (and neither did I) so we were tumbling all along to figure out what was going on at runtime (console.Log anyone?). It was kind of frustrating. But that’s how it is when learning something new.

The main point is that nothing is written in stone. They experiment with team configurations, tooling, work organization, techniques, and everything in between. This is probably the thing I loved the most. There is not a “we do things this way here” mentality but instead a curious, open-minded one. If you have a proposal to do things a different way, the team can give it a try for a time and see how it works. I just wish more organizations were this flexible. After all, all improvement is a deviation from the standard.

Mobbing

So I just spoke about flexibility. As the place where mob programming was invented, I expected the practice to be mandatory. Certainly is the default way of work. But people can pair or go solo as needed. It’s just that flexible.

I have to be honest: the main reason I took the tour was because I wanted to learn firsthand how to do mob programming. I had tried before but it didn’t turn out as I expected. I made many mistakes while trying to implement it. Having the chance to mob with a team experienced in the practice was quite an instructive experience. It seems to me that a lot of the mistakes I made, were just deviating from the simple rules at one point or another. It’s about discipline.

Rest often

The reason we are told to hydrate continuously is not to quench thirst, but to avoid feeling thirsty in the first place. One of the practices we followed strictly was to rest for 10 minutes every 40. With one hour lunch. Even if we were in the middle of something we would stop and take a break. The end result was that I didn’t feel tired after hours of deep concentration work. It was amazing. But it did require discipline.

Rotate often

This is one of the mistakes I made in my earlier attempts. For starters, the lack of some sort of control to keep in check who was next and when. We would often get carried away and forget to rotate who was typing and who was thinking until later on. The effect of this was disengagement from some of the team members. So rotating often is actually a very important thing. We were rotating every 4-5 minutes, but it can vary depending on the team.

Cross-pollination

This is one of those things that sounds too radical to most organizations. It turns out that every so often, the team switches members for a week or so. Since the team works as a mob, getting someone up to speed is actually very fast. Heck, even I could start being productive after an hour or so given that I had never seen that codebase. This way the knowledge of the different products is spread among all the teams. Something to note as well is that no person can remain on the same team for more than 2 years. You can either switch before or at the 2-year limit.

Mini-retros

Something to take into account is the constant feedback received from the environment. You receive feedback from your peers as you bounce your ideas with them. You receive feedback from the unit tests about your code. And then we have the mini-retro at the end of the day. I didn’t expect that but it was actually very cool. One of the things we discussed was the need to learn how to set up the tool for debugging and decided to do that the next day. This was cool because I have been in similar situations before but usually, the momentum just leads me to decide to stick with whatever hack I’ve been doing, even if it’s not optimal. It’s like trying to take down a tree with a blunt axe and deciding there’s no time to sharpen it. By having a small retro at the end of the day, we allow ourselves to pause and reconsider if there are better ways to continue… to sharpen the axe.

Small teams

I was told that the average size of the teams is 4. That’s another of those practices that seem to be irrelevant at first. But as time goes on I’ve come to discover that having 4 people in a team is enough to challenge any idea and see if we can’t come with something better without getting into a deadlock of opinions. Having too many people makes reaching a consensus hard. Having too little makes it hard to have a diversity of ideas. I suspect around 4 may be the sweet spot. Coincidence? a deliberate choice? an unconscious choice? You’ll have to ask them πŸ˜‰

TDD everywhere

I love TDD. I found it the most effective way to work. If you are trying to give it a try, I suggest to break everything into small tasks. So I found it refreshing to be able to work with other people this way. They we’re all seasoned practitioners and I pick up a few things from them. I believe you can go a long way just working on a mob, but bringing TDD into the scene helps to focus the attention of everyone to a single task: make the specification pass. Refactoring can happen later.

Closing words

A technically sound culture doesn’t happen by accident. It requires deliberate actions. I believe that the cornerstone of the success of the team at Hunter Industries is their relentless pursuit of a better way through constant experimentation. The disposition to try and fail and try something again, has undoubtedly shaped some of the practices metioned here. It just that this implies throwing away the current process, practices, and policies and considering that it’s human nature to fight change, I’m afraid that it will take a long time before we start to see more companies adopt these ideas.

Peeling the onion: understanding encapsulation

Encapsulation vs Information Hiding

Information hiding, as defined by Parnas, is the act of hiding design decisions from other modules, so no impact would be felt if you change such decisions. Encapsulation says nothing about such things. It only cares about grouping related things inside a capsule. It says nothing about the transparency level of such a capsule.

Encapsulation vs access modifiers

Oftentimes I’m asked to interview a candidate to see if he is suitable for a software development position. And 9 of 10 will refer to access modifiers. They will talk about public, private and other keywords, and access levels. Actually, access modifiers are more related to a concept called data hiding.

Encapsulation vs Data Hiding

If you are reading this, chances are that you have already found and read other articles related to the topic. And probably you found a reference to data hiding, saying that encapsulation is a way to achieve data hiding or something along those lines. But that doesn’t really help you have a clearer picture right?

Let’s try to look at it from another angle. What is data hiding about? As the name implies is hiding data from the world. Hiding it how? Well, you hide it behind a boundary. Inside that boundary, a piece of data is well known. But outside of it, it’s non-existent. So data hiding is all about, well, data.

This raises a question: how do you define a boundary for a piece of data?

Encapsulation and Cohesiveness

Well, there’s a well-known principle for creating boundaries: cohesiveness. Cohesiveness is about putting together all things related in some way. The “some way” part of it, can be changing depending on the scope but is usually about behavior.

It means to take the data and the operations that work upon it and draw a boundary around them. Sounds familiar?

So what’s encapsulation?

According to Vladimir Khorikov:

Encapsulation is the act of protecting against data inconsistency

Both data hiding and cohesiveness are guides we use to avoid ending in an inconsistent state. First, you put a boundary around data and the operations that act upon it, and then you step into the data/information hiding domain by making the data invisible (at some level) outside that boundary.

Easy peasy, right? One interesting thing about encapsulation is that it results in abstractions all over the code.

Encapsulation and Abstraction

Abstraction and encapsulation share an intimate relationship.

Simply put, encapsulation leads to the discovering of new abstractions. This is the reason why refactoring works.

Let’s look at some definitions:

In computing, an abstraction layer or abstraction level is a way of hiding the working details of a subsystem, allowing the separation of concerns to facilitate interoperability and platform independence.

wikipedia

In other words, the level of abstraction is the opposite of the level of detail: the higher the abstraction level the lower the level of detail, and vice-versa.

The essence of abstraction is preserving information that is relevant in a given context, and forgetting information that is irrelevant in that context.

– John V. Guttag[1]

So an abstraction (noun) is just the representation of a concept in a way that is relevant to a given context (the abstraction level).

Which is precisely what we do when we encapsulate.

Encapsulation: First abstraction level

Look at the following code:

using System;
using System.Collections.Generic;
    
public class Test
{
    public static void Main()
    {
        decimal total =0;
        decimal tax = 0;
        var order = new Order();

        order.Lines.Add(new OrderLine{ Item = new Item{Name = "Shampoo",Price = 12.95m}, Quantity = 2});
        order.Lines.Add(new OrderLine{ Item = new Item{Name = "Soap",Price = 8m}, Quantity = 5});
        
        foreach(var line in order.Lines)
        {
            total += line.Quantity * line.Item.Price;
        }
        
        tax = total * 0.16m;
        Console.WriteLine( total + tax);
    }    
}

public class Order
{
    public Order(){
        Lines = new List<OrderLine>();
    }
    
    public List<OrderLine> Lines{get;set;}
    
}

public class OrderLine
{
    public Item Item{get;set;}
    public int Quantity {get;set;}
}


public class Item {
    public string Name {get;set;}
    public decimal Price {get;set;}
}

Let’s try to draw some boundaries.
First, let’s look at the variables on the Main method. There are 3: order, total, and tax.
The trick here is to find where these variables are being used.

Let’s start with order.

So, in a nutshell, order is being used to calculate the value of total. Let’s take a look at that variable then.

Here we can see that the total is in reality a subtotal (before taxes) and it’s used to calculate both the tax and the real total which is implicit in the expression total + tax. So let’s fix that, let’s rename the total variable as subtotal, and let’s make the implicit real total explicit by assigning it to a variable called total.

using System;
using System.Collections.Generic;
    
public class Test
{
    public static void Main()
    {
        decimal subtotal = 0;
        decimal tax = 0;
        var order = new Order();

        order.Lines.Add(new OrderLine{ Item = new Item{Name = "Shampoo",Price = 12.95m}, Quantity = 2});
        order.Lines.Add(new OrderLine{ Item = new Item{Name = "Soap",Price = 8m}, Quantity = 5});        
     
        foreach(var line in order.Lines)
        {
            subtotal += line.Quantity * line.Item.Price;
        }
        
        tax = subtotal * 0.16m;

        var total = subtotal + tax;

        Console.WriteLine( total );
    }    
}

Good, now let’s move to the next variable, tax.

As you can see tax is used to calculate the new total. Notice again that we have an implicit piece of data in there, 0.16m, so let’s make it explicit. Remember, you’re looking for data and code that acts upon that data, so try to make the data easy to spot. Let’s rename tax as taxes and put the tax percentage into a variable called tax.

using System;
using System.Collections.Generic;
    
public class Test
{
    public static void Main()
    {
        decimal subtotal = 0;
        decimal taxes = 0;
        var order = new Order();

        order.Lines.Add(new OrderLine{ Item = new Item{Name = "Shampoo",Price = 12.95m}, Quantity = 2});
        order.Lines.Add(new OrderLine{ Item = new Item{Name = "Soap",Price = 8m}, Quantity = 5});        
     
        foreach(var line in order.Lines)
        {
            subtotal += line.Quantity * line.Item.Price;
        }
        
        var tax = 0.16m;
        taxes = subtotal * tax;

        var total = subtotal + taxes;

        Console.WriteLine( total );
    }    
}

Mmmm… I think we can now encapsulate some things. First, that tax variable and the operation used to calculate the value of taxes belong together, so let’s draw a boundary around them.

At the lowest abstraction level, the encapsulation boundary is a function.

using System;
using System.Collections.Generic;
    
public class Test
{
    public static void Main()
    {
        decimal subtotal = 0;
        decimal taxes = 0;
        var order = new Order();

        order.Lines.Add(new OrderLine{ Item = new Item{Name = "Shampoo",Price = 12.95m}, Quantity = 2});
        order.Lines.Add(new OrderLine{ Item = new Item{Name = "Soap",Price = 8m}, Quantity = 5});        
     
        foreach(var line in order.Lines)
        {
            subtotal += line.Quantity * line.Item.Price;
        }
        
       
        taxes = CalculateTaxes(subtotal);

        var total = subtotal + taxes;

        Console.WriteLine( total );
    }

    static decimal CalculateTaxes(decimal amount)
    {
       var tax = 0.16m;
       return amount * tax;
    }
}

Easy peasy, right? Let’s move backward and encapsulate the subtotal calculation.

using System;
using System.Collections.Generic;
    
public class Test
{
    public static void Main()
    {
        decimal subtotal = 0;
        decimal taxes = 0;
        var order = new Order();

        order.Lines.Add(new OrderLine{ Item = new Item{Name = "Shampoo",Price = 12.95m}, Quantity = 2});
        order.Lines.Add(new OrderLine{ Item = new Item{Name = "Soap",Price = 8m}, Quantity = 5});        
     
        subtotal = CalculateSubtotal(order);        
       
        taxes = CalculateTaxes(subtotal);

        var total = subtotal + taxes;

        Console.WriteLine( total );
    }

    static decimal CalculateTaxes(decimal amount)
    {
       var tax = 0.16m;
       return amount * tax;
    }

    static decimal CalculateSubtotal(Order order)
    {
       decimal subtotal = 0;
       
       foreach(var line in order.Lines)
       {
           subtotal += line.Quantity * line.Item.Price;
       }
        
       return subtotal;
    }
}

Something funny it’s going on. Why do we still have data (variables) laying around even after we encapsulate them? They should be non-existent to the world outside of the boundary right? Well, there are ways to go about this. First, since the taxes variable gets used only once, we can replace it with a call to the function.

using System;
using System.Collections.Generic;
    
public class Test
{
    public static void Main()
    {
        decimal subtotal = 0;       
        var order = new Order();

        order.Lines.Add(new OrderLine{ Item = new Item{Name = "Shampoo",Price = 12.95m}, Quantity = 2});
        order.Lines.Add(new OrderLine{ Item = new Item{Name = "Soap",Price = 8m}, Quantity = 5});        
     
        subtotal = CalculateSubtotal(order);        
               
        var total = subtotal + CalculateTaxes(subtotal);

        Console.WriteLine( total );
    }

    static decimal CalculateTaxes(decimal amount)
    {
       var tax = 0.16m;
       return amount * tax;
    }

    static decimal CalculateSubtotal(Order order)
    {
       decimal subtotal = 0;
       
       foreach(var line in order.Lines)
       {
           subtotal += line.Quantity * line.Item.Price;
       }
        
       return subtotal;
    }
}

Now, let’s talk about the subtotal variable. This one is used several times so we can’t replace it as we did previously (we can but it will be recalculating the same value twice). But this should pique our interest. Whenever we find this situation it means there are operations related to this data and we need to further encapsulate! After looking carefully we notice we haven’t done anything about the total variable! Are there any operations related to it? Let’s pack’em up!

using System;
using System.Collections.Generic;
    
public class Test
{
    public static void Main()
    {
        var order = new Order();

        order.Lines.Add(new OrderLine{ Item = new Item{Name = "Shampoo",Price = 12.95m}, Quantity = 2});
        order.Lines.Add(new OrderLine{ Item = new Item{Name = "Soap",Price = 8m}, Quantity = 5});       
            
        Console.WriteLine(CalculateTotal(order));
    }

    static decimal CalculateTaxes(decimal amount)
    {
       var tax = 0.16m;
       return amount * tax;
    }

    static decimal CalculateSubtotal(Order order)
    {
       decimal subtotal = 0;
       
       foreach(var line in order.Lines)
       {
           subtotal += line.Quantity * line.Item.Price;
       }
        
       return subtotal;
    }

    static decimal CalculateTotal(Order order)
    {
      decimal subtotal = 0;    

      subtotal = CalculateSubtotal(order);        
               
      var total = subtotal + CalculateTaxes(subtotal);
      
      return total;
    }
}

Let’s review this, now we have an order with some data and the display of the total amount of money to pay for such order. How the total is calculated is something irrelevant in this context. We switched from the how (lowest level of abstraction, high level of detail) to the what (higher abstraction level, lower level of detail). And this happens every time we encapsulate code.

Ok. Now we’re ready for the next step.

Encapsulation: Second abstraction level

If you have read up to this point, you must be tired. I know I am just from writing it. So let’s make this fast.

Let me ask you something. Do you see a piece of data and a piece of code that acts upon it? What’s that? correct, order! It’s just that this time we are dealing with a data structure, rather than primitives. When in this situation where you are using data from inside a data structure, encapsulating this behind a function won’t solve the problem. You need to turn the data structure into a full-fledge object by moving the functionality (behavior) into it.

At the second abstraction level, the encapsulation boundary is an object.

using System;
using System.Collections.Generic;
    
public class Test
{
    public static void Main()
    {
        var order = new Order();

        order.Lines.Add(new OrderLine{ Item = new Item{Name = "Shampoo",Price = 12.95m}, Quantity = 2});
        order.Lines.Add(new OrderLine{ Item = new Item{Name = "Soap",Price = 8m}, Quantity = 5});       
            
        Console.WriteLine(order.CalculateTotal());
    }   
}

public class Order
{
    public Order(){
        Lines = new List<OrderLine>();
    }
    
    public List<OrderLine> Lines{get;set;}

    static decimal CalculateTaxes(decimal amount)
    {
       var tax = 0.16m;
       return amount * tax;
    }

    decimal CalculateSubtotal()
    {
       decimal subtotal = 0;
       
       foreach(var line in Lines)
       {
           subtotal += line.Quantity * line.Item.Price;
       }
        
       return subtotal;
    }

    public decimal CalculateTotal()
    {
      decimal subtotal = 0;    

      subtotal = CalculateSubtotal();        
               
      var total = subtotal + CalculateTaxes(subtotal);
      
      return total;
    }
    
}

public class OrderLine
{
    public Item Item{get;set;}
    public int Quantity {get;set;}
}


public class Item {
    public string Name {get;set;}
    public decimal Price {get;set;}
}

There you go! how’s that? Do you still see more members of the order being acted upon by any piece of code outside the order? of course, the Lines collection! The Main method is still manipulating the Lines collection of the order! Let’s fix that!

using System;
using System.Collections.Generic;
    
public class Test
{
    public static void Main()
    {
        var order = new Order();

        order.AddLine(new Item{Name = "Shampoo",Price = 12.95m}, 2);
        order.AddLine(new Item{Name = "Soap",Price = 8m}, 5);       
            
        Console.WriteLine(order.CalculateTotal());
    }   
}

public class Order
{
    public Order(){
        Lines = new List<OrderLine>();
    }
    
    List<OrderLine> Lines{get;set;}

    public void AddLine(Item item, int qty)
    {
      Lines.Add(new OrderLine {Item = item, Quantity = qty});
    }

    static decimal CalculateTaxes(decimal amount)
    {
       var tax = 0.16m;
       return amount * tax;
    }

    decimal CalculateSubtotal()
    {
       decimal subtotal = 0;
       
       foreach(var line in Lines)
       {
           subtotal += line.Quantity * line.Item.Price;
       }
        
       return subtotal;
    }

    public decimal CalculateTotal()
    {
      decimal subtotal = 0;    

      subtotal = CalculateSubtotal();        
               
      var total = subtotal + CalculateTaxes(subtotal);
      
      return total;
    }
    
}

public class OrderLine
{
    public Item Item{get;set;}
    public int Quantity {get;set;}
}


public class Item {
    public string Name {get;set;}
    public decimal Price {get;set;}
}

Voila! and while I was at it since no one else was playing around with the lines collection, I did some Data Hiding and made it private, so it became non-existent outside of the boundary it lives on, in this case, the object. What’s the point I can hear you say? You are just replacing a data member for a method member, so what? well, what if you wanted to change the Lines definition from List<OrderLine> to Dictionary<string, OrderLine> for performance reasons. How many places would you have to change before encapsulating that? how many after? And what if you wanted to add a validation to check that you are not inserting 2 lines with the same product and instead just increase the quantity?

By encapsulating the code, effectively raising the abstraction level, you start to deal with concepts in terms of what instead of how. And if you only show the what to the outside, you can always change the how on the inside.

Anyway, now let’s go back to the CalculateSubtotal function. Can you see a piece of data that is being used by a piece of code? πŸ˜‰

So there you have it. This is becoming too long so I’ll wrap it up here. You can encapsulate forever at different abstraction levels: namespace, module, API, service, application, system, etc. It’s turtles all the way!

Meanwhile, I challenge you to try this on your own codebase. Let me know about your experience in the comments! Have a good day!

P.S. The code in this post is part of a challenge I put out for the devs in my current job. Wanna give it a try? you can find the it here https://github.com/unjoker/CoE_Challenge. Send me a PR to take a look at your solution πŸ˜‰

How I escaped the trap of the table-object binding

I really can’t recall the moment when I started thinking that an object was to be persisted as a table (hence we have a table for every object). I think it was on my first job, using an ORM, but I’m not sure. So what? mine is not a unique story: almost every software developer I know has been through this. I blame schools for this.

So where was I? oh! the table per object idea. That’s absurd. But I digress, let’s move forward. Once upon a time, I worked for a lending company. I was trying to model a loan application process. Something like:

So here it is. Imagine you have to code this. In case you’re not familiar with state machines diagrams, the lines are actions and the circles are states. Before moving forward, get a piece of paper and design a system to automate this process. Just outline the objects and their interactions. Go on, I’ll wait here.

Now, there are 2 different approaches here that will yield very different designs. Let’s review them.

Data centric

The data-centric approach aka database first is probably the most common approach. Basically, you use the relational paradigm to define objects… this has just one caveat: the result is not objects but tables. You create tables and represent them on code. This was my first approach and I ended with something like:

Good. Does your design look like this? If so, you’re in for a fun ride. Let me show you what I mean. Suppose you want to implement this design. It’ll probably look like:

class LoanApplication
{
   ...
   public string Status {get; private set;}
   public void StartReview()
   {
     if(Status !="Awaiting Review") throw new NotImplementedException();
     //do something here
    Status = "Reviewing";
   }

   public void SubmitDocs()
   {
     if(Status !="Docs Required") throw new NotImplementedException();
     //do something here
    Status = "Reviewing";
   }

   public void Withdraw()
   {
     if(Status !="Awaiting Review" && Status !="Reviewing" && Status !="Approved") 
           throw new NotImplementedException();
     //do something here
    Status = "Cancelled";
   }
 ...
}

You can see where this is going. The actions depend on the current state of the process. So with every new state, you will have to modify the guard clauses for the actions that you want to enable. But if you follow a data-centric approach, you probably won’t be bothered by this. You will assume that’s the only way. Luckily for me, the fact that I had to change different methods every time I introduced a new state, led me to further experimentation.

The behavior centric approach

As the name implies this approach focuses on object behavior. That means that the organizing principle is behavior rather than data. The result is rather radical:

So now you have objects that represent the states in the loan application process. The main benefit is that the code is simplified as only the actions allowed in the current state are advertised. See the difference? The behavior associated with each state is what drives the design. I loved this design. It was simpler. I no longer had to check the state before trying to send a message to the object. The only problem left was how to persist this?

Interaction between paradigms

So now, you’re probably wondering if you have to create many tables to store the data for each object. I have talked about having different models for an application before. This basically means that we can use a data-centric approach for the database design, which is basically the Relational paradigm. At the same time, we can use a behavior-centric approach to design the objects. This behavior-centric approach is at the heart of the Object-Oriented paradigm. Having different models interact with each other is just part of dealing with the impedance mismatch problem. The way I dealt with it was by using the repository pattern. Something like:

This way I had several objects being stored and loaded from a single table. The result was nice relational and object-oriented designs that we’re able to evolve independently.

Closing toughts

Probably the most important idea I learned from this experience is that a concept in the problem space can have several representations on the solution space.

Open your mind to this idea and try experimenting with it on your own project. As always, let me know your thoughts.

Liskov’s substitution explained like I’m five

Liskov’s substitution is one of those topics that are self-evident once you get it. However, explaining it can be tricky since the concept itself relies on other ideas. Besides, there are different aspects that you can deepen on, so you can always find something new to learn about it. Anyway, let’s talk about the main ideas and let the minutiae for another day.

The piano metaphor

I’ll assume that everyone knows what is a piano. If you have seen a piano, you will recognize it even if it is grand piano a digital piano, or even a virtual piano! Likewise, if you know how to play the piano, you can play any kind of piano as long as it behaves as such. Let’s break this down.

The shape

How is it that once you have seen a piano, you have seen them all? I mean, a digital piano is made in a very different way than a grand piano, how come you can identify both? Well, turns out all the pianos in the world share 2 kinds of keys (naturals & flats/sharps) that are set up in a very specific order. No matter what, if you see these keys arranged in that way, you know is a piano. Even if you can’t see under the hood!

The behavior

The reason a piano player can play any piano, is because he knows that a key in a given position will produce a note in a given tone. Since this is the same in every piano, he can confidently play a piece in any kind of piano. There may be some differences in the strength needed to hit the key, but that is irrelevant. The fact that the same key on any piano will give the same tone is a warranty.

The contract metaphor

Bertrand Mayer lay the ideas upon which the Liskov’s substitution was built. The main one is the idea of a contract. It basically says that 2 pieces of code can collaborate safely by doing it through a contract. The contract specifies what services are available, what is required to make use of them (pre-conditions), and what can you expect from them (post-conditions). As you will soon see, this idea of contract has 2 different instances in OOP: at the class level and at the method level.

Class contracts

To better understand Liskov’s substitution principle, you have to understand the idea of contracts, applied to classes.

Simply put is a contract around the “shape”: it specifies the services provided by objects of a certain class. Back to the piano analogy, it states that all pianos have keys in a given order, no matter the size, color, or shape of the body. In code, this kind of contract is expressed as interfaces, classes, or abstract classes.

Another aspect of a class contract it’s around its state: Invariants. Invariants are the rules that must be followed by all instances of the class to be valid, i.e. an hour object cannot have more than 60 minutes and not less than 0.

Method contracts

On the other hand, a contract applied to a method is a contract around “behavior”. It states that not only shape is important for a piano to be a piano: whenever you hit the C key, it has to make a sound on the C tone. The tricky part is that in most languages there are no artifacts (like interfaces) that enforce this kind of contract. Even worse, we ourselves are very lousy at defining this kind of contract.

Imagine the following code:

int Add2(int plus){...}

What would you expect of the following expression?

int result = Add2(plus: 3);

What would you think if the result was anything other than 5 or if it threw an exception? It makes no sense right? The function says that is going to add 2 (name) to an integer (parameter) and return the resulting integer (return type). So given we provide it with a valid value (pre-condition), we would expect a valid value in return (post-condition). There is more to say about this, but I’ll stop here.

The Substitution principle vs the Liskov’s substitution principle

A common confusion when trying to understand Liskov’s substitution principle is with regards to the substitution principle. Let’s fix that.

The substitution principle basically states that a class contract must be respected by it’s subclasses.

The Liskov’s substitution principle states that the invariants and methods’ contracts in a class must be respected by it’s subclasses.

Let’s look at the following example:

  public class Piano
    {
        protected Dictionary<string, Action> Keys;

        public Piano()
        {
            Keys = new Dictionary<string, Action>
            {
                {"C", makeSound(tone:"C")},
                {"C#", makeSound(tone:"C#")},
                {"D", makeSound(tone:"D")},
                {"D#", makeSound(tone:"D#")},
                {"E", makeSound(tone:"E")},
                {"F", makeSound(tone:"F")},
                {"F#", makeSound(tone:"F#")},
                {"G", makeSound(tone:"G")},
                {"G#", makeSound(tone:"G#")},
                {"A", makeSound(tone:"A")},
                {"A#", makeSound(tone:"A#")},
                {"B", makeSound(tone:"B")}
            };
        }
        
        public void Play(string key) => Keys[key].Invoke();

        protected Action makeSound(string tone)
        {          
            //do some magic
        }
        
    }

So the class contract on this Piano class says that there is a Play method. The pre-conditions are to receive a valid key, and the post-condition is that some sound is made. Now check this class:

 public class BadPiano : Piano
    {
        public BadPiano()
        {
            Keys["C#"] = () => throw new NotImplementedException();
            Keys["A"] = makeSound("E");
        }
       
    }

This class violates the Play method contract. If we try to play the tone C# is going to throw an exception and if we try to play the tone “A” it’s going to play “E” instead, whereas its parent won’t. Get it? is not just that the behavior is different than its parent, it’s actually out of the expectations of the client. Using this would at least produce some weird interpretation and at most, it will blow the whole program. Again there’s more to say here but I’ll refrain.

Closing thoughts

I hope this helps you grasp the ideas behind Liskov’s substitution principle. While there’s more to be said about the topic, I wanted this to be an introduction. If you’re interested in learning more, let me know in the comments and I will try to follow up later. Enjoy!

Using metaphors to make the code easy to understand

I mentioned this before, but to me, high-quality code has 3 attributes: it’s easy to understand, easy to change, and correct. I always start with trying to make any piece of code the easiest to understand. If you make it easy to understand, even if it’s not easy to change or correct yet, you are in a much better position than otherwise. Whenever I’m mentoring I always explain it like this: if you can understand the “what” you can change the “how”.

So, making the “what” explicit, that’s the challenge.

The socio-technical space

There’s this idea in the DDD circles called socio-technical space. The way I like to think of it is like a continuum that has technical issues/solutions on one side and social issues/solutions on the other.

When you start looking at social issues, the concepts and their interactions provide you with a nice framework where you can reason about the problem. Often, your design will take after these concepts as the building pieces for your solution. That means that if you are working on a system for a banking domain, you likely will have objects like accounts, money, and credit.

But what when you are solving a highly technical problem where the concepts are too vague, abstract, or low level? Well, you can try defining your own concepts to reason the problem and to explain your solution. Or you could use a metaphor.

A technical challenge for you

Exercism.io is a platform to practice solving problems in a programming language. I recommend it to any developer who takes pride in his/her craft. So I was solving the Spiral Matrix problem (login/sign up to access the problem). Before you continue reading I challenge you to solve it. Go on, I’ll wait for you.

So the problem states that given a size you have to create a Matrix[size, size] and you have to fill it with numbers starting from 1 up to the last element. Suppose you have a Matrix[5, 5] then you would have to fill all the slots with numbers 1 to 25. The tricky part is that you have to follow an inward spiral pattern. Are you interested now? try solving it!

Metaphors to the rescue!

The first time I heard about metaphors in the software development realm, was in relation to XP. The idea is simple: use a metaphor to drive the system design. Kent Beck used this on an overall system design level (architecture). But this time I’ll apply it on a smaller scale: the Spiral Matrix solution.

Each XP software project is guided by a single overarching metaphor… The metaphor
just helps everyone on the project understand the basic elements and their relationships.

-Kent Beck, Extreme Programming Explained

Patterns, patterns everywhere!

There are many ways to solve the Spiral Matrix problem. The most obvious solution is to sense the surrounding cells as you move. However, as I was looking at the numbers, I found a pattern in them. Turns out that you can calculate the turning points.

Here I marked all the turning points for a 3×3 matrix. If you lay out the numbers the pattern makes itself visible.

So starting from the right to left, you’ll notice that the distance between the 2 turning points is 1 (where distance is how many spots you’ll have to traverse before finding the next turning point). After the 2 turning points the distance increases by 1. And the sequence goes on. Every 2 turning points the distance increases by 1 until it reaches size-1. I’ll leave it to you to come up with an algorithm to take advantage of this. By the way, the number of turning points is equal to (size * 2) – 2.

Enough talk, show me the code!

So I wanted to make this pattern as obvious as I could, but after the first implementation, it was everything but obvious. After looking closely I noticed there were several things happening at the same time: keeping track of the corresponding number, moving on the grid, and knowing when to turn. So I decided to create objects to handle those responsibilities… but how should I call them?

Sure, you can call your objects however you want, but I wanted to make everything as clear as possible. Easy to understand, remember? So one of the responsibilities was to “navigate” the matrix. This led me to decide on a map. A map helps you navigate right? who could use a map? An Explorer right? after some iterations I ended with something like:

public static int[,] GetMatrix(int size)
    {
        var terrain = new int[size,size];
 
        var compass = new Compass(size);

        new Explorer().ExploreTerrain(terrain, compass);
        
        return terrain;
    }

So imagine you were tasked with starting the numeration at 3 instead of 1. You come and find this code. You’ll probably be puzzled, but the objects make sense to you. Because you understand the relationship between an explorer and the compass. You understand how the compass is used by the explorer. And knowing that, it makes sense to you that the explorer would use a compass to explore a terrain. Actually, it would be weird if he didn’t. But all of this happens in the back of your mind in a fraction of a second, without you really noticing it. So you go and check the ExploreTerrain method.

  public void ExploreTerrain(int[,] terrain, Compass compass)
        {
            while (_stepsTaken <= terrain.Length)
            {
                mapCurrentPosition(terrain);
                adjustDirection(compass);
                advance();
            }
        }

Again, this code is taking advantage of you existing knowledge on the matter of exploration. Wait what is this mapCurrentPosition doing? I think I know, but let’s confirm it.

 void mapCurrentPosition(int[,] terrain) => 
    terrain[_currentPosition.Y, _currentPosition.X] = _stepsTaken;

oh! so it’s putting a number in there… given what we know, this should be the corresponding number… so that is referenced as _stepsTaken! ok, let’s go back. Wait how is adjustDirection accomplished?

   void adjustDirection(Compass compass)
        {
            if(compass.IsTurningPoint(_stepsTaken)) 
                _currentPosition.TurnRight();
        }

So if the compass says that I need to turn at the current step, I turn right (notice how this didn’t puzzle you. Because using a compass to figure out if you need to turn around is something you understand, maybe even experienced before). Maybe we should rename that _stepsTaken variable to _currentStep? let’s go back and figure out what the advance method does.

 void advance() 
        { 
            _currentPosition.Forward();
            _stepsTaken++;
        }

Well, yeah, as expected. Wonder, how does the _currentPosition move forward? (notice we are questioning the “how” not the “what”. We understand what “moving forward” means when exploring). But hold on! where is that _stepsTaken initialized?

class Explorer
    {
        int _stepsTaken  = 1;
        ...
    }

Bingo! let’s initialize this variable to 3 instead of 1 and presto!

class Explorer
    {
        int _stepsTaken  = 3;
        ...
    }

I think you got the idea. If you want to check the details you can find the whole code here.

Closing thoughts

Hopefully at this point the advantages of using a metaphor have become evident (especially in an object oriented system).

Another benefit of using a metaphor is communication. Good metaphors are based on everyday experiences that a lot of people can relate to. This will allow you to convey ideas about the system design/architecture to non-technical people, which becomes increasingly important in agile settings, where the customer is part of the team.

I hope this picks your curiosity about using metaphors in the code. We already do it to explain our ideas in other settings, so why not use it in our code too? I challenge you to do it!

How to improve the signal to noise ratio of your code

As I have previously shared, code quality can be summarized along 3 axes: It is easy to understand, easy to change, and correct.
Today I want to talk about a trait that indicates how easy to understand is a codebase: Signal to noise ratio.

What is signal to noise ratio?

Signal-to-noise ratio (SNR or S/N) is a measure used in science and engineering that compares the level of a desired signal to the level of background noise

https://en.wikipedia.org/wiki/Signal-to-noise_ratio

In software development, this means how much of you code explains your intention/ideas/knowledge vs how much doesn’t.

Why is signal to noise ratio important?

Well, as mentioned before, this is an indicator of how easy to understand is your code. That means, how much time and mental effort is required to understand what the code does and more importantly, why it does it that way. Understanding these 2 facts is a requirement before changing how the code works. There’s no workaround for that.

What is the most influential factor on the signal to noise ratio?

If I were to pick a single attribute on a codebase to change its signal to noise ratio, that would be the abstraction level. You see in my experience poor signal to noise ratio comes from either under abstraction (too much detail) or over abstraction (too many layers of artifacts, too much indirection).

Under abstraction and its effect on the signal to noise ratio

How many times have you been tasked to make a little, tiny change in behavior, only to find yourself with a 200 lines function (… I just had a PTSD episode). The problem with a 200 lines function is that there’s too much detail to easily figure out the what and why.

This detail overload doesn’t happen just at the level of huge functions, but also at the level of language constructs. Take a look:

decimal orderTotal;
foreach(var line in orderLines)
{
    orderTotal+= line.Total;
}

So as you can see, the idea here is that the order total is the sum of the order lines total. So what code here isn’t relevant to that idea? Think about it for a moment. Done?

decimal orderTotal;
foreach(var line in orderLines)
{
    orderTotal+= line.Total;
}

Surprise! I bet a lot of you didn’t see that coming! This is because sometimes we get so used to the language that we give those things for granted. I know I did. It took me a lot of effort learning Smalltalk (and banging my head against the wall every time I tried to do something new) to rewire some parts of my brain. But you can’t deny it. Iterating over the lines is just a detail to sum up the lines total. I does not help conveying the main idea. It’s noise. How would you fix that? Actually, there are several ways.

decimal sumLinesTotal(){
    decimal linesTotal;
    foreach(var line in orderLines)
    {
        linesTotal+= line.Total;
    }
    return linesTotal;
}
...
decimal orderTotal = sumLinesTotal();

How’s that? Not a big deal right? But, now there’s no doubt about the code intention. I know, some of you may think this is dumb. The code itself wasn’t that complex to start with, why should we create a new function just for this? Well, what do you think would happen to a 200 lines function if you started doing this? Not only for loops but every place where implementation details (the how) appear. I dare you to try it. Now, if you are using C# there are other ways to be explicit about this:

 decimal orderTotal = orderLines.Sum(orderLine=>orderLine.Total);

Over abstraction and its effect on the signal to noise ratio

Over abstraction happens when we add unnecessary artifacts to a codebase. This is a prime example of accidental complexity. A very common cause of this is speculative generality: the idea that someday we may need to do something and preparing the code to handle such cases, even when we don’t have the need right now. But there are more common, more subtle cases.

So let’s say we have a report API to which we make requests:

public EmployeeData GetEmployeeData(Guid id);

public EmployeeData
{
    Guid Id;
    ...
}

public ManagerData GetManagerData(Guid id);

public ManagerData
{
   Guid Id;
   ...
}

So our relational mindset tell us that we are duplicating data here (id) and that we should remove that duplication.

public class ReportData
{
    Guid Id;
}

public EmployeeData GetEmployeeData(Guid id);

public EmployeeData: ReportData
{   
    ...
}

public ManagerData GetManagerData(Guid id);

public ManagerData: ReportData
{  
   ...
}

Great! duplication removed! but wait! we can go even further! Isn’t it everything we’re returning just report data? Let’s make that explicit!

public class ReportData
{
    Guid Id;
}

public ReportData GetEmployeeData(Guid id);

public EmployeeData: ReportData
{   
    ...
}

public ReportData GetManagerData(Guid id);

public ManagerData: ReportData
{  
   ...
}

But now the client code need to cast the result to the concrete type. Maybe we can make the ReportData object accommodate different sets of data?

public class ReportData
{
    Guid Id;
    Dictionary<string, object> Data;
}

public ReportData GetEmployeeData(Guid id);

public ReportData GetManagerData(Guid id);

So now let’s say you are given a ReportData object. How can you know if you are dealing with an employee or a manager’s data? You could query the data dictionary for a particular key that represents a property available only in employee (or manager), or worse, you can introduce a key in the dictionary that says which type of data is contained in it, moving from strongly typed to stringly typed. This is all noise. The signal has been effectively diluted.

Some guidelines to improve your signal to noise ratio

By this point I hope is clear to you that to improve your signal to noise ratio, using the right abstraction level is key. So I’ll share with you some of my observations on the abstraction process.

Step 1: remove noise by encapsulating details away into functions

Encapsulation and abstraction are closely related. I’ll talk about it in another post. Suffice to say that as you are encapsulating details away, you’re also raising the abstraction level. The trick to avoid going overboard is to think about what you want to express: the signal. Is that clear enough? A good rule of thumb is trying to make your functions 5 lines or less.

Step 2: uncover the objects

You will find that some functions act upon the same set of data. Those are objects hidden in the mist. Move, both the data and the functions that act upon it to a class. Naming the class will have an impact on the clarity of your signal, but don’t worry to get it right the first time, you can rename it (and you will) as your understanding increases.

Step 3: wash, rinse and repeat

Repeat the 2 previous steps over and over. If the idea you want to convey is still not clearly expressed by the code go to step 4.

Step 4: select a metaphor

To be discussed on the next post. πŸ™‚

A quick comment on comments

As I began writing I mentioned that you need to understand the what as well as the why of the code. The former can clearly be expressed by the code. If that’s not the case, you haven’t reached the right level of abstraction yet. As for the latter, this is the only situation in which I find comments justifiable. Explain constraints or whatever it is that lead you to chose the current solution.

Closing thoughts

Man, that was longer than I expected! I hope this can give you some hints on what to look for the next time you are on a code review (yours or someone else’s). As always if you have any comments, doubts or whatever, leave them below. Good coding!

Problem space, solution space, and complexity explained with pictures

For the last couple of years, my work can be described as nothing but refactoring. And I like it. It’s like taking away the mist surrounding the forest. As you move forward you start to gain a better sense of the code intention and start to detect places where complexity has made its nest.

Complexity is a strange beast. According to Ward Cunningham there are 2 kinds of complexity: empowering complexity (“Well that’s an interesting problem. Let me think about that for a while”) and difficulties (blockage from progress). Does this sound familiar? Where complexity and difficulties come from? To answer this, let’s take a look at the idea of problem space and solution space.

Problem and solution space

The problem space

As depicted in the picture the problem space is this conceptual space delimited by some rules and constraints. More important it includes the current state of affairs and the desired state. Is inside the boundaries of this space that solutions are born.

The solution space

As you can see the solutions are not all equal. Obviously, solution 2 is better than solution 1. This leads me to Ward’s definition of simplicity: Simplicity is the shortest path to a solution. Or in the context of our drawing, the shortest path to the desired state. By the same token, we could say complexity is any path that’s longer than necessary.

Now, this may be tricky. Is possible that solution 2 in our example/drawing requires a kind of knowledge that we don’t currently possess. In that case, we can’t even think of that solution. Or we can’t understand it when presented to us. It would take us the extra effort to acquire that knowledge before we can find solution 2 in our problem space. Hence why is important that we try to have a breadth of knowledge of the (mostly thinking) tools out there. But I digress.

So the shortest path, huh? well “shortest” is not the same for everyone.

Essential complexity

In this picture, the solution in problem space 2 is complex than the one in problem space 1, not because of the solution itself but because of the problem space.

This “distance” between the initial and desired state is known as essential complexity. No matter what you do, the solutions in problem space 2 will be complex than most of the solutions in problem space 1. It’s just the problem is more complex.

Accidental complexity

But what about this?

Clearly, problem space 2 is complex than problem space 1. Still, the solution on problem space 1 is complex than the one in problem space 2!

This is known as accidental complexity. It’s the complexity that comes from the solution we chose. Accidental complexity is our fault and is ours to solve.

And what about difficulties?

Well, now we have found where the complexity comes from. But what about difficulties? let’s review the definition:

A difficulty is just a blockage from progress.

mmmm…. from progress? That implies we are already on the path to our destination. Is it the path of solution 1 or solution 2? It doesn’t matter. What matters is that a solution has been selected and we are traversing through it. Keep this in mind as Ward enlighten us once again:

The complexity that we despise is the complexity that leads to difficulty.

That is accidental complexity!
The difficulty is born out of accidental complexity!

Final thoughts

So there you have it. I’ve been thinking about this stuff for a while. I still do.

As I continue to refactor code, I find myself understanding more about the solution, and the problem space itself. I believe the main difference between a programmer and a consultant, is that consultants start in the problem space, this means that they have the autonomy to explore and select solutions, whereas programmers are tasked to work on the solution space from the get-go. That being said, most of the time, we don’t know how good a solution is until we code it.

This leads me to the tip of the day: if it’s hard, that is, if the way is full of difficulties, maybe you are taking the long path. Try stepping back and ask yourself “is there another way to accomplish my objective?”

Depending upon abstraction is not about interfaces, is about roles

I recently stumble upon this code where someone took an object and extracted an interface from its methods, something like:

class Parent: IParent{
    public void Teach(){}
    public void Work(){}
}

interface IParent{
    public void Teach();
    public void Work();
}

I’ve seen many people (including myself, tons of times) do this and think: “There. Now we are depending upon abstractions“. The truth is, we are depending on an interface, but depending on abstraction is way more than that.

An object design guideline

All objects have a raison d’etre: to serve. They serve other objects, systems, or users. Although that may seem obvious, I’ve found that’s something often overlooked.

Warning: Rant ahead.

I have mentioned this before but, I believe the main reason object-oriented programming is often criticized is that is not well understood.

The idea of an object as an abstract concept that can represent either code or data has not reached enough people to change the overall perception.

A lot of the people I have seen complaining about OOP is doing structured programming. They still tend to separate the data from the operations that are done upon it. Basically structs and modules. It’s sad because this yield software that is hard. Hard to change, hard to understand, hard to correct. Is not soft (as in soft-ware). I blame schools for this. At least in my particular experience, OOP is often delivered as an extension of structured programming, much like C++ is often seen as an extension of C.

We need to reeducate ourselves on the way we think: OOP is not about using object-oriented technology but about thinking in an object-oriented fashion.

This is the reason I started this blog.

End of Rant πŸ˜›

So thinking of objects as either data bags or function bags is the result of ignoring a fundamental design question: whom does this object serve?

To answer this question you have to start with the client (object, system, user) needs. This leads itself to a top-down analysis/design approach. But a lot of us are trained to start a system design by thinking on the structure of a relational database, which it’s a bottom-up approach. Let’s see how they differ from each other.

The Database first approach

When designing a relational database, the thinking tools available are Entities and the Relationships between them, often displayed in an ER diagram. So we start with Entities from the nouns on the domain: Parent, Teacher, Student, Child, Class, Course, and so on. I’m pretty sure you can think of a domain just by looking at these concepts.

Now that you have these Entities, you have to think about the processes that interact with them. How do we create a new student? How do we update some of its data? How do we delete it? If you look closely you will find that most everything is modeled as CRUD operations around the Entities. In this scenario, the entities are your abstractions.

The Objects first approach

In this case, you would start by thinking about the needs of the user. This often is expressed as tasks. We usually discover and document these in the form of user stories or use cases. This initial set of needs will serve as the basis for the features of the system. We can now start creating the objects to fullfill these needs. Often this objects will represent the tasks expressed by the user. This is what is known as the application layer on DDD.

From here on things start to get interesting. Pick one of these task objects. What do you need to accomplish this particular task? These are the needs of the object. Now here comes the trick: define an interface/abstract class that fulfills one specific need and name it as such. By doing this we force ourselves to define a specific concept for a specific need in a specific operation. We call this kind of concepts: Roles.

I love the naming schema that Udi Dahan uses for Roles: IDoSomething/ ICanDoSomething. In this approach roles are your abstractions.

Entity vs Role

Let us go back to the original issue: what it means to depend on abstractions?
To answer that we need to answer another question first: what is an abstraction?

Let’s consider the difference between the 2 kinds of abstraction we’ve seen so far: Entity and Role.

First, let’s clarify something: Entities as we have discussed so far don’t belong to the OOP paradigm, they belong to the Relational paradigm. We have discussed before that the needs addressed by a model in the relational paradigm are geared toward disk space optimization, whereas the needs of an object model, particularly an object domain model, are about representing business concepts and interactions in a way easy to change and understand.

Side note: There’s actually an Entity concept in DDD.
An Entity is an object with a unique id. Often, DDD Entity objects overlap with their counterparts on the relational world, because both represent business concepts, but restricting the domain entities to the relational ones greatly caps our thinking and designing ability.

And here we come to the big idea: an Entity (or any object for that matter) can take upon many roles.

This is because roles and entities are different kinds of abstraction. Entities represent a thing/idea whereas roles represent a capability.

And often, depend on abstraction means depend on a role.

A (silly) code example

Let us review our previous code:

class Parent: IParent{
    public void Teach(){}
    public void Work(){}
}

interface IParent{
    public void Teach();
    public void Work();
}

A lot of people are OK with creating this interface before figuring out which services are going to be provided to which client. This is a leaky abstraction. It’s weak and ambiguous on its intention. Can you tell what’s the purpose of an IParent on a glance?

Let’s now review the client code. Let’s say a basic math class can be taught by a teacher, but given the COVID-19 situation it can also be taught by a parent at home:

public class BasicMathClass{
        public BasicMathClass(Teacher teacher){
             teacher.Teach();
       }

        public BasicMathClass(Parent parent){
             parent.Teach();
       }
}

public Teacher{
       public void Teach();
}

class Parent: IParent{
    public void Teach(){}
    public void Work(){}
}

interface IParent{
    public void Teach();
    public void Work();
}

When we look at the client code it’s obvious why the parent teaches. But since we extracted the interface without even checking who was using it before, we are now in a dilemma. One way to solve this could be:

public class BasicMathClass{
        public BasicMathClass(IParent parent){
             parent.Teach();
       }
}

public Teacher: IParent{
       public void Teach(){}
       public void Work(){}
}

class Parent: IParent{
    public void Teach(){}
    public void Work(){}
}

interface IParent{
    public void Teach();
    public void Work();
}

Solved. I know, this is silly, but if you think about it, all teachers also work, so it’s not so crazy to have a work method in there.
But not all of them are parents. So what then? Should we revert the interface?

public class BasicMathClass{
        public BasicMathClass(ITeacher teacher){
             teacher.Teach();
       }
}

public Teacher: ITeacher{
       public void Teach(){}
       
}

class Parent: ITeacher{
    public void Teach(){}
    public void Work(){}
}

interface ITeacher{
    public void Teach();
}

Well, this reads better right? All parents teach, so they are teachers, right? Well, that’s not necessary true either. They can teach, but not because they study to do so, and they cannot teach in a school either.

The problem is in the role conceptualization: we are talking about what something is, instead of what it does.

public class BasicMathClass{
        public BasicMathClass(IEducate educator){
             teacher.Teach();
       }
}

public Teacher: IEducate{
       public void Teach(){}
      
}

class Parent: IEducate{
    public void Teach(){}
    public void Work(){}
}

interface IEducate{
    public void Teach();
}

The change is a subtle one but is important nonetheless: instead of depending on an entity (some thing/idea) we are now depending on a role (a capability). The mental model implications are not to be taken lightly. Once you start depending on roles, you’ll start to think more in terms of them.

So here’s the tip of the day: If you want to talk about what something is, use a class. If you want to convey what it does, use an interface.

Objects are meant to act, not to be acted upon

One of the most common issues I find when mentoring people on object-oriented design has to do with the mentality that many people brings when moving from other paradigms. Particularly with the ones coming from the structured programming paradigm. Let’s clear that up.

Paradigm abstraction levels

To simplify, abstraction level = level of detail. Now imagine a map application, something like Google Maps, if you zoom out you can see more terrain and, at the same time, you lost sight of some information like store and street names. This is the idea behind an abstraction level. As you go up, the detail level goes down and vice-versa. Now, how does this relate with programming paradigms?

I often explain paradigms like tinted glasses. You put on some red-tinted glasses and everything looks reddish. If you put amber-tinted glasses everything looks brighter but if you put on some dark tinted glasses everything looks darker. So it is with paradigms: like tinted glasses, they affect the way we look at the world. Programming paradigms in specific provide some constructs to represent the world. So, every time you try to explain a world phenomenon you do it using the constructs provided by the paradigm you’re currently using.

So, we can classify a programming paradigm abstraction level by it’s number of constructs: the more it has, the more details you are dealing with, and hence you’re at a lower abstraction level.

So here’s a brief table showing some paradigms ranked by this criteria:

ParadigmConstructs
FunctionalFunction + Types
OOPObject + Message
Structured ProgrammingProcedures, Data Structures, Blocks, Basic Data Types

This is by no means an exhaustive table, but you get the idea. So you can see that OOP and Functional are paradigms at a higher level of abstraction, whereas Structured Programming operates at a lower level of abstraction.

So you see, OOP abstracts both data and code under one concept: an object. Just as important, it also abstracts the control flow under the concept of the message. Those are the tools available to you in this paradigm.

The root of all Evil

Well, maybe not of all evil, but surely it has brought a lot of problems. And that is: to believe that you are working on the OOP paradigm because you have an OOP compliance language while keeping a Structured Programming mindset. There, I said it. I know this will irk some people, but there’s no way around it. Let me show you.

var range = Utils.GenerateSequence(from:1, to:7);

So I think that’s a pretty straightforward OO snippet, right? Except it isn’t. Let’s see how would it look like if it truly were OO.

var range = 1.To(7);

So let’s review the differences. This may be a little tricky as the differences I am referring to are not in the code itself but in the mindset that generates it. Let’s start with the code and see if we can identify the mind patterns that generate it.

Differences between the Structured Programming and Object-Oriented mindsets

The main problem I find with people I coach or work with, it’s the idea that object == data structure + procedures. The problem with this is that it becomes a limitation. So, in the statement:

var number = 1;

People tend to think of ‘number’ as data since that’s what we are assigning to it. This difference between objects and data is throwing people off in the wrong direction. Remember that there is no such thing as ‘data’ in OOP, just objects and messages. You should think of ‘number’ as an object.

On the other hand, something like:

Action action = Utils.GenerateSequence;

It’s an object that represents code. But most people use the concept of pointer as a way to explain C# delegates. Why? because to them object == data structure + procedure. Anything outside of that definition is no object to them. By the way, this is what a pointer looks like in C#:

int* ptr1 = &x;

So the main question is: are you treating a variable as a data structure that needs to be passed around to functions in order to do something with it (is acted upon)? If that’s the case you are working on a structured programming paradigm (most likely). The Math class in the .net framework is a prime example of this.

On the other hand, do you send messages (‘invoke a method’ in C#/Java lingo… don’t really like the term) to the variable to do something that requires little to no external help (acts itself)? Congratulations, that’s exactly what OOP is about.

Conclusion

It’s not my intention to trash any paradigm out there. Every paradigm is useful in the right context. It’s just that there is so much confusion about them that I often find myself explaining this stuff over and over. So I hope this makes it clearer for you. If you ever find yourself struggling with OOP, try taking a step back and see if you are really operating on the OOP paradigm. Who knows, you may be surprised at your discoveries (as some of my mentees had been). See you on the next post!

Quality code pillars: a guide to better code reviews

As a code mentor and member of a software development team, I’m subject to and carry on code reviews. However, I have noticed that many times people doing the review don’t have a clear idea of what to look for. This leads to discussions on stuff like style and micro-optimizations. Having been there myself, I would like to offer some ideas on things that you could look for when doing a code review. I want to share my personal quality standard. To me, a high quality code is easy to understand, easy to change, and correct. In that order.

Easy to understand

I have found that communication is the main idea here: does the code communicate the ideas succinctly?

When reviewing, I look for code that is poorly encapsulated or named. Also pay attention to the semantic distance between the concept and the symbol in the code that represents it, it may reveal leaking abstractions. For example, you want to represent a money amount, so you code something like:

var debtInUSD = 200.15;

So if you familiar with USD you know that the decimal part refer to cents. How do you know we are dealing with USD? because the variable says so. Imagine if it were only something like:

var debt = 200.15

Could you tell what the currency is? is it USD or Euro? You probably would have to hunt down the code to figure that out. So you see, naming is very important when you try to make your code easy to understand. Don’t be lazy. Use meaningful names. Now consider the following example:

var debt = Money.USD(units:200, cents:15);

In this case, you know you are dealing with USD. At least at this point. If you find this later down the road you probably will have to hunt for the definition to see what are we dealing with. However, if you don’t care about the type, this should be enough (in the example being used here, you can think of USD and Euro as some kind of logical type, even if they’re only instances of the money type). Imagine the following:

var debt = Money.USD(units:200, cents:15);
debt = debt.AsEuro();

In this scenario, including USD in the variable name would be misleading.

Easy to change

A code should be easy to change. You can think of this as a platform to add new features. The code should make it evident where to introduce the new feature. This requires constant refactoring to reflect our new knowledge in the codebase. The codebase itself should be a reflection of our current knowledge. There are many things that make a code hard to change, uncle bob classifies them in the following categories: rigidity, fragility, immobility and viscosity. At the heart of them lies the idea of coupling.

I often look for references that couple objects, modules and projects unnecesarily.

Correct

What I mean by this is as free of errors as possible.

Typically, we deal with 3 types of errors: syntactical, semantical and runtime.
Since the compiler usually handles syntatical errors, let’s focus semantical and runtime errors.

Semantical errors are related to the business logic.Β This is a moving target since the rules the software try to model tend to change over time (at least this is true for a line of business application). We usually detect them using unit testing and acceptance/functional testing.

Runtime errors are usually related to resources used by the application. You can detect these using integration, load and any other kind of tests that exercise the application resources.

If the tests related to the code piece I’m reviewing are not present, I ask the developer for any.

Closing thoughts

So, there you have it. I would like to say that the order in which these appear is the priority order for me i.e I’ve found that if I start trying to create an easy to change codebase, I tend to end up with some code that is hard to understand. The reason for this, in my experience, is that we often introduce new levels of indirection in order to decouple the code, which in turn makes it harder to understand. So, by focusing on making the code easy to understand first, I can begin introducing indirection levels as this becomes necessary and still have a codebase that a new developer can pick up rather quickly. And if you have a code that is easy to understand and easy to change, you can easily correct it.

By the way, TDD promotes all of these, but that’s a post for another time πŸ˜‰

So, what do you look for when doing a code review? leave your comments below