How to improve the signal to noise ratio of your code

As I have previously shared, code quality can be summarized along 3 axes: It is easy to understand, easy to change, and correct.
Today I want to talk about a trait that indicates how easy to understand is a codebase: Signal to noise ratio.

What is signal to noise ratio?

Signal-to-noise ratio (SNR or S/N) is a measure used in science and engineering that compares the level of a desired signal to the level of background noise

https://en.wikipedia.org/wiki/Signal-to-noise_ratio

In software development, this means how much of you code explains your intention/ideas/knowledge vs how much doesn’t.

Why is signal to noise ratio important?

Well, as mentioned before, this is an indicator of how easy to understand is your code. That means, how much time and mental effort is required to understand what the code does and more importantly, why it does it that way. Understanding these 2 facts is a requirement before changing how the code works. There’s no workaround for that.

What is the most influential factor on the signal to noise ratio?

If I were to pick a single attribute on a codebase to change its signal to noise ratio, that would be the abstraction level. You see in my experience poor signal to noise ratio comes from either under abstraction (too much detail) or over abstraction (too many layers of artifacts, too much indirection).

Under abstraction and its effect on the signal to noise ratio

How many times have you been tasked to make a little, tiny change in behavior, only to find yourself with a 200 lines function (… I just had a PTSD episode). The problem with a 200 lines function is that there’s too much detail to easily figure out the what and why.

This detail overload doesn’t happen just at the level of huge functions, but also at the level of language constructs. Take a look:

decimal orderTotal;
foreach(var line in orderLines)
{
    orderTotal+= line.Total;
}

So as you can see, the idea here is that the order total is the sum of the order lines total. So what code here isn’t relevant to that idea? Think about it for a moment. Done?

decimal orderTotal;
foreach(var line in orderLines)
{
    orderTotal+= line.Total;
}

Surprise! I bet a lot of you didn’t see that coming! This is because sometimes we get so used to the language that we give those things for granted. I know I did. It took me a lot of effort learning Smalltalk (and banging my head against the wall every time I tried to do something new) to rewire some parts of my brain. But you can’t deny it. Iterating over the lines is just a detail to sum up the lines total. I does not help conveying the main idea. It’s noise. How would you fix that? Actually, there are several ways.

decimal sumLinesTotal(){
    decimal linesTotal;
    foreach(var line in orderLines)
    {
        linesTotal+= line.Total;
    }
    return linesTotal;
}
...
decimal orderTotal = sumLinesTotal();

How’s that? Not a big deal right? But, now there’s no doubt about the code intention. I know, some of you may think this is dumb. The code itself wasn’t that complex to start with, why should we create a new function just for this? Well, what do you think would happen to a 200 lines function if you started doing this? Not only for loops but every place where implementation details (the how) appear. I dare you to try it. Now, if you are using C# there are other ways to be explicit about this:

 decimal orderTotal = orderLines.Sum(orderLine=>orderLine.Total);

Over abstraction and its effect on the signal to noise ratio

Over abstraction happens when we add unnecessary artifacts to a codebase. This is a prime example of accidental complexity. A very common cause of this is speculative generality: the idea that someday we may need to do something and preparing the code to handle such cases, even when we don’t have the need right now. But there are more common, more subtle cases.

So let’s say we have a report API to which we make requests:

public EmployeeData GetEmployeeData(Guid id);

public EmployeeData
{
    Guid Id;
    ...
}

public ManagerData GetManagerData(Guid id);

public ManagerData
{
   Guid Id;
   ...
}

So our relational mindset tell us that we are duplicating data here (id) and that we should remove that duplication.

public class ReportData
{
    Guid Id;
}

public EmployeeData GetEmployeeData(Guid id);

public EmployeeData: ReportData
{   
    ...
}

public ManagerData GetManagerData(Guid id);

public ManagerData: ReportData
{  
   ...
}

Great! duplication removed! but wait! we can go even further! Isn’t it everything we’re returning just report data? Let’s make that explicit!

public class ReportData
{
    Guid Id;
}

public ReportData GetEmployeeData(Guid id);

public EmployeeData: ReportData
{   
    ...
}

public ReportData GetManagerData(Guid id);

public ManagerData: ReportData
{  
   ...
}

But now the client code need to cast the result to the concrete type. Maybe we can make the ReportData object accommodate different sets of data?

public class ReportData
{
    Guid Id;
    Dictionary<string, object> Data;
}

public ReportData GetEmployeeData(Guid id);

public ReportData GetManagerData(Guid id);

So now let’s say you are given a ReportData object. How can you know if you are dealing with an employee or a manager’s data? You could query the data dictionary for a particular key that represents a property available only in employee (or manager), or worse, you can introduce a key in the dictionary that says which type of data is contained in it, moving from strongly typed to stringly typed. This is all noise. The signal has been effectively diluted.

Some guidelines to improve your signal to noise ratio

By this point I hope is clear to you that to improve your signal to noise ratio, using the right abstraction level is key. So I’ll share with you some of my observations on the abstraction process.

Step 1: remove noise by encapsulating details away into functions

Encapsulation and abstraction are closely related. I’ll talk about it in another post. Suffice to say that as you are encapsulating details away, you’re also raising the abstraction level. The trick to avoid going overboard is to think about what you want to express: the signal. Is that clear enough? A good rule of thumb is trying to make your functions 5 lines or less.

Step 2: uncover the objects

You will find that some functions act upon the same set of data. Those are objects hidden in the mist. Move, both the data and the functions that act upon it to a class. Naming the class will have an impact on the clarity of your signal, but don’t worry to get it right the first time, you can rename it (and you will) as your understanding increases.

Step 3: wash, rinse and repeat

Repeat the 2 previous steps over and over. If the idea you want to convey is still not clearly expressed by the code go to step 4.

Step 4: select a metaphor

To be discussed on the next post. ūüôā

A quick comment on comments

As I began writing I mentioned that you need to understand the what as well as the why of the code. The former can clearly be expressed by the code. If that’s not the case, you haven’t reached the right level of abstraction yet. As for the latter, this is the only situation in which I find comments justifiable. Explain constraints or whatever it is that lead you to chose the current solution.

Closing thoughts

Man, that was longer than I expected! I hope this can give you some hints on what to look for the next time you are on a code review (yours or someone else’s). As always if you have any comments, doubts or whatever, leave them below. Good coding!

Problem space, solution space, and complexity explained with pictures

For the last couple of years, my work can be described as nothing but refactoring. And I like it. It’s like taking away the mist surrounding the forest. As you move forward you start to gain a better sense of the code intention and start to detect places where complexity has made its nest.

Complexity is a strange beast. According to Ward Cunningham there are 2 kinds of complexity: empowering complexity (“Well that’s an interesting problem. Let me think about that for a while”) and difficulties (blockage from progress). Does this sound familiar? Where complexity and difficulties come from? To answer this, let’s take a look at the idea of problem space and solution space.

Problem and solution space

The problem space

As depicted in the picture the problem space is this conceptual space delimited by some rules and constraints. More important it includes the current state of affairs and the desired state. Is inside the boundaries of this space that solutions are born.

The solution space

As you can see the solutions are not all equal. Obviously, solution 2 is better than solution 1. This leads me to Ward’s definition of simplicity: Simplicity is the shortest path to a solution. Or in the context of our drawing, the shortest path to the desired state. By the same token, we could say complexity is any path that’s longer than necessary.

Now, this may be tricky. Is possible that solution 2 in our example/drawing requires a kind of knowledge that we don’t currently possess. In that case, we can’t even think of that solution. Or we can’t understand it when presented to us. It would take us the extra effort to acquire that knowledge before we can find solution 2 in our problem space. Hence why is important that we try to have a breadth of knowledge of the (mostly thinking) tools out there. But I digress.

So the shortest path, huh? well “shortest” is not the same for everyone.

Essential complexity

In this picture, the solution in problem space 2 is complex than the one in problem space 1, not because of the solution itself but because of the problem space.

This “distance” between the initial and desired state is known as essential complexity. No matter what you do, the solutions in problem space 2 will be complex than most of the solutions in problem space 1. It’s just the problem is more complex.

Accidental complexity

But what about this?

Clearly, problem space 2 is complex than problem space 1. Still, the solution on problem space 1 is complex than the one in problem space 2!

This is known as accidental complexity. It’s the complexity that comes from the solution we chose. Accidental complexity is our fault and is ours to solve.

And what about difficulties?

Well, now we have found where the complexity comes from. But what about difficulties? let’s review the definition:

A difficulty is just a blockage from progress.

mmmm…. from progress? That implies we are already on the path to our destination. Is it the path of solution 1 or solution 2? It doesn’t matter. What matters is that a solution has been selected and we are traversing through it. Keep this in mind as Ward enlighten us once again:

The complexity that we despise is the complexity that leads to difficulty.

That is accidental complexity!
The difficulty is born out of accidental complexity!

Final thoughts

So there you have it. I’ve been thinking about this stuff for a while. I still do.

As I continue to refactor code, I find myself understanding more about the solution, and the problem space itself. I believe the main difference between a programmer and a consultant, is that consultants start in the problem space, this means that they have the autonomy to explore and select solutions, whereas programmers are tasked to work on the solution space from the get-go. That being said, most of the time, we don’t know how good a solution is until we code it.

This leads me to the tip of the day: if it’s hard, that is, if the way is full of difficulties, maybe you are taking the long path. Try stepping back and ask yourself “is there another way to accomplish my objective?”

Depending upon abstraction is not about interfaces, is about roles

I recently stumble upon this code where someone took an object and extracted an interface from its methods, something like:

class Parent: IParent{
    public void Teach(){}
    public void Work(){}
}

interface IParent{
    public void Teach();
    public void Work();
}

I’ve seen many people (including myself, tons of times) do this and think: “There. Now we are depending upon abstractions“. The truth is, we are depending on an interface, but depending on abstraction is way more than that.

An object design guideline

All objects have a raison d’etre: to serve. They serve other objects, systems, or users. Although that may seem obvious, I’ve found that’s something often overlooked.

Warning: Rant ahead.

I have mentioned this before but, I believe the main reason object-oriented programming is often criticized is that is not well understood.

The idea of an object as an abstract concept that can represent either code or data has not reached enough people to change the overall perception.

A lot of the people I have seen complaining about OOP is doing structured programming. They still tend to separate the data from the operations that are done upon it. Basically structs and modules. It’s sad because this yield software that is hard. Hard to change, hard to understand, hard to correct. Is not soft (as in soft-ware). I blame schools for this. At least in my particular experience, OOP is often delivered as an extension of structured programming, much like C++ is often seen as an extension of C.

We need to reeducate ourselves on the way we think: OOP is not about using object-oriented technology but about thinking in an object-oriented fashion.

This is the reason I started this blog.

End of Rant ūüėõ

So thinking of objects as either data bags or function bags is the result of ignoring a fundamental design question: whom does this object serve?

To answer this question you have to start with the client (object, system, user) needs. This leads itself to a top-down analysis/design approach. But a lot of us are trained to start a system design by thinking on the structure of a relational database, which it’s a bottom-up approach. Let’s see how they differ from each other.

The Database first approach

When designing a relational database, the thinking tools available are Entities and the Relationships between them, often displayed in an ER diagram. So we start with Entities from the nouns on the domain: Parent, Teacher, Student, Child, Class, Course, and so on. I’m pretty sure you can think of a domain just by looking at these concepts.

Now that you have these Entities, you have to think about the processes that interact with them. How do we create a new student? How do we update some of its data? How do we delete it? If you look closely you will find that most everything is modeled as CRUD operations around the Entities. In this scenario, the entities are your abstractions.

The Objects first approach

In this case, you would start by thinking about the needs of the user. This often is expressed as tasks. We usually discover and document these in the form of user stories or use cases. This initial set of needs will serve as the basis for the features of the system. We can now start creating the objects to fullfill these needs. Often this objects will represent the tasks expressed by the user. This is what is known as the application layer on DDD.

From here on things start to get interesting. Pick one of these task objects. What do you need to accomplish this particular task? These are the needs of the object. Now here comes the trick: define an interface/abstract class that fulfills one specific need and name it as such. By doing this we force ourselves to define a specific concept for a specific need in a specific operation. We call this kind of concepts: Roles.

I love the naming schema that Udi Dahan uses for Roles: IDoSomething/ ICanDoSomething. In this approach roles are your abstractions.

Entity vs Role

Let us go back to the original issue: what it means to depend on abstractions?
To answer that we need to answer another question first: what is an abstraction?

Let’s consider the difference between the 2 kinds of abstraction we’ve seen so far: Entity and Role.

First, let’s clarify something: Entities as we have discussed so far don’t belong to the OOP paradigm, they belong to the Relational paradigm. We have discussed before that the needs addressed by a model in the relational paradigm are geared toward disk space optimization, whereas the needs of an object model, particularly an object domain model, are about representing business concepts and interactions in a way easy to change and understand.

Side note: There’s actually an Entity concept in DDD.
An Entity is an object with a unique id. Often, DDD Entity objects overlap with their counterparts on the relational world, because both represent business concepts, but restricting the domain entities to the relational ones greatly caps our thinking and designing ability.

And here we come to the big idea: an Entity (or any object for that matter) can take upon many roles.

This is because roles and entities are different kinds of abstraction. Entities represent a thing/idea whereas roles represent a capability.

And often, depend on abstraction means depend on a role.

A (silly) code example

Let us review our previous code:

class Parent: IParent{
    public void Teach(){}
    public void Work(){}
}

interface IParent{
    public void Teach();
    public void Work();
}

A lot of people are OK with creating this interface before figuring out which services are going to be provided to which client. This is a leaky abstraction. It’s weak and ambiguous on its intention. Can you tell what’s the purpose of an IParent on a glance?

Let’s now review the client code. Let’s say a basic math class can be taught by a teacher, but given the COVID-19 situation it can also be taught by a parent at home:

public class BasicMathClass{
        public BasicMathClass(Teacher teacher){
             teacher.Teach();
       }

        public BasicMathClass(Parent parent){
             parent.Teach();
       }
}

public Teacher{
       public void Teach();
}

class Parent: IParent{
    public void Teach(){}
    public void Work(){}
}

interface IParent{
    public void Teach();
    public void Work();
}

When we look at the client code it’s obvious why the parent teaches. But since we extracted the interface without even checking who was using it before, we are now in a dilemma. One way to solve this could be:

public class BasicMathClass{
        public BasicMathClass(IParent parent){
             parent.Teach();
       }
}

public Teacher: IParent{
       public void Teach(){}
       public void Work(){}
}

class Parent: IParent{
    public void Teach(){}
    public void Work(){}
}

interface IParent{
    public void Teach();
    public void Work();
}

Solved. I know, this is silly, but if you think about it, all teachers also work, so it’s not so crazy to have a work method in there.
But not all of them are parents. So what then? Should we revert the interface?

public class BasicMathClass{
        public BasicMathClass(ITeacher teacher){
             teacher.Teach();
       }
}

public Teacher: ITeacher{
       public void Teach(){}
       
}

class Parent: ITeacher{
    public void Teach(){}
    public void Work(){}
}

interface ITeacher{
    public void Teach();
}

Well, this reads better right? All parents teach, so they are teachers, right? Well, that’s not necessary true either. They can teach, but not because they study to do so, and they cannot teach in a school either.

The problem is in the role conceptualization: we are talking about what something is, instead of what it does.

public class BasicMathClass{
        public BasicMathClass(IEducate educator){
             teacher.Teach();
       }
}

public Teacher: IEducate{
       public void Teach(){}
      
}

class Parent: IEducate{
    public void Teach(){}
    public void Work(){}
}

interface IEducate{
    public void Teach();
}

The change is a subtle one but is important nonetheless: instead of depending on an entity (some thing/idea) we are now depending on a role (a capability). The mental model implications are not to be taken lightly. Once you start depending on roles, you’ll start to think more in terms of them.

So here’s the tip of the day: If you want to talk about what something is, use a class. If you want to convey what it does, use an interface.

Objects are meant to act, not to be acted upon

One of the most common issues I find when mentoring people on object-oriented design has to do with the mentality that many people brings when moving from other paradigms. Particularly with the ones coming from the structured programming paradigm. Let’s clear that up.

Paradigm abstraction levels

To simplify, abstraction level = level of detail. Now imagine map application, something like google map, if you zoom out you can see more terrain and, at the same time, you lost sight of some information like store and street names. This is the idea behind an abstraction level. As you go up, the detail level goes down and vice-versa. Now, how does this relate with programming paradigms?

I often explain paradigms like tinted glasses. You put on some red-tinted glasses and everything looks reddish. If you put amber-tinted glasses everything looks brighter but if you put on some dark tinted glasses everything looks darker. So it is with paradigms: like tinted glasses, they affect the way we look at the world. Programming paradigms in specific provide some constructs to represent the world. So, every time you try to explain a world phenomenon you do it using the constructs provided by the paradigm you’re currently using.

So, we can classify a programming paradigm abstraction level by it’s number of constructs: the more it has, the more details you are dealing with, and hence you’re at a lower abstraction level.

So here’s a brief table showing some paradigms ranked by this criteria:

ParadigmConstructs
FunctionalFunction + Types
OOPObject + Message
Structured ProgrammingProcedures, Data Structures, Blocks, Basic Data Types

This is by no means an exhaustive table, but you get the idea. So you can see that OOP and Functional are paradigms at a high level of abstraction, whereas Structured Programming operates at a lower level of abstraction.

So you see, OOP abstracts both data and code under one concept: an object. Just as important, it also abstracts the control flow under the concept of the message. Those are the tools available to you in this paradigm.

The root of all Evil

Well, maybe not of all evil, but surely it has brought a lot of problems. And that is: to believe that you are working on the OOP paradigm because you have an OOP compliance language while keeping a Structured Programming mindset. There, I said it. I know this will irk some people, but there’s no way around it. Let me show you.

var range = Utils.GenerateSequence(from:1, to:7);

So I think that’s pretty straightforward OO snippet, right? Except it isn’t. Let’s see how would it look like if it truly were OO.

var range = 1.To(7);

So let’s review the differences. This may be a little tricky as the differences I am referring to are not in the code itself but in the mindset that generates it. Let’s start with the code and see if we can identify the mind patterns that generate it.

Differences between the Structured Programming and Object-Oriented mindsets

The main problem I find with people I coach or work with, it’s the idea that object == data structure + procedures. The problem with this is that it becomes a limitation. So, in the statement:

var number = 1;

People tend to think of ‘number’ as data since that’s what we are assigning to it. This difference between objects and data is throwing people off in the wrong direction. Remember that there is no such thing as ‘data’ in OOP, just objects and messages. You should think of ‘number’ as an object.

On the other hand, something like:

Action action = Utils.GenerateSequence;

It’s an object that represents code. But most people use the concept of pointer as a way to explain C# delegates. Why? because to them object == data structure + procedure. Anything outside of that definition is no object to them. By the way, this is what a pointer looks like in C#:

int* ptr1 = &x;

So the main question is: are you treating a variable as a data structure that needs to be passed around to functions in order to do something with it (is acted upon)? If that’s the case you are working on a structured programming paradigm (most likely). The Math class in the .net framework is a prime example of this.

On the other hand, do you send messages (‘invoke a method’ in C#/Java lingo… don’t really like the term) to the variable to do something that requires little to no external help (acts itself)? Congratulations, that’s exactly what OOP is about.

Conclusion

It’s not my intention to trash any paradigm out there. Every paradigm is useful in the right context. It’s just that there is so much confusion about them that I often find myself explaining this stuff over and over. So I hope this makes it clearer for you. If you ever find yourself struggling with OOP, try taking a step back and see if you are really operating on the OOP paradigm. Who knows, you may be surprised at your discoveries (as some of my mentees had been). See you on the next post!

Quality code pillars: a guide to better code reviews

As a code mentor and member of a software development team, I’m subject to and carry on code reviews. However, I have noticed that many times people doing the review don’t have a clear idea of what to look for. This leads to discussions on stuff like style and micro-optimizations. Having been there myself, I would like to offer some ideas on things that you could look for when doing a code review. I want to share my personal quality standard. To me, a high quality code is easy to understand, easy to change, and correct. In that order.

Easy to understand

I have found that communication is the main idea here: does the code communicate the ideas succinctly?

When reviewing, I look for code that is poorly encapsulated or named. Also pay attention to the semantic distance between the concept and the symbol in the code that represents it, it may reveal leaking abstractions. For example, you want to represent a money amount, so you code something like:

var debtInUSD = 200.15;

So if you familiar with USD you know that the decimal part refer to cents. How do you know we are dealing with USD? because the variable says so. Imagine if it were only something like:

var debt = 200.15

Could you tell what the currency is? is it USD or Euro? You probably would have to hunt down the code to figure that out. So you see, naming is very important when you try to make your code easy to understand. Don’t be lazy. Use meaningful names. Now consider the following example:

var debt = Money.USD(units:200, cents:15);

In this case, you know you are dealing with USD. At least at this point. If you find this later down the road you probably will have to hunt for the definition to see what are we dealing with. However, if you don’t care about the type, this should be enough (in the example being used here, you can think of USD and Euro as some kind of logical type, even if they’re only instances of the money type). Imagine the following:

var debt = Money.USD(units:200, cents:15);
debt = debt.AsEuro();

In this scenario, including USD in the variable name would be misleading.

Easy to change

A code should be easy to change. You can think of this as a platform to add new features. The code should make it evident where to introduce the new feature. This requires constant refactoring to reflect our new knowledge in the codebase. The codebase itself should be a reflection of our current knowledge. There are many things that make a code hard to change, uncle bob classifies them in the following categories: rigidity, fragility, immobility and viscosity. At the heart of them lies the idea of coupling.

I often look for references that couple objects, modules and projects unnecesarily.

Correct

What I mean by this is as free of errors as possible.

Typically, we deal with 3 types of errors: syntactical, semantical and runtime.
Since the compiler usually handles syntatical errors, let’s focus semantical and runtime errors.

Semantical errors are related to the business logic. This is a moving target since the rules the software try to model tend to change over time (at least this is true for a line of business application). We usually detect them using unit testing and acceptance/functional testing.

Runtime errors are usually related to resources used by the application. You can detect these using integration, load and any other kind of tests that exercise the application resources.

If the tests related to the code piece I’m reviewing are not present, I ask the developer for any.

Closing thoughts

So, there you have it. I would like to say that the order in which these appear is the priority order for me i.e I’ve found that if I start trying to create an easy to change codebase, I tend to end up with some code that is hard to understand. The reason for this, in my experience, is that we often introduce new levels of indirection in order to decouple the code, which in turn makes it harder to understand. So, by focusing on making the code easy to understand first, I can begin introducing indirection levels as this becomes necessary and still have a codebase that a new developer can pick up rather quickly. And if you have a code that is easy to understand and easy to change, you can easily correct it.

By the way, TDD promotes all of these, but that’s a post for another time ūüėČ

So, what do you look for when doing a code review? leave your comments below

Models and business logic patterns

In the last post, we talked about the different models that co-exist in an application. However, some of these models may or may not exist in a given application and time. To get a better understanding of this effect, let’s review 3 different configurations and the way they affect the application models mentioned before.

Transaction script

A transaction script as described by Martin Fowler, it’s a procedural (often a structured programming) approach to organize an application logic. It contains all the steps to fulfill a request/use case/scenario/story. In my experience, this kind of applications often uses the persistence model directly. This means that this model doesn’t have a domain model but the business logic is embedded in the transaction script itself. Often the transaction script returns a different model on the result, effectively using a presentation model.

Table module

A table module is a different way to organize an application. It’s based on the idea of having a structure that represents a table, just one row at a time. The object itself it’s just an access mechanism to the data of the underlying table. It allows to move to any given record and provides access to all of its fields. You can put the business logic onto this object making it instantly available to all of the records. Often this object is directly bound to the controls of the UI. This means that the persistence, domain and the presentation model are exactly the same.

Domain model

A domain model centric configuration works by separating the domain layer from anything else. The resultant objects are focused on one thing and one thing only: hosting the business logic and rules. By doing this (keeping the domain model ignorant to persistence and presentation needs) we are forced to create a persistence and presentation model. This may initially closely resemble the domain model but can change independently to accommodate the needs of the layer it belongs to (presentation, persistence). Usually, the domain model is developed using an OOP paradigm, whereas the persistence model is often developed under a RELATIONAL paradigm (if using a relational DB) and the presentation model is done following an ACTION or OBJECT representation approach.

Closing thoughts

So there you have it. Go and check your codebase again. Which pattern it follows? Can you pinpoint the different models being used? leave your comments below…

Not all models are the same

Some time ago I used to work for a company that made gelatin capsules used for drug administration. The process was a tricky one: there were several machines that would mix the gelatin and then you have a to wait a cool down time before you could start using it. There were different kinds of gelatin with different mix and cooldown times. My job was to create a simulator that would calculate the optimistic use of the machines given the requirements for different kinds of gelatin. I was given access to a database that had everything I needed. So my initial domain model was based on the DB¬†structure. As time passed by, however, it was clear that the current model was lacking a lot for my purposes. So I just dropped the whole thing and start anew. Using a repository I then mapped the domain model to the DB¬†model. Then something funny happened: my manager could not understand my domain model since he could not reason outside of the DB one. It took me some time to figure what was confusing for him…

Not too long ago I heard something similar. I was helping a friend at work to distill a domain model that was based on a DB. I didn’t do much, just giving some pointers on how to apply OOD principles and helping him to find out how to allocate responsibilities to the right objects. As the model became more clear he came to the conclusion that some of the DB design decisions were getting in the way, so he decided to create his own domain model. Once he did that, the code became so much clear.

The fallacy of one model to rule them all

Unfortunately, a lot of people tends to start modeling the DB and then create a domain model that mimics that structure. That works for a relatively simple system but it won’t stand against a more complex one. The reason it’s simple: the DB model, serves a different purpose than solving the problem. Actually, we deal with several models on a system, each one serving a different purpose. Let’s go over them.

The persistence model

I’ll start with this because often this is the starting point when designing a system. The persistence model actually has the purpose of storing data efficiently. That’s it. We often use E/R diagrams as a tool to understand the concepts of the domain and the relationship between them. One problem with this is that often these concepts are related only on a given context and these relationships are not valid out of that scope. A very experienced developer can avoid that but I argue there are better tools for analysis than E/R diagrams. Usually, the persistence model is very granular.

The domain model

The domain model it’s the one responsible for solving a very specific problem. Hence a domain model should be very specific. Designing a domain model requires you to have an understanding of the problem. This is part of the solution and not the problem space. I believe this model should be created before any other one. If you are using OOP, this model comprises your objects and the interactions between them. Is often more coarse than the persistence model.

The presentation model

The purpose of the presentation model is to allow the user to interact with the system.

When dealing with an object-oriented system, there are 2 schools of thought regarding the user interface: task-based and object-based.

Task-based user interfaces are geared towards a task that involves several objects interacting together. It’s like a script that a set of objects has to follow to accomplish something on behalf of the user. This often results in a more coarse model that aggregate several domain objects. Objects on this model are often called view-model objects.

Object-based user interfaces are predicated on the idea that the user should be able to manipulate the objects together as he/she sees fits to accomplish anything he wants. This means exposing the underlying domain model directly to the user. Patterns and frameworks such as naked objects are examples of this idea.

Traps and tricks

One of the problems I often encounter comes from the use of ORM’s. I’m not saying that using them is bad, but you should use them carefully. They often introduce constraints from the underlying persistence mechanism, forcing us to concern with stuff other than solving the problem at hand. They also somehow promote coupling so it’s not easy to switch the underlying persistence technology, ie from a relational to a document DB.

Another problem arises when you try to expose your domain objects on a task-based UI. In my experience they become intermingled with UI logic, making them a mess that is hard to maintain. Often you end up with additional data that has nothing to do with the object original purpose.

Eric Evans figure this out long ago. That’s why on DDD he provides us with patterns to isolate the domain model from any external influence. Repositories allow the domain model to be completely independent of the persistence model whereas the application API isolates it from the UI. Unfortunately, the intent behind these patterns has been forgotten and thus misuse and abuse of these patterns arise.

Closing thoughts

Next time you start a new project, put mechanisms in place to keep your models separated.

Whenever you find yourself struggling to accomplish something because your model (be it persistence, domain or presentation) cannot accommodate that change without creating a ripple effect to the other models, take a look at the patterns mentioned here and look for ways to isolate the model. This may take some time in the present, but it will pay handsomely in the future.

Good luck out there!

What’s holding you (or your organization) from being Agile?

So you got yourself a scrum manager, had a meeting with the team, explain the scrum practices and wrote a product backlog. 4 months later things aren’t going as you expected… this Agile talk is all nonsense – you say as you walk disappointed – we were supposed to be able to ship faster, to fix bugs faster, to add new features faster… Before throwing the baby with the water, let’s consider some of the possible¬†causes.

Your codebase is not Agile

This is by far the most common reason I have found on my experience. You have a code that breaks every time you introduce a change (fragile), or that has you change a lot of places every time you add a new feature (rigid). You cannot be agile with a codebase that fights you every step of the way. Focusing on processes and ignoring the codebase is often the reason why organizations fail when trying to implement Agile methodologies.

Your mindset is not Agile

If you think that a scrum master is a manager, you’re not Agile.
If you think that a backlog is like a Gantt chart, you’re not Agile.
If you think that you need a separate team (or phase) for testing, you’re not Agile.
If you think that story points are a unit of time rather than effort, you’re not Agile.
If you think that value is determined by someone else than the end user, you’re not Agile.

Your feedback loop is too loose

To me Agile means feedback. I remember that one of the things that surprise me the most on a scrum training was this exercise where we get to create something physical, present it, get feedback on it, and turn that into a user story/task. The trainer then proceeds to explain that the sooner we get the feedback, the sooner we would be able to adjust to get on the right track. He talked about how a sprint should have several opportunities to get feedback so by the end we get the right product and not only the product right.

Lack of enough experienced developers

This one is actually kind of logic. If you don’t have enough experienced¬†developers, how do you expect to have a flexible, high-quality codebase? Having enough experienced developers that you can pair with less senior developers helps you improve the overall team level. Whereas having just a few of them tends to become a bottleneck for the whole team since everyone depends on them somehow.

Closing words

I am, by no means, an expert on Agile. These¬†are just my observations on some of the most common errors I’ve seen in my professional career.

Do you think I’m missing one? leave your comments below.

How to make your c# code more OOP with delegates pt 2

Implement the strategy pattern with delegates

Changing the default behavior of a method under testing (or any other specific circumstance)

Given the following code:

class EmailSender{
    
    public void Send (string recipient, string subject, string body) {//invoke 3rd party}
    
}

class Email{
    
    public EmailSender _sender = new EmailSender();

    public Send(){_sender.Send(recipient, subject, body);}
    
}

Imagine that you cannot change the Email class. How would you unit test it without making a call to a 3rd party service?

Answer: inject a delegate with the desired behavior.

class EmailSender{
    
    Action<string,string,string> _sendAction = _send; //default action
    
    public void Send (string recipient, string subject, string body) {
     _send.Invoke(recipient, subject, body);
    }
    
    public void _send (string recipient, string subject, string body) {//invoke 3rd party}

    internal void ActivateTestMode(  Action<string,string,string> testAction){
     _send = testAction;
    }

}

 

Specializing the rules of a domain object without inheritance

Given the following code:

public class BonusCalculator()
{
  List<Bonus> bonuses = new List<Bonus>();

  public BonusCalculator(ICollection<Bonus> bonus)
  {
    bonuses.AddRange(bonus);
  }

  public decimal CalcBonus(Vendor vendor)
  {
   var amount = 0;
   bonuses.foreach(bonus=>amount += bonus.Invoke(vendor, amount));
   return amount;
  }

}

public class BonusCalculatorFactory()
{

   public BonusCalculator GetSouthernBonusCalculator()
   {
    var bonuses = new List<Bonus>();
    bonuses.Add(new WashMachineSellingBonus()); 
    bonuses.Add(new BlenderSellingBonus ()); 
    bonuses.Add(new StoveSellingBonus ());

    return new BonusCalculator(bonuses);    
   }

}

If we want to add a new bonus that increments the 15% we would have to create a new class just to do that multiplication… So let’s try something different.

public class BonusCalculator()
{
  List<Func<Vendor, Decimal>> bonuses = new List<Func<Vendor, Decimal>>();

  public BonusCalculator(ICollection<Bonus> bonus)
  {
    bonuses.AddRange(bonus);
  }

  public decimal CalcBonus(Vendor vendor)
  {
   var amount = 0;
   bonuses.foreach(bonus=>amount += bonus.Apply(vendor, amount));
   return amount;
  }

}

Now we have to modify the factory

public class BonusCalculatorFactory()
{

   public BonusCalculator GetSouthernBonusCalculator()
   {
    var bonuses = new List();
    bonuses.Add(new WashMachineSellingBonus().Apply); 
    bonuses.Add(new BlenderSellingBonus().Apply); 
    bonuses.Add(new StoveSellingBonus().Apply);
    bonuses.Add((vendor,amount)=> amount * 1.15); 
    return new BonusCalculator(bonuses);    
   }
}

 

Easy peasy. Now depending on how it is implemented, we could start thinking about turning some of the rules into singletons.

Moving the control flow into objects

How many times have you started an operation where you want to know 1) if the operation was successful and 2) the return value. A lot of times this leads to code like:

class OperationResult{
    public bool IsSuccess{get;set;}
    public object ResultValue {get;set;}
}

interface IDataGateway{
    OperationResult UpdateName(string name);
}

class NameUpdaterCommand{
    string _name;
    IDataGateway _data;
    Log _log;

    public NameUpdaterCommand(string name, IDataGateway data, Log log){
       _data = data;
       _name = name;
       _log = log;
    }
    
    public void Execute(){
        var result = _data.UpdateName(_name);

        if(result.IsSuccess)
            _log.Write("Name updated to:" + Result.Value.ToString());
        else
            _log.Write("Something went wrong:" + + Result.Value.ToString());
    }
}

Come on, don’t be shy about it. I’ve done it myself too…

So what’s wrong with it?

Let’s see, the intention¬†behind this code it’s to¬†decide on a course of action based on the result of an operation. In order to carry on these actions, we need some additional info for each situation. A problem with this code is that you can’t handle an additional scenario. For that to happen instead of a boolean IsSuccess¬†you would have to create an enumerator of sorts. Like:

enum ResultEnum{
    FullNameUpdated,
    FirstNameUpdated,
    UpdateFailed
}

class OperationResult{
    public ResultEnum Result {get;set;}
    public object ResultValue {get;set;}
}

interface IDataGateway{
    OperationResult UpdateName(string name);
}

class NameUpdaterCommand{
    string _name;
    IDataGateway _data;
    Log _log;

    public NameUpdaterCommand(string name, IDataGateway data, Log log){
       _data = data;
       _name = name;
       _log = log;
    }
    
    public void Execute(){
        var result = _data.UpdateName(_name);

        switch(result.Result){
             case ResultEnum.FullNameUpdated:
               _log.Write("Full name updated to:" + Result.Value.ToString());
               break;
             case ResultEnum.FirstNameUpdated:
               _log.Write("First name updated to:" + Result.Value.ToString());
               break;
             case ResultEnum.UpdateFailed:
               _log.Write("Something went wrong:" + + Result.Value.ToString());
               break;
        }  
    }
}

So now every time¬†you want to add a new scenario you have to add a new enum value and a new case on the switch. This is more flexible than before but a little more¬†laborious than it should be. Let’s try to replace this enum based code with objects that represent each case:

interface IDataGateway{
    void UpdateName(string name, Action<string> firstNameUpdated, Action<string> fullNameUpdated, Action<string> updateFailed);
}

class NameUpdaterCommand{
    string _name;
    IDataGateway _data;
    Log _log;

    public NameUpdaterCommand(string name, IDataGateway data, Log log){
       _data = data;
       _name = name;
       _log = log;
    }
    
    public void Execute(){
       _data.UpdateName(_name,
                     fullNameUpdated: name  => _log.Write("Full name updated to: " + name),
                    firstNameUpdated: name  => _log.Write("First name updated to: " + name),
                        updateFailed: error => _log.Write("Something went wrong: " + error )
        );
    }
}

So now we have a shorter code. We have also moved the responsibility to control the flow to the object implementing IDataGateway. How it does it is just an implementation detail. We don’t care if it’s using an enumerator or any other mechanism as long as it works.

Phew! I think that’s enough for now. Now go improve your code!

 

How to make your c# code more OOP with delegates pt 1

Since I became a codementor¬†a recurrent theme has been handling delegates. I’ll try to clarify this once and for all. This ended up as a long post so I have decided to break it into 2 parts: in this, we get a feeling of what are delegates. The next one will deal with how and when to use them.

Extending the C# type system

The c# type system is often classified as reference and value types. I won’t go into what’s the difference between these 2 since there’s a lot of information about this topic out there. Typically a developer starts an application by extending this basic type system to better model a solution for the problem he is solving (in effect, he’s creating a DSL). There are several ways to do this: if you want to extend the value type system you usually use structs whereas for the reference type system classes and interfaces¬†are the default way.

However, there’s a 3rd way to declare a new type: delegates.

Understanding delegates

Given that a delegate type syntax it’s different than the rest of the methods to declare a new type, a lot of developers never realize that they are indeed declaring a new type. I mean consider the following:

class StockItem {
    public int SKU {get; set;}
    public string Description{get; set;}
    public decimal price{get; set;}
}

struct Point{
   public int X {get; set;}
   public int Y {get; set;}
}

interface IValidate{
    bool IsValid(object value);
}

somehow they feel alike, right? now check this out:

delegate string CallWebService(string url);

It feels odd right? It looks nothing like the other type definitions we have seen so far. It doesn’t have attributes¬†nor methods. Just what is this?!?!
Calm down, first of all, a delegate it’s an object that holds code declared somewhere else in contrast with classes which define the behavior of its instances inside themselves. With that in mind let me tell you that, what the delegate definition is saying is what kind of code it will contain: which values can accept and return. The “method name” would be the delegate name. Now that we have a type we can create instances of it!

delegate string CallWebService(string url);

class WebServiceUtils {
   public string MakeCall(string url){...}
}

public class Test{ 
            public static main (){ 
                string anUrl = "..."; 
                var caller = new CallWebService(new WebServiceUtils().MakeCall);
                caller.Invoke(anUrl) //or could be just caller(anUrl);
             }
}

So far so good. Actually, the C# team saw the potential of delegate and in C# version 2, they decided to bring some of the best things that could happen to the language: anonymous methods. Anonymous methods are an incarnation of closures, a very powerful concept. Unluckily for us, they decided to reuse the delegate keyword for this.

delegate string CallWebService(string url);

class WebServiceUtils {
   public string MakeCall(string url){ ... }
}

public class Test{ 
 public static main (){ 
 string anUrl = "..."; 
     var caller = delegate(string url) { return new WebServiceUtils.MakeCall(url); };
     caller.Invoke(anUrl) //it could be just caller(anUrl);
 }
}

I can only imagine that the C# team was thinking that since anonymous methods were only going to be used with a delegate, it made sense to use the delegate keyword to declare not only a delegate type but a delegate instance as well. Unfortunately, this leads to further confusion since now when someone talks about a delegate, he could either be talking about a delegate type or an anonymous method.

Even worse! From MSDN:

There is one case in which an anonymous method provides functionality not found in lambda expressions. Anonymous methods enable you to omit the parameter list. This means that an anonymous method can be converted to delegates with a variety of signatures. This is not possible with lambda expressions.

Basically, it means that you can write code like:

delegate string CallWebService(string url);

class WebServiceUtils {
   public string MakeCall(string url){ ... }
}

public class Test{
   public static main (){
     string anUrl = "...";
     var caller = delegate { //<-- no parameters at all!!
            //you can have access to the variables on the same 
            //scope as where the anonymous method was declared
          return new WebServiceUtils().MakeCall(anUrl);
     };
     caller.Invoke(anUrl); 
   }
}

So now you have anonymous methods that don’t conform to the delegate definition but are still regarded as valid.

As if this wasn’t enough the 3rd version of C# brought another way to declare anonymous methods: lambda expressions.

delegate string CallWebService(string url);

class WebServiceUtils {
   public string MakeCall(string url){ ... }
}

public class Test{
   public static main (){
     string anUrl = "...";
      //this is an anonymous method too
     var caller = (url) => new WebServiceUtils().MakeCall(url);//or MakeCall(anUrl)
     caller.Invoke(anUrl); 
   }
}

uff! this was a lot for a single post. Next post will see delegates in action. Stay tuned!