The Problem With Design and Implementation

I originally posted this story on osnews.com
 

I'm editing it a bit to clear up some of the points.
As I've done more reading and reflection on this topic, there's a better read from someone more authoritative than my simple self.

Martin Fowler
http://www.martinfowler.com/articles/newMethodology.html

It's a pretty good site in general for anything related to the software process...
I've also done a decent diagram:
http://yaminb.blogspot.ca/2014/06/summary-design-and-implementation.html


The Problem with Design and Implementation
I've been developing software for quite a few years. One of the issues that seems to come up again and again in my work is this concept of design and implementation. I recall it being a significant part of my education at the University of Waterloo's Computer Engineering program as well. The message was always the same. Never write code first. First you must design software by writing a design document, flow charts, pseudo-code, timing charts... then it's merely a trivial matter of implementing it. Make note of the attitude here given towards implementing. The real work is in the design, and it's just a trivial matter of implementing it. It sounds so simple doesn't it? Now, how often does this work out in real life?

Before I attempt to counter this attitude on this attitude, let me first say that I am probably the last person that just starts hacking away at code. I really do believe in design and properly thinking about a problem first. So keep this in mind and for the rest of this article; understand that I am in no way advocating hacking.  Requirement gathering, functional specification, high level design documents, specific algorithm usage and all the other wonderful process oriented documents are absolutely essential.

However, if I were to summarize the article in one phrase it would be this:  The final DESIGN is code. That is ultimately what matters.

To people in software, most of what I say here is going to seem obvious.  Why am I spending so much space writing about something so obvious?  Well because many people, even many in software, have missed this point.  Let me also suggest that you read this article focusing on the mentality of the people in the business and how it works out in practice. Many of the terms I take issue with may seem reasonable from an abstract academic point of view, but this article in written from a practical perspective.
The Origin
I believe the origin of this attitude stems from older fields in engineering. The civil engineer designs a bridge and the construction workers build (implement) the bridge. The mechanical engineers design a car and the autoworkers assemble (implement) the car. So it was natural to try and overlay this well known idea onto the field of software. One designs software and then another implements. You can see the constant theme here. The implementers are thought to be like robots. All they need to do is follow the instructions in the design and you will end up with a good product.
The Philosophical Problem
The major problem with this is that ALL of software is design. 100% of software is design from the high level architect-like design to the low-level design of a for-loop. The implementers of software are not human! I knew you suspected as much given how odd many programmers are. No, the implementers of software are actually 'perfect' machines. They are the compilers (interpreters, preprocessors... are all included in the generic use of the this word). For almost all purposes, the compiler is perfect.

It is rather strange actually. It is as if people do not recognize the very thing the computer brings to the table. It gets rid of human implementers. It makes them obsolete. Thus, it can do the same task perfectly over and over again without error. Software is like a civil engineer having an army of robots capable of reading his design and perfectly building the bridge every time. What an amazing world that would be. Every screw is tightened to the exact specification. Every weld done perfectly. Every piece of steel cut to the exact precision. That is what we have in the world of software. The perfection of implementation. Yet, we do not recognize this. Rather, we have decided we cannot live in a world without human implementers. So we erroneously transferred this concept over to the field of software and created the notion of implementation where none is needed.
What is Design?
Yes, all of software is design.  There is no implementation.  Pardon me as I stress this over and over.
There is only high level and low level design.
To mirror other fields of engineering:
A civil engineer also has high level design, such as choosing the type and shape of bridge.
A civil engineer also has lots of low level design, such as choosing the kind of screws, where they go, where to weld...
All parts of the design are essential and are 100% design. So it is in software. The high-level architecture (choosing components, designing interfaces...) are all essential. So is the low-level design of individual for loops, error checking... I dare suggest most of the problems I deal with on a day to day basis are in problems in the low-level. Low-level software should not be dismissed as dummy work. This is the guts of a program.
The Real World Problem
Every line of source code is design. Software is the equivalent of the blue prints to a bridge. The only complete design is actual source code. Once you realize this, you will begin to understand why so many software projects go wrong. It is not enough to hand over a design document or specification to a code-monkey and expect everything to come out okay. The key issue here is to understand that no traditional design document or specification is complete. If it were complete, you would have been better off just writing the source code yourself.
After all, what is source code, but the specification of what the program should do? Most modern programming languages resemble English and written language enough that a well-written program reads as well as a specification. Modern programming languages are not cryptic or needlessly verbose like assembler. Programming languages have gotten so good and have gotten rid of so much of the fluff that they essentially have become a very good way to represent algorithms. As I look through my source code today, about the only fluff are the import or include statements at the top of the file. Well written libraries and proper design abstract the rest of the fluff.
Suppose I were to write a function that divides two numbers. I could either write a specification as follows:
Program Inputs:  32 bit signed integer A, 32 bit signed integer B
Program Outputs: A/B as an integer ignoring any fractional component
Error checking:  If B is 0 then the program will throw an exception.

That is what a properly written specification would be. I could of course have just written the source code myself as follows.
int DivideNumbers( int a, int b)
{
if( 0 == b) throw new Exception ("Illegal divide by zero");
return a/b;
}
Is the English specification any clearer than the actual code? I highly doubt it. You might as well just look at the source. Does the English specification provide anything of value that the source code does not? Combine this with the fact that if you need to make changes, you are relying on the specification and now you run the risk of having the specification be out of date with the source. Now imagine a program with thousands upon thousands of functions. Have you ever worked at a place where every function is defined in a specification to the detail above BEFORE any code is written? Of course not! Everyone recognizes the insanity that would be. It would be mindlessly repetitive to fully specify something in English and than translate that in source code.
Of course this even assumes it is reasonable to think a design can be perfected when written once. Indeed, software is an iterative process as is most designs. You design something, test it out, makes changes, rinse and repeat. Software makes this process amazingly simple given our debuggers which act is simulators. Certain fields in engineering have been done so much that standard designs make it seem as if they somehow possess more quality. We certainly aren't building many new styles of bridges for example. Yet given any new problem, all fields in engineering face an iterative process filled with bugs. We've been doing civil engineering for centuries. Somehow things like the Big Dig in Boston still cause lots of problems.
Source code is a valid specification on its own.
Yet, we seem to cling to this notion of the specification that 'just needs to be implemented'. Let us expand this little example a bit. Pardon the extreme simplification here. Much more likely than the full specification you see above, is something like the following or even no specification at all:
Program Inputs:   A,  B
Program Outputs: A/B

This leave the 'code-monkey' needing to fill in the blanks as the specification is not complete. The programmer might choose to return -1 instead of throwing an exception if B is 0. He might not even bother to do error cheeking and just rely on the language itself to handle the issue. He might choose to use long, double, float data types instead of integer. No one knows. This is for a very simple function. Imagine the blanks the programmer has to fill in for anything more complex. Never is this more obvious than in any application that needs a graphical user interface (GUI). The impact of this is immediately seen by users. Don't blame the programmer here. I highly doubt the GUI was specified completely before they started coding. Then when things go wrong you have people wondering what went wrong. Why couldn't that programmer just implement what we wrote in the spec? The programmers say the spec was incomplete. Again, the only complete specification is really the source code itself. Everything else is really just an incomplete specification. No one is really to blame except the broken process itself.
The Academic Problem
This is not just a problem in industry with the stereotyped MBAs not understanding software. Often, you hear from engineers that universities should not be teaching software in a specific language, but they should be teaching abstract notions of algorithms and data structures. I agree, but how do you propose students express algorithms or data structures? Yes, this is why programming languages were invented. To allow us to express algorithms and data structures in a human readable format! You have to learn the language to express your ideas. How do you test your algorithms and data structures? By writing it in a specific language and running it. The power of a good programming language is essential to your learning. You can set break points, inspect variables, make changes and see the results immediately.
Indeed, by insisting that specific programming languages are not 'valid' ways to specify something, academia only reinforces this notion of design and then implementing it. While this might be true in some abstract notion within academia it has disastrous effects when carried over into the rest of the world.
Let's have a look at a simple algorithm one might encounter in academia. Here is Wikipedia's description of the Euclidean Algorithm to find the Greatest Common Denominator.
Euclid's GCD Algorithm
The Euclidean algorithm is iterative, meaning that the answer is found in a series of steps; the output of each step is used as an input for the next step.[21] Let k be an integer that counts the steps of the algorithm, starting with zero. Thus, the initial step corresponds to k = 0, the next step corresponds to k = 1, and so on.
Each step begins with two nonnegative remainders rk−1 and rk−2. Since the algorithm ensures that the remainders decrease steadily with every step, rk−1 is less than its predecessor rk−2. The goal of the kth step is to find a quotient qk and remainder rk such that the equation is satisfied
rk−2 = qk rk−1 + rk
where rk « rk−1. In other words, multiples of the smaller number rk−1 are subtracted from the larger number rk−2 until the remainder is smaller than the rk−1.
In the initial step (k = 0), the remainders r−2 and r−1 equal a and b, the numbers for which the GCD is sought. In the next step (k = 1), the remainders equal b and the remainder r0 of the initial step, and so on. Thus, the algorithm can be written as a sequence of equations
a = q0 b + r0b = q1 r0 + r1r0 = q2 r1 + r2r1 = q3 r2 + r3…
If a is smaller than b, the first step of the algorithm swaps the numbers. For example, if a « b, the initial quotient q0 equals zero, and the remainder r0 is a. Thus, rk is smaller than its predecessor rk−1 for all k ≥ 0.
Since the remainders decrease with every step but can never be negative, a remainder rN must eventually equal zero, at which point the algorithm stops.[20] The final nonzero remainder rN−1 is the greatest common divisor of a and b. The number N cannot be infinite because there are only a finite number of nonnegative integers between the initial remainder r0 and zero.
Brings back memories of university doesn't it? Now you read that and yes, that can be implemented in any programming language. Here is one of the implementations done from the same Wikipedia article. It's a simplified version and might not match up exactly to the English/Mathematical language above. Go with the flow here.

function gcd(a, b)
if a = 0
return b
while b ≠ 0
if a » b
a := a − b
else
b := b − a
return a
Granted it is in pseudo-code, but it wouldn't take much effort to convert it to C# or any other language. If I were writing a program what would be the point of me spending several paragraphs describing the procedure for the algorithm in some mathematical and English language, when I could just express it directly as source code? Can you imagine handing off that written mathematical specification to a 'code monkey' to just implement? He'd have more trouble understanding what you wrote than if you had just written the code yourself.
And this is the same problem in every other realm from network protocol specifications to HTML standards. Every specification that is complete is better off just written as source code directly. I can almost guarantee you that it will be more understandable as well.
Small Tangent
Let's go off on a tangent here as this is one of the reasons people find it so hard to implement specs and standards. What is the correct HTML standard? The answer really is whatever reference renderer you use. Remember again that to fully specify the HTML standard, it would have to be long and detailed enough to essentially be source code. A small example I just recently ran into is Read-only text boxes. Firefox renders it the way I like. A read only text-box is 'greyed out'. In IE, it looks the same as a writable text box. Not being a web-programmer, this was news to me. I suddenly feel old and outdated as I am sure many of you are thinking that is so 1999. The HTML specification of course says nothing of the 'right' way to do this. This is natural of course as to get the HTML to render the exact same the specification would be so long detailing colors, border widths, bevels, how round corners should be, gradients... I'd hate to be in that committee.
These days, it seems WebKit is becoming one of the 'standard' renderers. You must behave as WebKit behaves. So why not use Webkit directly? Hence, we see many more browsers moving towards it rather than playing a game of constant catchup. Have a look at the alternative if you wish. Microsoft has released many specifications of its formats. They are insanely long and I doubt they are fully specified. I'd be more than willing to wager that to actually implement any of their standards, you would have to launch their renderer (MS Word), see how it renders something, and then copy it. I do not fault Microsoft here. It's just the reality of writing specifications. Having worked in the networking field, we did this all the time. Some part of the specification is vague or strangely worded or missing. What do we do? We hook it up to the 'standard' Cisco box and see how it behaves. Then we make sure we behave the same way. Sometimes you are lucky in that as more and more people do this, the 'specification' is updated to include all the little vagaries people ran into along the way. We're actually quite fortunate in networking as the protocols themselves are fairly straight forward and they rarely change. Case in point is IPV4. Despite all the problems with it, it is still the dominant protocol. Contrast that in other fields where change flows much quicker.
Granted, often times companies don't want to just give you the source code as they spent time and money developing it. They might want your software to be incompatible, so that they can claim your software is broken and less reliable than theirs. One of the beauties of open source is that you don't need to read the specification. You can just link the source directly.
My only point here is to suggest that to FULLY specify something, the actual source code is often the most concise and best method of doing it.


Only Implementation Activities
Now, are there any implementation activities in the software realm?  I suppose porting an application between similar languages using similar libraries would be implementation only.  Some aspects of GUI design are also implementation only.  However, most of these are disappearing as they are simply areas that compilers are lacking and boy do we like writing compilers to get rid of this tedious tasks.  A lot of GUI design is moving towards some form of specification (XAML...) that can then be read or compiled into a program itself without you needing to touch 'code'.

This is the progression that has been constant.  We keep moving up the chain if you will, removing all the redundant aspects of software and leaving only 'pure' design.   Decades ago if you programmed in assembler, there might have been an 'implementation' stage where you were simply doing a repetitive task transferring the design of a for loop into assembler commands.  However, it is the compiler that has relieved you of this tedious, repetitive tasks.  If you are coding in assembler today it is only because you lack a compiler for a higher level language or you are doing optimizing design.



Solutions
I believe the solution to this problem lies in changing attitudes. First and foremost we need to change the language used by us and the academic world. I never want to see the phrase 'design and implementation' ever mentioned again.  Even if 'we' understand the academic difference that the design can be conceived independent of programming language, when this reaches the wider management and business community, it is reduced to the very problem discussed here. We should only speak of high level and low-level design.

Next, we need to recognize the validity of source code itself as a valid specification. That is exactly what it is. You don't see civil engineers trying to write an essay to describe a bridge when an auto-cad blueprint is the real design. We need to treat our source code as the specification that it is. That means writing it neatly and cleanly. Putting separate objects in separate files. Just as the civil engineer doesn't scribble over his designs, neither should we leave our source code mangled with undecipherable variable names. You should be able to read it with ease. I suggest we as software engineers bare a great deal of responsibility here. Often times our source code resembles a blue print with eraser marks, numbers scratched out, pages duct tapes together, big arrows leading to other documents and a big TODO that leaves out critical functionality... We should also try and use programming languages that make being a valid specification easy. C# for example has the ability to have a lot of meta information and attributes within the source code itself. I hope this process continues and is expanded in more languages. The more expressive the language, the more we can offload onto the compiler.

We should use valid design documents were possible. It should aid in the source code specification, but should not be thought of as the specification itself as that is not what the implementers (compilers) use to produce the final result. Putting emphasis on useful design documents will allow for more man hours to be spent here instead of on useless items. I tire of seeing someone copying the header file of a c++ class, pasting it into a word document and claiming that is valid documentation. Designing GUI prototypes, timing diagrams for network protocols, class and interface maps, overall design choices... are all essential in guiding the source code.

Last, but not least is something I haven't touched on too much, but it is implied. Software is not trivial work. As such getting good people and training them is essential. You cannot separate the specification or design or knowledge from the code. The latest fad is something called a 'subject matter expert (SME)'. I say this is a fad as it assumes that you can separate the person doing the coding from the person that 'knows stuff.' In the networking world, this is the person that knows the protocols and specifications down to a tee. No worries though. They don't program. That is just the trivial matter left to the mundane implementers. I had such an experience recently at a company I worked for. They had a PHD who was supposed to be a subject matter expert. Of course there were the programmers actually writing the code, debugging issues, working on interoperability, coding which bits need to be set, what values go in what fields, and so on. The programmers ended up having to know more about the specification than the SME himself. Do I really need to rewrite the article on the futility of separating the SME from the programmer? In the end, it is your programmer who will debug issues, solve problems... They need to be experts in the subject matter. I suppose you could have an SME who does not code. It's just that the programmer would have to bother them for every line of code. In the end, the SME would probably be better off just writing the code themselves.


If you are hesitant to make the leap of realizing that to fully specify something, you essentially need to code it or that a programmer needs to be a subject matter expert... let me summarize this article with an old philosophical question.

Can you have thought without language?
Some of best languages we have found to precisely specify many routines are the programming languages. No other written language offers as much clarity or conciseness as a modern programming language.

Comments

Popular posts from this blog

Changing the Social Contract

The Alpha Muslim

Incels - Unregulated Markets