Confusions about abstraction

Confusions about abstraction

What the concept abstract really means ?

The term abstract is widely used, but the meaning in many cases is vague or fuzzy. To calibrate the term let’s try to define what we exactly mean with the concept (the word)

The next is a quote from English Wikipedia.  The definition of the word abstraction:

Abstraction is the process or result of generalization by reducing the information content of a concept or an observable phenomenon, typically in order to retain only information which is relevant for a particular purpose. For example, abstracting a leather soccer ball to a ball retains only the information on general ball attributes and behavior.

Abstraction uses a strategy of simplification, wherein formerly concrete details are left ambiguous, vague, or undefined; thus effective communication about things in the abstract requires an intuitive or common experience between the communicator and the communication recipient.

Abstraction is reducing the information content

So here the point is simplification by reducing the amount of details.  The purpose if this is to emphasize the important aspects from the chosen point of view.

To say that abstraction is simplification is true, but this expression could be interpreted too strongly. The simplification is in most cases achieved be reduction of details. So there is still room to move within the simplification. The amount and nature of simplification depends on the situation and change case by case.

From pure theoretical (mathematical)point of view we can say that abstraction is a mapping  f ( L) -> R, where L is the source set (in our case a subset of reality)and R is the result set (the model of reality). The mapping  f will abstract (simplified ) the complexity of L if the result set is more simple than the source set. In most general form this means reduction of the number of detail. This implies that the mapping f is homomorphism but is not isomorphism. Otherwise the reduction would be 0-reduction. Homomorphism means that many source points should map to one result point. This way a reverse mapping is not possible. This implies that the mapping genuinely destroys information. This lost information cannot be returned in any way just from the model.

A terrain and a map of this terrain is a good example of this kind of modeling.

Simplification is a double-edged sword. When the point of view and usage is strait forward and simple normally there are no problems. When we consider my geographical map example if we need direction with car in southern Finland or street map for waking in Helsinki city center, then the scaling of  the map is no big issue. The situation changes a lot when the “map” (model) is used in several different points of view and interests.

Common confusions of “abstract”- concept

Non-concrete is abstract
Abstract-concept is often used fuzzily as synonym for non-concrete and/or difficult. Sometimes people say that mathematic or geometry are “abstract” as such. This does not however conform to the previous definition. In other words there happens neither simplification nor reduction of detail. Both of these mainstream fields of mathematic are axiomatic system with rules of manipulation.   Completely other story is then the fact that mathematic and geometry can be used as instruments in abstraction but in such a situation the mathematic is only vehicle or tool to create the mapping between sets. In the same way we can say that chess game is abstract. This is false again. The game is most concrete thing in the world with it’s board and chess pieces and game rules. Following the same deduction also the programming languages are not abstract but well defined games in their own world.

Programming is abstract

The world of computer programming is defined by the physical structure of von Neuman machine. The next question is then do programming languages form layers of abstraction above each other? Many people say that Cobol is more abstract than assembler. My opinion is NO in my strict use of the concept abstract. My argumentation is following. As our current programming languages are all deterministic the code cannot truly simplify anything that effects the decisions made in the path of execution. This implies that all deterministic programs are isomorphic with each other. This means that when we have a function f from programming language a to b which gives the mapping then there always exists e reverse function f-1 which gives the reverse mapping from program in b back to original program in a. The real thing that some call abstraction is only compression! Here is a small simple example of that compression.

Example:  for-loop java 4 ja 5

void cancelAll(Collection<TimerTask> c) {
    for (Iterator<TimerTask> i = c.iterator(); i.hasNext(); )
        i.next().cancel();
}

and

void cancelAll(Collection<TimerTask> c) {
    for (TimerTask t : c)
        t.cancel();
}

Simplification always boils down reducing the number elements – the less important ones.  When the program is deterministic all those elements (attributes and their values) that control the flow of execution control must be present regardless the chosen programming language.  So in every equivalent program exactly the same if– statement must be present in one form or an other

UML is  abstract

UML is one-to-one mapping between a well defined set of concept and their corresponding graphical signs. This mapping is isomorphic between the graph and the garpth “verbal structure” which can be any programming language. This way UML is a transformation algoritm between a diagram  and a description (which can use for instance Java language).

About real simplification

Let’s return real simplification. When we model the reality we face all the time the question of how much do we simplify. When we are doing the mapping between the realty and the model we have to choose the scale. The following diagram illustrates the two sides of the decision. The y-axis is the amount of abstraction (or simplification) and x-axis describes the amount of semantics within a concept. When you pick up a point in y-axis and decide the level of abstraction the at the same time you get the amount of semantic value, which is the width of the triangle at the point. So the higher we are the smaller amount of semantic the concept gives.

Abstraction Triangel

There is a shaded  area in the center of the y-axis.  This shows optimal (read the best possible) level of abstraction and the read thread indicates that different individual concepts (read classes) can be at different levels of abstraction. This actually means that the benefit that we get from abstraction increases to maxim somewhere in the middle from 0-abstrction to total abstraction, which is single point with no content.  See the next diagram:

A common myth is that the higher the level of abstraction is the better. As this discussion shown this is not the case in contrary. As we can see from the parable at first there is a clear increase of clarity will follow but then at some point the raising of the level will reach a point where the simplification starts to corrupt the most essential parts of the information and finally the mapping will collapse to zero.  So both ends of this graph are area of danger.

First example of big crumpling at the left end of the graph was IBM ambitious attempt with objects. The project was called San Francisco (3200 classes, 24600 methods)  at the end of previous millennium.  The attempt was more or less to produce a model that would cover all possible businesses. As you can see from the figures above the level of abstraction was far too low. The model was finally constructed with huge effort. The trouble saw that it saw totally useless with that amount of information.  I am still quite frequently running into attitude where really big (read detailed) models are bolt and to be proud of but sadly the trough is almost the opposite.

The second extreme is of course at the other end of the function. These are models with very general concepts and only a few needed. Usually these models are technically correct but semantically completely empty. So they look nice but don’t contain any real value to develop applications.

This is the point where I can return to Grady Booch, when he asked his audience: “When is a domain model ready?” His answer to this question was that it not ready when all the possible classes have been added to the model but it is ready when you cannot remove a single class from the model without totally collapsing it!

My experience is that such a model typically consists of 30 – 60 classes. So even here the famous rule of Albert Einstein: “Everything should be made as simple as possible, but not simpler.”  is completely valid!

By the way the modeler can decide the number of classes in the model even without knowing anything about the target reality. This is of course done by either lifting or lowering the level of abstraction of several classes of the model.

Abstraction within programming

The level of abstraction of the classes is not directly reflected in the absolute number of classes but rather in the relative number of classes and methods together. This means that the higher the level of abstraction in the model is the more complicated are the implemented methods and vise verse.

This way the logical 3-tier architecture can lift the abstraction level for the GUI programmer by encapsulating the lower level details inside the object boundary into the method implementation.

Advertisements

Aged not outdated remarks on the meaning and importance of behaviour

About the difference: Yes I meant what I wrote. This is the root cause of the birth of Object-Oriented Analysis (OOA).

Here are the references:

Grady Booch: Object oriented design Second edition p: 16-20

http://www.amazon.com/Object-Oriented-Analysis-Applications-Addison-Wesley-Technology/dp/020189551X/ref=sr_1_7?ie=UTF8&s=books&qid=1241502775&sr=1-7

Here is the summary from Booch’s book:

Actually, this is a trick question, because the right answer is that both views are important: the algorithmic view highlights the ordering of events, and the object-oriented view emphasizes the agents that either cause action or are the subjects upon which these operations act.` However, the fact remains that we cannot construct a complex system in both ways simultaneously, for they are completely orthogonal views. We must start decomposing a system either by algorithms or by objects, and then use the resulting structure as the framework for expressing the other perspective. Our experience leads us to apply the object-oriented view first because this approach is better at helping us organize the inherent complexity of software systems, just as it helped us to describe the organized complexity of complex systems Object-oriented decomposition yields smaller systems through the reuse of common mechanisms, thus providing an important economy of expression. Object-oriented systems are also more resilient to change and thus better able to evolve over time, because their design is based upon stable intermediate forms. Indeed, object-oriented decomposition greatly reduces the risk of building complex software systems, because they are designed to evolve incrementally from smaller systems in which we already have confidence. Furthermore, object-oriented decomposition directly addresses the inherent complexity of software by helping us make intelligent decisions regarding the separation of concerns in a large state space.

 and

Peter Coad: Object Oriented Analysis  chapter 1.3. Analysis Methods (p: 18 -36) 

http://www.amazon.com/Object-Oriented-Analysis-Yourdon-Computing/dp/0136299814/ref=sr_1_1?ie=UTF8&s=books&qid=1241502828&sr=1-1

If all this is completely new for you, I suggest to you to read Coad’s book completely and first half of  Booch’s book.

As Peter Coad in his book strongly emphazises here we face a paradigm change. The glue in paradigm is that you cannot mix to paradigms. In his class training Peter Coad used as an example of two paradigms digital and mechanical watch. The point is that you have to choose -in good and bad. You cannot have both or even a mix.

To conclude this I enclose a short email from Peter Coad from 1995:

   Issue 34

Category: Use-Case

Title: “Use-Cases Considered Harmful When…

Date: Tuesday, December 3, 1996

Dear Friend,

A “use-case” describes how an actor interacts with other people and with automated systems, to achieve some business purpose. Use-cases are a helpful way of specifying functional requirements (use simple statements; state requirements from a user’s perspective; establish system context; define who is interacting with whom) For each use-case, one works out dynamics with scenarios. Each scenario is a time-ordered sequence of object interactions, showing what it takes to carry out a use-case (or a variation within a use-case).

How do object models fit into all of this? Object models provide a stable organizational framework (that is to say, problem-domain classes), so changes in use-cases or features can be more graciously accommodated.

It’s very important to make this point:

Use-Cases Considered Harmful … when used to drive an object model’s shape.

Perhaps it’s even a bit silly to write that. After all, if I made the statement:

Functional Specs Considered Harmful … when used to drive an object model’s shape

Nearly everyone would agree (with the exception of die-hard data-flow diagrammers and hard-core functional-spec writers ;-).

Allowing use-cases to be the driving force in shaping effective object models is a major mistake [1,2]. Far too often the resulting object models look like data flow diagrams, dressed in object notation. (When you see an object model and can classify its classes into controllers, function blobs, and data blobs, you know you are in trouble!)

A Better Way

A better way? Here’s a recommended approach:

  1. With your client, develop a list of features (desired outcomes, things your client will “vote with his wallet” to get).
  2. With your client, prioritize your features into three major categories: A (must have), B (nice to have), C (next time).
  3. With your client, group related features together. These groupings are use-cases, defined at an appropriate level of abstraction.
  4. With your client, create an initial object model, from classes named in the use-cases and features.
  5. With your client, proceed feature-by-feature and:
    1. Work out dynamics with scenarios.
    2. Establish responsibilities, along the way.
    3. And, at times, discover additional classes of objects, ones you had not previously considered.

When someone hands me a book of use-cases, do I use them? Definitely. When someone hands me a book of functional requirements, do I use them? Definitely. How? In both cases, I use such sources of content, not for insights on how to shape an effective object model.

Pete