Abstractions -- The Building Substance of Programming Languages

Analysis Requirements & Principles architecture:
Paradigms
substance:
Abstractions
structure:
Domains
building blocks: Features surface:
Syntactics
Defining PLs Language
List
foundational safety flexibilized

  1. Forms of Abstraction
  2. Procedural Abstraction
  3. What is a Type?
  4. Data Abstraction, Objects, Types, (Co)Algebras -- Hiding The Representation of Data

 
If you know further references, please don't hesitate to tell me (Ulf Schünemann), likewise if you have any suggestions.


[CoAlg] Horst Reichel: An Approach to Object Semantics based on Terminal Co-algebras; Math Struct in Comp Sci (1993/11).
[DT/PD] Pg 13/14,21 in John C Reynolds: User-Defined Types and Procedural Data Structures as Complementary Approaches to Data Abstraction; In [TAOOP] (first 1975).
[TCACI] Bart Jacobs and Jan Rutten: A Tutorial on (Co)Algebras and (Co)Induction; 1997.
[OCCoa] Bart Jacobs: Objects and classes, co-algebraically; In: Object-Orientation with Parallelism and Persistence; Kluwer 1996.
[OO/AD] William R Cook: Object-Oriented Programming Versus Abstract Data Types. In: Foundations of Object-Oriented Languages; Springer, 1991.
[OOPUA] Guiseppe Castagna: Object-Oriented Programming: A Unified Approach; Birkhäuser, 1997.
[PBL] Christophe Dony, Jacques Malenfant, Pierre Cointe: Prototype-Based Languages: From a New Taxonomy to Constructive Proposals and Their Validation; OOPSLA'92.
[ThO] Martin Abadi, Luca Cardelli: A Theory of Objects; Springer 1996.
[WOMB] Ole Lehrmann Madsen, Birger M\/oller-Pedersen: What object-oriented programming may be - and what it does not have to be; ECOOP'88.


Forms of Abstraction

«Concepts are created by abstraction, focussing on similar properties of phenomena and discarding differences. Three well-known sub-functions of abstraction have been identified» [WOMB]:
Abstraction Examples Techniques of Support
1. aggreggation /
decomposition
  1. base-level: expression, functional, algorithmic, procedural abstraction, overloaded function (aka CLOS's generic function), modular abstraction, data-structure abstraction, ...
  2. type-level: opaque types (eg. C/C++'s "incomplete types"), record types, union types, intersection types [>], ...
  1. Naming: use a name for a compound subterm
  2. Opacity: hide what the name stands for
  3. Ensure some unifying property of the aggregate, like encapsulation
2. classification /
exempliciation
  1. abstract types: function types, interfaces, object types, signatures, type classes [>], ...
  2. concrete types: datatypes, object classes, range types, ...
  1. Type defined as those values with certain properties
  2. as those values created through certain generators
3. generalization /
specialization
  1. base-level: subclassing
  2. type-level: subtyping, hash-types [>]
  1. Structural: recognized by comparing properties
  2. Inheritance, extension, reduction
Additionally, in programing languages one also says "abstraction" to parameterization, which does not seem to fit into this classification of abstractions (it contains aspects of each of them), and might therefore be considered a fourth subfunction of abstraction:
4. parameterization
  1. base-level: functions
  2. type-level: (predicative) parametric polymorphism [>]
     
But maybe parameterization is better considered the technique which supports (the fourth subfunction, or a combination of the three original subfunctions) of abstraction. In particular so, because parameterization's similarity with the technique of naming, to which is related by Schmidt's
Correspondence Principle. Naming and parameterization where E is a value term comes in several different flavours, may have slighly differing meanings, depending what exactly it is which is abstracted from and which x therefore names, denotes, refers to, is bound to:
  1. Syntactic abstraction (naming/parameterization): x denotes the term E. Therefore, E is evaluated each time that x is used in the execution of P or G, resp. That is, execution follows the normal evaluation strategy aka lazy aka, in case of parameterization, call-by-name [<].
  2. Semantic abstraction (naming/parameterization): x denotes the value v to which E evaluates. Therefore, E is evaluated when, resp., the definition x=E or the application G(E) are evaluated; x is then bound to the resulting value v.

Procedural Abstraction

«Procedural Abstraction Naming a segment of code [executable code, not declarations] so that it can be manipulated (usually just executed) by giving its name.» [DAOOC++ 30]

«Sie [die Prozedur] ermöglicht es dem Programmierer, eine Folge von Aktionen als eine einzige Aktion zu betrachten und die Wechselwirkunge dieser Aktionen mit anderen Teilen des Programms zu kontrollieren.» [Louden 11].


What is a Type?

«As soon as we start working in an untyped universe, we begin to organize it in different ways for different purposes. Types arise informally in any domain to categorize objects according to their usage and behavior. The classification of objects in terms of the purposes for which they are used eventually results in a more or less well-defined type system. Types arise naturally, even starting from untyped universes» [UTDAP].

The term "type" is used in several combinations with other words and used with different meanings. (mathematical concept of types). An incomplete overview:

Also it can be attempted to express dimensioned values by types. ... [TODO]
See also the classifications of types in the CORBA specification section 1.2.4.

What is a Datatype?

«Despite the apparent derivation of the word, [a datatype] is not simply "a type of data", but a concept in its own right [emphasis added]. (This is the main reason for removing the space in "data type".) Things that have a datatype may not themselves be an item of data (at least of that datatype; it may be a value of some other datatype). An integer variable is not itself an integer. ...
The next thing is to abandon any representational view of a datatype. Data is an abstraction, capable of representation in many forms, and "datatype" is a more abstract still; it is disastrous to link the concept to particular representational forms. This applies as much to second-order representational assumptions (e.g. a complex value as an ordered pair of real values, or an array as a contiguous block of values) as it does to first-order assumptions such as bit patterns. ...
Other things that have to go from the definition of datatype are the operations that can be performed on the data values. ... To [many languages people], the values and the operations on them are inseparable. ... [This attitude] overlooks two things. ... [Y]ou can distinguish between static and dynamic aspects ... For example, a data transmission channel operates with the data (dynamic) but not on the data, which is unchanged (if the channel is working correctly!). Operations such as addition and subtraction are an irrelevance here, and can even be a nuisance, for example when talking of conformity to a standard. The other overlooked aspect is that it is not always obvious what the operations should be. ...
I am not saying you could not include operations in a taxonomy, but to do it properly needs the full machinery of object-orientation.» [TaxDT] Brian Meek: A taxonomy of datatypes; 159-167 in: SIGPLAN Notices 29/9; ACM 1994.

Data Abstraction, Objects, Types, (Co)Algebras -- Hiding The Representation of Data

                           Data Abstraction
                            << abstract >>
                                  A
                                  |
         .------------------------+------------------------.
         |                        |                        |
 Data Type Abstraction     Object Abstraction     2-D Data Abstraction
 (abstract data types)     (object-based OO)
                           has dynamic dispatch
     A           A             A       A
     :           |             |       |
     :           `-----. .-----'       `------. + cloning
     :   + dyn overload| |  + generators      |
     :  (+ receiver    | | (+ method          |
     :      syntax)    | |    suites)         |
     :                 | |              Prototype-based
     :            Class Instances    (e.g. actor systems?)
     :                  A                     A
     :                  |                     |
     :                  |                     | + delegation
     :           ``classical OO''             |
     :                                  Delegation-based
     :................................. (with traits)

1. Data Abstraction

``Information hiding is a technique for handling complexity. By hiding implementation details, programs can be understood and developed in distinct modules and the effects of a change can be localized. One technique for information hiding is data abstraction.'' [
LCLint]

``Data Abstraction refers to a range of techniques for defining and manipulating data in an abstract fashion.'' [OO/AD, 155]

``A general view of abstract data is based on the notion of abstract constructors and abstract observations. The constructors create or instantiate abstract data, while the observations retrieve information about abstract data'' [OO/AD, 155]. ``[T]he empty list constructor nil and the prefix operation cons generate all finite[!] lists. And ... the head and tail operations tell us all about infinite[!] lists'' [TCACI, 9].
``The behavior of a [immutable] data abstraction is specified by giving the value of each observation applied to each constructor. This information is naturally organized as a matrix with the constructors on one axis and the observers on the other. Each cell in the matrix defines the behavior of the abstraction for a given observer/constructor pair'' [OO/AD, 155].

Besides general data abstract, abstract data types and object abstractions are ``two distinct techniques for implementing abstract data'' [OO/AD, 152]. There is a ``duality [in the categorial sense] between abstract object types on the one hand and abstract data types on the other hand'' [CoAlg, 4]: ``Whereas abstract data types in the initial approach may be formalized as minimal soulutions (fixed points) of recursive type equations, object types may be understood as maximal solutions (fixed points) of recursive type equations'' [CoAlg, 2].
More generally: ``The distinction between algebra and coalgebra pervades computer science and has been recognized by many people in many situations, usually in terms of data versus machines. A modern, mathematical precise way to express the difference is in terms of algebras and coalgebras. The basic dichotomy may be described as construction versus observation'' [TCACI, 4].
``The initial algebras and terminal coalgebras ... can be described in a canonical way: an initial algebra can be obtained from the closed terms (i.e. from those terms which are generated by iteratively appluing the algebra's constructor operations), and the terminal coalgebra can be obtained from the pure observations'' [TCACI, 3].

Arguments:
Abstract Data Types: Object Abstraction:
Pro: ``[I]t is possible to improve the efficiency, because the representations of both arguments to the [observation operation] may be inspected'' [OO/AD, 166]. Con: ``[A] primitive operation can have access to the representation of only [the] data item ... of which the operation is a component'' [DT/PD].
Pro: ``[A]ddition of a specialized representation can also be viewed as a form of optimization'' [OO/AD, 166].
Con: ``Adding a new constructor to an abstract data type involves extending its concrete representation. This, in turn, requires that all of the case statements on the operations be extended to cover the new representation variant'' [OO/AD, 162]. Pro: ``[E]asier extensibility'' [DT/PD]: ``[T]he objects act as clients among themselves, and so are encapsulated from each other'' [OO/AD, 161]. Adding a new constructor for an existing object type ``is easy'' because ``[t]he representation is decentralized'' [OO/AD, 162].
Con?: ``Adding new observations ... is complicated by the fact that the hidden representation must be shared by all operations. In some way the new observation operation must be inserted within the scope of the representation type'' [OO/AD, 165]. Con: To add a new observation to an existing object type ``it is necessary to add it to each constructor'' [OO/AD, 165].

2. Classes, their Instances, and Class-based OO

The class abstraction can be explained from both main directions of data abstraction:
ADT-like classes can be made more like providing object abstractions by adding a receiver distinguished syntax as syntactic sugar. See [OOPUA] for a comparison of the ``overloaded functions model'' of ADTs with the ``record-[of-procedures]-based model'' of object abstraction.
Implementation of object-based classes usually do not follow the naive storage model. ``Instead, for space efficiency, [the methods] are factored into method suites that are shared by objects of the same class.'' [ThO. 13] When a method is called, it is looped up in the receiver's shared method suite. ``Although the description and implementation of method lookup can be quite complex, it is usually dsigned to produce the illusion that methods are, after all, embedded directly into objects, as in the naive storage model ... It is interesting to notice that some features that would distinguish between embedding and delegation [to the method suite] do not appear in class-based languages. Typical of class-based languages is the fact that methods, unlike fields, cannot be extracted from objects as functions, and cannot be updated within objects (by replacing them with other methods). ... With the embedding implementation, a method update would affect a single object. With the delegation implementation, however, it would affect all the objects of that class. '' [ThO, 14]

The class concept allows to extend each direction of data abstraction by aspects of the other one [OO/AD]:
Abstract Data Types: Object Abstraction:
  • Specialization of the state representation: The state is defined independently in a subclass, and then ``translated'' to the superclasses state representation by a mapping.
    Examples: List comprehension and recursive data values in Haskell and Gofer.
  • Intermixing of objects from different classes: Allow sorts from different implementations to be the same type or subtype and select implementation of observation operation based on type tags attatched to the abstract values (``single dispatch'').
    Examples: only subtypes and only in combination with inheritance: Ada95, KOOL in [OOPUA].
Sharing of representation among objects is achieved by making the encapsulation of the representation class-based, not object-based. Then object types are just partially abstract interfaces [OO/AD, 170].
As a compromise with object-based encapsulation, parts of the representation might be object-wide and another part might be class-wide accessable.
Example Eiffel: object-based ``export NONE'' and (sub)class-based ``export MyClass.''

3. Prototype-based OO

«The most important idea of the prototype-based approach is that concrete objects are the only mean to model applications. Thre are no classes ... Prototypes are not meant to be abstractions, as classes are, and they are not linked in any way yo other objects that would describe them, as in the class-based approach. A prototype is a self-sufficient entity with its own state and behavior, and capable of answering messages. This means that the normal way to speak about a concept in these languages is to provide a concrete example for this concept.» [PBL, 202]
Reuse is provided by a primitive cloning operation creating a new object as (shallow) copy of any old object (the prototype); (Additionally there may also be a primitive for creating objects ex nihilo) [PBL].

4. ``Classical'' OO

And finally full OOP is reached if
inheritance, also called subclassing, is provided (don't confuse this operation on abstractions with subtyping).

5. Delegation

See also
variations of the delegation mechanisms.

«Delegation was introduced [by Lieberman 1981] as a message forwarding mechanism. The basic idea of delegation is to forward messages that cannot be handled by an object to another object called its parent in Self or proxy in Act1 ... Indeed, the key-point of delegation is that the pseudo-variable "self" still points to the original receiver of the message, even if the method used to answer the message is found in one of its parent[!]» [PBL, 203]

«More than message forwarding, delegation can also be interpreted as an extension mechanism. An object B that delgates to another object A, can be viewed as an extension of A.» [PBL, 203]
In implicit delegation all non-implemented messages are delegated to declared parent objects; in explicit delegation each to-be delegated messages is named with the object handeling it [PBL].
Examples: Self, ObjectLisp, Act1, Exemplars, Garnet [PBL].
Analysis Requirements & Principles Paradigms Abstractions Domains Features foundational safety flexible typing Syntactics Defining PLs Language List
Ulf Schünemann 120698, 040701