Type Systems -- The Safety Features of Programming Languages

Analysis Requirements & Principles architecture:
Paradigms
substance:
Abstractions
structure:
Domains
building blocks: Features surface:
Syntactics
Defining PLs Language
List
foundational safety flexibilized

  1. Typed Languages?
  2. Typing: Implicit vs. Explicit
  3. Typing Features for Various Safety Aspects
See also special pages -> static & dynamic types
safety from execution errors

 
This page looks at type systems from the perspective of typing expressions to ensure execution properties, not on how to define & construct type terms and the values of a type. Adding only this aspect of a type system to a language does not allow to define new computations; in the contrary, it serves to exclude some dubious computations.

If you know further references, please don't hesitate to tell me (Ulf Schünemann), likewise if you have any suggestions.

[RLM] W M Waite: Relationship of Languages to Machines; In: Compiler Construction (LNCS 21); 1974.
[TySys] Luca Cardelli: Type Systems; Handbook of Computer Science and Engineering (Chapter 103); CRC Press 1997.
[TiC] Xavier Leroy: Introduction; In: Types in Compilation '98 (LNCS 1473)


Typed Languages?

Nota bene that "type" may mean different things.
a. Types of values - the mode view of "type" b. Types of variables - the range-of-variable view of "type"

1. Typed vs. Typeless/Untyped Languages

"Typed" or not depends on how the interpretation of data representations is determined:
«When an operator is applied to data objects, the interpretation of their values could be determined either by the operator or by the data objects themselves» [RLM, 172].
«Languages in which the interpretation of a data object is determined by the operator applied to it are called typeless languages; those in which the interpretation is determined by the data object itself are called typed languages. The attribute of a data object which specifies its interpretation in a typed language is called its mode. If the mode of a particular object can be deduced solely by examination of the program text, it is termed a manifest mode. Latent modes, on the other hand, cannot be deduced until the program is actually executed. An object whose mode is latent must therefore carry an explicit mode indication during execution» [RLM, 174].
"Typed" or not depends on whether the variables are restricted to contain data objects (values) only of a certain type.
«A program variable can assume a range of values during the execution of a program. An upper bound of such a range is called a type of the variable. ... Languages where variables can be given (nontrivial) types are called typed languages. Languages that do not restrict the range of variables are called untyped languages: they do not have types or, equivalently, have a single universal type that contains all values» [TySys].

2. Weakly Typed vs. Strongly Typed Languages

"Strongly typed" or not has to do with whether it is checked (by the "type system" aspect of the language) that the data objects do indeed determine their interpretation for all operations in the program. The case where a data object does not determine an interpretation is that of a data-operation combination which ``does not make sense'' [execution error kind #1 (untrapped)]
«In a strongly typed language, the language implementation [compiler, interpreter] is required to check the types of operands in order to ensure that nonsensical operations, like diving the integer 5 by the string "hello", are not performed» [TyOOL].
"Strongly typed" or not has to do with whether it is checked that a variable cannot contain values not belonging to the range (type) determined for it [execution error kind #2 (untrapped)]

Type System

«A type system is that component of a typed language that keeps track of variables and, in general, of the types of all expression in a program. Type systems are used to determine whether programs are well behaved ...» [TySys].

Static vs. Dynamic Typing

has to do with when types are checked:
See also special page -> static & dynamic types

Implicit vs. Explicit Typing - Type Annotations (redundant info)

«A language is typed by virtue of the existence of a type system for it, whether or not types actually appear in the syntax of programs. Typed languages are explicitly typed if types are part of the syntax, and implicitly typed otherwise. No mainstream language is purely implicitly typed, but languages such as ML and Haskell support writing large program fragments where type information is omitted; the type systems of those languages automatically assign types to such program fragments» [TySys].

Type annotation is-a expression of invariants (make explicit) => enables checking
«Programmers know strong invariants about their programs, and it is widely accepted by practitioners that such invariants should be automatically, statically checked to the extent possible [Mag93]. However, except for static type systems, modern programming languages provide little or no support for expressing such invariants. In our view, the problem is not a lack of proposals for expressing invariants; the research community, and especially the verification community, has proposed many mechanisms for specifying and proving properties of programs. Rather, the problem lies in gaining widespread acceptance in practice. A central issue is what sort of invariants programmers would be willing to write down.
In this paper we consciously seek a conservative framework that minimizes the unfamiliar machinery programmers must learn while still allowing interesting program invariants to be expressed and checked. One kind of programming annotation that is widely used is a type qualifier. Type qualifiers are easy to understand, yet they can express strong invariants. The type system guarantees that in every program execution the semantic properties captured by the qualifier annotations are maintained» [TTQ].

For a further discussion on implicit vs. explicit see the following section.


Typing: Implicit vs. Explicit

Feature Description Argument
type-equivalence:
a. structural (implicit)
b. structural with optional brands (Strongtalk, Modula-3)
c. by-name (explicit/intensional)
d. behavioral (by specification)
e. type constructs with explicit + with structural subtyping

Pg. 110 in: Mitchell Wand: Type Inference for Objects with Instance Variables and Inheritance; In [TAOOP].

a. «the type of an object is a signature» [TAOOP]
b. «Brands are merely tags added to types to distinguish them from one another» [Strongtalk]
c. «each class constitutes a type» [TAOOP]
d. In literature on ADTs «a class is seen as some data ... along with operations ...» and a representation invariant. «the intensional type of a class consists of the ... structural type ... plus the invariants that are intended to hold» [TAOOP]
e. Java + structural tupletypes
a. «A disadvantage of structural typing is the risk of secmantically incompatible objects sharing a syntactic type, and not being distinguished by the type system» [Strongtalk]
b. ...
c. «Name equivalence may be regarded as a fail-safe approximation to typing using explicit invariants» [TAOOP]
d. users make assumptions about behavior, not just signature
subtyping:
a. based on signature (structural/implicit subtyping)
b. based on inheritance (explicit)
c. based on behavior (specification)
d. explicit subtypes + explicit supertypes (halotypes)

Garry T Leavens, William E Weihl: Reasoning about Object-Oriented Programs that use Subtypes; 212-223: OOPSLA/ECOOP'90.
[FCCA] Michael C Feathers: Factoring Class Capabilities with Adjectives; JOOP Mar/Apr 1999.

d. halotypes: interface H = { A, B } is a structural supertype of all types conforming to A and B: X < A & X < B => X < H

  • Pro d. It seems necessary to design a type with its subtypes in mind to prepare for save (i.e. behavioral) subtypes. «Since [behavioral] subtyping imposes strong conditions on the behavior of the types involved, it seems necessary to design subtypes[!] with subtyping in mind. Such strong conditions also seem necessary for the soundness of modular program verification, so that one can reason about subtypes implicitly. Hence, we suggest that subtype relationships should be declared, rather than inferred on the basis of structural information such as signatures ... or sublcass (inheritance) relationships among implementations» [ROOPS]
  • Pro d. Decoupling: classes M can be written with explicit implements list X,Y,Z,A,B,C,D,E,F, and do not need to anticipate (by writing this as X,Y,Z,H,D,E,F) that somewhere else there are functions for all objects (not only M objects) with the A+B+C combination of types.
  • implicit conversions

    Pg. 25 in: John C Reynolds: Using Category Theory to Design Implicit Conversions and Generic Operators; In [TAOOP] (first 1980).

    E.g. integer to real to complex ``A failure to provide implicit conversions can degrate the conciseness and readability of a language. OTOH, unless great care is taken in the design ... the resulting language can exhibit anonalies that will be a rich source of programming errors.''
    type inference

    in ML

    ... (-) ``[A]t best type inference is little more than a convenience when writing a program, and at worst a program for which many types are infered can be harder to read, since type annotations clarify the source text. Programs are more often read than written.'' [Strongtalk]
    (-) Can be difficult to find the reason for a type error from the error message my personal experience with Standard ML of New Jersey (0.93) in a 3600loc project

    Typing Features for Various Safety Aspects

    See also: Datatypes, and subrange embedding vs. enumeration/two-valued datatype

    Special forms of types:
    Feature Description Argument Reference
    type qualifiers «We show that the handling of type qualifiers can be separated from the standard type system of the language. That is, while the augmented type system includes rules for manipulating and checking qualifiers, in fact the computation of qualifiers can be isolated in a separate phase after standard typechecking has been performed.»
    «We introduce a natural notion of qualifier polymorphism that allows types to be polymorphic in their qualifiers.» «The technical observation behind our framework is that a type qualifier q introduces a simple form of subtyping: For all types T, either T < q T or q T < T.»
    «Type qualifiers are easy to understand, yet they can express strong invariants.» [TTQ] Jeffrey S Foster, Manuel F&aauml;hndrich, Alexander Aiken: A Theory of Type Qualifiers; PLDI'99.
    communication type types processes by the messages received and sent Allows static checking of communication between process variables Chul-Doo Jung, Silbert: Communication Types; SIGPLAN Notices 25/9, 1990.
    descriptive
    object types
    expresses object protocol
    coordinates clients
    guarantees implementation and proper use of protocol F Puntigam: Coordination Requirements Expressed in Types for Active Objects; In: ECOOP'97; LNCS 1241; Springer, 1997.
    F Puntigam: Types that Reflect Changes of Object Usability; In: Joint Modular Languages Conference JMLC'97; LNCS 1204; Springer, March 1997.
    regular
    types
    expresses object protocol guarantees implementation of protocol [RegTy] O Nierstrasz: Regular Types for Active Objects; In: Object-Oriented Software Composition; Prentice Hall, 1995. This is a revised and corrected version of paper published in OOPSLA'93; ACM, Oct 1993.
    (noname) expresses object lifecycle by a state machine Class hierarchy to be based on semantics (behavior) H Lecouche: Extending the Object-Oriented Languages Type Notion In: Object Currents 1/2; SIGS, 1996.

    Additional attributes for existing types:
    Feature Description Argument Reference
    resource and effect annotations (for function types). [for intermideate languages] Resource: e.g. whether a function's activation record can be on the stack or needs to be on the heap. Effect: e.g. whether the application of the function may result in the capture of a continuation. Allows compilation of intermideate code mixed from languages with different parameter-passing styles and closure/continuation capabilities (e.g. ML with C) with less runtime overhead. Z Shao, V Trifonov: Type-Directed Continuation Allocation; In [TiC].

    Alias control:
    Feature Description Argument Reference
    nonaliased pointers prohibits alias of pointer-arguments (+) allow optimization (like in Fortran)
    (-) Dennis Ritchie's 1988 argument against noalias (& const)
    Two proposals to ISO C: noalias
    restricted
    limited types
    (in Ada)
    limited types are not assignable prevents inadvertently losing file descriptors and tasks see [UseOnce] Ada's limited (non-assignable) types
    linear objects
    and types
    Linear objects are ``read-once.'' There exists only one possible access-path to one at a time. Linear types provide an additional dimension to the programmer's type system--whether the object is shared. prevent sharing [UseOnce] H G Baker: 'Use-Once' Variables and Linear Objects ...; SIGPLAN Notices Jan 1995.
    H G Baker: Linear Logic and Permutation Stacks ...
    unique pointers, consumable parameters can only be moved not copied improves encapsulation, reasoning, automatic memory management N H Minsky: Towards Alias-Free Pointers; ECOOP/OOPSLA'90.
    free/unique pointers and bridges to islands only destructive read of free/unique pointers improves encapsulation, localizes reasoning J Hogg: Islands: Aliasing Protection In Object-Oriented Languages; OOPSLA'91.
    balloon types prohibits alias of part objects control side-effects of sharing P S Almeida: Balloon Types: Controlling Sharing of State in Data Types; ECOOP'97.

    Operations on types:
    Feature Description Argument Reference
    reduction reductions are the inverse of extensions leading to supertypes
  • to convert old values to our new type
  • supports explicit abstraction from old type
  • (in the Alberich PL) Paakki, Karhinen, Silander: Orthogonal Extensions and Reductions; SIGPLAN Notices 25/7; 1990.


    Analysis Requirements & Principles Paradigms Abstractions Domains Features foundational safety flexible typing Syntactics Defining PLs Language List
    Ulf Schünemann 191100