Nominal Subtying Considered Bad: Findler, Flatt and Felleisen

March 14, 2006

I came across the following words in a paper about using casts:

“Nominal subtyping forces programmers to explicitly state all of the subtyping relationships in the program. This limits component reuse, because programmers cannot anticipate all of the contexts in which a particular class might be used. In contrast, structural subtyping implicitly allows any type with appropriate structure to be used in a given context.

Language designers choose nominal type systems because they are easy to understand and easy to implement. A programmer doesn’t need to investigate the structure of an interface I to find out whether an instance o of a class C can have type I; it suffices to check whether the definition of C mentions I as an implemented interface (or whether the superclasses and superinterfaces mention I). A compiler writer, in turn, can build a class graph and an interface graph and type check expressions and statements by comparing points in a graph.

Nominal typing, however, is also a known obstacle to software reuse. In particular, a programmer can only compose two objects if the creators of the two respective classes used the same (nominal) types. Unfortunately, in a world of software components where third-party programmers compose existing pieces of software, the implementor of a class cannot possibly anticipate all possible types for an object. Hence, programmers resort to casts and have invented adapter patterns to bridge the gap between third- party components.

One way to overcome this problem is to switch to a structural type system. The research community has long recognized this shortcoming of nominal subtype systems and that structural subtype systems do not suffer from this flaw. Some modern research languages like LOOM , OCaml, OML, PolyTOIL, and Moby adopt structural subtype systems. Their designs demonstrate how their structural subtype systems empower their user communities to reuse classes in unanticipated situations.”

This is very well stated. The authors go on to describe an implementation of a limited form of subtyping via contracts for an enriched Java.

Read the rest of their paper here: Semantic Casts, Contracts and Structural Subtyping in a NominalWorld, Robert Bruce Findler, Matthew Flatt, and Matthias Felleisen.


Deep Typing for Plain Old Lists – A new kind of LISP?

March 10, 2006

Genyris is an attempt to provide languages with dispatch on type
features and structural subtyping. The intent is to provide programmers with
a way to formally define the grammar of their list
structures, and to add ‘generic’ or polymorphic
functions on lists. The main difference between Genyris and
CLOS is that the you get dispatch on type for Plain Old Lists (POLs).

It is also designed to give you ambiguous data types.
So you can enter a list at the prompt and Genyris will
tell you what types it conforms to:

>- (types-of ‘(2134 . 986))

(complex point rational vector cons)

Here’s a familiar example:

Consider the following code fragment from Paul
Graham
‘s implementation of John McCarthy’s LISP:

(defun eval. (e a)
(cond
((atom e) (assoc. e a))
((atom (car e))
(cond
((eq (car e) ‘quote) (cadr e))
((eq (car e) ‘atom) (atom (eval. (cadr e) a)))
((eq (car e) ‘eq) (eq (eval. (cadr e) a)
(eval. (caddr e) a)))

To rewrite this using Genyris, first we need to define some list types:

(define-type atom symbol)

(define-type quote-form ‘quote thing) ; e.g. ‘(quote fred …)
(define-type atom-form ‘atom thing) ; e.g. ‘(atom z)
(define-type eq-form ‘eq thing) ; e.g. ‘(eq x y)

Then we write individual generic functions for eval.
which apply to the identified types:

(define-generic g-eval. ((atom e) (alist a)) (assoc. e a))

(define-generic g-eval. ((atom-form e) (alist a)) (atom (eval. (cadr e) a)))

(define-generic g-eval. ((quote-form e) (alist a)) (cadr e))

(define-generic g-eval. ((eq-form e) (alist a))
(eq (g-eval. (cadr e) a) (g-eval. (caddr e) a)))

With source code in this form we can add new evaluation rules to
g-eval. without needing to modify existing code. Whereas in the
traditional eval function there is a single cond expression which
must be maintained. We can also over-ride an existing definition
with a more specific data type. The dispatcher will always call the
most specific available function. For example we could add integers
to g-eval:

(define-generic g-eval. ((fixum e) a) e)

There’s more features than I can put in this post. An essay describing
the approach is available here.
Of course I will be grateful to anyone who reads it. A couple of Lisp people have reviewed it here in Australia (thanks guys) so it isn’t _too_ ordinary. I’m reasonably
confident that someone will point out a way to do this in CLOS or
Scheme. I looked but couldn’t see it.


We need structural subtyping NOW!

March 9, 2006

I’m a developer working in industry. From many years in the business I can tell you that perhaps 30% of application code is devoted to interpreting, validating input data and outputting it to the next program. Especially now we have middleware messaging and a proliferation of layering.

The average programming language (e.g. Java) has zero ability to automatically classify input data so we are left to churn out code which looks at data and figures out what class it belongs to. Add to this is the long list of programming languages used per system (JavaScript, HTML, XML, XSLT, Java, JSF, SQL) and the large number of class libraries which require interfacing glue code to hand data between them. In a typical J2EE application information passes through tens if not hundreds of forms as it traverses the system from user to database. At each hop it must pass from one encapsulated form to another.

Nominal typing systems are holding us back.

If programming languages could allow us to simply declare the data structures and associate classifications we could eliminate billions of lines of code. Preferrably the same declarations would work in all the different languages. Nirvana would be a VM which has a (read) method which automatically classifies the data input.

Structural Subtyping can help because it allows programs to operate on unfamiliar data without adapters. There are a handful of languages past and present which attempt to use it. Xen/Cowmega, Mozzie, MotionTypes. Apparently LOOM, Modular-3, OCaml, Brew and others can do it also but I have yet to explore these languages. And I am not the only one looking for this.
The challenge is to build this into languages we use every day.