Type Inference and Subtyping for Higher-Order Generative Communication

Models for generative communication use a common space, where concurrent agents put or retrieve data; data extraction is controlled by various kinds of filtering techniques, usually based on the structure of the data itself. However, for higher-order communication, i.e. when the data consists of software fragments, such techniques are not appropriate: structural information does not reflect the behaviour of a software fragment. This paper studies a filtering mechanism based instead on type information, so that software components can be retrieved according to type specifications. A rich type system with polymorphism and subtyping gives enough flexibility to formulate type specifications at various levels of detail.


Introduction
Coordination models were recently proposed as a new approach for concurrency and distribution, based on the key notions of shared dataspace and generative communication.A shared dataspace is a common repository for data, which multiple agents can use for communication across space and time.Generative communication is a form of data exchange which is no longer based on names or channels, as in traditional concurrent systems, but on the data itself, together with some filtering mechanism to select data.Typical examples are the Linda model [12], which uses pattern matching, and Gamma [5], which uses so-called reaction conditions, i.e. predicates over the data.The kind of data exchanged in such shared spaces is rarely defined explicitly, but is usually assumed to be first-order (atomic values, or structured values built from some predefined constructors).Here we consider some aspects of moving towards higher-order coordination models, i.e. models in which agents can also exchange software components.The potential benefits are increased flexibility and dynamicity: support for reflexivity, mobile agents, and dynamic construction of agents.This seems to have been acknowledged by the market, as a number of products for sending executable code across a network have recently been released [25,16].However, such products rely on traditional message passing models, while we are interested in higher-order data combined with generative communication.
We distinguish two main issues involved with exchanging higher-order data between agents.The first is to manage the links between higher-order software components and the agents who create or load them.Components moving from one agent to another may need to keep some links to the first agent, but also to establish links with the new agent.Since they typically have no precise knowledge of this new agent, there is no obvious way to start to interact with it -in other words, to "bootstrap" the communication.A fixed protocol based on traditional parameter passing mechanisms could of course be envisaged, but it would limit in a drastic way flexibility of evolution in the system: all agents would have to obey the same protocol, supplying the same resources in the same order.Fortunately, message-passing mechanisms found in actor or object-oriented systems provide one well-known solution to this kind of problems: an newly loaded software component can send messages to its environment in order to import or export resources.Messages may be interpreted differently at different sites, but they constitute a primary language on which more complex communication protocols can be bootstrapped.Message passing is not necessarily the sole solution, however: more generally, what is needed is a global framework supporting both static and dynamic binding operations, in which various forms of communication protocols between components and agents can be embedded.
The second issue is to adapt generative communication to higher-order data.Usual filtering mechanisms such as pattern matching or reaction conditions, which are based on the structure of the data, are not sufficient for describing software components: the syntactic structure of a software fragment is not very informative about its functionality.As a consequence, filtering mechanisms based on structural information would often fail to identify the appropriate software components.Filtering based on formal semantic descriptions would be an ideal solution, but since comparing such descriptions is most often undecidable, this choice would be impractical.An intermediate solution is provided by type information: the type of a software component tells something about its functionality, and it can be decidably compared against other type specifications.Of course this is only valid if the type system is flexible and informative enough, and is able to cope with higher-order software components.The type systems of modern functional languages such as ML [19] or Haskell [15] are equipped with such capabilities; in particular, they support parametric polymorphism, i.e. multiple applications of the same function to parameters of different types, which is a key factor for flexible assembly of higherorder software components.Furthermore, types in these languages are automatically inferred by the compiler, instead of requiring explicit specifications from the programmer: as a result, very expressive type specifications can be used without imposing any additional burden to the programming task.
In this paper, we study feasibility of a language for exchange of higher-order data, combining dynamic binding features and type-based generative communication.Dynamic binding comes from N, a -calculus extended with the notion of name-based interaction.As in the standardcalculus, N functions are data, which provides immediate support for higher-order operations.
Generative communication comes from a variant of the Linda primitives in, read and out, where the usual Linda "templates" are replaced here by type specifications.A type inference system controls the interaction between these language features.The resulting language shows feasibility of a filtering mechanism based on types for higher-order software components.Here we only consider filtering based on types, which is the most novel (and probably most difficult) aspect; so, missing first-order Linda templates, which will still be needed for usual, first-order data, some of our examples may seem quite academic.It should be clear, however, that the present work is part of a research effort in which the two aspects will later be combined.This more general direction, as well as the specific design choices and technical issues considered in this work, are discussed in further detail in the rest of this introduction.

Coordination Models
Under the denomination coordination models we understand a family of models for concurrent and/or distributed computation, which use a common data repository as communication medium, and use some data selection mechanisms as communication primitives.We will briefly review some of these models, in order to provide some global view, and to justify our choice.
The earliest coordination model is Linda [12].The dataspace in Linda is a persistent multiset of tuples.Concurrent agents, possibly written in different languages, exchange tuples through the dataspace.There is one output primitive out, for writing a persistent tuple, and two input primitives in and rd, for reading values from the tuple space; the former is a destructive read (i.e. the corresponding value is removed from the tuple space), while the latter is non-destructive.Communication is based on pattern-matching, so for example a request like in(c; x; ?y) issues a template composed of two actual parameters c and x, and one formal parameter y; it looks for triples in which the first component is the constant c, the second component is equal to the current value of variable x, and the third component is any value; variable y will be bound to that value.In addition, Linda is equipped with an eval primitive for generating "active tuples", i.e. forking off a new concurrent agent which eventually evaluates to a tuple; however, this feature is not essential for the current discussion, and will not be considered further.
Gamma [5] is a quite different coordination model.There, no structure is imposed on the data: the dataspace is merely a multiset of any kind of values; however, it is generally implicitly assumed that such values are homogeneous.The primitive mechanism for communication in Gamma is a reaction condition, which consists of a list of variables, a predicate over these variables, and an output action.If a subset of values for which the predicate is true can be found in the multiset, then these values are taken from the multiset, and are replaced by the output action; all this is performed in one single, atomic step.The output action may consist of duplicating data, deleting data, or "transmuting" it (performing some functional transformation).Reaction conditions are active as long as there are subsets of values in the multiset which satisfy the predicate; hence, the reaction rules are implicitly looping.An essential aspect of Gamma is the assumption that global termination is observable, i.e. that it is possible to detect a global state of the multiset where no reaction rules are applicable.At this point, the current active rules are cancelled, and may be sequentially replaced by another set of rules.
Finally, LO [4] is a model based on broadcasting, rather than on shared dataspace.Each agent has its own multiset of resources, and data output by an agent is broadcast to all other living agents (excluding the sender); each receiver gets a separate copy of the resource, which it can use independently.Communication is controlled by the names of logical predicates exchanged between agents.
The Linda model was chosen for our study of higher-order generative communication, because it appeared to be the most appropriate for controlling exchange of software components: Linda is language-independent, but has been designed in the spirit of traditional sequential languages; communication control is distributed amongst the participating agents, through simple IO primitives; finally, the pattern-matching mechanism can be easily replaced by another one more suited to higher-order data, without changing the basic model.By contrast, Gamma and LO have a centralised, rule-based mechanism for communication control, which is not suitable for decomposition into separate software components.

Higher-order coordination and dynamic binding
To avoid any confusion, we should first state that we use the term "higher-order" to denote systems in which program fragments are data; this corresponds to the usual understanding of the term in functional languages or process calculi.By contrast, the recent work on "Higher-Order Gamma" [18] proposes a system in which passive or active multisets are data: this introduces powerful mechanisms for structuring the dataspace, but does not correspond to the definition above, because multisets do not contain Gamma programs.The main paradigms supporting higher-order programming are i) functional programming, in which functions are data, and can be passed to other functions, ii) some object-oriented languages such as Smalltalk or Self, in which fragments of code are encapsulated as "block objects", which can then receive messages or be passed as arguments to other objects, and iii) symbolic programming (languages of the LISP family), in which some mechanisms are available to treat a sequence of instructions as data or, conversely, to interpret data in a list as a sequence of instructions.
The idea of exchanging software components among active agents is being actively pursued in various contexts, ranging from remote control/maintenance to distributed artificial intelligence.All face a common problem, which is to handle the relationship between mobile components and their surrounding environments.In order to be able to send a software component to another agent, this component must be self-contained; however, it is often desirable that the component not only keeps links to the agent which created it, but is also able to dynamically establish links with the agent to which it is sent.These are contradictory requirements, which can hardly be satisfied all at the same time.Therefore existing solutions typically choose to emphasize one aspect over the others, yielding very different systems.Probably the most illustrative examples of such differences are the following: In Postscript [1], programs sent by a computer to a remote printer or window system keep no link to their originating site.On the other hand, they can very flexibly establish links with their receiving site through the notion of dictionaries: all free names occurring in the postscript program are interpreted locally, in the dictionaries of the receiving site.This is convenient for various forms of context change: a program can either carry with it some resources (like fonts for example), or it can expect to find such resources at a given name in the environment in which it will be executed.
The Obliq language [7] is lexically scoped, which means that all free identifiers in a higher-order component sent over the network remain bound to their originating site.Therefore migration is transparent: the meaning of a computation does not change when moving to another site.On the other hand, it makes it more difficult to dynamically link with a new environment.For example, a moving component may wish to print on its local site, not on the printer of its originating site; but it cannot gain access to the local printer unless this is specified in a pre-established protocol.
The difference between the two approaches is essentially the difference between dynamic and static binding.In the dynamic camp, we may also cite LISP's quoting mechanism, or the dynamic interpretation of messages in object-oriented systems.In the static camp, there is the example of "blocks" in Smalltalk or in Self, which may be passed between objects but still remain bound to the environment of their originating site.Some languages combine both kinds of mechanisms: we just mentioned some static binding aspects in Smalltalk, Self, and Obliq, but in fact all of them also have message passing as a dynamic binding mechanism.
In the context of coordination and generative communication for higher-order components, dynamic binding is an essential feature.To obtain interesting interactions between agents, the exchanged entities should be able to dynamically quit an agent and make connections to a new one.Unfortunately, there are very few theoretical tools to study dynamic binding at an abstract level.Advanced theories have been developed for object-oriented systems (see the collection edited by Gunter and Mitchell [14]), but most often the semantics of dynamic binding is still treated at a rather concrete level, mimicking the implementation strategies for method lookup found in object-oriented programming languages.We will go to a more abstract level through N (lambda-calculus with Names), a compact extension of the usual -calculus proposed in the author's PhD thesis [8].The main idea of this extension is that variables can simultaneously carry several values: each value is independently accessible at a specific name (label, channel).Another way of understanding this is that each function is an abstraction over an environment (a partial mapping from names to values).Conversely, when using a function, one has to build such an environment, i.e. specify the names at which parameters are passed to the function.Environments can be built incrementally, in a modular fashion.This provides basic support for name-based interaction, and henceforth for records, objects, first-class environments, and of course dynamic binding.

Type information for controlling communication
Types usually are rather weak specifications, so it may seem questionable that type information is expressive enough for controlling communication.For example, if the functional type String!Int is the only known information about a component in the shared dataspace, there is hardly any sensible way in which an agent may make use of it.However, if type information is more accurate, in a more expressive type system, then the approach becomes feasible.Consider for example record types of shape fl 1 : T 1 ; : : : ; l n : T n g, where l 1 : : : l n are field names (labels): such types can express useful filtering constraints.For example, an agent may restrict its attention to records providing information at name l i , using this name as a single communication channel.Moreover, type variables in polymorphic types carry additional information about some internal dependencies: for example the type ffun : X !Y ; arg : Xg can be used to filter records in which field fun contains a function, and field arg contains an argument compatible with that function, because type variable X expresses the dependency between the two fields.Similarly, the type system of N is based on arrow types (functional types) of the form (l 1 : T 1 ; : : : ; l n : T n ) ! T in which the right-hand side is a usual type, while the left-hand side is a finite mapping from names to types.This means that the names appearing in the type of a component can be used for selecting that component in a shared dataspace.
The idea of using types for retrieving software components was recently advocated in the context of software library management [9,28].There the problem is simpler than with coordination models, since the comparison between a given type specification and the candidate components from a library is totally static.Yet the technical problems are non-trivial: a given functionality may be encoded in several different ways, yielding different types, so in order not to miss relevant software components it is necessary to consider types modulo an isomorphism relationship (two types T and U are isomorphic iff there are two conversion functions f : T !U and g : U ! T such that 8x 2 T; g(fx) = x and 8y 2 U; f(gy) = y).Di Cosmo [9] goes through deep theoretical studies to derive isomorphisms in various families of type systems.An implementation of these results for the ML language gives encouraging feedback; however, it is noted that to make the approach fully practical, much further work is required.In particular, for a given type request it may be useful to consider not only the types which are isomorphic to it, but also the types which are less general or more general in the instantiation ordering.To illustrate this point with a trivial example, a request for a function of type Int !Int may be satisfied by the identity function of type 8X:X !X, which can be instantiated to the particular type of the request.The current state of the art, as pictured in Di Cosmo's thesis, does not yet support such kind of type comparisons.
In order to apply similar ideas to generative communication, the difficulties become even more acute.First, there is a problem in managing partial type information, while Di Cosmo's work always considers full type specifications.In a Linda variant based on types, the type request generated by an in or read statement should ideally be inferred from local information; however, type inference is typically context dependent.For example an expression like f:in(x):fx receives a function f at the abstraction, then grabs a suitable argument x for f in the dataspace, applies the function to the argument and returns the result.The corresponding type is 8X; Y:(X !Y ) ! Y , but then the value fetched by the in statement should be of type X.
The problem is that type X cannot be statically compiled without knowing at which type the f argument will be instantiated.So this means that somehow the types involved in in or read statements depend on the whole program, or should be restricted to closed types, in which all type variables are quantified.Enforcing that kind of constraint without restricting too much programming flexibility is a delicate language design issue.
A second difficulty is to also handle subtyping, which is closely related to the question of naming discussed above.With subtyping, various levels of partial specifications can be formulated in type requests.For example, an agent interested in sending message l will typically ignore all other possible messages understood by the corresponding object.In other words, the type specification generated by the agent will only mention message l.Obviously, the search for corresponding objects in the dataspace should match not only with objects having exactly the same type, but also with objects having compatible subtypes (i.e.understanding more messages).We do not know of any work in the spirit of Di Cosmo [9] which incorporates subtyping.Even worse, parametric polymorphism and subtyping are not easily merged in a type inference system: despite a number of research efforts, surveyed by Fisher and Mitchell [11], subtyping has not been integrated yet into languages based on type inference, such as ML or Haskell.

Contributions
The present paper starts with a type inference system for N addressing the problem just mentioned above, i.e. integrating subtyping with parametric polymorphism.The algorithm is proved to be sound and to infer principal types; it is an adaptation of Thatte's algorithm for socalled partial types [26], i.e. it is based on sets of recursive type constraints.A similar approach, but based on different sources, was taken recently by the Hopkins group [10] to infer types for an object calculus, and is currently being investigated by Rémy for records [24]: this seems to indicate that recursive type constraints are likely to become an important tool for subtyping in the near future.Then we extend N with Linda generative communication primitives, and use the type inference algorithm as a mechanism for communication control.The resulting system deals with subtyping, but does not deal with type isomorphisms.However, it should be noted that some of the main isomorphisms of Di Cosmo are not always relevant in a context which supports name-based interaction: for example the isomorphism which allows permutation of the first two arguments of a function is not necessary if the two arguments are supplied under different names at the same abstraction level, because the corresponding N types are equivalent: (l 1 : T; l 2 : U) !V (l 2 : U; l 1 : T) !V A particular aspect of the present work is that type inference is used not only statically, but also dynamically to control the evolution of a system of agents.A set of type constraints is associated to each active agent, and is dynamically updated as communication events occur.A higher-order components is input by an agent only if this dynamic type-checking phase can ensure that no type error would be introduced.
The paper is organised as follows.Next section presents the untyped N calculus.Section 3 discusses the addition of untyped Linda primitives to N; despite the absence of a mechanism for communication control, it allows us to already illustrate the approach with some examples, and to investigate various design issues.We then proceed with the type inference algorithm for N. Finally, the results are brought together in a system with typed, higher-order generative communication.

Background: the untyped N calculus
The N calculus is a minimal extension of the usual lambda calculus, in which functions can receive several parameters at different names.The presentation below is fairly condensed and assumes knowledge of the traditional -calculus.

Syntax and reduction rules
The calculus is constructed from a set V of variables and a set N of names (or labels).Letters x; y; z range over V; letter l ranges over N; letters a; b; c; : : : range over the set Λ N of terms, which are built from the following abstract syntax: a ::= x l j x:a j a(l = b) j a! j err As mentioned in the introduction, variables carry several values at different names, so an expression of the form x l corresponds to the value carried by variable x at name l. Lambda abstractions are exactly like in the standard lambda calculus, and the notions of free and bound variables are also the same (see Barendregt[6]).We write FV (a) for the set of free variables occurring in a, and FN(a; x) for the set of names which index free occurrences of x in a; so if x l occurs free in a then x 2 FV (a) and l 2 FN(a; x).A term is closed iff it has no free variables, and the set of closed terms is denoted by Λ 0 N .Usual application is split into two different parts: an expression of the form a(l = b) (called bind expressions) passes value b under name l to abstraction a; an expression of the form a! (close expression) ends a sequence of bind expressions.Finally, err is a constant representing run-time errors, i.e. the well-known "message not understood" error of object-oriented systems.Errors are generated when trying to access a variable under a name for which that variable has no value (because there was no corresponding bind expression on the same name).Usual syntactic conventions apply, i.e. abstractions extend to the right as far as possible, and multiple abstractions of the form x 1 : : : : x n :a are abbreviated as x 1 : : : x n :a.x:err !N err (lambda-err) Figure 1: Reduction Rules The capture-avoiding substitution of b for all occurrences of x l in a is written a x l := b].Similarly, a x := b] denotes the substitution of b for all occurrences of variable x in a, whatever their label index may be.Avoidance of variable capture is handled as in the standard lambda calculus [6], by considering equivalence classes of N terms under -substitution (renaming of bound variables).
One-step reduction, written !N , is defined by the rules in Figure 1.The usual -reduction rule of standard -calculus is split into bind-reduction and close-reduction rules, with three additional rules for propagation of run-time errors.Notice how the lambda-bind rule performs a substitution without removing the outermost , while the lambda-close rule removes the and substitutes any remaining occurrence of the corresponding variable by err.By contrast, -reduction in the standard lambda calculus substitutes the variable and removes the in one single step.Following common conventions, the reflexive, transitive closure of !N is written !N , and $ N is its symmetric closure.Proof.Like Barendregt [6] for the standard -calculus, using a substitution lemma and a strip lemma.The full development can be found in the author's thesis [8].
Assume an "invisible name" inv 2 N, and let Λ denote the set of traditional lambda-terms.These can be embedded in Λ N by the translation function LN ?] below: LN ?] So in the following we will freely use traditional -calculus syntax whenever convenient, assuming this translation to be implicit.The translation exactly preserves usual -equality: Proof.
1. induction on a.
2. Let ( x:a 1 )a 2 be the redex involved in the reduction step a !b, with contractum After a bind reduction and a close reduction we get LN a 1 ] x inv := LN a 2 ]] x := err].
By 1), this is equivalent to LN a 1 x := a 2 ]]. 3. Every initial redex in LN a] comes from some redex ( x:a 1 )a 2 in a, and therefore is necessarily of shape ( x:LN a 1 ])(inv = LN a 2 ])! Hence the first reduction step must be a bind reduction, yielding a new redex ( x:LN a 1 ] x inv := LN a 2 ]])!.After performing the close reduction, we get the exact image of the contractum a 1 x := a2].

Example: Boolean values and extensibility
To give some intuition about the calculus, we show a way to encode boolean values.No recoding of true and false is needed, while in the standard Church encoding it would be necessary to recode them as functions with three abstraction levels instead of two.Furthermore, notU is defined incrementally as an extension of the previous not function.
To illustrate the reduction rules, here is a "standard reduction" (reducing leftmost outermost redex first) of the expression not true: not true = ( x:x inv (true = x:x false )(false = x:x true )!)(inv = ( x:x true ))! !N ( x 0 :( x:x true )(true = x:x false )(false = x:x true )!)! !N ( x:x true )(true = x:x false )(false = x:x true )! !N ( x 0 : x:x false )(false = x:x true )! !N ( x 0 : x:x false )! !N x:x false so the result is indeed false.Similarly, it can be verified easily that notU unknown yields unknown, or that notU false yields true.By contrast, consider what happens with an erroneous expression like not unknown: not unknown = ( x:x inv (true = x:x false )(false = x:x true )!)(inv = ( x:x unknown ))! !N ( x 0 :( x:x unknown )(true = x:x false )(false = x:x true )!)! !N ( x:x unknown )(true = x:x false )(false = x:x true )! !N ( x:x unknown )(false = x:x true )! !N ( x:x unknown )! err 3 Linda primitives in N In order to motivate the choice of N for supporting higher-order coordination, we will start with an informal example, where some potential uses of Linda primitives within N are shown.
Type expressions are not displayed here, but their use is discussed; technical details will be given in the following sections.

Example: distributed servers
Consider a distributed system, in which each site may give access to some of its local resources (for example myscreen, myfilesystem, etc.) either by exporting them to remote sites, or by importing foreign programs and executing them locally.We will show how the basic primitives of the language, together with higher-order components, can implement several protocols to do so.
A first possibility is that some site S1 issues a command like: This puts a component in the dataspace which is ready to get a request at name S1, and executes this request by giving it access to several resources of site S1.We will call this component an "exporter" of S1.If another site S2 wants to use the resources, for example for displaying a videoclip on the screen and speaker of site S1, it will typically use a command like in(x):x(S1 = y:showVideo y screen ; y speaker ])! which grabs the exporter from the dataspace and binds it to a program using part of the resources.Because of the chosen protocol between the two partners, the type specification generated from this command will identify that S2 specifically wants to access the site named S1, and that it only needs the screen and speaker resources; but by virtue of subtyping the exporter displayed above, exporting more resources, matches the type specification and therefore can be grabbed by the in command.If, on the other hand, site S2 wanted to display an interactive videoclip, with a command of the form in(x):x(S1 = y:showVideoInteractive y screen ; y speaker ; y mouse ])! then the type specification would express that an additional resource is needed at name mouse; since the exporter displayed above does not supply this resource, it would not be grabbed, and site S2 would block until another exporter from S1 supplies the needed resources.
Issues of mutual exclusion or sharing can be handled in various ways, depending on the protocol chosen between partners.In the example above, if site S1 only generates one exporter, then that exporter acts as a global semaphore on all resources of S1, and consumers like site S2 should put it back into the dataspace in order to release the resources.By contrast, it would obviously be possible to allow sharing by generating several exporters, by exporting individual resources separately instead of all together, or by using the read construct instead of in.
The example above could as well be implemented using traditional message-passing techniques: the name S1 is used as a target identifier, and names screen and speaker are used as message selectors.However, a simple modification of the protocol can take advantage of generative communication to establish more flexible exchanges between agents.For example if site S1 issues the command out( x:x(screen = myscreen)(printer = myprinter)!) then here variable x of the exporter is no longer accessed on the specific name S1, but on the "invisible name" inv.Now we can imagine a customer of the shape in(x):x( y:printSomething y printer ]) which will grab in the dataspace the first printer available to complete a printing job, without specifying which particular site should satisfy the request.
It is also possible to design a totally different protocol, in which sites do not put "exporters" into the dataspace, but rather act as servers which repeatedly grab programs from the dataspace and execute them while giving them some access to local resources.With such a scheme, site S1 would now loop with a command like in(x)x(screen = myscreen)(printer = myprinter)!and customers wanting to print something just have to issue the command

out( x:printSomething x printer ])
The protocol here is simpler, but has no control over site names, so it is not possible to specify which particular site we want to communicate with.
Finally, the distinction between bind and close operations in N can be used to design collaborative protocols, in which several sites supply partial resources which are brought together for a common computation.For example site S1 could loop with a command like in(x):out(x(printer = myprinter)) which repeatedly grabs programs from the dataspace, supplies them with the printer resource, without closing the list of bindings, and puts them back into the dataspace.Any program can take advantage of this to access the printer, and then can go to some other site to access other needed resources.For example the command out( x:doSomething x Printer ; x lesystem ]) outputs a program P which needs both a printer and a filesystem.Suppose the filesystem is made available at another site S2 with the following protocol: in(x)x ( lesystem = my lesystem) (screen = myscreen) (speaker = myloudspeaker)!
Program P does not match the input request, since it needs a resource which is not supplied at S2.However, P matches the input request of site S1; therefore it can be loaded there, gain access to the printer, and come back to the dataspace as a new program P 0 which only requires the filesystem resource.At this point, it now matches the input request of site S2, and can be loaded and executed there.

Design of Linda primitives
The examples above were merely intended as an illustration, with many implicit assumptions.
We shall now explore in more detail how to extend N with Linda primitives.One issue immediately comes forward: N is a purely functional model with referential transparency, but the Linda primitives in, read and out, as any other form of communication, have sideeffects.This creates a dilemna which is the same as for more usual input/output operations in functional programming: either one has to sacrifice the purity of the language, admitting basic operations with side-effects which suppress referential transparency, or one has to push the sideeffects out of the program, at the operating system level.The latter approach has been adopted in lazy functional languages like Haskell [15]; it has the great advantage of preserving good properties within the language, but at the cost of some constraints on the programming style.
Great progress has been made on various mechanisms to encapsulate such constraints ("lazy streams", "continuations", "monads", see Gordon [13] for an overview); nevertheless, these are still complex to use and to reason with in some situations.Since referential transparency is not an essential issue within the scope of this work, we preferred to adopt a more simple-minded approach à la ML, in which the Linda primitives are just operations with side-effects.So the extended syntax with untyped Linda primitives is: a ::= x l j x:a j a(l = b) j a! j err j in(x):a j read(x):a j out(a) The input primitives in (destructive input) and read (non-destructive input) are syntactically very similar to abstractions.The variable x is bound within the body of a, and is accessed through name indices x l 1 ; x l 2 ; : : : as usual.This means that an input request may fetch several values from the dataspace.It evaluates to expression a in which all occurrences of the x variable have been substituted by values from the dataspace.The output primitive out just evaluates to the identity function.
The meaning of these Linda extensions is formally given by a structured operational semantics, where statements of the form (S; a) # (S 0 ; v) should be read as: "program a in dataspace S evaluates to value v and modified dataspace S 0 ".The dataspace is a multiset of values, and ] stands for multiset union.In order to be able to reason about side-effects, we use a call-by-value semantics, as in ML, in which arguments are evaluated before they are passed to functions.The rules are given in Figure 2, as collection of rules specifying the behaviour of a single agent in a multiset, together with a global rule (system) specifying the collective behaviour of a system of agents fa 1 : : : a n g.
This semantics calls for several comments.First, there are currently no types, so there is no control for communication.This means that input primitives are likely to fetch unsuitable values in the dataspace, producing run-time errors when pursuing evaluation.Solving this problem will be the subject of the next sections.Second, it should be noticed that the variable bound by an in or read primitive is accessed on several names, which means that it simultaneously inputs several values.This is a radical departure from the standard Linda model, in which only one tuple is fetched at a time.As a result, we can have input requests for mutually constrained values.Consider for example the expression in(x):out(x func x farg ) Here the input request simultaneously grabs a function at name func, an argument at name farg, applies the function to the argument and puts the result back into the dataspace.Type checking will of course make sure that the function and the argument are type compatible.
The fact that several values are input simultaneously in an atomic operation allows us to mimic the "reaction conditions" of the Gamma model of communication.A reaction condition consists of a request for a finite number of values, a predicate C over these values, and an action A to be executed if the predicate is true.A similar behaviour is obtained by the following expression in(x):C(x l 1 ; : : : ; x l n ) (true = A(x l 1 ; : : : ; x l n )) (false = out(x l 1 ) : : : out(x l n ))!
where a finite number of values are input at names l 1 : : : l n , the condition C is checked, and then either action A is executed or the values are all output again into the dataspace.Here the operation is executed only once, but by putting it into a loop it could operate as long as there is appropriate data in the dataspace.However, this does not mean that the current framework completely implements the Gamma model of communication.This is because there is no means here to encode the "global termination condition" of Gamma, i.e. to atomically check that no subset of values in the dataspace can satisfy the reaction condition.Such a property is essential for sequential composition of Gamma programs, but relies heavily on the fact that Gamma is a centralised model, in which reaction rules are grouped together, and therefore can be checked globally in an atomic operation.By contrast, the Linda model is decentralised, so any operation checking the input filters of all active agents in one atomic step would be contrary to the basic Linda philosophy: this explains why Gamma cannot be embedded in the current framework.
Finally, we would like to stress the asymmetry between input and output primitives: the out statement outputs one single value, while the in and read statements atomically input several values.This mimics the asymmetry already encountered with N abstractions.The output primitive plays a role similar to the bind operation, but performs no "closing"; closing is implicitly done in the dataspace, when atomically selecting a set of values suitable for a given input statement.Another important difference between the new Linda primitives and the original constructs of N is that interaction through the dataspace does not directly use names: this is a deliberate design decision.An alternative possibility would have been to consider a dataspace partioned by names, instead of a single multiset.The output primitive would then be of the form out(l = a), putting value a in partition l of the dataspace; input primitives would fetch values from the specific partitions corresponding to the names they use.Technical treatment of this model would not be sensibly different, but it would have a strong impact on programming flexibility.Our decision was motivated by an intuition that this form of named control was too strict for generative communication.However, this still requires to be confirmed by further studies.

Filtering by types, and filtering by templates
As explained in section 1.1, the original Linda model supports input requests of the form in(c; x; ?y)where the argument to the in primitive is a template composed of a constant c, of the current value of a variable x, and a formal parameter ?y which should assign a value to variable y through the input request.Values c and x are used as a constraint over the search space, so that only relevant tuples (i.e.triples with first and second component equal to c and x) are retrieved from the Linda space.By contrast, the filtering mechanism studied here, based on type specifications, does not support the notion of templates containing particular values.Since this seems to be much less expressive, we will attempt to give some arguments to justify our approach.
First of all, filtering by types is by no means opposite to usual filtering by templates.We argued in the introduction that templates are not well-suited for higher-order components, but on the other hand types are not generally well-suited for ordinary, first-order values.So a powerful environment for generative communication should ideally support both forms of filtering.The present work concentrates on types, the most novel aspect, but this is part of a more general research program in which both aspects will be combined.Second, even if integration with ordinary templates is not likely to cause any severe difficulties, it may be not very useful in a framework implementing subtyping.Again consider records as an example.We could in principle write a template like fl 1 = c; l 2 = x; l 3 =?yg and, according to intuition, this should match any record with at least fields l 1 ; l 2 and l 3 (but possibly others), in which the first two fields contain values which match c and x.The problem is that this scheme provides no handle to talk about the other fields of the record, which are immediately forgotten.So an agent like in(fl 1 = c; l 2 = x; l 3 =?yg):out(fl 1 = c; l 2 = x; l 3 = y + 1g) would not only increment the l 3 field in records matching the template: it would also implicitly erase all fields different from l 1 ; l 2 or l 3 !Exactly the same problem appears in functional languages with pattern matching, and no satisfying solution has been implemented so far by any of the major languages in the area.Therefore this is a research topic in itself, and has been left out of the present work.
To conclude this discussion, it is worth pointing out that type expressions can in some cases be surprisingly expressive, alleviating the need for filtering by templates.Consider for example Boolean values.The standard way to deal with them is to have two constants true and false, together with a bool type.With that scheme, the type abstracts information from the two values.If, however, Boolean values are just encoded as lambda expressions, as shown in Section 2.2, then there are two distinct types for the two values: xy:x : X !Y !X xy:y : X !Y !Y so in that case type information is expressive enough to filter on exactly the value true, for example.Similarly, by encoding lists as -expressions, we could have a type specification filtering for example all lists of more than three elements in which the third element is the value true.This shows that filtering by types can sometimes be pushed further as one would intuitively expect.

Type inference for N
In the rest of this paper, the issues discussed above will now be treated in full technical detail.This section comes back to plain N, without any Linda primitives, and gives a type inference algorithm for it.Next section will integrate the algorithm with typed Linda primitives.

Simple type assignment
First we consider an adaptation of Curry's simple type assignment system to N. The idea is to illustrate the basic intuitions underlying the calculus, before going to more complex type systems involving various technical complications to obtain principal types.The syntax of simple types is: T ::= > j X j P !T P ::= (l 1 : T 1 ; : : : ; l n : T n ) (all l i distincts) where > is a type constant (the type of anything, including errors), X is a type variable, and P !T is an arrow type mapping a parameter type to a type.Parameter types are finite associations of names to types.Any name not explicitly mentioned in the set is implicitly associated with >; therefore the empty parameter type, written (), maps every name to >. P(l) denotes the type associated to name l in parameter type P, and Pnl denotes the parameter type P in which name l has been remapped to >; more formally: > otherwise Pnl = f(l 0 : T)j(l 0 : T) 2 P; l 0 6 = lg Parameter types are treated modulo the following syntactic equivalence relationship: This says that declarations in parameter types can be arbitrarily permuted, and that declarations of the form l : > can be added or removed.
We will use the letters T; U; V for types, P; Q for parameter types, and X; Y; Z for type variables.Furthermore, we adopt a syntactic sugar convention for types which corresponds to the similar convention for terms in section 2.1: type expressions of the usual -calculus, of form T !U, do not formally belong to the syntax of types, but they will be used in the T > (top) > P !> (top-arrow) X X (tvar) true : (true : X) !X false : (false : X) !X not : ((false : (true : X) !X; true Types are ordered through a subtyping relationship given in Figure 3. Observe that the rule for arrow types is covariant on the right and contravariant on the left of the arrow, as usual in type systems with subtyping.Parameter types are ordered by comparing them on all possible names; even if the name space may be infinite, the subset flj(P 1 (l) 6 = >) _ (P 2 (l) 6 = >)g is always finite, so the comparison is decidable.

Lemma 4.1 The subtyping relation is reflexive and transitive.
Proof.Easy induction on the structure of types.
A basis Γ is a finite association of variables to parameter types; the set of variables for which a parameter type is associated in Γ is denoted dom(Γ).If x : P 2 Γ, then Γ associates type P(l) to labelled variable x l ; this is sometimes written Γ(x l ).Furthermore, Γ; x : P denotes the extension of basis Γ with association x : P (assuming x 6 2 dom(Γ)).Typing judgements for the Curry simple type system are of the form Γ `S a : T, saying that "a has type T in basis Γ".Such judgements are derived from the rules of Figure 4.This type system has an unusual aspect in comparison with many other systems, where each type constructor has one introduction and one elimination rule.Here the arrow type is introduced by rule abs, but is eliminated in several steps: a type (l 1 : T 1 ; : : : ; l n : T n ) ! T is progressively reduced to () !T through multiple invocations of the bind rule; only then can it be eliminated through the close rule.This is obviously related to the asymmetry between lambda abstractions, which introduce several named parameters at the same time, and the bind and close constructs, which supply parameters in several steps.Like Curry's original system, this type assignment system assigns many types to any given term, which is inconvenient for practical purposes.Typed functional languages overcome the Γ `S a : > (top) Γ `S a : T T U Γ `S a : U (subs) Γ; x : P `S x l : P(l) (var) Γ; x : P `S a : T Γ `S x:a : P !T (abs) Γ `S a : P !T Γ `S b : P(l) Figure 4: Typing Rules difficulty by using a different type system, known as Hindley-Milner type inference.This is a deterministic algorithm, directed by the syntax of terms, which yields principal types.From a principal type, all other possible types of the same term can be generated by substitution of type variables.The purpose of the next subsections is to design a similar system for N.
Two intermediate steps are needed: recursive constraints and families of type variables.For simplicity, we will not consider type schemes and let polymorphism; this dimension is essential for practical languages, but is orthogonal to the issues investigated here, and the technical aspects of type schemes with recursive constraints have already been studied in several places [26,2,10].

Recursive constraints
In presence of subtyping, principal type inference becomes considerably more difficult, because subsumption (the subs rule of the previous section) can in principle be applied at any point: as a result, typing proofs are not unique.Fortunately, the work of Thatte on so-called partial types [26], which was motivated by totally different considerations than ours, can be easily adapted to N. Thatte introduced a type constant > to denote absence of typing information, and then studied a type system with the usual contravariant/covariant subtyping rule on arrow types.In order to be able to infer principal types, he restricted applications of subsumption to some specific places, and used sets of recursive constraints to keep track of the subtyping assumptions involved in a typing proof.Independently of Thatte, a similar approach was taken by Aiken and Wimmers [2] for constraints on sets, and their work was later adapted by Eifrig, Smith and Trifonov [10] for typing object calculi.
A set of constraints C is a set of pairs of types or pairs of parameter types, where pairs are written with the symbol , i.e.
C fT 1 U 1 ; : : : ; T n U n ; P 1 P 0 1 ; : : : ; P k P 0 k g A constraint written T = U is an abbreviation for the pair of constraints (T U; U T).
Constraints are typically introduced by a type inference algorithm, as will be seen in Section 4.6; for the moment, they are just propagated along typing judgements, which therefore have the new form Γ; C `rc a : T, where `rc stands for "recursive constraints".The new typing rules of Figure 5 are exactly as before, except for propagation of recursive constraints.The important difference comes with subtyping, as displayed in Figure 6.In addition to the previous rules, we have a rule constr for extracting one of the constraints explicitly listed in set C, and a collection Γ; C `rc a : > (top) Γ; C `rc a : T C `rc T U Γ; C `rc a : U (subs) Γ; x : P; C `rc x l : P(l) (var) Γ; x : P; C `rc a : T Γ; C `rc x:a : P !T (abs) Γ; C `rc a : P !T Γ; C `rc b : P(l) Γ; C `rc a(l = b) : Pnl !T (bind) C `rc X X (tvar) 8l:C `rc P 1 (l) P 2 (l) C `rc P 1 P 2 (ptype) C `rc P !T P 0 !T 0 C `rc P 0 P (break-left) C `rc P P 0 8l:C `rc P(l) P 0 (l) (break-ptype) Figure 6: Subtyping with recursive constraints of rules trans, break-right, break-left, break-ptype for also deriving subtyping judgements implicitly contained in the same set, by breaking arrow types and enforcing transitivity.It can be seen easily that Lemma 4.1 (reflexivity and transitivity) is still valid.The new system allows more subtyping judgements to be derived, and therefore a richer use of subsumption, which gives more typing power.For example we can derive fX !Y Xg `rc ∆ x:xx : (X !Y ) ! Y whereas in the simple type assignment system only type > can be assigned to ∆.
Various sets of constraints may generate the same subtyping relationship, even if they are syntactically different.To capture this notion, we introduce an ordering on sets of constraints, where C C 0 is to be read: "the set C implies stronger constraints than set C 0 ".

Definition 4.2
The and relations on sets of constraints are defined as

Subject Reduction and Soundness
When working with sets of constraints it is usually important to check consistency, i.e. that the constraints are not self-contradictory.Here, there is an obvious solution to any set of constraints, which is to map every type variable to >; however, this is not an interesting solution since we cannot then guarantee absence of errors, and therefore cannot prove soundness of the type system.One approach is to show the existence of a non-trivial solution, which typically requires sophisticated mathematical tools like limits of contractive maps in an ideal model [2,17].However, we found it simpler to follow the approach of the Eifrig, Smith and Trifonov [10], who show soundness at an operational level through the subject reduction property.So we show that types are preserved by reductions; we show that the types which can be assigned to err are all of a given shape (trivial types); and we conclude that terms having non-trivial types cannot reduce to err.Lemma 4.3 (Generation lemma) Suppose C `rc T 6 >.Then 1: Γ; C `rc x l : T , 9U: Proof. ( direction is direct from the typing rules; ) direction requires induction on type derivations.Without rule subs, the system would be entirely syntax-directed; but since subtyping is reflexive and transitive (Lemma 4.1), any proof of Γ; C `rc a : T can be rewritten as a proof which finishes by the rule corresponding to the structure of a, followed by a single application of subs.Then the induction is straightforward.
Proof.Induction on the generation of Γ; x : P; C `rc a : T.
Proof.Induction on the generation of a !N a 0 .If C `rc > T, then the result is trivial.If not, we use the generation lemma in inspecting the prime cases for one-step reduction: case a ( x:b)(l = c) and a 0 ( x:b x l := c]).By the generation lemma, we have: Again by the same lemma 9P 0 ; U 0 : Γ; x : P 0 ; C `rc b : U 0 ] ^ C `rc P 0 !U 0 P !U]] The condition C `rc P 0 !U 0 P !U is equivalent to C `rc U 0 UĈ `rc P P 0 , so c : P 0 (l) by subsumption.On the other hand, since substitutions are capture-free, c does not contain any occurrence of variable x (if it did, then x:b would be -converted), so we are entitled to extend the basis in the typing proof of c, yielding Γ; x : P 0 nl; C `rc c : P 0 (l).Collecting these results, we get Γ; x : P 0 ; C `rc b : U 0 ] ^ Γ; x : P 0 nl; C `rc c : P 0 (l)] where the conditions are met to apply the substitution lemma 4.4, with result Γ; x : P 0 nl; C `rc b x l := c] : U 0 By application of typing rule abs, we derive Γ; C `rc x:b x l := c] : P 0 nl !U 0 , where the subject is a 0 .Since C `rc P P 0 and C `rc U 0 U and C `rc Pnl !U T, we have C `rc P 0 nl !U 0 T, so Γ; C `rc a 0 : T follows by subsumption.
case a ( x:b)! and a 0 b x := err].By two applications of the generation lemma, we have: 9U: Γ; C `rc x:b : () !U ^C `rc U T] 9P; U 0 : Γ; x : P; C `rc b : U 0 ^C `rc P !U 0 () !U] Now P !U 0 () !U iff P () and U 0 U, which implies Γ; x : Pnl `rc err : P(l) for any l.Hence we can apply the substitution lemma, yielding Γ; x : (); C `rc b x := err] : U 0 where the subject is a 0 .Since a 0 cannot contain free occurrences of x, the assumption x : () can be removed from the basis, and then again the result follows by subsumption.cases a err(l = b); a err!; a x:err are trivial.
The subject reduction theorem will prove soundness of type assignment, but only when the hypothesis is satisfied, i.e. :(C `rc > T).Verification of this proposition is not immediate, since some subtyping assumptions may be hidden in the structure of types: for example f(l 1 : X1; l 2 : X2) !Y (l 1 : Z) !Y g `rc > X2 because of contravariance of subtyping on left-hand sides of arrow types.So to be able to check the condition easily, we introduce a normal form for sets of constraints.Definition 4.6 A set of constraints C is in normal form iff all constraints are of the form X T or T X, where X is a type variable, and T is either a type variable or an arrow type.This notion of normal form is rather weak: it does not imply unicity of normal form, and further simplifications could be done, like performing the substitution X := T] if C contains a pair of constraints (X T); (T X) and X is not a free type variable of T. Such aspects are important for practical implementation; however, the definition above is sufficient for proving soundness of type assignment.Lemma 4.9 For every set of constraints C and associated type T, there are C 0 , T 0 such that C 0 is in normal form, and Γ; C `rc a : T , Γ; C 0 `rc a : T 0 .Proof.Iterate over the following steps, until no conditions are applicable.This will be denoted as the norm algorithm.remove all constraints in C of form T >. replace constraints of form > P !T by > T. remove constraints of form > X, and perform corresponding substitutions X := >] on C and T.
replace constraints of form P !U P 0 !U 0 by constraints U U 0 and P 0 P. replace constraints of form P P 0 (constraints on parameter types) by the following set of individual type constraints: fP(l) P 0 (l)jP (l) 6 = > _ P 0 (l) 6 = > g transitively add the constraint T U for any pair of constraints T X; X U. Theorem 4.10 (Soundness) If Γ; C `rc a : T; norm(C; T) = (C 0 ; T 0 ), T 0 6 2 Triv, then :(a !N err).
Proof.By the subject reduction theorem, if a !N err, then Γ; C `rc err : T. But Lemma 4.8, states :(Γ; C 0 `rc err : T 0 ), which implies :(Γ; C `rc err : T).So we have a contradiction, and a cannot reduce to err.

Families of type variables
With sets of constraints we have solved one difficulty for unicity of typing.The bind construct creates another difficulty.If a sequence of bind statements is immediately followed by a close, then it is straightforward to adapt well-known techniques.For example, a function like xyz:x(l 1 = y)(l 2 = z)! will have type However, if the close operation is removed, as in xyz:x(l 1 = y)(l 2 = z) then possible types could be ((l 1 : X1; l 2 : X2) !Y ) ! X1 !X2 !(() !Y ) ((l 1 : X1; l 2 : X2; l 3 : X3) !Y ) ! X1 !X2 !((l 3 : X3) !Y ) ((l 1 : X1; l 2 : X2; l 3 : X3; : : : ; l n : Xn) !Y ) ! X1 !X2 !((l 3 : X3; : : : ; l n : Xn) !Y ) and there is no way to generate all such types by substitution of type variables and subsumption.Therefore we need to consider a more complex system involving families of type variables, which are indexed by names.Capital letters X; Y; Z or X1; X2; : : : now denote such families, and individual type variables are now of the form X l .For convenience, we adopt the same convention as in the term syntax, namely that for a specific name inv the index can be omitted.
The principal type of xyz:x(l 1 = y)(l 2 = z) now becomes: where the association : X3 states that any name l different from l 1 and l 2 is implicitly mapped to type variable X3 l .This is an infinite mapping, but arguments to the function above can only use a finite number of names, so most instances of the pattern : X3 will be unified with > during the constraint normalization phase.The type expression here specifies that names l 1 and l 2 have been bound, and therefore are remapped to >, but that all other names stay as they were.A very similar mechanism called row variables has been used in the context of record calculi [27,22].
A last technical point is to adapt the constraint normalization algorithm to open parameter types.The idea is exactly the same: a constraint on parameter types is decomposed into several individual constraints on types.However, the formal definition is more complex, because we need to consider all possible combinations of closed and open parameter types.The table below details how to decompose a constraint P P 0 : P 0 P ( l : T) P ( l : T; : X ) ( l0 : T0 ) fP(l) P 0 (l)jl 2 l l0 g fP(l) P 0 (l)jl 2 l l0 g; (X >; 6 2 l l0 ) ( l0 : T0 ; : X 0 ) fP(l) P 0 (l)jl 2 l l0 g; (> X 0 ; 6 2 l l0 ) fP(l) P 0 (l)jl 2 l l0 g; (X X 0 ; 6 2 l l0 )

Type inference
We now have all ingredients in place for a type inference algorithm.The rules are given in Figure 7.The system is entirely syntax-directed, so obviously it is deterministic, and it always terminates since terms are finite.The bind rule shows why it was necessary to introduce the notions of families of type variables and open parameter types: the algorithm can cancel one specific name l, while leaving all other names untouched.
The two theorems below show that the inference system is correct and generates principal types, in the sense that all other possible types of the same term can be obtained by a combination of type substitution and subsumption.Theorem 4.12 (Completeness) If the inference algorithm yields Γ; C `inf a : T, then for any type assignment Γ 0 ; C 0 `rc+f a : T 0 , there is a type substitution such that Γ = Γ 0 ; C 0 C , and T T 0 .Proof.Induction on the derivation Γ 0 ; C 0 `rc+f a : T 0 , following the same line as Thatte [26].
The error cases are trivial, the case var is immediate, and cases abs, close and subs are easy.
Γ; C `inf x l : Γ(x l ) (var) Γ; C `inf err : > (err) Γ; x : P; C `inf a : T Γ; C `inf x:a : P !T (abs) where ( P (l 1 : X l 1 ; : : : ; l n : X l n ); fl 1 ; : : : l n g FN(a; x) where ( C 3 = fT 1 (l : T 2 ; : X ) ! Y g X and Y are new where Pnl !T 00 and the antecedents of the rule are Γ 0 ; C 0 `rc+f b : P !T 00 and Γ 0 ; C 0 `rc+f c : P(l).By induction hypothesis, there exist type substitutions b and c such that: Clearly, b and c must agree on all type variables in Γ.Moreover, since the inference algorithm systematically generates new type variables, the only type variables in common between, on the one hand, C b and T b , and, on the other hand, C c and T c , are exactly those type variables already in Γ.Finally, we can define a substitution X on type variables of the X family, such that ( : X ) X P; this substitution does not conflict with the previous ones because X is a new type variable.In consequence, is a well-defined type substitution, for which T T 0 .We still have to verify that C 0 C .Since C b C b b and C c C c c , we already know that C 0 (C b C c ) , so we just want to prove the remaining constraint, namely C 0 `rc (T b (l : T c ; : X ) ! Y ) .Define P 0 such that P 0 equals P on any name, except l for which P 0 (l) = T c .Since T c c P(l), we have P 0 P, and hence P !T 00 P 0 !T 00 ((l : T c ; : X ) ! Y ) .But T b b P !T 00 by induction hypothesis, so the result follows by transitivity of subtyping.

Typed LINDA primitives
Finally it is time to integrate type inference and Linda primitives.We want to ensure that values output by out statements are not err, and that values input by a statement in(x):a (or read(x):a) are chosen in such a way that substituting them for occurrences of x in the body of a does not yield err.
As explained in the introduction, the difficulty is that type correctness typically depends on global type information, so it is not possible to only rely on the local structure of a program fragment.However, once the type inference algorithm has been executed on a whole program, then the set of type constraints does contain such global type information.So the idea is the following: during the static type inference phase, all input or output statements are statically decorated with type variables (with systematic generation of new type variables).We obtain a type T for the whole program and a global set of constraints C, which can be normalized to C 0 and T 0 .If T 0 is a trivial type, then an error may occur and the program is rejected.Moreover, if, during the normalization process, one of the type variables X associated with an output statement has been substituted by a trivial type, then that output statement may write an error into the dataspace, and again the program is rejected.Hence, a program accepted by the static type system does not produce errors.
It remains to check that the values input from the dataspace by a given program correspond to the type assumptions of that program.This is a more delicate point, since these assumptions depend on global constraints, and moreover cannot be entirely determined statically: again the example in(f):in(x):fx shows that the type of x depends on which function f was fetched from the dataspace.
The solution is to dynamically keep evolving type information associated with each agent in the system.Initially, the type information is exactly the output of the type inference algorithm.As the computation progresses, additional constraints are added at each input statement, in order to check that the values fetched from the dataspace are type compatible with the program.An input communication event is only allowed if normalization of the extended set of constraints still yields non-trivial types.In the example above, f is initially assigned a type variable X, and x is assigned a type variable Y , together with the constraint X Y !Z for some type variable Z. Suppose the dataspace contrains a function x:x 0, of type (Int !Z 0 ) ! Z 0 .Then, before accepting to input this function at the in(f) request, the type system will check wether existing constraints are consistent with the new constraint (Int !Z 0 ) ! Z 0 X.This is indeed the case, but then by transitivity of constraints on type variable X, we derive new constraints Z 0 Z and Y Int !Z 0 .As a result, type variable Y now has an upper bound which states that any value entered at the in(x) request should be a function of type Int !Z 0 for some Z 0 .In other words, the interaction between the two in requests has been properly captured by dynamic adaptation of the type constraints.

Static phase
For a given program a, the static phase consists of the following steps: decorate each output statement in the program with a new type variable, i.e. replace out(b) by out(b : X inv ), where X is new.decorate each input statement in the program with a new parameter type, i.e. replace in(x):b by in(x : ( : X )):b, where X is new, and similarly for read statements.apply the type inference algorithm, with the extensions displayed in Figure 8. let (T; C) be the output of the inference algorithm.Annotate the whole program with expression h TnCi, where T is the finite set of types containing T and all type annotations for out primitives.
normalize the global set of constraints C. Any type substitutions generated during the normalization process should be applied to members of T, and also to all type annotations for in and read primitives.Let a 0 h T0 nC 0 i be the result of this normalization process.reject the program if T0 \ Triv 6 = ;.

Dynamic phase
Sets of typing constraints are kept together with programs in the dynamic phase, so we consider evaluation rules of the form (S; ah T nCi) # (S 0 ; vh T0 nC 0 i.The evaluation rules for N Γ; C `inf b : T Γ; C fX = Tg `inf out(b : X) : Y !Y (out)   where Y is new  (S ] fv 1 ; : : : ; v n g; a x l 1 := v 1 ] : : : ; x l n := v n ]h T0 nC 0 i) where (C 0 ; T0 ) are as above Figure 9: Reduction rules of typed Linda primitives constructs are as in section 3.2; the new rules for Linda primitives are given in Figure 9.The behaviour of a system of concurrent agents is now given by the rule (S; a i h Ti nC i i) # (S 0 ; vh T0 i nC 0 i i) (S; fa 1 h T1 nC 1 i; : : : ; a n h Tn nC n ig) # (S 0 ; fa 1 h T1 nC 1 i; : : : ; vh T0 i nC 0 i i; : : : ; a n h Tn nC n ig) (system) and we can statically ensure that no agent will ever evaluate to err, and that no error value will ever be output into the dataspace S.

Conclusion
We have proposed the idea of generative communication controlled by type information, and we have demonstrated feasibility of the approach in a context which combines static type inference, dynamic binding, and Linda primitives.The major technical steps have been shown, and some of them have already been put into practice: the type inference algorithm of section 4.6 has been implemented in a prototype interpreter based on the N calculus.However, a number of issues were left out of the present paper, and will need further consideration before building a practical coordination system with higher-order generative communication.Most of them are quite simple extensions, which do not cause important technical difficulties, but would have made this presentation even more complex: adding primitive constants and primitive types, possibly with a primitive subtyping relationship (e.g.Int Real).Such an extension is totally compatible with recursive type constraints.
adding type schemes and let polymorphism.In presence of recursive type constraints, type schemes are more complex than in the original Hindley-Milner system: they require simultaneous quantification over a set of type variables, and inclusion of the constraints within the type scheme.However, as already mentioned, the technical details have already been sorted out in several places [26,2,10].
considering dynamic agent configurations, in which agents can be dynamically created, probably through a variant of the Linda eval primitive.Since we already assumed that type inference is performed within the dataspace, this is not likely to make any significant difference as far as technical support is concerned.
In addition, some more important issues deserve further study.The first will be to integrate filtering by types with a more usual mechanism for filtering by templates: the difficult problem, as discussed in Section 3.3, will be to extend pattern matching to deal with subtyping in a proper way.Another important issue, related to evaluation strategies, is to consider in more detail the substitution operation on terms.We have seen how the static typing phase decorates each input statement with type information.These types are carefully kept independent, by systematic generation of new type variables.However, it may happen that such an input statement is part of an expression which, after a -reduction, is copied several times within another expression.In that case, each copy will be decorated with the same type.The result is still type-correct, but overly constrained: it is as if a value input at one copy had to simultaneously satisfy the types constraints of all copies.Therefore the type checker will forbid some communication events that would not have generated run-time errors, and the system may get into deadlock.In order to solve that problem, we need a notion of substitution which automatically generates new type variables when producing multiple copies of a given term.
Finally, for embedding this approach into a programming language, efficiency is likely to be a problem.The set of constraints generated for a given program may be large, so input operations will be very costly if each of them requires a new inspection of the constraints.However, it is not impossible that the introduction of type schemes (with universal quantification of type variables) may partition the space of constraints, in such a way that only a small subset of constraints need to be inspected for a given input event.Furthermore, recent work by Pottier [21] gives encouraging results for simplifying sets of recursive type constraints and keeping them of manageable size.

Theorem 2 . 1 (
Confluence) a !N b ^a !N c ) 9d:b !N d ^c !N d.
(true = false)(false = true)!= x:x inv (true = x:x false )(false = x:x true )!The standard -calculus has another way of encoding boolean values (Church encoding), based on positions of parameters: true is xy:x, false is xy:y, and not is bxy:b y x.By contrast, the N encoding uses only one abstraction level (one single ), but accesses the corresponding variable through different names.The advantage is extensibility: additional names can be used for additional values, without changing the basic protocol.For example, a three-valued logic, with an additional unknown value and a corresponding redefinition of the not operation, is obtained as follows: (x(unknown = unknown)) = x:not(inv = x inv (unknown = unknown))!

Figure 5 :
Figure 5: Typing rules with recursive constraints

Definition 4 . 7
The set Triv of trivial types is defined inductively as:1.> 2 Triv 2. T 2 Triv ) 8P:(P !T) 2 Triv Lemma 4.8 If Γ; C `rc err : T and C is in normal form, then T 2 Triv.Proof.Proofs of err : T can only use the top axiom and subsumption.If C is in normal form, it contains no constraint of form > U for some U. Therefore the only possible use of subsumption is rule top-arrow; but then T is necessarily trivial by the antecedent of the rule.

Theorem 4 . 11 (
Correctness) Γ; C `inf a : T ) Γ; C `rc+f a : T Proof.Easy induction on the structure of a.

Figure 8 :C 1 ` 0 f(l 1 :
Figure 8: Extensions to the type inference algorithm These are arrow types in which the left-hand side is a parameter type mapping all names to >, except for the invisible name inv.Thanks to this convention, the types of usual lambda terms (i.e.terms which do not contain names other than inv) look exactly like in the usual lambda calculus.Examples of types are given below, for some of the boolean values of section 2.2: (inv : T) !U.