Expectations of a type system
Posted: Wed Aug 28, 2013 5:31 pm
The following is the note I put together when compiling Ex10 in the example forum. As I am having some difficulty in getting to grips with the type model in TTM/Tutorial D/Rel, I have listed some of the behaviour I expect from a type system. Some of this behaviour is present in Rel and some not.
This is not a full exposition of the nature of a type system, but rather describes some behaviour I expect of any type system I use.
Practical type system are often two tier systems consisting of base data types (BDTs) and user defined types (UDTs), defined in terms of the BDTs. UDTs form subtypes of BDTs, the subtypes being defined by constraint. BDT)s are defined as part of the type system. A type system should allow for expansion of types within a multi-tier system, to enable complex structures of UDTs to be defined in terms of other UDTs. It should also allow for the expansion of the system to allow multiple inheritance of data types. So far there is no need for multiple inheritance support.
A type system should provide facilities to define, implement and provide new data types and operators on these new data types for use by wider systems. These wider systems are defined (in part) by strict typing provided by the type system. While the type system should attempt to be complete, consistent, and, if possible, compatible with the principles set out in The Third Manifesto (TTM), there is no need for these aspects to be proven.
Every data item is associated with a data type, which can be a BDT or a UDT. A type defines the range of values permitted. A type is derived from a BDT, or another UDT, by constraint. The operators permitted on the type are defined separately from the data type definition, by means of a calling signature and a return value.. A data item of a particular type must not be used in a context where a different type is expected. To do so is considered to be a compile time error. The rule, "a data item of a particular type must not be used in a context where a different type is expected", is relaxed in that a base type can always be used in a context where a constrained version of that type is expected.
A type system must allow for different types, derived from different base types, to have the same name. It must also allow for there to be multiple name spaces, so that the same name can be defined in more than one name space (eg definition of identifier and sys.identifier).
Types and operators defined within the type system have the following behaviour:
If type A is defined as a supertype of a hierarchy of data types (it does not matter whether A is a UDT or a BDT).
B and C are subtypes of A
D is a subtype of B
In addition the following operators are defined (no return type is given, as it is irrelevant):
P ( A )
Q ( B )
R ( B )
R ( D )
S ( C )
T ( D )
then all the following are legitimate operator invocations in the language (and should be generated by the type system or language?):
P ( A ), P ( B ), P ( C ), P ( D )
Q ( B ), Q ( D )
R ( B ), R ( D ) - but gives different results to R ( B )
S ( C )
T ( D )
but the following are not legitimate and should generate an error in the language:
T ( C ), T ( B ), T ( A )
S ( B ), S ( A )
R ( C ), R ( A )
Q ( A )
The following operators in REL etc.; THE_, selector, TREAT_AS_, and IS_A are each defined at the appropriate level of the type hierarchy to give the following results:
legitimate:
THE_A ( A ), THE_A ( B ), THE_A ( C ), THE_A ( D ), THE_B ( B ), THE_B ( D ), THE_C ( C ), THE_D ( D )
A ( A ), A ( B ), A ( C ), A ( D ), B ( B ), B ( D ), C ( C ), D ( D )
TREAT_AS_A ( A ), TREAT_AS_A ( B ),, TREAT_AS_A ( C ),, TREAT_AS_A ( D ), TREAT_AS_B ( B ), TREAT_AS_B ( D ), TREAT_AS_C ( C ), TREAT_AS_D ( D )
and
IS_A ( A ) returns TRUE, IS_A ( B ) returns TRUE, IS_A ( C ) returns TRUE, IS_A ( D ) returns TRUE,
IS_B ( A ) returns FALSE, IS_B ( B ) returns TRUE, IS_B ( C ) returns FALSE, IS_B ( D ) returns TRUE,
IS_C ( A ) returns FALSE, IS_C ( B ) returns FALSE, IS_C ( C ) returns TRUE, IS_C ( D ) returns FALSE,
IS_D ( A ) returns FALSE, IS_D ( B ) returns FALSE, IS_D ( C ) returns FALSE, IS_D ( D ) returns TRUE,
illegitimate:
THE_B ( A ), THE_B ( C ), THE_C ( A ), THE_C ( B ), THE_C ( D ), THE_D ( A ), THE_D ( B ), THE_D ( C )
B ( A ), B ( C ), C ( A ), C ( B ), C ( D ), D ( A ), D ( B ), D ( C ),
TREAT_AS_B ( A ), TREAT_AS_B ( C ), TREAT_AS_C ( A ), TREAT_AS_C ( B ), TREAT_AS_C ( D ), TREAT_AS_D ( A ), TREAT_AS_D ( B ), TREAT_AS_D ( C )
The type system must not unduly constrain the definition of data types. In particular there must be a mechanism that specifies that a default value is required (and provided?) for a data type. It must also be possible for the type system to define types that require a particular collate sequence (and provide the means to define that sequence); and to define types that specify case sensitivity of their values.
This is not a full exposition of the nature of a type system, but rather describes some behaviour I expect of any type system I use.
Practical type system are often two tier systems consisting of base data types (BDTs) and user defined types (UDTs), defined in terms of the BDTs. UDTs form subtypes of BDTs, the subtypes being defined by constraint. BDT)s are defined as part of the type system. A type system should allow for expansion of types within a multi-tier system, to enable complex structures of UDTs to be defined in terms of other UDTs. It should also allow for the expansion of the system to allow multiple inheritance of data types. So far there is no need for multiple inheritance support.
A type system should provide facilities to define, implement and provide new data types and operators on these new data types for use by wider systems. These wider systems are defined (in part) by strict typing provided by the type system. While the type system should attempt to be complete, consistent, and, if possible, compatible with the principles set out in The Third Manifesto (TTM), there is no need for these aspects to be proven.
Every data item is associated with a data type, which can be a BDT or a UDT. A type defines the range of values permitted. A type is derived from a BDT, or another UDT, by constraint. The operators permitted on the type are defined separately from the data type definition, by means of a calling signature and a return value.. A data item of a particular type must not be used in a context where a different type is expected. To do so is considered to be a compile time error. The rule, "a data item of a particular type must not be used in a context where a different type is expected", is relaxed in that a base type can always be used in a context where a constrained version of that type is expected.
A type system must allow for different types, derived from different base types, to have the same name. It must also allow for there to be multiple name spaces, so that the same name can be defined in more than one name space (eg definition of identifier and sys.identifier).
Types and operators defined within the type system have the following behaviour:
If type A is defined as a supertype of a hierarchy of data types (it does not matter whether A is a UDT or a BDT).
B and C are subtypes of A
D is a subtype of B
In addition the following operators are defined (no return type is given, as it is irrelevant):
P ( A )
Q ( B )
R ( B )
R ( D )
S ( C )
T ( D )
then all the following are legitimate operator invocations in the language (and should be generated by the type system or language?):
P ( A ), P ( B ), P ( C ), P ( D )
Q ( B ), Q ( D )
R ( B ), R ( D ) - but gives different results to R ( B )
S ( C )
T ( D )
but the following are not legitimate and should generate an error in the language:
T ( C ), T ( B ), T ( A )
S ( B ), S ( A )
R ( C ), R ( A )
Q ( A )
The following operators in REL etc.; THE_, selector, TREAT_AS_, and IS_A are each defined at the appropriate level of the type hierarchy to give the following results:
legitimate:
THE_A ( A ), THE_A ( B ), THE_A ( C ), THE_A ( D ), THE_B ( B ), THE_B ( D ), THE_C ( C ), THE_D ( D )
A ( A ), A ( B ), A ( C ), A ( D ), B ( B ), B ( D ), C ( C ), D ( D )
TREAT_AS_A ( A ), TREAT_AS_A ( B ),, TREAT_AS_A ( C ),, TREAT_AS_A ( D ), TREAT_AS_B ( B ), TREAT_AS_B ( D ), TREAT_AS_C ( C ), TREAT_AS_D ( D )
and
IS_A ( A ) returns TRUE, IS_A ( B ) returns TRUE, IS_A ( C ) returns TRUE, IS_A ( D ) returns TRUE,
IS_B ( A ) returns FALSE, IS_B ( B ) returns TRUE, IS_B ( C ) returns FALSE, IS_B ( D ) returns TRUE,
IS_C ( A ) returns FALSE, IS_C ( B ) returns FALSE, IS_C ( C ) returns TRUE, IS_C ( D ) returns FALSE,
IS_D ( A ) returns FALSE, IS_D ( B ) returns FALSE, IS_D ( C ) returns FALSE, IS_D ( D ) returns TRUE,
illegitimate:
THE_B ( A ), THE_B ( C ), THE_C ( A ), THE_C ( B ), THE_C ( D ), THE_D ( A ), THE_D ( B ), THE_D ( C )
B ( A ), B ( C ), C ( A ), C ( B ), C ( D ), D ( A ), D ( B ), D ( C ),
TREAT_AS_B ( A ), TREAT_AS_B ( C ), TREAT_AS_C ( A ), TREAT_AS_C ( B ), TREAT_AS_C ( D ), TREAT_AS_D ( A ), TREAT_AS_D ( B ), TREAT_AS_D ( C )
The type system must not unduly constrain the definition of data types. In particular there must be a mechanism that specifies that a default value is required (and provided?) for a data type. It must also be possible for the type system to define types that require a particular collate sequence (and provide the means to define that sequence); and to define types that specify case sensitivity of their values.