Page 1 of 1

Rel options in the future

Posted: Tue Mar 07, 2006 1:34 pm
by Phmb
I'm really quite excited to see this new DB being set up! I think it an excellent idea to start from scratch and avoid the compromises that SQL has forced on the world - I've always thought SQL an amazingly clumsy and old-fashioned construct, a bit like a modern COBOL.

I've quite a few questions about this really, and some ideas. Since there aren't many people here yet, I'll put them in one post rather than lots of different ones - feel free, of course, to discuss them in different threads if you're interested or find it useful!

1. I understand the decision to use java. It don't think it'll fly for an Enterprise wide system, no matter how java evolves. Maybe I'm wrong, but simply avoiding what you call 'sharp edges' is fine to start with, but means that they're just lurking there for later. What about starting a modular rock-solid re-write in Ada? That'd knock the socks off Enterprises in a few years time when it finally worked and be streets more solid than Oracle and the rest. If carefully designed it might even be faster.

2. When starting a new project, you can make bold generalising design decisions - or you can lock yourself into assumptions that make sense at the time, but limit your final results considerably some years later. Have these considerations and this design question been considered? I don't just mean being ready for 64 or 128 bit machines, I mean, for example, how adaptable is this model to a distributed DB? How well is on-line backup during normal operation designed into the Rel? How adaptable is Rel to multiple locking strategies being in place at the same time? How easily extensible is the set of language primitives?

3. I've been interested in database design (that is designing database engines, not designing databases) for over twenty five years. I haven't done that much about it, everything was well sqled fairly early on! How general is the object model, or could the object model be? Clearly objects need to include methods and many of these need to be standard, like existing DBs with their triggers and so forth. Could various Rel objects contain code for execution in different languages? Could, on the other hand, Rel insist on Ada (for example) being used to encourage consistent coding? Alternatively, are there sufficiently general rule models to enable useful objects to be developed?

4. Since this is early days, a couple more extreme questions. A distributed relational object-orientated db is a general platform model. How easy would it be to transform Rel into an e-mail server, for example, to compete with Exchange? How would object addressing, ownership, security, integrity and so forth be treated in such a distributed environement? Could Rel easily handle voice, video and so forth in the future?

5. Another application style question. If Rel were to be used as a document management system (or one of the many related things), how well would it manage (and how would it be designed to manage) the volumes, the need for versioning, secure storage and retrieval etc. etc.?

6. In designing db objects, has something like UML been considered? Could somebody design an application based on a future version of Rel by defining how the objects kept with in it are defined in time, space and ownership using something like UML so that it would be the basis of an Enterprise wide workflow engine?

Some of the above might well, at this stage, seem to you as quite irrelevant to the design or Rel and its implementation. I'd like to suggest that they aren't. If the design considers these possible uses at an early, design stage, then you don't end up with these things added as cobbled on extras later, you can build proper hooks, at least, for them early on and consider language extensions to include them.

Ok, so that is far too much to take on at the start, it probably all sounds like wild pipe-dreams now. But, is it worth at least thinking about these?

Re: Rel options in the future

Posted: Thu Mar 09, 2006 1:02 pm
by Dave
Phmb wrote:1. I understand the decision to use java. It don't think it'll fly for an Enterprise wide system, no matter how java evolves. Maybe I'm wrong, but simply avoiding what you call 'sharp edges' is fine to start with, but means that they're just lurking there for later. What about starting a modular rock-solid re-write in Ada? That'd knock the socks off Enterprises in a few years time when it finally worked and be streets more solid than Oracle and the rest. If carefully designed it might even be faster.
Considering the current popularity of Java/J2EE in enterprise systems, I'm not sure I'd discount Java outright. However, Java-based DBMSes like HSQLDB, though popular in their own right, aren't exactly a threat to Oracle. Yet.

I think it's worth pointing out that Rel, in its present form, is primarily a teaching tool. As a teaching tool, I want it to be able to run on as many platforms as possible, without having to recompile for and test on multiple platforms for every minor release. Java is the only popular, statically-typed language that permits one compilation to be deployed to as many platforms.

Rel has a secondary purpose as a test environment for what will eventually become an industrial-strength DBMS. At present, it is likely that Industrial Rel will be implemented in a LISP-like, set-oriented, persistent, transactional language that I am currently designing. I might call it SETP, i.e., SET Processor, as opposed to LISP's LISt Processing. SETP will probably be implemented in C, as I will essentially be using C as a generic assembly language to construct the necessary primitives. Then Industrial Rel will be written in SETP.

I shall continue my response in a subsequent post.

Dave...

Re: Rel options in the future

Posted: Thu Mar 09, 2006 3:44 pm
by Dave
Phmb wrote:2. When starting a new project, you can make bold generalising design decisions - or you can lock yourself into assumptions that make sense at the time, but limit your final results considerably some years later. Have these considerations and this design question been considered? I don't just mean being ready for 64 or 128 bit machines, I mean, for example, how adaptable is this model to a distributed DB? How well is on-line backup during normal operation designed into the Rel? How adaptable is Rel to multiple locking strategies being in place at the same time? How easily extensible is the set of language primitives?
In general, I am cautious about trying to anticipate and design for all reasonable eventualities. I believe it easiest, in the long run, to design for all current requirements, solidly refactor the resulting code into the simplest, most elegant form, and then wait for and adapt to new requirements, including re-writing components if necessary.

I believe that to be considerably easier, in the long run, than trying to anticipate future requirements, design and implement complex code to handle those requirements, and then discover they're not needed and make it more difficult to adapt to the new requirements that are needed.

That said, in answer to the individual questions:

How adaptable is this model to a distributed DB?

There is currently no support for distributed databases. This would require an additional layer between the relational algebra layer and the storage engine layer.

How well is on-line backup during normal operation designed into the Rel?

Transaction isolation is provided by the storage engine, which currently uses the Berkeley DB for Java. On-line backups while other transactions are ongoing should be possible, though I haven't tried this.

How adaptable is Rel to multiple locking strategies being in place at the same time?

Locking is internal to the Berkeley DB. The Berkeley DB will be replaced with a custom storage engine at some point, but due to the tight coupling between locking strategies and the physical storage model, I doubt it will be possible to employ multiple locking strategies at the same time.

How easily extensible is the set of language primitives?

Extending the language is relatively easy, but not within the database language itself. It is straightforward to change the grammar and interpreter to add new language constructs.

Dave...

Re: Rel options in the future

Posted: Thu Mar 09, 2006 4:09 pm
by Dave
Phmb wrote:3. I've been interested in database design (that is designing database engines, not designing databases) for over twenty five years. I haven't done that much about it, everything was well sqled fairly early on! How general is the object model, or could the object model be? Clearly objects need to include methods and many of these need to be standard, like existing DBs with their triggers and so forth. Could various Rel objects contain code for execution in different languages? Could, on the other hand, Rel insist on Ada (for example) being used to encourage consistent coding? Alternatively, are there sufficiently general rule models to enable useful objects to be developed?
Sorry, I do not understand the question.
Phmb wrote:4. Since this is early days, a couple more extreme questions. A distributed relational object-orientated db is a general platform model. How easy would it be to transform Rel into an e-mail server, for example, to compete with Exchange? How would object addressing, ownership, security, integrity and so forth be treated in such a distributed environement? Could Rel easily handle voice, video and so forth in the future?
The language of the Rel DBMS, which is essentially an implementation of Tutorial D, is a general purpose programming language. With the addition of appropriate system libraries for handling network sockets, regular expressions, etc., there is no reason why it couldn't be used to develop an e-mail server or any other application for which you might use C, Java, C#, Python, Perl, Ruby, Ada, FORTH, LISP, Smalltalk, etc.

The difference between these and Rel or Tutorial D is the presence of relational operators, constraints, and persistent relation-valued variables (roughly equivalent to "tables" in SQL) in the language itself, though there are projects afoot to add relational operators (or some query language equivalent) to some of the above. For example, the LINQ project adds native query capability to .NET languages.

There is no reason why Rel could not be used to store rich data types. The relational model imposes no restriction on data types (other than values must be testable for equality), so these may be as rich as any general purpose programming language will allow.

As for permissions, ownership, etc., issues, the existing Rel catalog provides some skeleton support for these, but the implementation is incomplete.

However, none of the above are distributed capabilities per se. Rel was not (originally, at least) designed to be a distributed database, so I cannot yet speak of these things in distributed terms. However, from a user's point of view a distributed database should be indistinguishable from a non-distributed database, so you can think of the above as applying equally to some future implementation of Rel that is distributed.
Phmb wrote:5. Another application style question. If Rel were to be used as a document management system (or one of the many related things), how well would it manage (and how would it be designed to manage) the volumes, the need for versioning, secure storage and retrieval etc. etc.?
I can certainly appreciate the need for these things, but as my primary focus has been on creating a test implementation of the language as opposed to a full-blown, industrial strength DBMS, I have not given sufficient thought to these issues to provide a useful answer.
Phmb wrote:6. In designing db objects, has something like UML been considered? Could somebody design an application based on a future version of Rel by defining how the objects kept with in it are defined in time, space and ownership using something like UML so that it would be the basis of an Enterprise wide workflow engine?
I've been trying to convince one or more of my students to start work on a graphical diagram-based Rel user interface, but so far without success. Lazy students! :wink: I have a number of ideas for how this might work, but I'll have to address that in a separate (and no doubt lengthy) post.

Dave...