This article (and the accompanying comments) were originally posted on Nov 10, 2009. Because I think many of the comments are as useful as the initial article, I'm including them as addenda to the article. I hope to have more information on Xorion shortly, as it's developing rather nicely. -- Kurt Cagle
This particular posting is meant to solicit what people would see as general requirements for the Xorion XRX CMS system. While this touches on architecture as well, this document is intended more to cover what particular cases the system should satisfy, either through its own native capabilities or through extensions. I'll be listing my own thoughts on this below, but I welcome others.
- While this is both broad and simplistic, my goal with Xorion is to create something very much analogous to Drupal on an XRX platform. What this means in practical terms is that I want to create a community content management system in which content is stored as XML within an XML Database repository, utilizes XQuery and related XML standards, and that provides an extensible community framework in which community developers can create specialized components, queries and views for web, syndication and publication purposes.
- Xorion will build an abstraction layer, an API, on top of the XQuery APIs currently exposed by various XML Database systems in order to make it possible for Xorion implementations on different systems. While many of these APIs will probably echo the API for the eXist-db database, they will make use of a separate namespace.
- Unlike with Drupal, the core document type is known as a Resource. Each resource has a corresponding abstract type known as a Resource Type that can nonetheless be validated by one or more schemas for that resource. Additionally, each resource type exposes one or more transformations (written in XQuery, though with the possibility of invoking XSLTs from within the XQuery) that can be used to generate different Representations of the resource type. The collection of resources, representations, views (described in #5) as well as specialized stylesheets and resources are contained with a Resource Portfolio.
- All resources are structured to contain both an envelope (which is system dependent) and a payload (which is system independent). The envelope will likely be built using an AtomPub <entry> element. This means that while resources can have their own individual representation constructors, if a resource portfolio doesn't contain a constructor, than a generic constructor for handling Atom deployment can be used instead. In this sense, each resource inherits from the Atom resource type.
- Each resource portfolio defines a set of URLs associated with the resource, with the idea that the base URL for a resource portfolio is a feed of some type consisting of links to a set of resources of the same type, with a given representation (such as an Atom feed, an HTML list or "teaser" set, or some other formal output). In general, these are somewhat analogous to Drupal views.
- Resource URLs internally are bound to pipelines associated either with the resource type or defined generically (i.e., of type AtomResource). Such pipelines perform any number of tasks, from performing transformations upon resource content (either XML or definable via a regex supported filtration set) to saving data resources to performing messaging tasks. Ultimately some form of visual pipeline tool will make it possible to create specialized pipelines per resource type and URL.
- Xorion themes differ fairly dramatically from it's Drupal counterpart. Each drupal theme consists of an XML document that defines different regions on the page using named <region> elements (i.e., regions with specific xml:ids) along with a core set of "required" regions (in some cases, the regions in question may be analogous to HTML header elements which are consequently invisible - such as code blocks or metadata/SEO information). Each region in turn is bound to one or more feed views with different (appropriate) representations - while it's not completely one to one, each view is basically a type of Drupal block. Because of legacy systems, this direct association between feed view and block isn't always true, but in Xorion it's central. When a new theme is introduced, it will receive a regional binding document that is essentially that of the current theme, though its possible for the superuser to alter that.
- The Xorion core API is a set of XQuery functions and a set of basic XProc components that can be invoked in the construction of any new resource classes. Additional modules within Xorion fall into one of several possible areas:
- Core XQuery API - this expands the definition of functions that have broad applicability across all resource portfolios.
- Core XProc Pipes - this expands the set of steps that can be invoked globally within XProc stacks.
- Resource XQuery API - these functions have scope only for a given resource portfolio, and are usually in the portfolio namespace. Most will either be implementations of global processing APIs, but some will also be helper functions that are unique to the resource. Resource constructors fall into this category.
- Resoure Pipes API - these are XProc pipes that only have a validity within the context of a specific resource.
- Resource Portfolio - Each resource portfolio can in fact be thought of as being analogous to a Drupal module, though not all modules are directly tied to resource portfolios.
- Because of the presence of larger scale documents than you're likely to find in Drupal, there may be one or more representations for most resources that utilize XForms to incorporate content. It's likely that the first versions of Xorion will make use of XSLTForms as the preferred XForms engine of choice, with Orbeon likely second. One impact on this is that there will be less need for modules such as Drupal's CCK, though at the same time, code that can generate xforms from schema (with minor tweaking) would likely need to be a key part of the application.
- This shift towards resource portfolios suggests that such portfolios could in fact be generated a priori, at least for most of the standard representations. This in turn makes a web tool that can be used to aggregate, generate, and publish such resource portfolios should be considered a high priority in the development of Xorion.
- Caching will be a critical part of any architecture, and this in fact suggests that one particular pipeline that could prove useful would be explicit cache pipes (or steps) that could be used to persist document changes and cache them in the right contexts. Such staggered microcaching also offers the potential of creating multiple cache gates that could persist content at various points in the pipeline, dramatically reducing computation. Similarly, logging and debugging integration needs to be built into the process from the outset.
- One thing that should be explored is the degree to which JavaScript and some type of jQuery (or other AJAX library) integration should be made within the application. The assumption of a jQuery layer makes some activities - most notably client side updates of content - far easir, and certainly provides richer visual experiences. On the downside, you lose portability to non-JavaScript enabled clients. I'd especially be interested in hearing thoughts on this aspect of the application development.
I'd be interested in people adding their own thoughts on both requirements documents and use cases - so that we can better figure out which capabilities should be first integrated and which can be left for later (or not at all).
-->I think it would be worthwhile as a metadata layering of content. One idea that I've been playing with is a modification of one of the editor libraries in order to support RDFa embedding - as well as possibly building an inline schematically driven intellisense editors. Think, for instance, of such an editor in conjunction with DocBook, so that you could actually create a DocBook editor that would let you bring up the relevant sub-tags with a right click to expose the appropriate menu. Then it would be a simple modality switch to go from DocBook tags to RDFa attribute bindings.
One additional benefit of such a pipeline is that it becomes possible to create intermediate services between the retrieval of the raw data and the output of content. For instance, imagine what could be done with document enrichment such as OpenCalais, which could either enrich the raw text before passing it into the output, or as a function within the editor transmit the raw text to the OpenCalais and place the result back into the edit pane with full enrichment provided (including, ultimately RDFa). My own belief is that most RDFa will likely NOT be manually entered, but will be determined contextually, and the architecture needs to support it.
Jeni Tennison recently wrote up a check list of features I'd like to see in any content management system like this: http://www.jenitennison.com/blog/node/105
I think it would be good to make XSLT 2.0 and XProc first class citizens as scripting languages along side XQuery.
I'm in a quandary on both, if only because of the question about support for these languages on the server. eXist of course can handle both, but Mark Logic doesn't have an XSLT2 layer (it doesn't even have a transformation capability at the moment, beyond XQuery), and despite Norm Walsh's presence there, Calabash is still largely a Java application. Creating a basic XProc implementation in Mark Logic wouldn't be THAT hard, though not trivial, so I'm actually leaning more towards making XProc be a core capability requirement (that gives us eXist, Mark Logic, and EMC's xdb9), and I suspect that we could do something similar with XML DB.
It gives us a somewhat backwards implementation - XProc natively supports XSLT and only optionally supports XQuery - but it may be a doable 1.0 solution for now until all of the XML databases mature to the point where we have the necessary XML infrastructure to be able to support these capabilities.
Kurt,
Because you mentioned XProc and xDB, I would like to add some more information to the discussion: Calumet, EMC's XProc processor, comes with a plug-in that provides a tight integration with the database. At its core, xDB is implemented as a persistent DOM, and Calumet can make direct use of it. This gives you some nice benefits like support for large documents, transaction control, indexes etc. (Like Mark Logic, neither xDB does support XSLT 2.0 directly - but as you said, you can get it indirectly via XProc.)
So, if Xorion uses XProc and/or XQuery as its XML processing layer, I think it should be relatively straightfoward to use xDB with it.
Regarding the XForms front-end, I know that everybody is talking mostly about XSLT Forms these days (and rightly so; it is good), but our Formula engine may also be worth looking at. It is written in Google Web Toolkit (Java sources compiled to Javascript; but you don't really have to be aware of that) and it runs completely client-side in the browser. In your initial post, you expressed your wish to support also non-Javascript environments. The nice thing about Formula is that it makes it possible to implement UIs also for other renditions than HTML/Javascript: internally, we have done that for Swing, Flex or, for instance, Sharepoint.
Vojtech
Vojtech,
This is good news. There's no question in my mind that XProc and XQuery will play the dominant role in Xorion.
I'm close to getting a working core in place for the app running on eXist, which I'm using primarily to test assumptions about architecture. One thing I'm planning on once I get the pieces working properly is to start determining what will make for a good set of "starter" XProc steps for the app. I think once I get past that point, I'll set up an xdb server and see what it would take to get the same system implemented there.
Kurt,
Great stuff.
Regarding #12, I would think that if XForms is to be used for the user interface, direct use of JavaScript / jQuery or other client-side libraries becomes much less of a necessity, since XForms can be used to create and script complex user interfaces. And a server-side XForms solution can also target clients that don't support JavaScript.
-Erik
One of the things that I'm beginning to think about is the idea of doing an XQuery based XForms implementation once things get far enough down the road. I still like Alain's solution on the XSLTForms client side (and really think that's where XForms should be), and I don't really see any solution with XForms that doesn't incorporate some JavaScript, if only for AJAX XMLHttpRequest support, but there's a mountain of difference between implementing callbacks to the server and building comprehensive solutions on the client.
Kurt,
Thank you again for your interest in XSLTForms.
Just in case XPath 2.0 compatibility might be important due to Orbeon alternative, I have now added prefix support for XPath function calls in the latest XSLTForms version.
Xorion is a very interesting project !
-Alain
Ooohh ... this is seriously good news. I'd played with some namespace alternatives by overriding the core library, but that seemed to me to be a fairly risky proposition.
Feel free to write up a blog post on it here if you'd like - I'm working on an XSLTForms article now, but as I find with David Lee's pieces, I'm more of an analyst anymore than a heavy duty programmer, so am likely to miss some of the cooler features of the later incarnations.
-- Kurt
Think, better use RDF instead atom.
What about acces no only from brouser (for example sharing via samba/ftp/nfs, asses from another applications)?
Content can should be present via ODF or MS OOXML container.
CMS can host a many sites in one db, so need XML description for them.
In this case possible make queries throuth public data of all sites (intersting feature).
Think need the presetation layer like in openlaszlo project (possible use one or any other, ZK for example, as frontend).
One more feature - history tracker/logger for resources/collections/permissions, but this depend from environment implementation. Just I'd implemented for our system via exist's triggers.
---------------------
Best regards, Evgeny Gazdovsky, aka "gev"
Evgeny,
> Think, better use RDF instead atom.
I have to admit that RDF may not be a bad idea as an internal envelope. I'm more comfortable working with Atom, so haven't really put a lot of thought into an RDF store, but I think there are some definite possibilities there.
> Content can should be present via ODF or MS OOXML container.
I'm kind of assuming that there will be pipes for transforming content into a large number of formats - with ODF and OOXML near the top of the list. I also see pipes that would output to PDF, LateX, various eBook readers, JSON and similar formats. The central challenge, as always, is assuring that we have some internal format (most likely XSL-FO) that would act as a bridge for these formats.
> What about acces no only from brouser (for example sharing via samba/ftp/nfs, asses from another applications)?
> Think need the presetation layer like in openlaszlo project (possible use one or any other, ZK for example, as frontend).
We're dealin with streams here. This was actually my one reservation with jQuery heavy sites doing heavy lifting on the data side (outside of an XForms renderer), The central idea here is that the Xorion server is just that - a server of streams. Some of those will be functionality rich, some of them will be almost pure data, but ultimately what a RESTful approach buys us is the capability to be client agnostic.
The implications of this are actually pretty profound. It means that we aren't specifically tied into a given user interface, and as such can support a number of them, so long as we stay restful in our architecture. I'm figuring that XHTML+jQuery will be the "default" user interface - but keep in mind that we're dealing with pipelines here. A Laszlo interface might prove useful, though anymore I see Laszlo as being just another AJAX variant that can also work with Flash. An Adobe Air interface or a Microsoft Silverlight interface is another possibility.
All of these, however, are downstream pipes - we've processed the original data streams, we've handled any inclusions or view aggregations, and at some point rather than pointing the XHTML+jQuery interface, we have a pipe which does mapping to Air or Silverlight or Laszlo. That's why I would see these as community modules that could be developed by third party devs.
> CMS can host a many sites in one db, so need XML description for them.
This is a very important point, and one I'm glad you made. In some databases (Mark Logic is a good example) it is possible to set up different web servers and internal database partitions, while others (such as eXist) would need to manually differentiate between these. I'd put multi-site support as a key aspect of Xorion development.
> In this case possible make queries throuth public data of all sites (intersting feature).
It is an interesting feature. A similar need will be for parameterized external feed parsing, which I feel has huge value, to the extent that this may be a core feature. If you assume that query results of open data can be expressed as feeds of some sort (RDF or Atom, presumably, or possibly with more difficulty raw XML) then the two features are isomorphic.
> One more feature - history tracker/logger for resources/collections/permissions, but this depend from environment implementation. Just I'd implemented for our system via exist's triggers.
Actually that's several more features ;-) however, you're right to include them. I've deliberately not touched on the permissions model, though I'm leaning toward a Drupal model there - it is, for the most part, surprisingly elegant. Collections are central to the application, but because different xdbs handle them in different ways, I see the need for a virtual collection layer within the CMS itself that can then map to the right internal implementations. As to the history piece, we're going to need to put some serious thought into intelligent logging; I see this as being one of the central pieces not only for debugging but for database platform independence.
Nice ideas! Keep them coming!
-- Kurt
- Kurt Cagle's blog
- Add new comment

- Quote
- 586 reads
Re: Xorion General Requirements
I suggest that XOrion takes into account XMPP for instance BOSH - XML Streams Real Time messaging support will be a distinctive advantage







Does RDF(a) have a role here?
I'm reading about the difficulties of arriving at shared URL conventions for complex document sets (legislation) and I'm wondering why RDF is not getting more attention generally in those discussions. (Thanks for the reminder, jpcs, about Jeni's CMS wish list.)
Should Xorion have a set of RDFa features that would allow document editors to express "non-tree" relationships among documents.