Improving Drupal's content workflow

At DrupalCon Mumbai I sat down for several hours with the Drupal team at Pfizer to understand the work they have been doing on improving Drupal content management features. They built a set of foundational modules that help advance Drupal's content workflow capabilities; from content staging, to multi-site content staging, to better auditability, offline support, and several key user experience improvements like full-site preview, and more. In this post, I want to point a spotlight on some of Pfizer's modules, and kick-off an initial discussion around the usefulness of including some of these features in core.

Use cases

Before jumping to the technical details, let's talk a bit more about the problems these modules are solving.

  1. Cross-site content staging — In this case you want to synchronize content from one site to another. The first site may be a staging site where content editors make changes. The second site may be the live production site. Changes are previewed on the stage site and then pushed to the production site. More complex workflows could involve multiple staging environments like multiple sites publishing into a single master site.
  2. Content branching — For a new product launch you might want to prepare a version of your site with a new section on the site featuring the new product. The new section would introduce several new pages, updates to existing pages, and new menu items. You want to be able to build out the updated version in a self-contained 'branch' and merge all the changes as a whole when the product is ready to launch. In an election case scenario, you might want to prepare multiple sections; one for each candidate that could win.
  3. Preview your site — When you're building out a new section on your site for launch, you want to preview your entire site, as it will look on the day it goes live. This is effectively content staging on a single site.
  4. Offline browse and publish — Here is a use-case that Pfizer is trying to solve. A sales rep goes to a hospital and needs access to information when there is no wi-fi or a slow connection. The site should be fully functional in offline mode and any changes or forms submitted, should automatically sync and resolve conflicts when the connection is restored.
  5. Content recovery — Even with confirmation dialogs, people delete things they didn’t want to delete. This case is about giving users the ability to “undelete” or recover content that has been deleted from their database.
  6. Audit logs — For compliance reasons, some organizations need all content revisions to be logged, with the ability to review content that has been deleted and connect each action to a specific user so that employees are accountable for their actions in the CMS.

Technical details

All these use cases share a few key traits:

  1. Content needs to be synchronized from one place to another, e.g. from workspace to workspace, from site to site or from frontend to backend
  2. Full revision history needs to be kept
  3. Content revision conflicts needs to be tracked

Much of this started as a single module: Deploy. The Deploy module was first created by Greg Dunlap for Drupal 6 in 2008 for a customer of Palantir. In 2012, Greg handed over maintainership to Dick Olsson. Dick continued to improve Deploy module for Al Jazeera while working at Node One. Later, Dave Hall created a second Drupal 7 version which more significant improvements based on feedback from different users. Today, both Dick and Dave work for Pfizer and have continued to include lessons learned in the Drupal 8 version of the module. After years of experience working on Deploy module and various redesigns, the team has extracted the functionality in a set of modules:

Pfizer content workflow improvements modules

Multiversion

This module does three things: (1) it adds revision support for all content entities in Drupal, not just nodes and block content as provided by core, and (2) it introduces the concept of parent revisions so you can create different branches of your content or site, and (3) it keeps track of conflicts in the revision tree (e.g. when two revisions share the same parent). Many of these features complement the ongoing improvements to Drupal's Entity API.

Replication

Built on top of Multiversion module, this lightweight module reads revision information stored by Multiversion, and uses that to determine what revisions are missing from a given location and lets you replicate content between a source and a target location. The next two modules, Workspace and RELAXed Web Services depend on replication module.

Workspace

This module enables single site content staging and full-site previews. The UI lets you create workspaces and switch between them. With Replication module different workspaces on the same site can behave like different editorial environments.

RELAXed Web Services

This module facilitates cross-site content staging. It provides a more extensive REST API for Drupal with support for UUIDs, translations, file attachments and parent revisions — all important to solve unique challenges with content staging (e.g. UUID and revision information is needed to resolve merge conflicts). The RELAXed Web Services module extends the Replication module and makes it possible to replicate content from local workspaces to workspaces on remote sites using this API.

In short, Multiversion provides the "storage plumbing", whereas Replication, Workspace, and RELAXed Web Services, provide the "transport plumbing".

Deploy

Historically Deploy module has taken care of everything from bottom to top related to content staging. But for Drupal 8 Deploy has been rewritten to just provide a UI on-top of the Workspace and Replication modules. This UI lets you manage content deployments between workspaces on a single site, or between workspaces across sites (if used together with RELAXed Web Services module). The maintainers of the Deploy module have put together a marketing site with more details on what it does: http://www.drupaldeploy.org.

Trash

To handle use case #5 (content recovery) the Trash module was implemented to restore entities marked as deleted. Much like a desktop trash or recycle bin, the module displays all entities from all supported content types where the default revision is flagged as deleted. Restoring creates a newer revision, which is not flagged as deleted.

Synchronizing sites with a battle-tested API

When a Drupal site installs and enables RELAXed Web Services it will look and behave like the REST API from CouchDB. This is a pretty clever trick because it enables us to leverage the CouchDB ecosystem of tools. For example, you can use PouchDB, a JavaScript implementation of CouchDB, to provide a fully-decoupled offline database in the web browser or on a mobile device. Using the same API design as CouchDB also gives you "streaming replication" with instant updates between the backend and frontend. This is how Pfizer implements off-line browse and publish.

This animated gif shows how a content creator can switch between multiple workspaces and get a full-site preview on a single site. In this example the Live workspace is empty, while there has been a lot of content addition on the Stage workspace in.
This animated gif shows how a workspace is deployed from 'Stage' to 'Live' on a single site. In this example the Live workspace is initially empty.

Conclusion

Drupal 8.0 core packed many great improvements, but we didn't focus much on advancing Drupal's content workflow capabilities. As we think about Drupal 8.x and beyond, it might be good to move some of our focus to features like content staging, better audit-ability, off-line support, full-site preview, and more. If you are a content manager, I'd love to hear what you think about better supporting some or all of these use cases. And if you are a developer, I encourage you to take a look at these modules, try them out and let us know what you think.

Thanks to Tim Millwood, Dick Olsson and Nathaniel Catchpole for their feedback on the blog post. Special thanks to Pfizer for contributing to Drupal.

Comments

jstoller (not verified):

Yes please! This makes me so happy. :-) And perhaps you could throw in revision state support, so we've got more than published and not published?

April 04, 2016
Tim Millwood (not verified):

This is a good suggestion. We're already working closely with Crell and the Workbench Moderation team. Currently you can set moderation states on workspaces and the content in these workspaces gets automatically deployed when the moderation state changes to published.

April 05, 2016
Larry Garfield (not verified):

Yep. Workbench Moderation (https://www.drupal.org/project/workbench_moderation) for Drupal 8 now works on any revisionable content entity. Multiversion makes (nearly) all content entities revisionable. Workspaces are revisionable content entities. It all adds up. :-)

Palantir.net has been working with Dixon and Tim for most of this year on Workspace and Workbench Moderation specifically. We've already added code to the Workspace module so that if you moderate a Workspace using Workbench Moderation, making it published will cause it to merge to its upstream workspace. Net result: You can have an entire workspace that is in "Draft" or "Needs Review", and allow only selected people to move it to Published, aka "merge it". Essentially it becomes a more access-fine-grained UI to this system as an alternative to Deploy. (I think it works only for single-server setups, but I've not tried it in any other configuration yet so who knows, it may work.)

Workbench Moderation is in beta2 right now, and I'm reasonably confident we'll have a 1.0 before DrupalCon.

April 05, 2016
Imran Sarder (not verified):

It's really great. I just want to say: WOW! :)

April 05, 2016
Tanc (not verified):

This is extremely exciting and having been involved in a large scale deploy based build in Drupal 7 these improvements and developments look fantastic! Top work Dick and Dave, really looking forward to trying this suite out. I'd love to see content staging in Drupal core, our at least an extremely easy to use contrib module(s).

April 05, 2016
Nikhil Sukul (not verified):

Pretty awesome. It would be great if we could have Drupal 7 versions for these modules. So that we can integrate them in our existing sites.

April 05, 2016
David Rothstein (not verified):

For some of these there are other Drupal 7 modules that provide similar functionality:

- The Deploy module itself exists for Drupal 7, of course, and provides the "cross-site content staging" feature.
- Shared Content (https://www.drupal.org/project/shared_content) builds on top of Deploy and lets content creators push content from one site to another via a single checkbox.
- CPS (https://www.drupal.org/project/cps) provides "Preview your site" and "Content branching" features within a single site.

There are probably others too; those are just the ones I know off the top of my head since I've been involved with writing/maintaining two of them :)

April 06, 2016
moshe weitzman (not verified):

This is a quantum leap in core CMS functionality. I'd love to assemble a small team dedicated to getting this suite of modules into Drupal core.

April 05, 2016
dalin (not verified):

Before we jump to the "how", let's first think about whether this is a good idea to include in core:

  • Will it be used by >80% of Drupal sites — No (probably more like 2-5% for most of it, maybe up to 20% for the "trash" feature)
  • Will it make Drupal Core harder to maintain — ??? (my guess is yes)
  • Will it make Drupal's API more difficult for other developers doing more straightforward things like simple node creation — ??? (we'd need to be intentional to avoid this from happening)
  • Is there some marketing benefit that would offset the above concerns — For Acquia, probably yes; for the Drupal project at large, probably no.

So maybe this is better as something that continues to live in contrib, but changes could be made to the Drupal APIs to help it along, as long as those changes don't adversely affect people not involved.

April 05, 2016
Tim Millwood (not verified):

Firstly, I don't think we should look at putting all these features in.

When you say these features won't be used by 80% of Drupal sites, I agree, I don't think all of them will. However I think we should enforce some sensible features. Like Trash, you pick this out as something that will be used by 20% of Drupal sites, I think Drupal should enforce it to be used by 100% of sites. Another thing is things like making all content entities revisionable. Primarily this will allow us to trash any content entities, but it also opens up things like Workbench Moderation states on all content entities, rather than just Nodes and Block content.

The Multiversion module also add as many non-visible features, some of these have been particularly tricky to implement in contrib and would be super useful to many contrib modules if they were in core.

As for making Drupal harder to maintain and the API more difficult, I feel that some of these changes may help make the Entity API easier. The issue we have is that semantic version does not allow us to break existing APIs, but when we get to Drupal 9 we can look at simplifying a lot of the Entity API, maybe even to remove the option for entities to not be revisionsable, and even not be translatable.

April 06, 2016
Larry Garfield (not verified):

I agree with Tim entirely. Core's entity API tried to account for all of the various different combinations of feature sets present in previous core versions. In hindsight, I believe that was a mistake. Content entities may or may not be revisionable, but you can't tell that from the class/interface: They should just always be revisionable, period. (That is, CRAP not CRUD.) Content entities may or may not be translatable, but you can't tell that from the class/interface: They should always support translations, period. Much of what Multiversion does is impose a lot more consistency and predictability on Entity API, which core absolutely can and should adopt. That would make life far easier for modules like Workbench Moderation, where I have to spend a lot of time and code working around the inconsistencies in Entity API and entity definitions.

Adding a "deleted" flag to an entity revision, to support Trash-like behavior, is technically not necessary but certainly would be a nice add on, and if you're using Entity API properly in D8 you won't even notice the difference if it's not relevant to you.

Workspaces in core, eh, take 'em or leave 'em. But a lot of the underlying plumbing is totally core-appropriate.

April 06, 2016
cosmicdreams (not verified):

It would be really awesome if there was a distribution with great documentation that we could use to test this implementation for ourselves. Then I can feel comfortable adopting this approach.

If I could set up a scenario where I can prove to myself that I can get content to replicate from site to site then I can

April 06, 2016
Adam Balsam (not verified):

Acquia's Lightning Distribution has this functionality in it's roadmap. It's targeted for our 1.1.0 release:
https://www.drupal.org/node/2677938

We (the Lightning Team and the Module Acceleration Team) have been working closely with Dick and Tim for the last couple of months. Our user stories are slightly different, but use the same underlying modules and should be a great starting point.

April 06, 2016
dixon_ (not verified):

That's a good idea. I know that Lightning (https://www.drupal.org/project/lightning) have plans to include these modules in their distribution. But perhaps me and the Deploy team should also put together a lightweight distro that only includes the content staging parts... We might give this a go soon :)

April 07, 2016
Bojhan Somers (not verified):

This is very interesting, I've been following these presentations for quite a while as they seek to resolve major wholes in Drupal's content workflow proposition. From my understanding of Drupal's audience, many of these features will be highly beneficial even when not actively leveraged, as moshe calls it a quantum leap.

We definitely have an opportunity to better understand and map, which features are beneficial to what parts of our audience. This should in part help us prioritize the proposed features.

However, what worries me continuously throughout this initiative is the epic proportions of approach and development required. The idea of our new release strategy is that we are able to onboard new functionality at a much faster pace. This post indicates that we are only just exploring, and the road to inclusion is a large one requiring many fundamental framework elements needing to be put in place and mature.

From a product stand point, this is very worrisome - as to me this indicates that the timelines and with that strategies and tactics do not align with what we should be doing to remain competitive in a fast moving market. While perhaps acceptable, to how we used to do things in Drupal 6-8. Ideally we are much more agile; move from idea, to MVP (experiments package), to full core inclusion in a matter of 1-2 releases. This will need commitment from the larger agencies/clients - but theoretically our process should now allow for this.

I am worried, when Drupal 9 comes around that we will have a similar post, with a even more thorough and greater understanding of the problem space. In Drupal 8, we did significant work on this part - but little of it had a game changing effect on core's capabilities for end-users.

Whats your plan of attack to bring this to core?

April 06, 2016
Dries:

There currently is no plan of attack to bring this in core. Creating such a plan would be the next step. I think you're making various assumptions in your comment that aren't necessarily true.

April 06, 2016
Greg Dunlap (not verified):

I remember being very reluctant to hand over Deploy to Dick, not because of him but because it is a project that was very dear to my heart. But I was ignoring it, the D8 initiative was starting, and it really deserved some love. Boy has he taken it and run with it. It has been thrilling to watch all this work happening over the last several years, and it is a testament to what can happen when you let go of your code and allow it to bloom under new ownership.

April 06, 2016
dixon_ (not verified):

Thanks Greg! That means a lot to hear from you :)

April 07, 2016
Marco Galimberti (not verified):

I'm very happy to hear about that ...one of my three prediction for 2016 is coming true https://www.drupal.org/node/2652402#comment-10796766 :-)

That's what I think about it: this is not an option, it should be a MUST. Every use case presented it's important, but I think that this feature should be projected to the future of User Experience Platforms (UXP). Following is just an example of what defined in the Gartners's "MarketScore for User Experience Platform" in 2014 (see Integration and API services).

Content staging it's not just useful for a single site or a single environment, but for communities and groups of people that have something to share: that is "federated sites". Drupal has lot's of useful and nice features, but what I really like and what I really think that make the difference between Drupal and other CMSs (if with D8 we can still talk about CMS) is the power and flexibility to create structured content. Drupal should give the chance to publish and maintain every kind of content between different sites.

I'm not a developer, I'm a IT security engineer but I love web and I deploy websites too. Here is one real life example that should be applied: MITRE and NIST both provides security standards and structured data (OVAL, CVE, CWE, SCAP, etc.), each one takes part of other standards. Drupal should be the hub between both sites and each standard. In this use case all these modules should be used, but we still miss a peace of the puzzle. To share this (public) information with third parties (e.g. my own site), a tool to automatically define needed structures to the federated site should be provided. If I want to rely on the information published by these sites, I need to be sure that the structures used to define each single standard it's the same.

That's one of the (open source) project I'd like to do in the future (other and more complex require additional features to be developed).

I think that if you choose to include these modules in the system, the community will benefit from features that other CMSs can not give; however I think you are in the right direction.

Many other issues should be evaluated, but this is a simple comment to a blog post :-) Fell free to ask me for details.

Great job ;-)

April 06, 2016
jstoller (not verified):

My first DrupalCon was San Francisco, in 2010. Though I was not a Core contributor at the time (beyond my opinionated posts in the issue queue), I attended the Core Developers Summit for the express purpose of advocating for better content moderation and workflow support. Now, six years later (and nearly a decade after my first issue queue posts) it's exciting to see the attention to this area that I've been hoping for all these years.

Will it be used by >80% of users? I'd say absolutely! When I started looking into this issue I was (and continue to be) amazed at how many people just let users edit live web content. They did this because that's how Drupal came out of the box and implementing content moderation has ranged from difficult to impossible, depending on the version of core and the site's needs. As soon as best practices like revisions and content moderation become the default in core, their use will jump exponentially. Because why wouldn't you allow users to save drafts, have content changes approved prior to publication, and the like, if you could easily do so? I've yet to work on a single project that wouldn't benefit from these features, even if it was a single editor website. For the large institutional site that's my primary concern, robust content moderation was an absolute requirement. The lack of these features was partially responsible for that site's development being delayed by a good five years. It should have been easier and hopefully in the future it will be.

April 10, 2016

Updates from Dries straight to your mailbox