Jekyll for Drupal Users

For the past ten years or so, I’ve had various responsibilities over a web hosting environment that relies on Drupal to power hundreds of sites. I was largely responsible for the selection of Drupal and it was definitely the right solution for us when we picked it, but over the years I’ve grown frustrated with it, mainly because of its complexity, its insanely granular templating system, and the fact that its proprietary database format (which holds content, configuration, and debugging logs, ugh!) makes it very difficult to migrate anything between a development and production environment without a full overwrite.

When I started redesigning the Tay House site for our centennial this past spring, I decided I was definitely not using Drupal. But what would I use? The site had been running in ModX Evolution (now Evolution CMS), a CMS I was excited about when I first launched the site years ago. Evolution is, however, an older product now, and I’ve kind of lost interest in it as time has gone by. I wanted to see what was out there, and my search brought me to Jekyll.

Jekyll is a static site builder. Rather than relying on a server-side technology to build the site as it is accessed, a Jekyll site consists of a series of files containing the structure and content of the site which are compiled into a set of stand-alone HTML files to be deployed to a server. Since no code runs on the server, the site will be both very fast and very secure. I liked it because it meant that I could deploy the site and essentially forget about it. While there are some dynamic aspects of the Tay House site, the site is largely static and the content doesn’t change that often. Though I had set up several sites with ModX back in the day, Tay House was the only one I had that still used it, and since I wasn’t working on the site all the time, it was easy to lose track of when it needed to be updated.

This post explores my initial Jekyll experience and compares it to my many years experience with Drupal. I also touch on some of the interesting solutions I’ve come up with to get around some of shortcomings of not running an on-server CMS, though I’ll probably write some follow-up posts that get into them in more detail.

Modules

In Drupal, everything is controlled by modules. If you want to implement a feature, you install a module to do it. The code behind the module controls everything from the way the feature is displayed on the front-end to how it’s configured and interacted with by site editors. Drupal has a thorough system of hooks that allow modules to interact with each other and can influence or override just about any action.

Jekyll has a concept of plugins to allow sites to implement features that are not natively provided by Jekyll. Plugins are very similar to Drupal modules, but since Jekyll does not have a web-based backend and does not run interactively, the plugins tend to be much less complex. Also, since Jekyll provides a lot of flexibility over how sites are built in it’s core, there’s much less need to reply on plugins to extend Jekyll. In my initial launch of the Tay House site, I only needed one plugin—to help me generate context-specific menus–though I’ve since added a couple more as I’ve continued to build out the site.

Nodes

Content in Drupal is stored as what Drupal calls “nodes.” Typically a node is analogous with a page of the site, though sometimes nodes are used to store data that is used in other ways and is never accessed directly as a page. Each node has a content type which defines which fields are available to that node and, therefore, what type of data can be stored. If you’re a developer, a node is essentially an object and the content type is the class.

Jekyll has a looser data structure in terms of what fields are available to a piece of data, so there’s some degree of flexibility to how it can be used. A page in Jekyll is typically written in Markdown (though HTML can also be used) and each page is stored in a separate file. The file contains a YAML header, known as the front matter, which can hold variables specific to that page, followed by the page content, which is analogous to Drupal’s default content field. The front matter can consist of any of several standard variables, such as title (the title of the page) and permalink (the URL of the page in the generated site), but it can also contain custom variables which can be accessed when the site is generated. Custom variables do not need to be defined anywhere, so you can add as many page-specific variables as you’d like.

One of Jekyll’s standard page variables is layout, which specifies the template that will be used to render the page. Layouts can be used as a simple way to support Drupal-like content types by matching layout elements with expected front matter variables.

Jekyll also supports a concept called collections which can take the content type analogy a bit farther. Collections are groups of related data which are stored in individual files in a specifically-named directory. They can be set up to render as individual pages, though they don’t have to be–in some cases it makes more sense to access the data more like you would with a Drupal view. When data is rendered into pages, however, collections give some specific advantages, such as the ability to apply a permalink template to all collection items, similar to Drupal’s Pathauto module.

Blocks

Blocks in Drupal are used to show the secondary content on a page. A block might be used to show a site’s navigation, a list of related pages, a Twitter feed, a advertisement, or pretty much anything else you might want to show on one or more pages of a site. Blocks containing static HTML can be defined in Drupal’s UI, they can be created with views to show dynamically updated data based on the view results, or they can be implemented through a module, enabling almost any imaginable functionality.

Jekyll does not have the same concept as blocks, but because of its flexible templating system, it’s possible to implement something similar. Typically, the functionality of a block would be implemented in an include, a sub-template that can be called from within a layout template. This makes the “block” reusable, as it can be included in multiple layouts easily. Another approach would be to create an overarching layout that contains all of the “block” content and then use a sub-layout for each “content type” that contains the page-specific layout, similar to Drupal’s onion-skin template model.

Themes

Drupal has a very elaborate, and very complicated, theme system. Drupal uses an onion-skin approach to theming that starts at the meta-structure of the page, continues to the basic page layout, and then gets into individual elements of the page. Default templates are specified in the modules that implement them, but they can be overridden by themes. Further, generic templates can be overridden by more specific ones by category or specific element. It’s a very powerful system, but it’s very difficult to understand, especially for novices.

Jekyll uses layouts, which are template files implemented in the Liquid layout language developed by Shopify. I found Liquid very easy to work with since it is very similar to Smarty, which I’ve used for years with my PHP applications.

A Jekyll layout is a single file containing HTML with additional markup for inserting values from variables as well as some rudimentary logic such as loops and if statements. A layout is specified in a page’s front matter via the layout variable. Layouts can access any of several variables including page, which contains the page’s front matter values, and site, for site-wide values including information about other pages and any auxiliary data loaded from files in the _data directory. The main content of the page is available in the content variable.

Jekyll pages can also contain their own Liquid logic and layouts can insert themselves into “parent” templates by specifying a layout variable in their own front matter. When that happens, the rendered content of the child layout is passed in as the content to the parent layout.

Views

One of Drupal’s most powerful features is the Views module, a query-by-example interface for accessing the data stored in Drupal. Views is commonly used to create lists of data, such as all of the nodes of a given content type tagged with a certain value. For example, to create a landing page that automatically includes all of the pages within a certain section of a site, you might use a view displaying summaries of each page, and links to them. Or, you could use a view to build a list of locations for a business, with each location being it’s own node, so that they could all be cleanly listed on a single page of a site, even if those nodes are never accessed individually. This way, when you add or remove locations, the page is adjusted and sorted automatically, with no redesign necessary.

Jekyll doesn’t have the concept of views, but it does have multiple ways of getting data into the site, including data files and collections. Once Jekyll has that data, it’s easy to iterate over it using Liquid logic in a page, layout, or include, mimicking the output of a view.

I’ve touched on collections already, so I won’t discuss them further, other than to say they can be accessed in Liquid via the site.collections variable. For example, if a collection of files is stored in a directory named _locations, this data becomes available in site.collections.locations, which is an array of the parsed contents of each file in the directory.
Similarly, data can be passed to Jekyll in CSV, tab-separated, JSON, or YAML files, which are stored in the site’s _data directory. When Jekyll runs, it parses each file and inserts the contents, as an array, into the site.data variable with the key being the file’s name without the extension. So a file named locations.csv would be accessed through site.data.locations. For CSV and TSV files, the individual elements’ keys are derived from the first line of the file.

Web-Based Tools

Drupal is completely web based–from configuration, to content editing, to viewing the site, everything can be done through a web UI. While this poses some managment challenges, such as making it difficult to promote changes through a typical dev-test-prod workflow, it does make it easy for site owners with no web authoring experience to make changes to their sites.

Jekyll is file based and has no web backend. Software engineers will like that Jekyll can be easily integrated into source control systems like git and can be deployed using CI/CD tools, but less technical users may struggle with the semi-complex file organization system and the need to write markdown or raw HTML without an editor. Fortunately there are a few tools to help with this, such as Forestry and NetlifyCMS. These tools provide a web interface for editing content and automatically commit changes to git repositories without the user needing to know anything about git. Forestry is a hosted tool and has a subscription cost associated with it, though a limited free version is available. NetlifyCMS is open source and can be installed along-side the Jekyll site, but it isn’t as polished. Neither is as tightly integrated or as customizable as the Drupal admin, but they do make decent solutions for content editing.

Feeds

Drupal has a pretty elaborate feeds system for importing data in various formats. It can be used for everything from creating nodes from data in a CSV file to aggregating news from another site’s RSS feed, to populating a dropdown in a form with options from a JSON API. Imported data is stored in Drupal entities, most often as nodes though it can be stored in any entity type, such as taxonomy terms. The data is linked back to the original feed via a unique ID so that updates can be automatically applied, and there are many options available for how and when to expire old content.

Like many things Drupal, the feeds system is powerful for when you need it, but it can be overkill for simple tasks. Want to display the latest headlines or today’s events in a block on the homepage? Create a feed importer, import each item as it’s own node, and then create a view to show those locally stored nodes. Then set the feed’s deletion policies so that those nodes get deleted when they get dropped from the feed, else you end up with a lot of cruft in your database.

With Jekyll, I was able to do something similar with two plugins. jekyll_get enables the import of JSON data from a URL. The URL gets called early in the site’s build process, the feed is parsed, and the data it contains gets added to the data variable under a key you specify in the site’s _config.yml, much like if the data was dropped in a file in the _data directory. From there you can use it throughout the site in Liquid markup. Since the feed is pulled each time the site is rendered, there’s no need to worry about stale data being left behind, though it’s more difficult to collect old data if the feed is limited to only the newest content.

Data, whether from file or API, is not rendered into pages but can be easily iterated over to create something analogous to a Drupal view. Sometimes, however, you may want to create individual pages. For example, I want to pull events out of a external calendar but display the details of each event as a page on the site. For this I found the data_page_generator plugin. This plugin lets you specify a data set, a layout (which the plugin refers to as a template), some filters, and some details about how to name the files it generates and, when the site is built, you’ll end up with a set of pages containing data from the data set. Again, since the data is reprocessed each time the site it built, if a particular row of data is removed, the page containing that data will also be removed.

Dynamic Content

Drupal is built in PHP and pages are built on the fly (unless, of course, they are cached), so it’s easy to add dynamic content via modules, template files (ugh!), or through PHP code embedded into blocks or nodes (double ugh!).

With Jekyll, being a static site builder, you’d think dynamic content would be out of the question, but it is actually possible with a little creativity. For example, I wanted a dynamic feedback form for the Tay House site, so I wrote it in PHP, and added a <?php include(‘/blah/blah/contact-form.php’); ?> into my Jekyll page where I wanted the form to appear. Then I just set the permalink of the generated page to have a .php extension and now I have a dynamic page on my static site.

I’m looking to take this concept a bit farther as I further develop the site by having some sections of the site that are locked down via version control, others that can be automatically updated (but still statically built) by having a separate headless CMS trigger the build process, and still have others that can pull late-breaking information in from an API on page load.

Some Jekyll purists warn against imbedding code, saying it defeats the purpose of a static site generator and the security it provides. Many Jekyll sites, I’ve noticed, rely on Javascript-embeddable services, like Disqus, to handle add-on features like blog comments, but I’m an experienced developer with a background in web security, and I’d much rather trust code I’ve written over some black box service that I have no control over.

My Project

As I was starting to redo the Tay House site, I was quite surprised at how well Jekyll was able to do everything I wanted without much effort. So far I’ve rebuilt all of the sites general content as Jekyll pages and can render them using a set of layouts that I built from scratch in a matter of hours. (For the record, I’ve never been able to build a Drupal theme completely from scratch and often spend as long as it took me to build the Jekyll templates, if not longer, just trying to disable crap I don’t want from Drupal’s starter themes.). I probably had the first pass at the site built and functional, but without a lot of content, in the time I’d spend just trying to figure out what modules I’d need in Drupal.

The site is currently managed in my own git repository, which I host using Gitea. I may end up moving it to GitHub, however, since I’m not sure I’ll be able to get NetlifyCMS or Forestry to work with Gitea and I hope to get one of them working in the near future.

Once I had the basic site done, it was time for some semi-dynamic content. While most of the site doesn’t change often, some things, namely the announcements and calendar, need to be edited more frequently. I figured that putting these in a headless CMS would make it easier for me to let other people keep on top of them. I selected Directus for this, since I like how it uses normalized SQL while still retaining revision history.

Now that I’ve proven that Jekyll will work for my use case, I’ve started to come up with my ideal configuration. I feed the data from Directus into the site at build time using its JSON API and jekyll_get and create an individual page for each item with data_page_generator.

For now, I have to build the site manually each time I make a change and then deploy it manually with rsync. I’m looking to automate the workflow with some sort of CI/CD workflow. Ideally, I’ll get a web-based editor set up that deploys changes to a devel branch of the git repo, allowing me give other people an easy way to mange the site while allowing me to retain editorial control of the site by controlling the merges to prod.

I also hope to configure Directus to rebuild the prod site automatically whenever a calendar event or news item is added or changed, as I want those changes to become available as soon as possible. Since I don’t store the rendered site in git, I can do this without having to worry about merge conflicts in the repository. I’d also like to do automatic nightly rebuilds so that I can, say, show a list of the next five events on the homepage and have them drop off automatically as they pass.

There’s a pretty good community of Jekyll users out there and, whenever I’ve gotten stuck, I’ve been able to find answers to most of my questions online. Now that I’ve gotten more used to using Jekyll, I’m starting to put the envelope a bit more, so I’ll start posting some tutorials with some of the cool stuff I’m doing soonish.