The Atomic Content Engine

Jessie Dib
Mar 30, 2024
5 min read

Updated: Oct 10, 2024

The letters I N being erased with a pencil from the word inaccessible.

How do you productively transform complex, print-based content into reusable digital assets? How do you ensure content delivery across multiple platforms while maintaining structure, interactivity, and a single source of truth?

We would like to introduce you to the Atomic Content Engine – our custom-built solution for automating content extraction and transformation. By creating a tool that transforms the process from content creation to digital delivery, we give our clients the control to maximize the potential of their content from creation to a digital experience.

The Atomic Content Engine is designed to improve workflows and bridge the creation and delivery of print content to digital applications as part of the process. In this post, we dive deeper into our process, our story on why we created this tool, how this engine is helping our clients innovate in an ever-changing digital landscape and how it works!

Introduction to the Atomic Process

Our development practice is the result of delivering products on back-to-school schedules for over twenty years. We have learned along the way about what really matters in software products and how to operate continuously in an environment of high stakes. We know that education can change lives, so we take it seriously.

Start from Content

The Atomic team can do everything for content-first applications, from publishing content to developing digital applications. Our focus on content-first applications means that we understand how to get your content to your technology product teams. We call it content engineering, and it is critical to every content business.

We understand content management systems, InDesign, Adobe’s ExtendScript and CEP frameworks, XML and XSL, and K4 and Woodwing. But we also know how to integrate those content tools into the digital development process. So our stack includes those tools as well as GitHub and Bitbucket, React and Angular, Java, SQL databases for metadata, JSON for content, and cloud services from AWS and Azure. The result is much closer coordination of content publishing and digital product development, and it has saved thousands of hours of expensive engineering time.

Content engineering has been part of our process from the beginning. We believe it’s where all great ed-tech products start, and we actively invent ways to improve content workflows with our clients every day.

The Origin Story

During our years of hands-on experience in helping partners transform print content into digital formats, we joined authors, editors, content producers, and product engineers alike in the struggle to produce interactive products from print content.

The old print-to-digital process was tedious and error-prone, requiring numerous manual tasks that were time-consuming and costly. Educational content is high stakes – we have seen small copy-paste errors and copy-fit errors cause major disruptions in limited classroom time.

In particular, print content repurposing, metadata integration, and delivery formats—all crucial for reducing app development costs and achieving high quality—presented significant challenges.

We were determined to improve this process and empower our clients, so we set out to solve those problems and enable a print content team to focus on the content. The result was the Atomic Content Engine.

The Problem It Solves

The Atomic Content Engine is built on the idea that print content can make its deadlines, can be interactive and engaging on multiple devices, and be cost-effective.

To do all of that, it must first be reusable: it has to work for all kinds of print content, with highly variable layouts in InDesign templates, but still deliver consistently to product engineers.

These were our design constraints:

Enable a single source of truth for print and digital content
Allow authors, editors, designers, and compositors to work on the same file
Avoiding duplication or confusion about where content updates are made
Empower the content team to manage the output internally
Integrate with clients' existing tools and processes
Include content validation as part of the process
Provide reports about the content extracted to ensure correct delivery to the developers
Include customization as needed, such as metadata extraction, image transformation, and other requirements based on client needs

Basically, there was no reusable bridge from print content to digital, and the content team had to fill the gap. They filled it with the late nights and manual work of producing the content twice, making two sets of deadlines, depending on the engineering team, and waiting for weeks before they could finally confirm that it all worked in the browser. That’s the problem we solved!

How Does it Work?

The Atomic Content Engine interprets InDesign documents so that a print production group can create high-quality, structured content once. They can lay it out as needed for print and structure it for application developers, all in one place.

It starts with a generic extraction that can work on any InDesign document without modification and produce simple, high-quality, valid XML. With a consistently named set of styles and consistent layout, this format will be sufficient for many InDesign file types and applications. Full stop: it just works.

However, for many projects, styles vary, and layouts are not consistent. So, the engine can also apply reusable, custom matching rules to transform the content into a more specific target format. The rules can be made from a powerful combination of criteria in the content:

Structural elements and labels of the InDesign template
Pattern matching of the content itself
Positions of content within an X-Y coordinate range
Styles that indicate different presentations in print but might mean the same thing to a digital user

By working with that criteria in simple configuration files for the matching rules, the content team can manage their print as they see fit and deliver to engineering all from one source.

The matching rules also provide a way to validate the transformations. Alongside each matching rule, the content team can include a validation that checks the result, long before they send it downstream. Validations such as “a list must have at least three items,” or “a heading must always have 3 sub-heads,” can be automated and reported to the right person to take action.

The rule syntax is designed to be familiar to people on the content production team who are familiar with the content and XML expressions. They can test and modify the results without help from application developers.

The results of every transformation and validation for each InDesign file are saved in a human-readable report to move content QA earlier (shift left!). Additionally, while most schema validators stop at the first error, the Atomic Content Engine checks all the rules with every run, making resolving multiple issues at once easier.

The entire engine runs inside Adobe InDesign, using Adobe's Javascript extensions, known as ExtendScript. This allows an individual user to prototype results on their desktop and, when they’re ready, run the engine without modification on the InDesign Server.

Finally, the engine integrates with print workflow systems like Woodwing and K4 to listen for file status and automatically process files, transforming, validating, and delivering downstream to the product engineers.

Let's Get to Work

We believe our Atomic Content engine is critical for partners who understand the value of print materials, are tied to print deadlines, and are developing digital applications with the same content. If that sounds familiar, contact us for a demo. Let us show you how the Atomic Content Engine and the Atomic NYC team can meet your unique needs.