Google Season of Docs/2021/Case Study
Season of Docs Case Study for 2021
Migrate BRL-CAD's Documentation Infrastructure
Description: BRL-CAD is a cross-platform solid modeling system for 3D computer-aided design (CAD) and graphic visualization. With a development history spanning more than 40 years, BRL-CAD is in production use and continuously developed by a collective of industry, academia, and government participants from all over the world. This project focuses on improving BRL-CAD's documentation infrastructure for future generations.
Case Study Author: Sean Morrison <brlcad>
In all, BRL-CAD has more than a million words of documentation across hundreds of manual pages, dozens of tutorials, hundreds of wiki pages, dozens of technical papers, and other resources.
This project attempts to modernize BRL-CAD's infrastructure with this specific effort initiating an exploratory migration of BRL-CAD's existing documentation from the Docbook XML format to AsciiDoc in Antora. This is a first-step towards improving documentation management by consolidating onto a system that is more accessible and maintainable while also preserving our ability to "compile" documentation into HTML, PDF, and other formats. It is also a step towards supporting bidirectional documentation authoring and editing of documents for documents online and under revision-control.
Our organization focus for the 2021 Season of Docs entailed migrating BRL-CAD's core documentation infrastructure off of Docbook XML. For many years, BRL-CAD has utilized Docbook XML for primary documentation management but requiring tooling customization, imposing documentation support limitations, and presenting significant authoring challenges. Docbook XML tools are available and robust, but have required custom integration and custom maintenance. Docbook XML as a format (with or without XSLT) is rich and complex, but still not well-suited to certain types of documents (e.g., presentations, flyers, quick reference sheets, or documents with very specific formatting). It's also not as common or familiar as simpler text-based formats (e.g., Asciidoc, Markdown, etc) that have since become popular and commonplace.
As such, this project involved setting up a new system in order to better understand modern available options and to being migration of a significant portion of BRL-CAD's documentation to a system addressing one or more of the challenges faced with Docbook XML. Anotora was proposed as a potential solution worht exploring in significant detail given Asciidoc's strong alignment, richness, and compatibility with Docbook as well as simple text formatting and ability to support manual pages, presentations, and other document types.
Creating the proposal
The BRL-CAD development team had already identified documentation infrastructure as a potential project several years ago, generally recognizing that it was tedious and non-trivial for most contributors to work on BRL-CAD documentation. On multiple occasions, new contributors to BRL-CAD would attempt to work on documentation, only to disengage due in part to the infrastructure complexity. Even the development team struggled at times to migrate documentation in other formats to Docbook XML given the verbosity, complexity, and lack of editing support tools. The custom integration of Docbook XML within the build system was (and is) very powerful and robust, but cumbersome to edit. Hierarchy was (and is) rigidly coupled to the build system making restructuring maintenance very costly (hence avoided).
We were even successful getting a few translations of our 1-page "About BRL-CAD" documentation in Docbook XML format, but garnering casual interest was demonstrably a very significant challenge for contributors. As such, evaluating new doc infrastructure has received attention and discussion -- particularly surrounding any solutions with the potential for *round-trip* editing while still being manageable in a revision control repository. Round-trip editing proved to be a specific challenge for Docbook XML as online editing options are limited. New infrastructure was also seen as an opportunity to modernize our toolchain and potentially reach new contributors.
A call for feedback was put out on our Zulip chat and potential discussed amongst community members. Early discussion and preliminary research was invested on Antora vs Docusaurus vs MkDocs vs other solutions. Ultimately, an executive decision was made to trial a conversion to Antora as it appeared to offer the most alignment and compatibility with Docbook XML, and this project initiation was the result.
Project budget estimation was very straightforward. As the project was constrained to documents already under revision control in Docbook XML format, we already has prior experience with their approximate conversion costs and level of effort involved. By counting the number of documents, document complexity, number of pages, and taking timeline into consideration, we arrived at an estimate of 3-4 months of full-time effort, approximately. Several documents were manually converted to validate estimation intuition.
With other 500 documents needing to be converted, we budgeted for an amortized estimate of 1 hour per document. That time was to include setup, research, debugging, and interaction times. Depending on the person's background and skills would determine the appropriateness of the estimate.
We did not encounter any unexpected expenses except for time and resourcing required to set up banking accounts for paying our writer. Account establishment and other fiscal responsibilities took considerably more time (ultimately months) than was expected resulting in significant delays and frustration. This is elaborated in more detail below in the Analysis section.
This project was principally worked on by our hired writer, Dashamir Hoxha.
With interest participating in the Season of Docs, Dashamir came to our community and became involved in community discussions many months in advance. We also reached out to other community contributors that have been involved in BRL-CAD documentation and community development in the past to assist with mentoring and review. Kesha Shaw agreed to serve as a formal reviewer and has been helpful in discussions.
As all participant interactions were online and virtual, communications were generally contained within our Zulip chat channel.
As for recruiting, communication, and project management, the importance of communication cannot be overstated. We learned that many of the communication procedures that we had developed over the years under the Google Summer of Code (GSoC) did not transfer very well over to Season of Docs (SoC) despite both being virtual programs. The dynamic of the writer being hired by the community created a different communication atmosphere.
With a formal agreement and direct fiscal responsibility, interactions with the writer were less informal than was typical under GSoC and other mentoring programs. We also learned the need to be more transparent in our discussions about payment systems, account setup, and payment methods. We also learned that our organization has not been very resilient to the effects of covid on our community and that our communication systems suffered as individuals became overworked, stressed, and with multiple competing obligations. We learned that we need to be considerably more proactive, ideally measuring metrics along the way so we can adapt more effectively.
We we anticipated the project would require 3-4 months of near full-time effort. As mentioned, we scoped this project to the size of our existing Docbook XML files consisting of approximately 505 documents. After converting several by hand while tracking time, a shorter or longer SoD timeline became supportable.
Our writer was consistently ahead of schedule such that he completed the majority scope in a little under 2.5 months time. He came to the project with significant Docbook and documentation system experience, so he adapted and got set up quickly. He then invested another month expanding his tasks to work on our wiki and gallery conversion, which were not in Docbook XML format. Those conversions came as a very welcome but surprise.
As for dates, we set a preliminary start date to coincide with the start of SoD and an estimated end date tentatively and approximately at 3-4 monhths later. Since our writer remained consistently ahead of schedule, he was able to complete the work within 3 months even with his addition of wiki and gallery tasking.
The conversion work is still pending brlcad.org integration and final review as it's a big corpus of changes, but the new documentation system is staged publicly at: https://brl-cad.fs.al/
Dashamir was ultimately successful in converting over 600 documents from Docbook XML to Antora and Asciidoc format.
Our docs were converted from Docbook, Mediawiki, and Piwigo for our main docs, wiki docs, and gallery respectively. All three are now available on our GitHub repo, representing a long migration oprimicas https://github.com/BRL-CAD/brlcad-docs https://github.com/BRL-CAD/wiki https://github.com/BRL-CAD/gallery
Measuring success was (initially) a simple function of percent documentation converted to Asciidoc and deployed to Antora. At project start, there were precisely 505 Docbook XML documents in our doc/docbook folder of the repository that required conversion. For the project to be considered successful, we identified (somewhat arbitrarily but based on need) a requirement of 90% of the existing Docbook files to be fully converted to Asciidoc, imported to Antora, and deployed online. Our writer ended up creating approximately 674 Asciidoc documents in the brlcad-docs repository. This far exceeded our success criteria. No additional metrics or criteria were added.
Our writer was adept at revision control and documentation systems, so he was able to establish repositories on GitHub with complete transparency on progress being made. Updates were shared over Zulip as well.
As for our writer's efforts: the project was a resounding success. We had identified approximately 505 Docbook XML files for conversion. Our result is 674 Asciidoc files as well as two more repositories of conversion. The BRL-CAD "wiki" repository included a conversion of 624 of our website MediaWiki pages from Mediawiki markup to Markdown format.
While our writer worked very well independently, our development team and reviewers struggled with changes this year (unrelated to SoD). We lost a couple of core developers to covid-related stress, professional employment status changes, and had several developers managing a backlog of responsibilities. There were also the aforementioned issues with fiscal processing that required an exceptional level of time and attention that we were not prepared to invest. There were also issues with our primary website's dedicated host hardware (hard drive failure and file system corruption) that consumed time as well. As such, communications with our writer suffered in an unplanned manner.
There were 41 people that interacted on our Zulip Season of Docs stream, but 1-on-1 interactions with our writer were not at the level we typically want to see for other mentoring programs (e.g., GSoC). We typically like to see several messages per day, throughout the day, and interactions were typically several messages per week instead. As our writer was doing a phenomenal job making visible progress independently and without roadblocks, developer attention shifted. Several developers, myself included, found themselves overwhelmed with too many concurrent responsibilities. We did not adapt well to our staffing changes and communication challenges.
Despite all that, the project is still definitively considered a success. Not only was our writer effective in proving BRL-CAD's documentation could be effectively converted over to Antora + Asciidoc, but that we also have options for incorporating and managing our wiki documentation and other documentation types in a centralized manner as well. It's a beacon indicator that tells us taking the next steps to sort out compilation integration will be worthwhile.
As for the SoD program: the application process was a little confusing this year, more so than in previous years, in part due to changes. Status and uploads are not tracked through a web app / interface as they are with the Summer of Code, so that also contributed to confusion. We had to repeatedly go back to the program documentation to see if there was anything we overlooked or an action we missed, and there was always a bit of uncertainty.
The documentation, templates, and examples from Google are excellent, but we found it to be a very manual process. It would be helpful if there were a login portal for administrators that is used to view org submission status, see upcoming deadlines, manage writer applications, etc, as is done for GSoC. We realize SoD is different and targets different individuals, but the benefits of a personalized dashboard would have greatly helped reduce some uncertainties.
While we loved the idea and potential, and think Open Collective is a great resource, it was our biggest hurdle. It was excessively complex, confusing, and ultimately contributed to a loss of trust with our writer. It took months to resolve banking issues. We didn't know it would at the time and resolution always seemed a week or two away, but he eventually (and understandably) questioned why weeks would pass without payment. Ultimately, we had to establish a new business banking account that would work internationally.
If our writer had been okay with a Paypal transfer or if we already had an international bank account established or if our org were comfortable entering into a legal relationship with an unknown fiscal sponsor on Open Collective, none of this would have been a problem, but that is hindsight knowledge specific to our participant...
I think it would be a great improvement if payment could be issued from Google to writers directly, so orgs are not caught in the middle of international payment system rules and regulations. It would also help avoid multiple exchange rate / transfer fees.
The objectives originally set out for this project included a conversion of all the Docbook XML documents in BRL-CAD's main source repository. All 505 documents were converted along with many other legacy documents too, ultimately resulting in 674 Asciidoc documents in the Antora system. An online prototype of the new documentation system was demonstrated and new infrastructure processing was set up in a GitHub repository. Going above and beyond, our writer also demonstrated a conversion of 624 mediawiki documents to the Markdown format and MkDocs for comparison. This too was also set up in a separate GitHub repository. Last but not least, our writer converted our existing image gallery to a static website demonstration and also set up a GitHub repository.
In terms of technical knowledge, we learned that Antora and Asciidoc are powerful documentation infrastructure easily capable of handling even the largest of projects. We learned that BRL-CAD's documentation can be readily and effectively migrated, and that (in my humble opinion) we should migrate in full quickly in order to not risk continued documentation fragmentation. We learned that MkDocs may be a viable alternative as well, with more research needed on how to integrate a system such as Antora or MkDocs into our CMake build system as a submodule.
Our organization learned numerous lessons regarding fiscal account management establishment. We learned how to get set up on OpenCollective and that significant time and resources will need to be allocated until our community is more familiar with costs and implications. We also learned that our community does not adapt well to large changes in staffing capacity.
The main lesson learned that we would choose to do differently is communication. We needed to communicate more frequently and consistently, as we have done in the past, despite all the staffing and time obligation changes faced. We needed to ensure more mentors were involved early on so that the writer has multiple points of contact to lean on for feedback and interaction.
Special thanks to Cliff Yapp for his mentoring support and challenges he posed to the writer on where this effort was heading. He outlined several important integration considerations that will need to be addressed.
Special thanks to the OpenCollective staff that helped with various payment processing questions, and for their patience in helping our org resolve issues.