Stuck in the Past
Why we should treat research like software and how Quarto can help
I’m officially a third-year PhD student folks! One of my favorite aspects of being a PhD is freedom: the freedom to explore, the freedom to learn, the freedom to create, and, last but not least, the freedom to choose the tools I want to use along the way. On this blog, I have repeatedly advocated for Quarto, my preferred tool for scientific publishing: it is open-source, straightforward to use, backed by an incredibly strong and responsive development team, works well with Julia—see my previous posts here and here—and is highly customizable—see my post on extensions.
One of my least favorite aspects of being a PhD is the old-fashioned way of publishing scientific papers. Having to deal with clunky LaTeX templates when all I want to do is write up my research is the bane of my existence. It makes me dread the writing process because I know that every time I’m forced to peek inside the LaTeX compiler log, my eyes will bleed. It shouldn’t be this way! I want to be able to look forward to writing. It should be the part where you finally get to reap the rewards for the hard work you put into your project. A creative process that allows you to communicate your findings in an engaging manner. Ideas should just be allowed to flow out freely, uninterrupted by frustration about formatting issues.
This frustration is amplified by the fact that I’m so painfully aware of the fact that there are better ways to do this. We have the technology to make the process of scientific publishing so much more enjoyable, efficient, engaging, accessible and reproducible. We also generally seem to agree that science can be much more effectively communicated through modern-day forms of media that involve animated or even interactive content. But for some reason, the academic community still insists on good old-fashioned PDFs as its gold standard for scientific publishing and communication, often enough locked behind paywalls. No wonder that the general public feels disconnected from academia.
📜 Legacy Systems
Scientists pride themselves in the fact that they are continuously pushing the boundaries of human knowledge, overturning old theories and replacing them with new ones. No idea is ever too good to be challenged. The better the idea, the more challenge it is likely going to attract. The scientific method, in its ideal form, is a transparent process of continuous refinement and creative destruction. And yet we treat our main scientific output, the research paper, as a static artifact: write it up, have it peer-reviewed, publish it, pad yourself on the back—your work is now set in stone, the stamp of approval has been obtained, move on to the next one!
“Treating research outputs as static artifacts is fundamentally at odds with the dynamic nature of science.”
I have a real issue with this approach and think that treating research outputs as static artifacts is fundamentally at odds with the dynamic nature of science. It incentivizes us to focus on quantity over quality. A static artifact creates the illusion of a finished product, which is not what science is about. It is about the process of discovery, not the end result. Even though we treat papers as if they were end results, they are really just interim results: the current version of knowledge.
Wait … did I say version? Don’t we have a well-established system for managing versions of digital knowledge artifacts? Yes, we do! It’s called version control and it is the backbone of modern software development. It allows us to keep track of changes to a digital artifact over time, to collaborate with others on the same artifact, and to easily revert to previous versions if necessary. It is a powerful tool that has revolutionized the way we develop software.
“Reward researchers for the quality of their total contributions to the scientific community, not for the number of first-author papers they publish.”
If academia took the paradigm of continuous and transparent improvement seriously, it would treat research outputs like open-source software. It would use version control to keep track of changes to research outputs over time and create an environment that fosters open collaboration. It would not reward researchers for the number of first-author papers they publish but for the quality of their total contributions to the scientific community. Peer review would be a continuous process, not a one-off event. Peer review could come in many forms: from a simple comment on a GitHub issue to a full-blown pull request.
My dear colleagues and supervisors at TU Delft have recently published a position paper that argues for exactly this approach (Liem and Demetriou 2023). It is a well-written and thought-provoking piece that I hope will inspire many to rethink the way we do science. Among the things Liem and Demetriou (2023) point out is that there are clear parallels between the state of free and open-source software (FOSS) today and the goals of the open science movement. The current state of FOSS is characterized by the following:
“Where in terms of ownership, public open-source repositories may have an active team of maintainers and owners of an artifact, other people not in these groups are explicitly welcome to raise issues or feature requests if they see points for improvement, and implement and suggest contributions themselves, that the maintainers and owners may choose to incorporate.”
— Liem and Demetriou (2023)
Which is exactly what we should be doing in science:
“Similarly, in scientific insight, a core team may work on a particular project, but other researchers and interested parties may suggest changes or improvements that could be incorporated with visible provenance.”
— Liem and Demetriou (2023)
To get to that point, it seems obvious that we need to stop treating static papers as our gold standard. We need to start treating our research outputs as what they really are: interim results of a continuous process of discovery. We need to start treating them like software.
🔆 Signs of a Brighter Future
This has been a bit of a rant so far and I’m not quite done yet. Some of you may be wondering what all the fuss is about. You may think that ‘publishing through LaTeX and PDF is just what we do; it’s a proven and reliable system; been there, done that; so why should we fix something that isn’t broken?’ You might personally prefer engaging with research that is printed out on paper and the LaTeX compiler log doesn’t even scare you anymore. Perhaps all of this debate so far has simply been below your h-index. In that case, please don’t feel like you have to stick around! But if you’re still here reading this, then let me now show you some signs of a brave new world of scientific publishing that is already emerging.
Quarto Journal Extensions
Quarto journal extensions are the first sign of a brighter future. They allow you to write your paper in Quarto and then render it to a PDF that complies with some journal’s LaTeX template. This is a huge step forward because it allows you to focus on the content of your paper and not on the formatting. It also allows you to use the same document to render a PDF for submission to the journal and an HTML version for your website. This is a great way to make your work more accessible to the general public.
The current list of available journal extensions is admittedly still short. But the list is growing, partially due to the immense efforts of the Quarto team, and partially due to the fact that it is relatively straightforward to create your own journal extension (another hat tip to the dev team). While not everyone has the time and resources to create their own journal extension for every venue they submit to, I encourage you to give it a try. The dev team is very responsive and will help you along the way. Future generations of researchers will thank you for it!
Peer Review meets Version Control
Another sign of a brighter future is the emergence of publication processes that combine peer review with version control. The Journal of Open Source Software (JOSS) is a great example of this. JOSS is a free and open-access journal that publishes research software packages. It uses GitHub as its platform for peer review and version control. While I have personally never submitted to JOSS, I have submitted my work to JuliaCon Proceedings, which follows a very similar process. This year, we published our first paper on CounterfactualExplanations.jl
(Altmeyer, Deursen, et al. 2023). The review process was refreshingly transparent and collaborative. I was able to engage with reviewers and editors directly on GitHub and paper revisions were automatically rerendered through a bot. Since most of us, especially in the computational sciences, are already using GitHub or GitLab for version control, this approach seems like a no-brainer to me and I am somewhat puzzled that it hasn’t been adopted more widely yet.
🤗 Embracing Change
While the publication process for JuliaCon Proceedings is already ahead of the game, I think it can be improved even further. In particular, I think that it could benefit from embracing Quarto. The community as a whole has certainly heard of Quarto by now: in 2022, the CEO of posit himself, J.J. Allaire, gave a talk at JuliaCon about Quarto; later that same year, yours truly gave a talk about using Quarto with Julia at Julia Eindhoven; other Julia developers like Ronny Bergmann have come up with clever ways to combine Quarto with existing documentation tools like Documenter.jl through GitHub actions. To my mind, JuliaCon Proceedings is the perfect place to turn to next for Quarto adoption.
Quarto for JuliaCon Proceedings: A Proposal
I have been working on just that: a Quarto journal extension for JuliaCon Proceedings that I would like to introduce in this blog post. The extension is called quarto-juliacon-proceedings
. It is based on the existing JuliaCon Proceedings LaTeX template and will allow authors to write their JuliaCon Proceedings paper (and more!) in Quarto.
It is close to being finished but still needs some work. The repo contains a Quarto version of our JuliaCon Proceedings paper Explaining Black-Box Models through Counterfactuals (Altmeyer, Deursen, et al. 2023). The rendered PDF serves as a comparison to the published version of the paper. The remaining differences in formatting need to be sorted out. If you notice any differences that we have not already listed, please open an issue. There is also an open issue on JuliaConSubmission.jl
which you may use to share your thoughts on this proposal with the JuliaCon Proceedings team (or simply support this proposal by giving it a thumbs up).
At the time of writing, this is a proof-of-concept for how we could use Quarto for JuliaCon proceedings. For current submissions, please follow the official instructions here.
Provided we can eventually get this extension to be officially supported by the JuliaCon Proceedings team, I think it would be a great step forward for the Julia community. It would reduce the process of complying with the JuliaCon Proceedings LaTeX template to a single line of code:
quarto use template pat-alt/quarto-juliacon-proceedings
Simply run this command and start writing your paper in Quarto. The extension will take care of the rest.
Beyond making life easier for authors, I think that this extension would also be a great opportunity for the JuliaCon Proceedings team to rethink the way we publish JuliaCon Proceedings papers. I’ll finish this post by brainstorming some ideas for how we could improve the current process even further by embracing Quarto.
Continuous Peer Review
The current process is still very much based on the old-fashioned, print-based approach to scientific publishing. Once the paper is published, that’s that. This approach is at odds with the way developers treat the software they write about in their papers. Software is continuously updated and improved and aspects discussed in a paper may become outdated over time. I could imagine a world where the JuliaCon Proceedings paper is automatically updated and tagged whenever a new version of the package is released, just like the documentation. Standard code review processes that involve the community as a whole would then act as a form of continuous peer review.
Dynamic and Interactive Content
Of course, Quarto adds a whole new dimension to the way we can communicate our work. Since the same Quarto document that is used to render the static PDF can also be used to generate HTML, we can now include animated and interactive content in our papers. Many developers already use thoughtful and illustrative animations in their package documentation. It seems like a lost opportunity to not feature these in the most condensed and public-facing output (the proceedings paper). Further down the road, I could even imagine a world where Quarto integrates seamlessly with Pluto.jl—both Quarto documents and Pluto notebooks are essentially just code blended with Markdown—to allow for fully interactive versions of our papers.
Synergies between Docs and Papers
Finally, using Quarto for package documentation and JuliaCon Proceedings papers would allow us to create synergies between the two. Developers typically spend a lot of time carefully crafting their package documentation. Independent of whether or not they use Quarto to this end, the Markdown-based documentation files are already a great starting point for a JuliaCon Proceedings paper. Quarto makes it easier than LaTeX to recycle that content and turn it into a paper, since any Markdown file can be seamlessly converted to a Quarto document.
🌯 Wrap Up
I hope that this post has given you some food for thought. I am aware that some of the ideas I have presented here are quite radical and that it will take time for the academic community to embrace them. I am also aware that there are many other aspects of the scientific publishing process that need to be rethought. But the tools I highlight in this post are freely accessible to us. We just need to start using them.
References
Citation
@online{altmeyer2023,
author = {Altmeyer, Patrick},
title = {Stuck in the {Past}},
date = {2023-11-25},
url = {https://www.patalt.org/blog/posts/quarto-juliacon-proceedings/},
langid = {en}
}