Software Packages [raw]

Research Software Sharing, Publication, & Distribution Checklist

Considerations for publishing a software package which may be used in research or as a researcher

📒Source control

How can you keep track of the history of your project and collaborate on it?

Uses git (or other source control tool such as fossil)
- 🥉Bronze (Easy): Using version control but has a shallow project history, just placed in git for distribution
- 🥈Silver (Intermediate): Longer project history, commit messages of mixed quality, some large messy changes
- 🥇Gold (Hard): Silver plus - Well written commit messages, nice granular commits making discrete self-contained changes. Tags, releases, or branches at major project milestones, maybe some contributions from other users
- 🏆Platinum (MAXIMUM OVERKILL): Gold plus - Some from: conventional commits; Clean history with a consistent rebasing/merging strategy; Signed commits from all contributors; Contributions go through a consistent workflow like, issues, then a pull request from a branch.

If the language you are writing in has a convenient tool for initiating a template for a package then you may want to get your project’s git repository started using that tool. R for example has the {usethis} package which makes the creation of a minimal R package very easy, including adding automated building and testing with github actions.

©Licencing

On what terms can others use your code, and how can you communicate this?

Project is suitably licensed
- 🥉Bronze (easy): There is a LICENSE file in the repository for a license which meets one of the OSI, Debian, or FSF/GNU definitions of free/libre or open source software. Or for any contents that are not software a Creative Commons license.
- 🥈Silver (easy): If any prose/documentation or images is licensed differently from the code in the project this is indicated and those licences provided. If licences have an attribution requirement there is are easy to copy text/links for appropriate attribution.
- 🥇Gold (intermediate): Uses REUSE.software to provide license information for every file.
- 🏆Platinum (intermediate): all previous tiers plus any images have licensing information embedded in their metadata.

All software needs a license if you want to permit others to reuse it. It is important to give some thought to the type of license which best suits your project, it is a choice which can have significant long term implications. Checkout the turing way chapter on licensing for an introduction to the subject. If you have no time some pretty safe choices are: For a permissive license, the Apache 2.0. This would allow the re-use of your work in closed commercial code. For a ‘copyleft’ license, the GPLv3 (AGPL for server-side apps). This requires that anyone distributing software containing your code or derivatives of it share the source code with the people they distributed it too.
If you are including external code in your package then you should check that their licenses are compatible and you are legally allowed to distribute your code together in this way. Checkout this resource on license compatibility.
REUSE.software is a tool that can help you keep track licenses in complex multi-license projects. It identifies licences for code in individual files with SPDX licence codes and has an approach to doing so for binary assets.

✅Testing

How can you test your project so you can be confident it does what you think it does?

Package is appropriately tested
- 🥉Bronze (easy): You have examples in documentation or vignettes which are run and allow you to see ‘manually’ if your code’s output is correct for key functionality
- 🥈Silver (easy): You are using unit tests and an automated testing framework with tests that cover at least your package’s core functionality
- 🥇Gold (intermediate): silver plus: You are monitoring your test coverage to get some insight into any important code paths you might be missing
- 🏆Platinum (intermediate): You follow the Test Driven Development (TDD) model, designing and writing test first then writing code to make them pass

A good test suite allows you to refactor your code without fear of breaking its functionality. Good tests are agnostic to the implementation details of action that you are testing, so that you can change how you implemented something without needing to change the tests. The use of automated testing frameworks is especially useful for software that is under ongoing development as it allows developers to catch the unintended consequences of a change made in one place on some other part of the code that they did not anticipate.
Examples of automated testing frameworks include {testthat} for R & unittest for python. Tools like Codecov or coveralls in conjucnction with language specific tools such as covr can help with code coverage monitoring and insights.
Unit tests allow you to spell out in detail what you expect the behaviour of your software to be under a particular circumstance and test if it conforms to these expectations. Automatically running tests like this can be added to CI/CD pipelines on git forges.
Test coverage does not necessarily need to be 100% or even especially high but code coverage tools can allow you to spot gaps in test coverage over important parts of your codebase and ensure that you cover them and give you an indication when you added new and poorly covered code to your codebase that you may want to add tests for.
Try to make sure that your test suite runs fast so that you can run it regularly and quickly iterate.
Test Driven Development (TDD) is the practice of writing your tests first and then developing the code which conforms to these tests. It works well if you have an extremenly well defined idea of what exactly you want your code to do and not do.

🤖 Automation

What tasks can you automate to increase consistency and reduce manual work?

Linting is a process of statically analysing the source code to catch errors which can be detected without compiling/running the code such as syntax errors. Examples include {lintr} for R and Ruff for Python.
Automate the use of a standard style / format Using an automated code formatter ensures that your project has consistent code formattin. This can forestall such debates among contributors as: ‘spaces vs tabs’ for indentation, at least it can once you have agreed to bake that descision it your formatter and quash further discussion on the topic. Examples include {styler} for R and Black for python.
Building Documentation
Git hooks It can be preferable to automate certain actions based on git events
- pre-commit hooks can be especially useful for automating things like linting and code formatting on your local system before you can commit any changes to your git history which do not conform to these standards Another useful application here is where your documentation is built from source documents in your repository and also under version control, a pre-commit hook can ensure that you cannot commit your documentation sources and their build artefacts in an inconsistent state. You can even do this with automated testing so contributors cannot push code that breaks tests locally, that way if tests break in continious integration (CI) it’s always likely to be something related to differences in the testing environment or conflicting changes.
- You can write and manage your own git hooks, there is also the tool pre-commit written in python and configured in yaml which is a package manager for git commit hooks allowing you to simply install and configure many existing hooks.
Continious Integration and Deployment (CI/CD) is broardly automation of the integration of new changes from contributors to your project and the deployment of those changes to your users on an ongoing basis.
- Many modern software forges combine hosting of source control with CI/CD tools. Github has ‘github actions’ and GitLab has ‘gitlab CI/CD’, these tools are tide to those specific git hosting services meaning that adopting them can generate significant lock-in to that specific git hosting tool/platform. Codeberg provides an instance of Woodpecker CI and there are other git host-agnostic CI/CD tools such as Jenkins available.
- CI/CD pipelines can also be a good place to run linters and code formatters which either reject merges/pushes which do not conform to these standards or automatically apply them and use bots to commit them. This can act as a second line of defense to ensure that contributors have linted their code, applied standard formatting and built documentation.
- CI is a good place to run automated testing so that CD does not deploy anything detectably broken by your automated test suite.
- CI/CD is a very convenient way to manage documentation websites for your software which are built based of your packages sources. For example the {pkgdown} packge for R is specifically for building documentation sites for R packages and integrates with GitHub actions and GitHub pages to do so.

👥Peer review

Package has been appropriately reviewed
- 🥉Bronze (easy): Someone other than you has checked over your package and given you feedback
- 🥈Silver (intermediate): You have published your package in a package repository which performs reviews of submissions such as CRAN, (PyPI’s review practices would not be adequate for this purpose).
- 🥇Gold (intermediate): You have published your package via an organisation like JOSS, rOpenSci, or pyOpenSci where the code itself is subject to review by other research software developers.
- 🏆Platinum (hard): Silver, gold and some from:
  - Published a peer reviewed article with a scientific review of the theoretical / statistical / mathematical underpinnings of the tool that you implemented in addition to a technical peer review of the code quality. (These may well be seperate reviews for example by a methods journal and a software repository reflecting their different expertise.)
  - You have had and independent ‘red team’ attempt to find errors in your project and incorporated any relavant changes as a result.
  - Your project is a part of a bug bounty program.

Entities like The Journal of Open Source Software (JOSS), rOpenSci, pyOpenSci provide a more ‘academic peer review flavoured’ form of software review and make it easy to cite software in the academic style.
Package repositories like CRAN and Bioconductor have quite robust processes for review of suitability and quality of packages that are listed in these repositories, this is a form of peer-review though with a more technical focus than academic peer review of research manuscripts. By contrast PyPI and npm have minimal review processes and anyone with an account can upload packages which meet their technical specifications for packaging. Different language communities have different standards and practices around their major package repositories.

📦Distribution

Package is distributed in appropriate format(s)
- 🥉Bronze (easy): On a software forge (such as GitHub or Codeberg) in a standard package format so that it can be installed as a development package with the language’s standard tooling.
- 🥈Silver (easy): Packaged in a language specific package respository such as CRAN, PyPI, crates.io, CPAN etc.
- 🥇Gold (intermediate): Packaged for additional general package mangement tools with better systems dependency management than language specificic package mangers such as conda, better still functional packaging format such as Nix or Guix
- 🏆Platinum (intermediate): Available in with additional packagement tools, and from additional repositories
  - Not just available in the package format of but present in the package repositories of: conda forge or better still NixPkgs or Guix.
  - If applicable it has a reproducible binary build (ideally for all common architechtures)

Packaging your software so that it can easily be installed by package and environment management tools is important to allow people to use your software. Using standard packing format and build tools also often makes it easier to automate testing and documentation building from your source code as well as building binary packages for different, versions, operating systems and architectures.
Package repositories and other packaging formats, conda, spack, Nix.
If you do not have the resources to maintain your package it may be preferable to leave it out of the main package repos, many may not allow your code to be included there without an active maintainer.

💽Environment Management / Portability

The compute environment needed to build and install the package is well defined
- 🥉Bronze (easy): Makes use of a packaging format that defines dependencies in the package’s own languge, but not neceasrily system dependencies
- 🥈Silver (easy): Packaging information captures all dependencies both build and runtime, either manually or in an automated fashion
- 🥇Gold (intermediate): Detailed version information of all dependencies is also captured, this might also include details of compatible version ranges, to facilitate dependency graph resolution in pacakage managers where only one version of a package can be installed in an environment at the same time.
  - conda / environment.yml
- 🏆Platinum (intermediate):
  - Make use of functional package managers like Nix/Guix whose package derivations make the strongest guarantees about the ability to re-build a package as they describe a pure function called in a sandboxed environment.
  - Cross operating system / architecture builds - does your package build on different operating systems and instruction set architectures (arm, x86, RISC-V etc.)

Use of a robust environment management tool for you language which can exactly reproduce the environment in which any given build of your software was made. In particular any released version of your software would ideally be re-buildable from source in a bit for bit fashion.
For a software package that people may want to run in many different environments and which may be run with different versions of the language and other packages it is important to check a broad combination of factors which might crop up in environment that people are likely to be using such as:
- Different operating systems and versions of these operating systems
  - e.g. Linux vs Windows vs MacOS, and Win10 vs Win11
- Different language versions
  - e.g. R 3.6.3 and R 4.3.2
- Different computational architectures
  - x86_64, arm64, RISCV
  - There are practically analogous issues with code to run on AMD vs Intel vs Nvidia accelerators
Combinations of all of the above
You can cover all of these is all combinations, nor do you need to, just cover the ones most relevant to your software and it’s users.

🌱 Energy Efficiency

[ ]
- 🥉Bronze (easy):
- 🥈Silver (easy):
- 🥇Gold (intermediate):
- 🏆Platinum (intermediate):

Everyone likes fast and efficient code, but especially if your code is going to be re-used by a lots of people in a computationally demanding application it can burn a lot of energy. This translates to carbon emissions, water use and opportunity costs for whatever else could have been done with that energy and compute time. If you’re making a pipeline produces a lot of intermediate files and outputs consider which of these are needed or good defaults, which could be optional and which could be discarded by default. Defaults are king and people will mostly keep whatever your tool outputs often essentially indefinietly so you can reduce the energy expended on unnecessary storage by keeping your outputs lean. Consider what can you do to make your code a little more efficient.

Good documentation and good error handing can reduce the number of times people make mistakes using your code that means they re-run or partially re-run their analysis multiple times before they figure out how to use it right.

Don’t generate unnecessary outputs that will sit on people’s drives unused, clean results of intermediate steps
for pipelines in particular caching results and avoiding needing to re-compute things if possible - make best use of these features in pipeline managers for example by having small granular tasks to minimise repeated work on run failure.
Choice of libraries and frameworks - some libraries may be more efficient that others or be a wrapper around an efficient implementation in another language, or be able to make use of offload to hardware accelerators.
benchmarking & Profiling to locate and improve inefficient code
Language Choice
Offload to harware accelerators