bv

design: what we changed and why

design decisions

Where bv differs from conda, Docker, and other package managers.

packaging

factored OCI images

The standard BioContainers image for samtools is a ~300 MB monolith: a full conda environment with samtools, its runtime dependencies (htslib, libgcc-ng, openssl, ncurses, ...), and the conda bookkeeping files that no user ever touches again after install. That 300 MB is transferred entirely on every version bump, even if only samtools itself changed.

bv builds factored images: one OCI layer per conda package. The samtools image has roughly 15 to 20 layers, one each for samtools, htslib, openssl, and so on. When samtools 1.20 ships its 1.21 update, only the samtools layer changes. The openssl layer is byte-for-byte identical to the one in the bwa image, in the bcftools image, and in every other tool that depends on it.

The construction is intentionally simple. bv-builder downloads each pinned conda package, verifies its sha256, extracts the archive, and writes a reproducible tar. Fixed modification timestamps, deterministic file ordering, symlinks preserved. Two builds from the same input always produce the same bytes, and therefore the same layer digest.

The OCI image config records each layer's sha256 of its uncompressed tar content (diff_ids). Docker verifies this on extraction. Using the compressed digest as a DiffID (easy to do accidentally) produces "wrong diff id calculated on extraction" errors. bv computes and stores both digests separately.

layer sharing

Layer deduplication in OCI registries and in Docker's local content-addressable store is based on digest. Two layers with the same digest are stored and transferred once, regardless of which images reference them.

For this to work across tools, the openssl 3.2.1 layer in samtools must have the same digest as the openssl 3.2.1 layer in bwa. That's only true if the construction is deterministic and both tools were built from the same pinned conda package (same name+version+build+sha256 from repodata). bv ensures both: packages are pinned by all four fields, and the tar construction is reproducible.

When a closure has too many packages to fit in solo layers, bv applies a popularity-based packing strategy. The most popular packages (openssl, libgcc-ng, zlib, ncurses, and so on, ranked across the full registry) each get their own layer; the long tail is batched. The dedicated layer sharing page walks through the algorithm with worked examples.

resolving transitive dependencies without a solver

Bioconda packages declare their direct dependencies, not their transitive closure. samtools declares htslib, which declares libdeflate and bzip2, which declare libgcc-ng. Running samtools from a scratch container without those libraries produces linker errors or silent failures.

bv-builder resolves the full transitive closure by reading repodata directly. It downloads repodata.json from each declared channel (bioconda, conda-forge) and walks the depends field of each package breadth-first, accumulating a complete pinned package set.

The tradeoff: bv doesn't do full constraint satisfaction. It picks the latest match for each spec and resolves greedily, rather than finding a globally consistent version assignment. This works for the vast majority of bioinformatics tools because the published version is already a known-good combination. For tools with unusual cross-package constraints it may pick wrong versions; in those cases the image will fail at runtime rather than at resolve time, which is a worse failure mode than mamba's upfront conflict detection.

Transitive dependencies that can't be found in any declared channel (private channels, unusual naming conventions, meta-packages) are skipped with a warning rather than failing the build. Direct dependencies still fail loudly.

reproducibility

tags are mutable; digests are not

A Docker image tag like samtools:1.21 is mutable. The registry owner can push a new image under the same tag without changing the tag string. Every user who then pulls samtools:1.21 gets the new image. This is sometimes intentional (security patches) and sometimes not (accidental overwrite), but either way the tag is no longer a stable identifier.

An OCI manifest digest like sha256:3f7a... is not mutable. It's the sha256 of the manifest JSON, which covers the digest of every layer and the config. If any layer changes, the manifest digest changes. Pulling by digest either returns exactly those bytes or fails.

bv records digest, not tag. After building and pushing an image, bv-builder reads the manifest digest back from the registry and writes it to the registry manifest file. bv add reads that digest and embeds it in bv.lock. bv sync pulls by digest. A tag is used only for the initial pull reference; the digest is what actually identifies the image.

the lockfile

bv.lock is a flat TOML file. Each entry records the tool id, version string, image reference, manifest digest, the manifest's own sha256, and the binary routing index. It is generated deterministically from bv.toml and the registry contents.

The design mirrors Cargo.lock and uv.lock: the lockfile is the reproducibility artifact. Commit it. Do not gitignore it. On any other machine, bv sync reads the lockfile and pulls exactly those digests; the decision of what to pull was already made when the lockfile was written.

bv lock --check exits 1 if the lockfile would change given the current bv.toml and registry contents. Use it in CI to detect drift:

bv lock --check   # fails if bv.lock is out of date
bv sync --frozen  # fails if bv.toml and bv.lock are inconsistent

usability

the same workflow, on a laptop and on HPC

Docker is standard on laptops. Most HPC clusters don't allow it: the daemon requires elevated privileges, and shared systems don't give those out. Apptainer (formerly Singularity) runs user-space containers without a daemon, so it's the de facto standard on HPC.

Most Docker-based bioinformatics tools just don't work on HPC. Most Apptainer-native workflows are one-off shell scripts that don't transfer to a laptop. The bifurcation is a real cost: the same analysis requires different tooling depending on where you run it.

bv uses the same manifest, the same lockfile, and the same bv run syntax on both backends. The backend is either auto-detected (bv doctor shows what's available) or explicitly pinned in bv.toml or via BV_BACKEND. The only thing that changes between backends is how the container is launched; the image, the mount paths, and the binary routing are identical.

Apptainer's read-only container root is the main practical difference. Docker's writable upper layer means a tool can write anywhere inside the image; on Apptainer it cannot. bv handles this with declared cache paths: the tool's manifest lists the paths it writes to, and bv binds writable host directories there automatically. The common well-known paths (/cache, /root/.cache) are bound by default as a fallback for tools whose manifests haven't declared them yet.

tools as typed interfaces, not just binaries

pip, conda, and apt see a tool as a name, a version, and a binary. That's enough to install it, but not enough to compose tools, validate pipelines, or generate tool descriptions for AI assistants.

bv manifests declare typed inputs and outputs using the bv-types vocabulary: fasta, fastq, bam, vcf, blast_tab, pdb, and about 15 more. Each I/O entry has a name, a type, a cardinality, and an optional description.

This schema does three things. First, it enables search: bv search fasta returns tools that accept FASTA input. Second, it enables validation: before running a pipeline step, bv can check that the declared output type of one tool matches the declared input type of the next. Third, it generates MCP tool descriptors that AI assistants can consume directly:

bv show blast --format mcp        # MCP tool descriptor for Claude, etc.
bv show blast --format json-schema # JSON Schema for the tool's inputs

Tools without typed I/O are assigned the experimental tier and are excluded from default search results. This creates a mild incentive for manifests to be complete without making it a hard gate on publication.

binary routing

A typical bioinformatics project uses five to twenty tools. Those tools expose dozens of binaries: blast exposes blastn, blastp, tblastn, makeblastdb; samtools exposes samtools; htslib exposes bgzip, tabix. You want to call these by name in your scripts without prefixing every command with docker run ....

bv generates PATH shims under .bv/bin/, one per exposed binary. Each shim is a small script that calls bv run <binary>. bv exec prepends .bv/bin/ to PATH before running any command, so existing scripts, Makefiles, and Snakemake rules work without modification.

The binary index (which binary routes to which tool) is stored in bv.lock. Conflicts (two tools exposing the same binary name) are detected at bv lock time with a clear error message, not at runtime. Resolve them with [binary_overrides] in bv.toml.

auto-sync before exec

A recurring failure mode: someone adds a tool to bv.toml on one machine, commits the file, and then runs a command on another machine without first running bv sync. The binary shim doesn't exist yet. The command fails with a confusing error about a missing executable rather than a clear "you need to sync."

bv exec checks whether the project state matches bv.lock before running. If images are missing or shims are stale, it syncs first. This mirrors how uv run installs dependencies before executing. The check is fast: it compares digests from the lockfile against what's cached locally without hitting the network if everything is present.

Pass --no-sync to skip this check when you know the environment is current and don't want the overhead.

prior work

comparison

feature conda / mamba Docker (manual) Nextflow / Snakemake bv
pins by digest no (version strings) if you write it down per-process container refs yes, in bv.lock
works on HPC without root yes no with Apptainer backend yes (Apptainer backend)
same config, laptop + HPC yes (if env is compatible) no no (backend differs) yes
layer deduplication across tools n/a only if Dockerfiles share base only if images share base yes (per-package layers)
typed I/O schema no no no yes (bv-types vocabulary)
binary routing (call by name) yes (active env) no no yes (PATH shims)
transitive dep resolution yes (full SAT solver) manual or via base image manual yes (greedy BFS, no solver)

bv doesn't replace Nextflow or Snakemake for complex multi-step workflows; it manages the tool layer that those frameworks call into. The intended position: bv handles "which exact bytes of samtools am I running," and the workflow framework handles "in what order do I run these steps."

Compared to plain conda: bv doesn't do full SAT solving, so it will miss some version conflicts that mamba would catch. The tradeoff is that bv produces an OCI artifact with a stable digest, which conda does not. If you need the solver, use mamba to build the environment and bv to package and pin the result.

the pull pipeline

On macOS, Docker Desktop routes all docker pull commands through a Linux VM. Layers are fetched by the VM's network stack, transferred to the macOS host, then decompressed by containerd. The VM hop adds latency and CPU overhead for every layer.

bv bypasses this for GHCR images: it downloads all layer blobs directly over native HTTPS, assembles an OCI Image Layout tar in memory, and pipes it to docker load. The download happens on the host; containerd only sees the final assembled archive.

bearer authentication

OCI registries use OAuth2-style bearer tokens. bv performs the standard exchange:

  1. Probe GET /v2/ to receive a 401 with a WWW-Authenticate: Bearer realm=...,service=...,scope=... header.
  2. Read credentials from ~/.docker/config.json. bv checks credHelpers[registry] first, then credsStore, then the base64 auths entry. If a helper is configured (e.g. docker-credential-desktop), bv spawns it as a subprocess, writes the registry hostname to stdin, and parses the JSON credentials response.
  3. Fetch a bearer token scoped to repository:<name>:pull using Basic auth against the token endpoint from realm.

This mirrors what docker pull does internally, but in-process on the host rather than inside the VM.

manifest and layer downloads

GET /v2/<repo>/manifests/<digest> returns the raw manifest JSON. bv preserves the exact response bytes; sha256(manifest_bytes) must equal the digest stored in bv.lock. The Docker-Content-Digest response header provides the digest; if absent, bv computes it from the bytes.

The manifest lists a config blob and all layer blobs. bv spawns a tokio JoinSet task for each blob and downloads them all concurrently. For a typical bv image (samtools has ~15 layers), this matters: layers that would download in ~8s sequentially complete in ~1.5s at 10x concurrency. After all tasks complete, bv restores manifest order for reproducibility before assembly.

bv.lock pins by OCI manifest digest, not by tag. Fetching /manifests/sha256:3f7a... guarantees exactly those bytes are returned, or the request fails. There is no ambiguity from tag mutability.

OCI Image Layout assembly

The OCI Image Layout spec defines a directory structure that Docker 28+ accepts via docker load:

oci-layout                        # {"imageLayoutVersion":"1.0.0"}
index.json                        # manifest index with ref.name annotation
blobs/sha256/<manifest_hex>      # raw manifest bytes (sha256 = manifest digest)
blobs/sha256/<config_hex>        # image config JSON
blobs/sha256/<layer1_hex>        # compressed layer tar
blobs/sha256/<layer2_hex>        # ...

bv assembles this structure in memory as a tar archive. Layer blobs are deduplicated before writing; OCI images frequently include empty layers with identical digests.

The org.opencontainers.image.ref.name annotation in index.json is required. Without it, docker load imports the blobs but doesn't map the OCI manifest digest as a repo digest. Subsequent docker image inspect --format {{.Id}} <ref>@<digest> calls then fail, which is how bv checks whether an image is already locally available.

loading into Docker

bv spawns docker load, writes the in-memory tar to its stdin, and waits for exit. Docker's containerd backend verifies each blob's sha256, decompresses layers into its content store, and registers the manifest digest. No temporary files are written to disk.

On a subsequent bv sync, is_locally_available runs docker image inspect by digest. The image is found immediately and the pull is skipped entirely.

benchmarks

methodology

All runs were on macOS (Apple Silicon) with Docker Desktop. For each tool count (1, 5, 10):

Tools tested: samtools, bwa, bcftools, blast, minimap2, fastp, fastqc, trimmomatic, seqkit, hisat2. Wall-clock time measured with time. Each condition run once from a fully cold state.

install time

install time at 1, 5, 10 tools for bv, pixi, micromamba

For a single tool, pixi is faster: conda packages are smaller than Docker images and rattler's parallel download is well-optimized for small dependency counts. At 10 tools, bv converges because layer sharing kicks in: the 10th tool in a typical bioinformatics stack shares ~60–70% of its layers with tools already downloaded. The marginal cost per additional tool falls as the installed set grows.

first-run latency

Install time is a one-time cost. First-run latency (time from "installed" to seeing tool output) recurs on every pipeline invocation. bv runs tools as Docker containers; startup with a locally cached image is fast because the runtime fork-execs without any environment activation step.

first-run latency: bv 0.35s, pixi 3.5s, micromamba 4.2s

For pipelines that invoke tools many times, this 10x difference compounds. A Snakemake rule that calls samtools sort once per sample over 200 samples adds ~700s of activation overhead with pixi; with bv the same 200 invocations cost ~70s.