bv

contributing: workspace layout, dev commands, and CI

dev setup

contributing to the bv codebase or registry

getting started

You need Rust stable, Docker (for integration tests), and git.

git clone https://github.com/tejasprabhune/bv
cd bv
cargo build                          # build all workspace crates
cargo test                           # unit tests only (no Docker needed)
cargo test --test integration -- --include-ignored   # integration tests (needs Docker)

Set BV_CACHE_DIR to a temp path so each integration test run starts with a clean cache:

BV_CACHE_DIR=/tmp/bv-test cargo test --test integration -- --include-ignored

Build a release binary and put it on your PATH for manual testing:

cargo build --release -p bv-cli
export PATH="$PWD/target/release:$PATH"
bv doctor

workspace layout

cratewhat it is
bv-clithe bv binary. All commands live here. Depends on everything below.
bv-coreshared types: Manifest, Lockfile, CacheLayout, error types. No I/O.
bv-indexIndexBackend trait plus GitIndex, which clones bv-registry and resolves tool lookups.
bv-runtimeContainerRuntime trait plus DockerRuntime. Handles pull, run, inspect, image availability.
bv-runtime-apptainerApptainer/Singularity backend. Converts OCI images to SIF files, runs with --nv for GPU.
bv-typesThe bv-types vocabulary: FASTA, FASTQ, BAM, VCF, and ~40 others. Used for typed I/O in manifests.
bv-conformanceLibrary for smoke-testing a tool's declared binaries. Used by bv conformance and the registry CI.
bv-builderBuilds factored OCI images from conda specs (one layer per package). Used only in registry CI, not shipped in the user-facing binary.
bv-benchInstall-path benchmark harness: compares bv against mamba, conda, and pixi on install time, footprint, and cold-run latency.

bv codebase

working on the bv CLI and runtime

dev commands

commandwhen to use it
cargo build -p bv-clibuild just the CLI (faster than building the whole workspace)
cargo test -p bv-coreunit tests for a single crate
cargo test --test integration -- --include-ignoredfull integration suite; needs a running Docker daemon
cargo clippy -- -D warningsmust be clean before opening a PR
cargo fmt --checkformatting check; run cargo fmt to fix
cargo deny checklicense and vulnerability audit (uses deny.toml)

adding a command

  1. Add a variant to the relevant Commands or sub-command enum in bv-cli/src/cli.rs.
  2. Add a pub (async) fn run(...) in bv-cli/src/commands/<name>.rs.
  3. Wire it in bv-cli/src/commands/mod.rs and the dispatch match in bv-cli/src/main.rs.
  4. Add an integration test in bv-cli/tests/integration.rs. Mark #[ignore] if Docker is required.

All user-visible output goes to stderr. Only structured data (tables, JSON) goes to stdout.

conformance (bv-conformance)

bv-conformance is the library behind bv conformance <tool>. It pulls the tool's image and smoke-checks each binary declared in [tool.binaries] by trying --version, -version, --help, -h, -v, version, and a bare invocation in order. A binary passes if any probe exits 0 or produces more than 30 bytes of output.

# run locally against the live registry
bv conformance samtools
bv conformance samtools --backend apptainer

# run against a local registry clone (useful when reviewing a new manifest)
bv conformance samtools --registry /path/to/bv-registry

# run against a specific image digest (skip the registry lookup)
bv conformance samtools --digest sha256:abc123...

The library is in bv-conformance/src/runner.rs. The entry point is runner::run(manifest, image_digest, runtime), which returns a ConformanceResult with per-binary pass/fail messages.

The conformance suite also runs in registry CI on every PR that touches tools/**, and on a weekly schedule for the full set of core tools. See .github/workflows/conformance.yml in bv-registry.

conformance sweep

A conformance sweep runs bv conformance --all against every tool in the registry to verify that each tool's declared binaries produce correct output. Run a sweep before merging changes that affect many tools, or after a builder upgrade.

prerequisites

running a sweep

# full sweep, skipping GPU-only tools and tools that require reference data
bv conformance --all --skip-gpu --skip-reference-data

# single tool
bv conformance samtools

output status codes

statusmeaning
PASSall declared binaries responded to at least one probe
FAILone or more binaries did not respond to any probe
SKIPtool was excluded by a flag (--skip-gpu, --skip-reference-data) or has no declared binaries
ERRthe tool's image could not be pulled or the container failed to start

when a tool fails

  1. Check the tool's manifest conformance section to see which binaries are declared and what probes are expected.
  2. Rebuild the image using bv-builder or by re-running the build-factored.yml workflow.
  3. Re-run bv conformance <tool-name> to confirm the fix before opening a PR.
Conformance must pass for all affected tools before a PR can be merged to the core or community tier. The conformance.yml CI workflow enforces this automatically on every PR that touches tools/**.

benchmark (bv-bench)

bv-bench compares bv against mamba, conda, and pixi across three metrics: install time, disk footprint, and cold-run latency. It uses two fixture suites:

Each suite has three fixture sizes: 1 tool, 5 tools, and 10 or 20 tools.

# build the harness
cargo build --release -p bv-bench

# mac suite: bv + mamba + conda + pixi
./target/release/bv-bench --suite mac --mamba --conda --pixi

# linux suite: bv + mamba (pixi/conda will show n/a for linux-only fixtures on macOS)
./target/release/bv-bench --suite linux --mamba --conda --pixi

# output as JSON instead of a table
./target/release/bv-bench --suite mac --mamba --json-out results.json

# override the work directory (default: /tmp/bv-bench)
./target/release/bv-bench --suite mac --mamba --work-dir /tmp/my-bench

how footprint is measured

bv reads image_size_bytes from the bv.lock file that bv add writes. This is the compressed OCI image size for each tool. mamba, conda, and pixi report the uncompressed size of the conda environment directory.

The fixtures are defined in bv-bench/src/fixture.rs. The install paths (bv, mamba, conda, pixi) are in bv-bench/src/main.rs.

bv-registry

adding tools, rebuilding images, understanding CI

layout

bv-registry/
  tools/          # manifests, one .toml per (tool, version)
    samtools/
      1.21.0.toml
  specs/          # build specs for the factored OCI builder
    samtools/
      1.21.0.toml
  web/            # static website (GitHub Pages)
  index.json      # pre-computed search index, updated by CI
  scripts/        # update-index.py, update-popularity.py

A tool entry needs both a manifest (tools/) and a build spec (specs/). The manifest describes the tool's I/O, hardware, and entrypoint. The spec tells the builder which conda packages to include and which platform to target.

adding a tool spec

For conda-based tools (the common case), add two files:

# 1. build spec: tells bv-builder what to build
cat > specs/seqtk/1.4.0.toml << 'EOF'
name = "seqtk"
version = "1.4.0"
channels = [
    "https://conda.anaconda.org/bioconda",
    "https://conda.anaconda.org/conda-forge",
]
packages = ["seqtk ==1.4.0"]
platform = "linux/amd64"

[entrypoint]
command = "/opt/conda/bin/seqtk"
EOF

# 2. tool manifest: describes the tool for users
cat > tools/seqtk/1.4.0.toml << 'EOF'
[tool]
id = "seqtk"
version = "1.4.0"
description = "seqtk: toolkit for processing FASTA/FASTQ sequences"
homepage = "https://github.com/lh3/seqtk"
license = "MIT"
tier = "community"
maintainers = ["github:lh3"]

[[tool.inputs]]
name = "reads"
type = "fastq"
cardinality = "one"
description = "Input reads"

[[tool.outputs]]
name = "reads_out"
type = "fastq"
cardinality = "one"
description = "Processed reads"

[tool.entrypoint]
command = "seqtk"
EOF

Push to a branch and open a PR. CI builds the image and runs conformance automatically. You do not need to run the builder locally.

The [tool.image] section (digest, reference) is written by CI after the build, not by hand. Leave it out of the manifest you submit.

image builder (bv-builder)

bv-builder turns a specs/<tool>/<version>.toml into a factored OCI image: one layer per conda package, pushed to ghcr.io/tejasprabhune/bv-pkg/<tool>:<version>. CI runs the builder automatically on every new or changed spec file.

You rarely need to run it locally, but when you do:

cargo build --release -p bv-builder

# resolve a spec to a pinned package list (dry run, no image built)
./target/release/bv-builder resolve specs/samtools/1.21.0.toml

# build and save as a tar archive (does not push)
./target/release/bv-builder build specs/samtools/1.21.0.toml --out /tmp/samtools.tar

# build and push to GHCR (needs GHCR_TOKEN or GITHUB_TOKEN in env)
./target/release/bv-builder build specs/samtools/1.21.0.toml --push

how it works

  1. resolve: downloads repodata.json for each channel, BFS-resolves transitive deps with numeric version comparison.
  2. build layers: downloads each conda package (.conda or .tar.bz2), extracts it into a reproducible tar layer. One layer per package.
  3. push: pushes all layers to GHCR, retrying on 429 rate-limit responses with exponential backoff (up to 8 attempts).
  4. update manifest: CI writes the pushed digest back to tools/<tool>/<version>.toml and commits.

Large tools (metaphlan4 has ~530 packages) take 20+ minutes to push due to GHCR's per-minute blob upload limit.

CI workflows

workflowtriggerwhat it does
build-factored.ymlpush to specs/**, or manual dispatchbuilds and pushes OCI images for changed specs; updates digest in tools/
conformance.ymlPR touching tools/**, weekly schedulesmoke-tests each declared binary using bv conformance
update-index.ymlpush to tools/**regenerates index.json for the website and bv search
update-popularity.ymlweekly schedulerefreshes download counts from conda-stats
pages.ymlpush to web/**deploys the static site to GitHub Pages

rerunning a failed build

Build failures are usually transient (network error downloading a conda package, or GHCR rate limit on large images). To rerun a single tool:

gh workflow run build-factored.yml \
  --repo tejasprabhune/bv-registry \
  -f spec=specs/metaphlan4/4.1.1.toml

To rebuild all tools at once (after a breaking change to the builder, for example):

gh workflow run build-factored.yml --repo tejasprabhune/bv-registry

The detect job runs first and lists every specs/**/*.toml; each spec becomes a parallel matrix job.

conventions

code style

PR conventions