Hello Guix! This patch series adds the Software Heritage (SWH) client library initiallydiscussed at: https://lists.gnu.org/archive/html/guix-devel/2018-11/msg00285.html Furthermore, it uses it in (guix git-download) to download code from SWHwhen it is unavailable upstream and on our servers. This bit relies onthe “vault” API of SWH, which allows you to fetch a checkout as a tarball.Not all revisions are readily available as tarballs, understandably, sothe vault API has a mechanism that allows you to request the “cooking”of a specific checkout. Cooking is asynchronous and can take some time. https://docs.softwareheritage.org/devel/swh-vault/api.html When downloading over SWH, the ‘swh-download’ procedure first resolvesthe tag (if it’s a tag), then tries to download the corresponding tarballfrom the vault. If the vault doesn’t have it yet, it sends a cookingrequest and waits for it to complete by periodically checking the cookingstatus. In the future, we should provide a “lister” and “loader” so that SWH canregularly obtain a list of Guix packages with their source URL andcommit/tag: https://forge.softwareheritage.org/T1352 The SWH team is also considering pre-cooking all VCS tags such thatevery time we refer to a tag, we can be sure its contents are alreadyavailable in the vault: https://forge.softwareheritage.org/T1350 Feedback welcome! Ludo’. Ludovic Courtès (2): Add (guix swh). git-download: Download from Software Heritage as a last resort. Makefile.am | 1 + guix/git-download.scm | 64 +++-- guix/swh.scm | 551 ++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 596 insertions(+), 20 deletions(-) create mode 100644 guix/swh.scm -- 2.19.1
Toggle quote (3 lines)> When downloading over SWH, the ‘swh-download’ procedure first resolves> the tag (if it’s a tag), then tries to download the corresponding tarball
Speaking of tags, it’s not news but tags are bad from a reproducibilitystandpoint: they are mutable and per-repository. Tag lookup isnecessarily relative to a repository URL (and to a snapshot of therepository, since it can be mutated): scheme@(guile-user)> (lookup-origin-revision "https://git.savannah.gnu.org/git/guix.git""v0.15.0") $5 = #<<revision> id: "359fdda40f754bbf1b5dc261e7427b75463b59be" date: #<date nanosecond: 0 second: 39 minute: 16 hour: 22 day: 5 month: 7 year: 2018 zone-offset: 7200> directory: "27c69c5d298a43096a53affbf881e7b13f17bdcd" directory-url: "/api/1/directory/27c69c5d298a43096a53affbf881e7b13f17bdcd/"> So if, say, SWH archived a mirror ofhttps://git.savannah.gnu.org/git/guix.git but nothttps://git.savannah.gnu.org/git/guix.git itself, then tag lookup willfail, which is sad given that the code is actually there. To address this, possible options include: 1. Always store commit IDs rather than tags, effectively giving us “normal” Git content-addressability. This is not great for code readability and review though. 2. Store ‘sha1_git’ hashes (SHA1s of Git trees) instead of or in addition to nar sha256 hashes so we can perform lookups by content hash on SWH or Git mirrors. #2 might be the best long-term option though it would require daemonsupport to compute, store, and check these Git-style hashes. Ludo’.