cuirass: Add build products download support.

  • Done
  • quality assurance status badge
Details
5 participants
  • Danny Milosavljevic
  • Ludovic Courtès
  • Mathieu Othacehe
  • Mathieu Othacehe
  • Mathieu Othacehe
Owner
unassigned
Submitted by
Mathieu Othacehe
Severity
normal
M
M
Mathieu Othacehe wrote on 1 May 2020 10:54
(address . guix-patches@gnu.org)
87ees4uja7.fsf@gmail.com
Hello,

Here's a patch adding support for build products downloading in
Cuirass. It is inspired by a similar mechanism in Hydra.

Attached a screenshot of what I obtained with the following
specification:

Toggle snippet (21 lines)
(define hello-master
'((#:name . "guix-master")
(#:load-path-inputs . ())
(#:package-path-inputs . ())
(#:proc-input . "guix")
(#:proc-file . "build-aux/cuirass/gnu-system.scm")
(#:proc . cuirass-jobs)
(#:proc-args (subset . "all"))
(#:inputs . (((#:name . "guix")
(#:url . "https://gitlab.com/mothacehe/guix")
(#:load-path . ".")
(#:branch . "master")
(#:no-compile? . #t))))
(#:build-outputs . (((#:job . "iso9660-image*")
(#:type . "iso")
(#:output . "out")
(#:path . ""))))))

(list hello-master)

Thanks,

Mathieu
Attachment: download.png
D
D
Danny Milosavljevic wrote on 1 May 2020 12:09
(name . Mathieu Othacehe)(address . m.othacehe@gmail.com)
20200501120914.606ffe02@scratchpost.org
Hi Mathieu,

very cool!

Though I agree using sendfile would be much better, especially since the user
can download 800 MB image files there.

The guile (web server) module allows passing a procedure as the #:body, but
then it makes a bytevector out of the result and hard-codes the content-type :P.

Eventually (web server http) http-write is reached, which only supports encoding
bytevectors and #f, that's it. No files.

So we'd have to overwrite http-write.

But we are using our own (web server fiberized) impl already.

So our impl chould be extended to be able to get and process FDs.

client-loop there has

(lambda (response body)
(write-response response client)
(when body
(put-bytevector client body))

which means the "when body" part should be extended to also handle files, not just bytevectors.
-----BEGIN PGP SIGNATURE-----

iQEzBAEBCAAdFiEEds7GsXJ0tGXALbPZ5xo1VCwwuqUFAl6r9UoACgkQ5xo1VCww
uqV0Ygf7BNjc81KkTQWXt1QmYRHfzdoPbd+ejt9vYvY4GNmF8IgKMZkPcZoHxPDH
Us3I1WRknemIK0hDngXqaRaq6UBaPUQ1TntES0CcmGmpe3CK3UZCEqlgIVwXCIOI
GIzgJm7y1zLaTNc7SiL0lSzbOSGtBSyunyQVFchpqKdBNTVgDiYcbhSHNrNSqPxq
/+OCuRv1gHM8MGBZFQpUJh4ehQp2O/JXAgyJD2DGKDdHFIR15rn/Voel6DQTLdiJ
//vPJWAu/BNhOeNuvvSm/ZagbRnCAfLRQBp2E6Vyu7UO72gIXYwSMn/IwIQyZW+W
76TLr3lzKkVl/KwVFKav2qzbZADFRA==
=AN9j
-----END PGP SIGNATURE-----


M
M
Mathieu Othacehe wrote on 1 May 2020 15:35
(name . Danny Milosavljevic)(address . dannym@scratchpost.org)
874ksz4w21.fsf@gmail.com
Hey Danny,

Toggle quote (2 lines)
> very cool!

Thanks :)

Toggle quote (24 lines)
> Though I agree using sendfile would be much better, especially since the user
> can download 800 MB image files there.
>
> The guile (web server) module allows passing a procedure as the #:body, but
> then it makes a bytevector out of the result and hard-codes the content-type :P.
>
> Eventually (web server http) http-write is reached, which only supports encoding
> bytevectors and #f, that's it. No files.
>
> So we'd have to overwrite http-write.
>
> But we are using our own (web server fiberized) impl already.
>
> So our impl chould be extended to be able to get and process FDs.
>
> client-loop there has
>
> (lambda (response body)
> (write-response response client)
> (when body
> (put-bytevector client body))
>
> which means the "when body" part should be extended to also handle files, not just bytevectors.

The problem is that even with our fiberized implementation, what we pass
as "body" is checked in "sanitize-response" procedure of Guile's (web
server) module.

With the (very) hacky patch attached, I fool sanitize-response, by
sending the file name as a bytevector. This allows me to save gigabytes
of RAM when downloading disk images.

WDYT?

Thanks,

Mathieu
L
L
Ludovic Courtès wrote on 1 May 2020 23:11
(name . Danny Milosavljevic)(address . dannym@scratchpost.org)
87d07ne4xh.fsf@gnu.org
Hi,

Danny Milosavljevic <dannym@scratchpost.org> skribis:

Toggle quote (11 lines)
> Though I agree using sendfile would be much better, especially since the user
> can download 800 MB image files there.
>
> The guile (web server) module allows passing a procedure as the #:body, but
> then it makes a bytevector out of the result and hard-codes the content-type :P.
>
> Eventually (web server http) http-write is reached, which only supports encoding
> bytevectors and #f, that's it. No files.
>
> So we'd have to overwrite http-write.

See how ‘guix publish’ uses ‘sendfile’ for nars. It’s hacky because it

Ludo’.
L
L
Ludovic Courtès wrote on 1 May 2020 23:17
(name . Mathieu Othacehe)(address . m.othacehe@gmail.com)
871ro3e4oa.fsf@gnu.org
Hello!

Mathieu Othacehe <m.othacehe@gmail.com> skribis:

Toggle quote (4 lines)
> With the (very) hacky patch attached, I fool sanitize-response, by
> sending the file name as a bytevector. This allows me to save gigabytes
> of RAM when downloading disk images.

Yay! This is similar to what ‘guix publish’ does. :-)

Toggle quote (16 lines)
> From 0c5e91c170639d50d1cc339fa0b0e68ea4fba68c Mon Sep 17 00:00:00 2001
> From: Mathieu Othacehe <m.othacehe@gmail.com>
> Date: Fri, 1 May 2020 15:03:12 +0200
> Subject: [PATCH] cuirass: Use sendfiles instead of raw copies.
>
> * src/cuirass/http.scm (respond-file): Send the file name as an UTF8
> bytevector, instead of the raw file content,
> (respond-gzipped-file): ditto. Also set 'content-disposition header.
> * src/web/server/fiberized.scm (client-loop): Check if 'content-disposition is
> set. If it's the case, assume that the bytevector is the file name, and use
> sendfiles to send it. Otherwise, keep the existing behaviour and send directly
> the received bytevector.


> +(define extend-response (@@ (web server) extend-response))

@@ is evil and it’s not guaranteed to work with Guile 3: the procedure
might be inlined.

But you can use these ‘guix publish’ helper procedures, which rely on
(srfi srfi-9 gnu):

(define (strip-headers response)
"Return RESPONSE's headers minus 'Content-Length' and our internal headers."
(fold alist-delete
(response-headers response)
'(content-length x-raw-file x-nar-compression)))

(define (with-content-length response length)
"Return RESPONSE with a 'content-length' header set to LENGTH."
(set-field response (response-headers)
(alist-cons 'content-length length
(strip-headers response))))

Toggle quote (8 lines)
> + (call-with-input-file file
> + (lambda (port)
> + (write-response
> + (extend-response response 'content-length
> + file-size)
> + client)
> + (sendfile client port file-size))))

I didn’t look at the other patches, but note that ‘sendfile’ blocks.
Since Cuirass is fiberized, you shouldn’t block a fiber.

‘guix publish’ doesn’t use Fibers but it shouldn’t block either while
sending a nar, so what it does is spawn a new thread for the ‘sendfile’
call.

HTH!

Ludo’.
M
M
Mathieu Othacehe wrote on 3 Jun 2020 13:54
(name . Ludovic Courtès)(address . ludo@gnu.org)
874krs74ax.fsf@gnu.org
Hello Ludo,

Toggle quote (7 lines)
> I didn’t look at the other patches, but note that ‘sendfile’ blocks.
> Since Cuirass is fiberized, you shouldn’t block a fiber.
>
> ‘guix publish’ doesn’t use Fibers but it shouldn’t block either while
> sending a nar, so what it does is spawn a new thread for the ‘sendfile’
> call.

Thanks for your help! I copied what's done in (guix scripts publish),
except that I used "non-blocking" instead of using a plain
"call-with-new-thread".

If you could have a short look to the first patch (introducing
build products) and tell me if the concept is ok for you, that would be
great :)

Thanks,

Mathieu
L
L
Ludovic Courtès wrote on 3 Jun 2020 22:14
(name . Mathieu Othacehe)(address . othacehe@gnu.org)
87h7vrkiuf.fsf@gnu.org
Hi!

Mathieu Othacehe <othacehe@gnu.org> skribis:

Toggle quote (13 lines)
> From c99cc0314b98e349a577f38870d1271a3f1c3a54 Mon Sep 17 00:00:00 2001
> From: Mathieu Othacehe <m.othacehe@gmail.com>
> Date: Wed, 3 Jun 2020 13:41:30 +0200
> Subject: [PATCH] cuirass: Use sendfiles instead of raw copies.
>
> * src/cuirass/http.scm (respond-file): Send the file name as 'x-raw-file
> header argument, instead of the raw file content,
> (respond-gzipped-file): ditto. Also set 'content-disposition header.
> * src/web/server/fiberized.scm (strip-headers, with-content-length): New procedures,
> (client-loop): Check if 'x-raw-file is set. If it's the case, use sendfiles to
> send the given file. Otherwise, keep the existing behaviour and send directly
> the received bytevector.

If it works for you, LGTM!

Ludo’.
L
L
Ludovic Courtès wrote on 3 Jun 2020 22:26
(name . Mathieu Othacehe)(address . m.othacehe@gmail.com)(address . 40993@debbugs.gnu.org)
875zc7ki9g.fsf@gnu.org
Hello!

Mathieu Othacehe <m.othacehe@gmail.com> skribis:

Toggle quote (3 lines)
> Here's a patch adding support for build products downloading in
> Cuirass. It is inspired by a similar mechanism in Hydra.

Neat!

Toggle quote (21 lines)
> Attached a screenshot of what I obtained with the following
> specification:
>
> (define hello-master
> '((#:name . "guix-master")
> (#:load-path-inputs . ())
> (#:package-path-inputs . ())
> (#:proc-input . "guix")
> (#:proc-file . "build-aux/cuirass/gnu-system.scm")
> (#:proc . cuirass-jobs)
> (#:proc-args (subset . "all"))
> (#:inputs . (((#:name . "guix")
> (#:url . "https://gitlab.com/mothacehe/guix")
> (#:load-path . ".")
> (#:branch . "master")
> (#:no-compile? . #t))))
> (#:build-outputs . (((#:job . "iso9660-image*")
> (#:type . "iso")
> (#:output . "out")
> (#:path . ""))))))

For the record, in Hydra, build products would be found if there’s a
special ‘nix-support/hydra-build-products’ file in the output. The
advantage is that it’s more flexible, but the downside is that you’d
have to adjust your derivations specifically for that.

Toggle quote (30 lines)
>>From dbb78929d7c8aa3b9007660795f55232ab47dbfb Mon Sep 17 00:00:00 2001
> From: Mathieu Othacehe <m.othacehe@gmail.com>
> Date: Fri, 1 May 2020 10:32:18 +0200
> Subject: [PATCH] Add support for build products downloading.
>
> * src/sql/upgrade-7.sql: New file.
> * Makefile.am: Add it.
> * src/cuirass/base.scm (create-build-outputs): New procedure,
> (build-packages): call it,
> (process-spec): add the new spec argument and pass it to create-build-outputs.
> * src/cuirass/database.scm (db-add-build-product, db-get-build-product-path,
> db-get-build-products): New exported procedures.
> * src/cuirass/http.scm (respond-static-file): Move file sending to ...
> (respond-file): ... this new procedure,
> (url-handler): add a new "download/<id>" route, serving the requested file
> with the new respond-file procedure. Also gather build products and pass them
> to "build-details" for "build/<id>/details" route.
> * src/cuirass/templates.scm (build-details): Honor the new "products" argument
> to display all the build products associated to the given build.
> * src/schema.sql (BuildProducts): New table,
> (Specifications)[build_outputs]: new field.
> * tests/database.scm: Add empty build-outputs spec.
> * tests/http.scm: Ditto.
> * examples/guix-jobs.scm: Ditto.
> * examples/hello-git.scm: Ditto.
> * examples/hello-singleton.scm: Ditto.
> * examples/hello-subset.scm: Ditto.
> * examples/random.scm: Ditto.
> * doc/cuirass.texi (overview): Document it.

[...]

Toggle quote (12 lines)
> + (map (lambda (spec)
> + (let* ((build (find-build (assq-ref spec #:job)))
> + (product (find-product build spec)))
> + (when (and product (file-exists? product))
> + (db-add-build-product `((#:build . ,(assq-ref build #:id))
> + (#:type . (assq-ref spec #:type))
> + (#:file-size . ,(file-size product))
> + ;; TODO: Implement it.
> + (#:sha256-hash . "")
> + (#:path . ,product))))))
> + product-specs))

Use ‘for-each’ if it’s for effects, as seems to be the case.

Regarding #:sha256-hash: there’s a somewhat standard format to represent
hashes and their algorithms as strings, but I forgot the name. Like,
you’d write “sha256-” followed by a base64 string, something like that.

Perhaps it’d be wiser to use it rather than hard-code sha256?

Also, we don’t really have tests for the web UI, I don’t know how much
work it’d be to add tests.

Apart from that, I have little to say, other than the fact that it’s
really cool. :-)

Thank you!

Ludo’.
M
M
Mathieu Othacehe wrote on 10 Jun 2020 17:44
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 40993@debbugs.gnu.org)
87h7vjlyci.fsf@gmail.com
Hola!

Toggle quote (5 lines)
> For the record, in Hydra, build products would be found if there’s a
> special ‘nix-support/hydra-build-products’ file in the output. The
> advantage is that it’s more flexible, but the downside is that you’d
> have to adjust your derivations specifically for that.

Yes, I first hesitated to replicate this mechanism but finally opted for
the implemented one. I didn't like so much the idea of adding and extra,
CI-specific file, to the output of some jobs.

Toggle quote (2 lines)
> Use ‘for-each’ if it’s for effects, as seems to be the case.

Oh you keep repeating that to me, and I keep doing the same mistake!

Toggle quote (6 lines)
> Regarding #:sha256-hash: there’s a somewhat standard format to represent
> hashes and their algorithms as strings, but I forgot the name. Like,
> you’d write “sha256-” followed by a base64 string, something like that.
>
> Perhaps it’d be wiser to use it rather than hard-code sha256?

Yes sure, I changed the database field to "checksum" so that we can use
the string representation you're talking about.

Toggle quote (3 lines)
> Also, we don’t really have tests for the web UI, I don’t know how much
> work it’d be to add tests.

If a more web-aware hacker could propose something, it would be great
indeed :)

Toggle quote (3 lines)
> Apart from that, I have little to say, other than the fact that it’s
> really cool. :-)

Thanks for reviewing! I just pushed those two commits and updated the
Cuirass package. Now I'll see with Marius or Ricardo how to update
Berlin, I guess.

Thanks,

Mathieu
M
M
Mathieu Othacehe wrote on 10 Jun 2020 17:44
control message for bug #40993
(address . control@debbugs.gnu.org)
87d067lybs.fsf@meru.i-did-not-set--mail-host-address--so-tickle-me
close 40993
quit
?