Skip to content

feature: allow specifying which shell to use for cram test#8134

Closed
haochenx wants to merge 33 commits intoocaml:mainfrom
haochenx:cram-shell-spec
Closed

feature: allow specifying which shell to use for cram test#8134
haochenx wants to merge 33 commits intoocaml:mainfrom
haochenx:cram-shell-spec

Conversation

@haochenx
Copy link
Member

@haochenx haochenx commented Jul 6, 2023

Design

Add (shell ..) option to the (cram) stanza and user action. The syntax is described below:

  • (shell :system) will use the system shell (aka /usr/bin/env sh)
  • (shell :bash) will use bash (aka /usr/bin/env bash)
  • (shell <dep>) will resolve <dep> (that may contain variables) as a path to an executable and use it as the shell
    • e.g. (shell %{bin:zsh}) will use zsh as the shell
    • e.g. (shell %{bin:dash}) will use dash as the shell

Discussions

  • Ast of cram action is changed (to accommodate specifying shell_spec)
    • this might be a breaking change for external tools relying on output of dune rules
    • it seems that dune rules does not output entries regarding cram tests so we can probably safely assume
      that currently no external tools depends on the encoded action spec

Sub-tasks

  • allow config of shell for Cram_exec.action (done in 80a5248)
  • add stanza syntax (done in 506ff24)
    • (cram (shell <shell_spec>)) where <shell_spec> could be
      • :system, which is interpreted as System_shell (aka env sh)
      • :bash, which is interpreted as Bash_shell (aka env bash)
      • <dep>, which is interpreted the same way as those in (deps <dep>)
  • add rule action syntax (done in c5724de)
    • seems like an undocumented feature?
  • add new test cases
    • for stanza syntax (test/blackbox-tests/test-cases/cram/{shell-opt,shell-opt-zsh}.t added)
    • for rule action syntax (test/blackbox-tests/test-cases/cram/cram-action-shell-opt.t added)
    • for min dune lang version (test/blackbox-tests/test-cases/cram/shell-opt-dune-lang-version.t added)
  • check for dune language version (>= 3.10)
  • modify documentation
  • add change log

Optional?

Signed-off-by: Haochen Kotoi-Xie <hx@kxc.inc>
@haochenx haochenx marked this pull request as draft July 6, 2023 18:37
haochenx added 5 commits July 7, 2023 06:17
Signed-off-by: Haochen Kotoi-Xie <hx@kxc.inc>
Signed-off-by: Haochen Kotoi-Xie <hx@kxc.inc>
Signed-off-by: Haochen Kotoi-Xie <hx@kxc.inc>
Signed-off-by: Haochen Kotoi-Xie <hx@kxc.inc>
Signed-off-by: Haochen Kotoi-Xie <hx@kxc.inc>
@haochenx haochenx changed the title WIP: allow specifying which shell to use for cram test WIP: feature: allow specifying which shell to use for cram test Jul 6, 2023
haochenx added 3 commits July 7, 2023 16:46
Signed-off-by: Haochen Kotoi-Xie <hx@kxc.inc>
Signed-off-by: Haochen Kotoi-Xie <hx@kxc.inc>
Signed-off-by: Haochen Kotoi-Xie <hx@kxc.inc>
@haochenx
Copy link
Member Author

haochenx commented Jul 7, 2023

@rgrinberg do you mind to take a look at the design and implementation of this PR

@johnridesabike do you think this approach satisfy your use case?

(cram)

(cram
(enabled_if %{bin-available:zsh})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you do this, you will have to add zsh to the CI so we can run the tests. Probably best if you restrict to unix too.

Copy link
Member Author

@haochenx haochenx Jul 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the macOS CI environment should have zsh available.

What's the additional benefits to add restriction to unix in addition to checking whether zsh exists?

Copy link
Collaborator

@Alizter Alizter Jul 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well actually I'm thinking that we shouldn't be using bin-available. If we miss putting zsh in the CI then this test will never be run, and it might surprise some other devs when they have zsh if this test is broken. So we have three choices:

  1. Either we make zsh mandatory (hence the unix restriction) so that the test is always seen to run. We can therefore catch any changes in CI.
  2. We make a dummy zsh or any other shell, just to check how it works. (echo would probably work)
  3. We don't test zsh.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I did have similar feeling when I added the test, but I ended up with adding them since I wanted to have some non-artificial tests.

But given that your concern is solid, I am in favor of doing (1) or (3) as you suggested. I don't think (2) is a good option since it seems that dune's cram test executor expects the shell at least have (a) output redirecting and (b) script sourcing features and making a convincing dummy zsh above the level of what is already done in shell-opt.t requires some significant efforts.

Regarding (1), I'm in general in favor of it, but since it involves changing the CI configuration, I don't think I should make the call.

If the maintainers of dune is not in favor of (1), then I think we should take (3) as missing this test case isn't crucial for the completeness of testing the feature added in this PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(thank you for taking a look btw!)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How should we reach to a consensus here?

@johnridesabike
Copy link

This current implementation would certainly be helpful, but using an environment variable or CLI argument would be preferable for my use case.

My goal is to check that my tests are shell-agnostic, so I'd like to run them in something like dash locally (or in multiple different shells), but allow other users/CIs run them in whatever they have installed. I wouldn't want to commit a Dune rule that requires using a specific shell.

@haochenx
Copy link
Member Author

haochenx commented Jul 7, 2023

It's probably not what you want, but since you can refer to environment variable, you could probably achieve what you wanted to achieve.

For example, you can use (shell %{env:DUNE_CRAM_SHELL=/bin/sh}) to achieve what you described.

Yet it is indeed not as elegant. I will probably look into using an environment variable, but I think we need to set the expected behavior straight first I don't think I will be adding it in this PR.

haochenx added 2 commits July 7, 2023 23:29
Signed-off-by: Haochen Kotoi-Xie <hx@kxc.inc>
Signed-off-by: Haochen Kotoi-Xie <hx@kxc.inc>
@haochenx haochenx marked this pull request as ready for review July 7, 2023 14:35
@haochenx haochenx requested a review from christinerose as a code owner July 7, 2023 14:35
@haochenx haochenx changed the title WIP: feature: allow specifying which shell to use for cram test feature: allow specifying which shell to use for cram test Jul 7, 2023
@johnridesabike
Copy link

For example, you can use (shell %{env:DUNE_CRAM_SHELL=/bin/sh}) to achieve what you described.

Wouldn't this require all users to have that environment variable set in order to run cram tests?

If we can only configure it in a Dune file, then I'll just add a (shell ...) stanza locally and not commit it. That's not necessarily a problem, just a little clunky. It's still better than not having the feature at all though.

@haochenx
Copy link
Member Author

haochenx commented Jul 7, 2023

For example, you can use (shell %{env:DUNE_CRAM_SHELL=/bin/sh}) to achieve what you described.

Wouldn't this require all users to have that environment variable set in order to run cram tests?

The =/bin/sh part provides the default value (in this case, /bin/sh) so I guess not all users need to specify that environment variable.

I agree that being able to specify the shell elegantly via environment variable is a great feature.

I will think about how the spec.

@haochenx
Copy link
Member Author

haochenx commented Jul 8, 2023

I'm thinking about to add two environment variables that matters for dune:

  • DUNE_CRAM_SYSTEM_SHELL, which if set, decides the path when Shell_spec.System_shell is selected
  • DUNE_CRAM_BASH_SHELL, which if set, decides the path when Shell_spec.Bash_shell is selected

As it comes for determining the actual path for Shell_spec.System_shell (and respectively for Shell_spec.Bash_shell), the following logic is to be used:

  • check whether envar DUNE_CRAM_SYSTEM_SHELL exists and being non-empty, if so
    • if the given path is an absolute path, check its existence and use it as the shell program
      • if the path does not exists or is a directory, issue an error
    • if the given path is a simple name, resolve it with Stdune.Bin.which
      • if this fails, issue an error
    • otherwise, issue an error
  • otherwise, use the current behavior (viz. resolve "sh" with Stdune.Bin.which)

Comments are welcomed.

@Alizter
Copy link
Collaborator

Alizter commented Jul 8, 2023

I would go for something more general. We have system and bash actions that users can put in rules. We should just have DUNE_BASH_SHELL and DUNE_SYSTEM_SHELL configure these also, not just cram. However that would make the scope of this PR bigger than it needs to be.

@haochenx
Copy link
Member Author

haochenx commented Jul 8, 2023

I would go for something more general. We have system and bash actions that users can put in rules. We should just have DUNE_BASH_SHELL and DUNE_SYSTEM_SHELL configure these also, not just cram. However that would make the scope of this PR bigger than it needs to be.

Sounds better! I agree that this change will go beyond the scope of this PR. I'm happy to submit another PR to implement that. If my understanding is correct, the place also need to be modify should be the following?

| System cmd ->
let path, arg =
Utils.system_shell_exn ~needed_to:"interpret (system ...) actions"
in
let+ () = exec_run ~display ~ectx ~eenv path [ arg; cmd ] in
Done
| Bash cmd ->
let+ () =
exec_run ~display ~ectx ~eenv
(bash_exn ~loc:ectx.rule_loc ~needed_to:"interpret (bash ...) actions")
[ "-e"; "-u"; "-o"; "pipefail"; "-c"; cmd ]
in
Done

sh.ml
$ cat foo.cram.corrected
$ echo "foo from foo.cram"
***** UNREACHABLE *****
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this and the rest unreachable?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My guess is that when creating foo.cram, the $ is interpreted by the parent shell. You can use << 'EOF' line 69 to prevent this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests try to see whether the PATH environment variable is regarded by making our own bash and sh compiled from bash.ml and sh.ml. But since those are nowhere near complete shell implementation, they do not handle stdout redirection directives (like echo hello > file.txt), which cram_exec relies on to capture the output. Therefore the cram test executor will think the cram test crashed and include ***** UNREACHABLE ***** here.

We know our own mocked version of bash and sh are invoked here by looking at https://github.com/ocaml/dune/pull/8134/files/e2913128ca6046e24b2dbfa7255be40588ce58d1#diff-0484438795287a34ad56a4a530a816caea0a5480e23fffb006408f39591b35e6R74 and https://github.com/ocaml/dune/pull/8134/files/e2913128ca6046e24b2dbfa7255be40588ce58d1#diff-0484438795287a34ad56a4a530a816caea0a5480e23fffb006408f39591b35e6R95

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is less than ideal but I didn't come up with other ways to perform this test.

@rgrinberg
Copy link
Member

The problem with the special symbols is twofold:

  • The dash and oil shells are just as useful as zsh and bash.
  • You're not allowing users to customize the command line arguments for running the shell

This raise ambiguity though since we don't know whether we should use dune's dependency language to expand bash (or system, respectively) or should lookup bash (or sh, respectively) using which.

Do you mean the dependency language or for allowing percent forms? I assume you meant the latter because that's the only thing that makes sense in this context. I think we should allow for expanding dune's percent forms and we should also allow command line arguments as well. For example, one should be able to:

(cram
 (shell %{bin:dash} -v %{script})) ;; %{script} will be expanded to the path of the shell script. like %{input-file} for ppx.

Dune has a single way of resolving binaries so the question regarding which are automatically resolved. E.g. see how Run is expanded in action_unexpanded.ml.

Once this is resolved and you remove the optional argument, this PR is good to go.

@haochenx
Copy link
Member Author

@rgrinberg @emillon thanks for having a look and point out a direction. I will take sometime later this week to re-look at this issue and bring it forward.

@haochenx
Copy link
Member Author

haochenx commented Sep 20, 2023

Unfortunately (and my apologies for that) I didn't get time to work on this PR last week.. I will see whether I can find some time this weekend.

Signed-off-by: Haochen M. Kotoi-Xie <hx@kxc.inc>
Signed-off-by: Haochen M. Kotoi-Xie <hx@kxc.inc>
@haochenx haochenx marked this pull request as draft September 25, 2023 03:01
Signed-off-by: Haochen M. Kotoi-Xie <hx@kxc.inc>
Signed-off-by: Haochen M. Kotoi-Xie <hx@kxc.inc>
Signed-off-by: Haochen M. Kotoi-Xie <hx@kxc.inc>
type t = int * int

let latest = 3, 11
let latest = 3, 12
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure whether this should be in this PR

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a separate PR would be best.

Signed-off-by: Haochen M. Kotoi-Xie <hx@kxc.inc>
Comment on lines +174 to +199
| Custom_shell { prog; args } ->
let shell_deps, sandbox = Dep_conf_eval.unnamed ~expander [ File prog ] in
let dir = Path.build dir in
let expand_arg sw =
let open Action_builder.O in
let+ v =
String_expander.Action_builder.expand
sw
~dir
~mode:Many
~f:(Expander.expand_pform expander)
in
Value.L.to_strings v ~dir
in
let shell =
let open Action_builder.O in
let+ shell_args = Action_builder.all (List.map args ~f:expand_arg)
and+ shell_prog =
Expander.(
With_deps_if_necessary.expand_single_path expander prog
|> Deps.action_builder)
in
shell_prog, List.concat shell_args
in
return ~shell_deps ~sandbox (Some shell)
in
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure whether this is the best way to expand the prog path and arg here

@haochenx
Copy link
Member Author

I have addressed #8134 (comment) except for %{script}, which I find a bit difficult to do.

@rgrinberg do you think %{script} is necessary?

Signed-off-by: Haochen M. Kotoi-Xie <hx@kxc.inc>
@haochenx haochenx marked this pull request as ready for review September 25, 2023 04:05
@rgrinberg
Copy link
Member

do you think %{script} is necessary?

I can give it a try myself

@haochenx
Copy link
Member Author

@rgrinberg regarding %{script}, I could probably attempt if you could kindly
(1) point me to code that demonstrates how it could be implement
(2) confirm the exact semantics that we want: I assume that we should stop appending the %{script} automatically when one is specified

@rgrinberg
Copy link
Member

Thanks @haochenx.

  1. You need to use the Expander.add_bindings function to add a script binding. See how we handling %{input-file} it should be pretty similar. Maybe we should just call it %{input-file} as well. I wonder what you think about that.
  2. That's correct.

@haochenx
Copy link
Member Author

@rgrinberg thanks for the pointers. last time I tried to grep for input-file to find code examples, but the logic involved was quite complicated and not directly referable. I will check again.

Using Expander.add_bindings to add the binding is one thing, but suppressing the default behavior is another. When attempting, I will see which of the following approach fit the code base the best, but please review what I may do:

Maybe we should just call it %{input-file} as well. I wonder what you think about that.

My take is that it will be confusing for the users, as it's not strictly the "input file", as the script executed is quite heavily processed snippets of the original .t test description files. So sticking to %{script} is a better idea IMO.

(I estimate that I will attempt to implement %{script} in a couple of days, but no guarantee.)

Signed-off-by: Christine Rose <christinerose@users.noreply.github.com>
rgrinberg added a commit that referenced this pull request Dec 28, 2025
Allow explicitly using `bash` as the shell instead of `sh`.

Extracted and/or loosely inspired form #8134

---------

Signed-off-by: Rudi Grinberg <me@rgrinberg.com>
@rgrinberg
Copy link
Member

We ended up providing bash as an option. Not as general as this PR, but enough to cover our needs. Feel free to extend what we have to add more shells.

@rgrinberg rgrinberg closed this Jan 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants