Add 370 unix101 single-command tasks by gb-vmax · Pull Request #14 · VmaxAI/tasks

gb-vmax · 2026-02-28T00:23:44Z

Summary

Adds 283 validated Harbor-format tasks covering 157 Unix commands across 9 categories
Each task exercises a single command (with flags) — no pipes or chaining
All tasks validated end-to-end in Docker (build, solve, test pass)
Tasks generated by GPT-4.1 using real --help/man output extracted from the Docker base image
Categories: coreutils, text processing, compression, findutils, file utilities, networking, process management, environment, misc

Commands dropped (28)

No passing tasks — need TTY, network, or produced unreliable tests:

arp, bzcat, chgrp, df, dig, fuser, host, hostname, iconv, last, locate, nslookup, numfmt, parallel, patch, pidof, pr, printenv, sha1sum, ss, stat, strings, time, ts, unexpand, uniq, wdiff, who

Generation pipeline

Source repo: https://github.com/gb-vmax/unix-101
discover.py extracts real help text from the Docker container (197 commands discovered)
generate.py feeds help text to GPT-4.1 to produce tasks with setup, solution, and test scripts
validate.py builds each task in Docker, runs the solution, runs the test — only passing tasks included

Test plan

Run validate.py against all tasks in Docker
Remove tasks that fail validation
Verify task coverage across command categories

🤖 Generated with Claude Code

GPT-4.1-generated tasks covering 185 Unix commands across 9 categories (coreutils, text processing, compression, findutils, file utilities, networking, process management, environment, misc). Each task exercises a single command with realistic setup data and deterministic tests. Generated by: https://github.com/gb-vmax/unix-101 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

GPT-4.1-generated tasks covering 157 Unix commands across 9 categories, all validated end-to-end in Docker (build, solve, test pass). Commands fully cut (no passing tasks — need TTY, network, or produced unreliable tests): arp, bzcat, chgrp, df, dig, fuser, host, hostname, iconv, last, locate, nslookup, numfmt, parallel, patch, pidof, pr, printenv, sha1sum, ss, stat, strings, time, ts, unexpand, uniq, wdiff, who Generated by: https://github.com/gb-vmax/unix-101 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

mesa-dot-dev

Performed full review of 3542d8d...caa0c31

Analysis

• Duplicate Dockerfile chown commands appear across all 370 tasks—a code generation artifact that increases build time and indicates the pipeline needs refinement. Pre-merge removal required.

• Test validation gaps allow solutions to pass without verifying the intended command was actually used (e.g., xargs tasks only check final state). Tests need strengthening to validate command execution, not just outcomes.

• Category taxonomy inconsistencies exist where commands are grouped by package origin rather than functional purpose, creating organizational misalignment with stated design.

• Alias/environment variable tasks have fundamental persistence issues where solutions don't work as intended—only tests pass because they redefine the alias themselves. Architectural rethinking or clearer framing needed.

• 370 programmatically-generated tasks are explicitly "not yet validated"—high probability of edge cases, flag incompatibilities, and environmental failures. Validation pipeline must run before production readiness.

Tip

Help

Slash Commands:

/review - Request a full code review
/review latest - Review only changes since the last review
/describe - Generate PR description. This will update the PR body or issue comment depending on your configuration
/help - Get help with Mesa commands and configuration options

^{300 files reviewed | 3 comments | Edit Agent Settings • Read Docs}

mesa-dot-dev · 2026-02-28T00:58:43Z

data/unix_101_008aa989/environment/Dockerfile

+RUN echo 'Temporary file 2' > /home/user/temp_dir/subdir/file2.txt
+RUN chown -R user:user /home/user
+RUN chown -R user:user /home/user
+


Duplicate chown -R user:user /home/user command on lines 8 and 9. This pattern appears in many Dockerfiles across the PR (e.g., unix_101_20b2bab8, unix_101_229a1c19). Remove one of the duplicate commands to avoid redundant operations during image build.

•

Prompt for Agent

Task: Address review feedback left on GitHub. Repository: VmaxAI/tasks#14 File: data/unix_101_008aa989/environment/Dockerfile#L9 Action: Open this file location in your editor, inspect the highlighted code, and resolve the issue described below. Feedback: Duplicate `chown -R user:user /home/user` command on lines 8 and 9. This pattern appears in many Dockerfiles across the PR (e.g., unix_101_20b2bab8, unix_101_229a1c19). Remove one of the duplicate commands to avoid redundant operations during image build.

mesa-dot-dev · 2026-02-28T00:58:44Z

data/unix_101_131a0447/solution/solve.sh

@@ -0,0 +1,2 @@
+#!/bin/bash
+bash -i -c "alias greet='echo Hello, world!' && alias -p" > /home/user/greet_alias.txt


Creating an alias in a non-interactive script won't persist after the script exits. Aliases are not exported to subshells. This solution won't work as intended - the test passes because test.sh redefines the alias itself. Consider instructing users to add this to ~/.bashrc instead, or clarify that this task demonstrates alias syntax only.

•

Prompt for Agent

Task: Address review feedback left on GitHub. Repository: VmaxAI/tasks#14 File: data/unix_101_131a0447/solution/solve.sh#L2 Action: Open this file location in your editor, inspect the highlighted code, and resolve the issue described below. Feedback: Creating an alias in a non-interactive script won't persist after the script exits. Aliases are not exported to subshells. This solution won't work as intended - the test passes because test.sh redefines the alias itself. Consider instructing users to add this to ~/.bashrc instead, or clarify that this task demonstrates alias syntax only.

mesa-dot-dev · 2026-02-28T00:58:45Z

data/unix_101_00cbfa08/task.toml

+version = "1.0"
+title = "Extract specific character columns from text file"
+command = "cut"
+category = "coreutils"


Category mismatch: The cut command is a text processing tool, but it's categorized as 'coreutils'. While technically part of coreutils, for consistency with the PR's stated organization (text processing, compression, findutils, etc.), consider using category = "text_processing" to match the taxonomy described in the PR summary.

•

Prompt for Agent

Task: Address review feedback left on GitHub. Repository: VmaxAI/tasks#14 File: data/unix_101_00cbfa08/task.toml#L5 Action: Open this file location in your editor, inspect the highlighted code, and resolve the issue described below. Feedback: Category mismatch: The `cut` command is a text processing tool, but it's categorized as 'coreutils'. While technically part of coreutils, for consistency with the PR's stated organization (text processing, compression, findutils, etc.), consider using category = "text_processing" to match the taxonomy described in the PR summary.

gb-vmax and others added 2 commits February 27, 2026 16:23

mesa-dot-dev bot reviewed Feb 28, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add 370 unix101 single-command tasks#14

Add 370 unix101 single-command tasks#14
gb-vmax wants to merge 2 commits intoVmaxAI:mainfrom
gb-vmax:unix101/harbor-tasks

gb-vmax commented Feb 28, 2026 •

edited

Loading

Uh oh!

mesa-dot-dev bot left a comment

Uh oh!

mesa-dot-dev bot Feb 28, 2026

Uh oh!

mesa-dot-dev bot Feb 28, 2026

Uh oh!

mesa-dot-dev bot Feb 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		@@ -0,0 +1,2 @@
		#!/bin/bash
		bash -i -c "alias greet='echo Hello, world!' && alias -p" > /home/user/greet_alias.txt

Conversation

gb-vmax commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Commands dropped (28)

Generation pipeline

Test plan

Uh oh!

mesa-dot-dev bot left a comment

Choose a reason for hiding this comment

Analysis

Uh oh!

mesa-dot-dev bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

mesa-dot-dev bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

mesa-dot-dev bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

gb-vmax commented Feb 28, 2026 •

edited

Loading