Skip to content

Add 370 unix101 single-command tasks#14

Open
gb-vmax wants to merge 2 commits intoVmaxAI:mainfrom
gb-vmax:unix101/harbor-tasks
Open

Add 370 unix101 single-command tasks#14
gb-vmax wants to merge 2 commits intoVmaxAI:mainfrom
gb-vmax:unix101/harbor-tasks

Conversation

@gb-vmax
Copy link
Copy Markdown

@gb-vmax gb-vmax commented Feb 28, 2026

Summary

  • Adds 283 validated Harbor-format tasks covering 157 Unix commands across 9 categories
  • Each task exercises a single command (with flags) — no pipes or chaining
  • All tasks validated end-to-end in Docker (build, solve, test pass)
  • Tasks generated by GPT-4.1 using real --help/man output extracted from the Docker base image
  • Categories: coreutils, text processing, compression, findutils, file utilities, networking, process management, environment, misc

Commands dropped (28)

No passing tasks — need TTY, network, or produced unreliable tests:

arp, bzcat, chgrp, df, dig, fuser, host, hostname, iconv, last, locate, nslookup, numfmt, parallel, patch, pidof, pr, printenv, sha1sum, ss, stat, strings, time, ts, unexpand, uniq, wdiff, who

Generation pipeline

  • Source repo: https://github.com/gb-vmax/unix-101
  • discover.py extracts real help text from the Docker container (197 commands discovered)
  • generate.py feeds help text to GPT-4.1 to produce tasks with setup, solution, and test scripts
  • validate.py builds each task in Docker, runs the solution, runs the test — only passing tasks included

Test plan

  • Run validate.py against all tasks in Docker
  • Remove tasks that fail validation
  • Verify task coverage across command categories

🤖 Generated with Claude Code

gb-vmax and others added 2 commits February 27, 2026 16:23
GPT-4.1-generated tasks covering 185 Unix commands across 9 categories
(coreutils, text processing, compression, findutils, file utilities,
networking, process management, environment, misc). Each task exercises
a single command with realistic setup data and deterministic tests.

Generated by: https://github.com/gb-vmax/unix-101

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
GPT-4.1-generated tasks covering 157 Unix commands across 9 categories,
all validated end-to-end in Docker (build, solve, test pass).

Commands fully cut (no passing tasks — need TTY, network, or
produced unreliable tests):
  arp, bzcat, chgrp, df, dig, fuser, host, hostname, iconv, last,
  locate, nslookup, numfmt, parallel, patch, pidof, pr, printenv,
  sha1sum, ss, stat, strings, time, ts, unexpand, uniq, wdiff, who

Generated by: https://github.com/gb-vmax/unix-101

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@mesa-dot-dev mesa-dot-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performed full review of 3542d8d...caa0c31

Analysis

• Duplicate Dockerfile chown commands appear across all 370 tasks—a code generation artifact that increases build time and indicates the pipeline needs refinement. Pre-merge removal required.

• Test validation gaps allow solutions to pass without verifying the intended command was actually used (e.g., xargs tasks only check final state). Tests need strengthening to validate command execution, not just outcomes.

• Category taxonomy inconsistencies exist where commands are grouped by package origin rather than functional purpose, creating organizational misalignment with stated design.

• Alias/environment variable tasks have fundamental persistence issues where solutions don't work as intended—only tests pass because they redefine the alias themselves. Architectural rethinking or clearer framing needed.

• 370 programmatically-generated tasks are explicitly "not yet validated"—high probability of edge cases, flag incompatibilities, and environmental failures. Validation pipeline must run before production readiness.

Tip

Help

Slash Commands:

  • /review - Request a full code review
  • /review latest - Review only changes since the last review
  • /describe - Generate PR description. This will update the PR body or issue comment depending on your configuration
  • /help - Get help with Mesa commands and configuration options

300 files reviewed | 3 comments | Edit Agent SettingsRead Docs

RUN echo 'Temporary file 2' > /home/user/temp_dir/subdir/file2.txt
RUN chown -R user:user /home/user
RUN chown -R user:user /home/user

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low

Duplicate chown -R user:user /home/user command on lines 8 and 9. This pattern appears in many Dockerfiles across the PR (e.g., unix_101_20b2bab8, unix_101_229a1c19). Remove one of the duplicate commands to avoid redundant operations during image build.

Fix in Cursor • Fix in Claude

Prompt for Agent
Task: Address review feedback left on GitHub.
Repository: VmaxAI/tasks#14
File: data/unix_101_008aa989/environment/Dockerfile#L9
Action: Open this file location in your editor, inspect the highlighted code, and resolve the issue described below.

Feedback:
Duplicate `chown -R user:user /home/user` command on lines 8 and 9. This pattern appears in many Dockerfiles across the PR (e.g., unix_101_20b2bab8, unix_101_229a1c19). Remove one of the duplicate commands to avoid redundant operations during image build.

@@ -0,0 +1,2 @@
#!/bin/bash
bash -i -c "alias greet='echo Hello, world!' && alias -p" > /home/user/greet_alias.txt
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium

Creating an alias in a non-interactive script won't persist after the script exits. Aliases are not exported to subshells. This solution won't work as intended - the test passes because test.sh redefines the alias itself. Consider instructing users to add this to ~/.bashrc instead, or clarify that this task demonstrates alias syntax only.

Fix in Cursor • Fix in Claude

Prompt for Agent
Task: Address review feedback left on GitHub.
Repository: VmaxAI/tasks#14
File: data/unix_101_131a0447/solution/solve.sh#L2
Action: Open this file location in your editor, inspect the highlighted code, and resolve the issue described below.

Feedback:
Creating an alias in a non-interactive script won't persist after the script exits. Aliases are not exported to subshells. This solution won't work as intended - the test passes because test.sh redefines the alias itself. Consider instructing users to add this to ~/.bashrc instead, or clarify that this task demonstrates alias syntax only.

version = "1.0"
title = "Extract specific character columns from text file"
command = "cut"
category = "coreutils"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low

Category mismatch: The cut command is a text processing tool, but it's categorized as 'coreutils'. While technically part of coreutils, for consistency with the PR's stated organization (text processing, compression, findutils, etc.), consider using category = "text_processing" to match the taxonomy described in the PR summary.

Fix in Cursor • Fix in Claude

Prompt for Agent
Task: Address review feedback left on GitHub.
Repository: VmaxAI/tasks#14
File: data/unix_101_00cbfa08/task.toml#L5
Action: Open this file location in your editor, inspect the highlighted code, and resolve the issue described below.

Feedback:
Category mismatch: The `cut` command is a text processing tool, but it's categorized as 'coreutils'. While technically part of coreutils, for consistency with the PR's stated organization (text processing, compression, findutils, etc.), consider using category = "text_processing" to match the taxonomy described in the PR summary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant