Skip to content

Occupancy grid image#822

Closed
s-desh wants to merge 2568 commits into
devfrom
occupancy_grid_image
Closed

Occupancy grid image#822
s-desh wants to merge 2568 commits into
devfrom
occupancy_grid_image

Conversation

@s-desh

@s-desh s-desh commented Dec 10, 2025

Copy link
Copy Markdown

This PR is one of many for encoding maps for agents. #804

Adds

  • Evals that use map images for point placement and map comprehension.
  • OccupancyGridImage, that encodes OccupancyGrid as RGB image and overlays robot pose.
  • Interpret map skill - pull maps and place points based on the query for navigation.
  • Vibe coded annotater to add queries on evals

Evals and Results

A dataset of floorplans (only 2 right now, with variations) are used to generate grids and evaluated on point placement and map comprehension. This dataset can be populated by adding a new floorplans with expected answers for queries.

dimos/agents2/skills/interpret_map/eval/test_map_interpretability.yaml has queries with varying difficulty to evaluate spatial reasoning. For now, the minimum pass rate for point placement and map comprehension is set to 0.25 and 0.7 respectively.

Run

For evals run pytest -s dimos/agents2/skills/interpret_map/eval/test_map_eval.py.

Examples of successful point placement results.


Go to the conference table in the office

Second room to the robot’s left along the corridor

a point immediately behind the robot

second room to the robot’s left along the corridor

Debug

  • Failed point placement tasks are store the image for debugging in this format - debug_goal_placement_<map_id>_<query>.png. The goal placed is marked with +
  • Failed map comprehension answers are logged in the terminal.

Adding new maps and queries for eval

  1. Get a map image, black representing obstacles, white freespace and gray for unexplored regions.
  2. Add an entry with a new map_id, image_path, robot_pose.position in pixels and orientation under map_comprehension_tests or point_placement_tests in dimos/agents2/skills/interpret_map/eval/test_map_interpretability.yaml.
  3. For point placement tests, use dimos/agents2/skills/interpret_map/eval/annotate.py <image.png> to create bounding boxes and question pairs. These are saved into a questions.yaml, copy them to main testing yaml mentioned above.
  4. For map comprehension tests, manually add questions and expected regex patterns to check in the main yaml

Running with agents

The interpret_map_skill is to be able to pull the map / place a goal based on the query and return the world coordinates to navigate.

Run dimos --replay run unitree-go2-agentic --extra-module interpret_map_skill

In the cli, queries like "get a goal right in front of the robot", "get a goal to the northeast side of the map" can be asked.

Few observations

  • Orienting the map so the robot always points up improves scores on robot centric queries like "second room to the robot's left" etc.
  • Point placement is highly sensitive to prompt, example the points identified are far off from description if the prompt mentions "place the point only in free (white) space" repeatedly. Moving a goal to the nearest free space in post processing is feasible. Another example - using white pixels instead of white area in prompt gives better results
  • The VLM does better at queries that do not need robot orientation to answer.
  • Noise influences how well the robot orientation is understood.
  • Qwen pixel identification works well when max dimension is limited to 1024px, while keeping aspect ratio.

@s-desh s-desh closed this Dec 10, 2025
@s-desh s-desh reopened this Dec 10, 2025

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

13 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Comment thread dimos/agents2/skills/interpret_map/OccupancyGridImage.py
Comment thread dimos/agents2/skills/interpret_map/OccupancyGridImage.py

@leshy leshy left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just these small things, otherwise looks good

Comment thread dimos/agents2/skills/navigation.py Outdated
Comment thread dimos/agents2/skills/test_map_eval.py Outdated
Comment thread dimos/agents2/skills/interpret_map/OccupancyGridImage.py
@s-desh s-desh force-pushed the occupancy_grid_image branch 3 times, most recently from 0bdf5a6 to 6fbe4eb Compare December 16, 2025 14:13
@leshy

leshy commented Dec 16, 2025

Copy link
Copy Markdown
Member

I wrote a quick way to ask an agent a question and see the result in foxglove in realtime on top of your map, Your resolution was wrong, you were placing points in pixels and not meters, for example [480, 270, 0.0] is 480,270 meters away from zero zero on the map.

I made the system define transforms correctly (for occupancygrid world frame, for robot base_link)

answer to "conference room with a bunch of chairs"

2025-12-16_23-30

likely something wrong with how image is rendered for an agent? idk, but wanted this so I can ask a few questions myself and see results

run foxglove-bridge in console, run foxglove, import occupancygrid_agent_foxglove.json dashboard in your evals/ dir

run (twice initially to see the image, becuase bridge is a bit dumb)

pytest -svk ivan dimos/agents2/skills/interpret_map/eval/test_map_eval.py

@s-desh

s-desh commented Dec 17, 2025

Copy link
Copy Markdown
Author

Your resolution was wrong, you were placing points in pixels and not meters, for example [480, 270, 0.0] is 480,270 meters away from zero zero on the map.

This is taken care by position=[ i * self.occupancy_grid.info.resolution for i in self.robot_pose["position"] # convert pixels to meters ], and works as expected in tests. I've removed it now, you can continue using meters for position.

Pixel to world conversion was incorrect in your version, thats fixed. The actual response for your query looks like

Screenshot from 2025-12-17 12-44-28

Adding "long table" in your query gives a correct response.

@s-desh s-desh force-pushed the occupancy_grid_image branch from 482ed81 to 7165549 Compare December 23, 2025 15:26
Comment thread dimos/agents2/skills/interpret_map/OccupancyGridImage.py
Comment thread dimos/agents2/skills/interpret_map/OccupancyGridImage.py
Comment thread dimos/agents2/skills/interpret_map/eval/test_map_eval.py
Comment thread dimos/agents2/skills/interpret_map/eval/test_map_eval.py
@s-desh s-desh force-pushed the occupancy_grid_image branch from ab9769c to 3e5509b Compare December 26, 2025 10:12
leshy and others added 13 commits December 26, 2025 22:37
Former-commit-id: f721a9d [formerly ec2cc6d]
Former-commit-id: a5f5091
Former-commit-id: 1ebf5bd [formerly c1ce353]
Former-commit-id: 5906348
Former-commit-id: ea5cf0d [formerly 063c712]
Former-commit-id: c1fa3fa
Former-commit-id: 49e4bd4 [formerly 09ef890]
Former-commit-id: cefb46a
Former-commit-id: 6e70cc1 [formerly eb37f0d]
Former-commit-id: 0233868
Former-commit-id: c1f8483 [formerly 97123d1]
Former-commit-id: d1211a1
Former-commit-id: 4da9043 [formerly fa06c86]
Former-commit-id: 0182f73
Former-commit-id: 4a1b103 [formerly 69ba61f]
Former-commit-id: e6f8de4
Former-commit-id: 48efc26 [formerly d97b178]
Former-commit-id: b224c15
Former-commit-id: 5f3d01c [formerly a70c3a9]
Former-commit-id: d9f13ea
s-desh and others added 13 commits January 4, 2026 14:54
Former-commit-id: 2c9e1be [formerly 2b25e30]
Former-commit-id: 886c43c
Former-commit-id: db6bf76
Former-commit-id: 599b66f
Former-commit-id: 6fbe4eb
Former-commit-id: 7dad55a
Former-commit-id: b5590e2
Former-commit-id: 6377930
Former-commit-id: c634412
Former-commit-id: dc3c687
Former-commit-id: 2d0b444
Former-commit-id: 00a1719
Former-commit-id: 79eb8c0
Former-commit-id: f0f486a
Former-commit-id: 842021d
Former-commit-id: 86d0b7c
Former-commit-id: 15f8bb1
Former-commit-id: 6ff050c
Former-commit-id: 7165549
Former-commit-id: 260c101
Former-commit-id: 3e5509b
Former-commit-id: 90d5a91
@spomichter spomichter force-pushed the occupancy_grid_image branch from 3e5509b to 90d5a91 Compare January 8, 2026 13:59
@spomichter spomichter requested a review from a team January 8, 2026 13:59
@greptile-apps

greptile-apps Bot commented Jan 8, 2026

Copy link
Copy Markdown
Contributor

Too many files changed for review.

1 similar comment
@greptile-apps

greptile-apps Bot commented Jan 8, 2026

Copy link
Copy Markdown
Contributor

Too many files changed for review.

@greptile-apps

greptile-apps Bot commented Jan 8, 2026

Copy link
Copy Markdown
Contributor

Too many files changed for review.

1 similar comment
@greptile-apps

greptile-apps Bot commented Jan 8, 2026

Copy link
Copy Markdown
Contributor

Too many files changed for review.

@greptile-apps

greptile-apps Bot commented Jan 8, 2026

Copy link
Copy Markdown
Contributor

Too many files changed for review.

1 similar comment
@greptile-apps

greptile-apps Bot commented Jan 8, 2026

Copy link
Copy Markdown
Contributor

Too many files changed for review.

@paul-nechifor

Copy link
Copy Markdown
Contributor

Closed because it's old and it's before the rebase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants