LookPlanGraph: Embodied Instruction Following Method with VLM Graph Augmentation

This repository contains the official implementation of LookPlanGraph, a method for embodied instruction following that leverages a scene graph composed of static assets and object priors. It also includes the GraSIF (Graph Scenes for Instruction Following) benchmark.

Overview

LookPlanGraph enables robots to plan and execute complex instructions in dynamic environments where object positions may change. It uses a Memory Graph to track the scene state and a Scene Graph Simulator (SGS) to validate actions. A Graph Augmentation Module utilizes a Vision Language Model (VLM) to dynamically update the graph based on the agent's observations.

Repository Structure

Code/
├── LookPlanGraph/      # Core implementation of the LookPlanGraph
├── baselines/          # Baseline methods implementations
├── benchmarks/         # Benchmarks
│   ├── grasif/         # GraSIF benchmark
│   └── dynamic_env/    # Data for dynamic environments
├── utils/              # Utility scripts
├── results/            # Directory for storing experiment results
│   ├── grasif/
│   ├── ablation/
│   ├── show_config.yaml
│   └── calculate_metrics.py
├── config_grasif.yaml  # Main configuration file
├── grasif_test.py      # Main entry point for running experiments
└── requirements.txt    # Python dependencies

Installation

Clone the repository:

git clone <repository-url>
cd LookPlanGraph

Install dependencies:
```
pip install -r requirements.txt
```
Set up Environment Variables: You need an API key for the LLM provider (OpenRouter is used by default).
```
export OPEN_ROUTER_KEY='your_key_here'
```
Alternatively, you can set the key directly in config_grasif.yaml.

Usage

Configuration

The experiment settings are defined in config_grasif.yaml. You can modify this file to select the method, dataset, and LLM.

LLM:
  model_name: meta-llama/llama-3.3-70b-instruct # Model to use
  # ...

methods: 
  names: [LookPlanGraph] # Options: LookPlanGraph, ReAct, SayPlan, SayPlanLite, LLMasP, LLM+P
  mode: null             # Ablation modes: 'no_memory', 'no_corrections', etc.

dataset:
  subdatasets: [SayPlanOffice] # Options: SayPlanOffice, VirtualHome, Behaviour1k

Running Experiments

To run the evaluation on the GraSIF benchmark:

python grasif_test.py

This script will:

Load the configuration from config_grasif.yaml.
Initialize the selected dataset(s).
Run the specified method(s) on the tasks.
Save the results (success rates, plans, logs) to the results/ directory.

Running Baselines

ReAct, SayPlan, SayPlanLite: Can be run directly by adding them to the methods.names list in config_grasif.yaml.
LLM+P / LLMasP: These methods require a separate PDDL solver, which is implemented as a server inside a Docker container.
1. Navigate to Code/baselines/llmpp/.
2. Build and run the Docker container.
3. Set the llmpp_url from Docker in config_grasif.yaml (default: http://localhost:8091).

Results

We evaluated LookPlanGraph against several baselines on the GraSIF dataset. The table below summarizes the performance in terms of Success Rate (SR), Average Plan Precision (APP), and Tokens Per Action (TPA).

Method	SayPlan Office (SR↑ / APP↑ / TPA↓)	BEHAVIOR-1K (SR↑ / APP↑ / TPA↓)	RobotHow (SR↑ / APP↑ / TPA↓)
LLM-as-P	0.47 / 0.59 / 1409	0.39 / 0.53 / 178	0.44 / 0.51 / 3417
LLM+P	0.07 / 0.21 / 1945	0.33 / 0.37 / 160	0.30 / 0.38 / 5396
SayPlan	0.46 / 0.59 / 3697	0.36 / 0.43 / 1888	0.86 / 0.87 / 5576
SayPlan Lite	0.53 / 0.68 / 1368	0.61 / 0.76 / 524	0.84 / 0.89 / 4641
ReAct	0.38 / 0.64 / 2503	0.47 / 0.61 / 1713	0.89 / 0.91 / 1322
LookPlanGraph	0.62 / 0.73 / 1989	0.60 / 0.77 / 1472	0.87 / 0.89 / 2653

Acknowledgements

The implementations of LLM as Planner and LLM+P were adapted from the original repository: https://github.com/Cranial-XIX/llm-pddl.

We utilize Fast Downward (https://github.com/aibasel/downward) as the underlying planner.

Citation

If you use this code or dataset in your research, please cite our paper:

@inproceedings{onishchenko2025lookplangraph,
  title={LookPlanGraph: Embodied Instruction Following Method with VLM Graph Augmentation},
  author={Onishchenko, Anatoly O. and Kovalev, Alexey K. and Panov, Aleksandr I.},
  year={2025}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LookPlanGraph: Embodied Instruction Following Method with VLM Graph Augmentation

Overview

Repository Structure

Installation

Usage

Configuration

Running Experiments

Running Baselines

Results

Acknowledgements

Citation

About

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
LookPlanGraph		LookPlanGraph
baselines		baselines
benchmarks		benchmarks
results		results
utils		utils
.gitignore		.gitignore
config_grasif.yaml		config_grasif.yaml
grasif_test.py		grasif_test.py
readme.md		readme.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

LookPlanGraph: Embodied Instruction Following Method with VLM Graph Augmentation

Overview

Repository Structure

Installation

Usage

Configuration

Running Experiments

Running Baselines

Results

Acknowledgements

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages