Skip to content

Commit cadd3d9

Browse files
committed
ICICLE metadata added
1 parent 7570419 commit cadd3d9

File tree

2 files changed

+119
-46
lines changed

2 files changed

+119
-46
lines changed

README.md

Lines changed: 92 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,8 @@
99

1010
LLM-based reasoning using Z3 theorem proving with multiple backend support (SMT2 and JSON).
1111

12+
**Tags:** `AI4CI` `Software`
13+
1214
## Features
1315

1416
- **Dual Backend Support**: Choose between SMT2 (default) or JSON execution backends
@@ -18,6 +20,38 @@ LLM-based reasoning using Z3 theorem proving with multiple backend support (SMT2
1820
- **Batch Evaluation Pipeline**: Built-in tools for dataset evaluation and metrics
1921
- **Postprocessing Techniques**: Self-Refine, Self-Consistency, Decomposed Prompting, and Least-to-Most Prompting for enhanced reasoning quality
2022

23+
### License
24+
25+
[![License](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
26+
27+
This project is licensed under the MIT License. See [LICENSE](LICENSE) for details.
28+
29+
## References
30+
31+
- [Z3 Theorem Prover](https://github.com/Z3Prover/z3) — The underlying SMT solver used by ProofOfThought.
32+
- [OpenAI API](https://platform.openai.com/docs) — LLM provider for reasoning generation.
33+
- [Azure OpenAI Service](https://learn.microsoft.com/en-us/azure/ai-services/openai/) — Azure-hosted LLM endpoint support.
34+
- [SMT-LIB Standard](https://smtlib.cs.uiowa.edu/) — The SMT-LIB 2.0 standard used by the SMT2 backend.
35+
- [Diataxis Documentation Framework](https://diataxis.fr/) — Framework guiding the structure of this documentation.
36+
37+
## Acknowledgements
38+
39+
This work was supported by:
40+
41+
*National Science Foundation (NSF) funded AI institute for Intelligent Cyberinfrastructure with Computational Learning in the Environment (ICICLE) (OAC 2112606)*
42+
43+
## Issue Reporting
44+
45+
If you encounter any issues, please report them via [GitHub Issues](https://github.com/debarghaG/proofofthought/issues). When filing an issue, please include:
46+
- A clear description of the problem
47+
- Steps to reproduce the issue
48+
- Your Python version and OS
49+
- Relevant logs or error messages
50+
51+
---
52+
53+
# Tutorials
54+
2155
## Installation
2256

2357
### From PyPI (Recommended)
@@ -123,21 +157,35 @@ result = pot.query("Would Nancy Pelosi publicly denounce abortion?")
123157
print(result.answer) # False
124158
```
125159

126-
## Batch Evaluation
160+
## Examples
127161

128-
```python
129-
from z3adapter.reasoning import EvaluationPipeline, ProofOfThought
162+
The `examples/` directory contains complete working examples for various use cases:
130163

131-
evaluator = EvaluationPipeline(proof_of_thought=pot, output_dir="results/")
132-
result = evaluator.evaluate(
133-
dataset="data/strategyQA_train.json",
134-
question_field="question",
135-
answer_field="answer",
136-
max_samples=10
137-
)
138-
print(f"Accuracy: {result.metrics.accuracy:.2%}")
164+
- **simple_usage.py** - Basic usage with OpenAI
165+
- **azure_simple_example.py** - Simple Azure OpenAI integration
166+
- **backend_comparison.py** - Comparing SMT2 vs JSON backends
167+
- **batch_evaluation.py** - Evaluating on datasets
168+
- **postprocessor_example.py** - Using postprocessing techniques
169+
170+
### Running Examples After pip Install
171+
172+
If you installed via `pip install proofofthought`, you can create your own scripts anywhere using the Quick Start examples above. The examples directory is primarily for development and testing.
173+
174+
### Running Examples in Development Mode
175+
176+
If you cloned the repository:
177+
178+
```bash
179+
cd /path/to/proofofthought
180+
python examples/simple_usage.py
139181
```
140182

183+
**Note:** Some examples use helper modules like `utils/azure_config.py` which are only available when running from the repository root.
184+
185+
---
186+
187+
# How-To Guides
188+
141189
## Backend Selection
142190

143191
ProofOfThought supports two execution backends:
@@ -186,40 +234,21 @@ Available techniques:
186234

187235
See [POSTPROCESSORS.md](POSTPROCESSORS.md) for complete documentation and usage examples.
188236

189-
## Architecture
190-
191-
The system has two layers:
192-
193-
1. **High-level API** (`z3adapter.reasoning`) - Simple Python interface for reasoning tasks
194-
2. **Low-level execution** (`z3adapter.backends`) - JSON DSL or SMT2 backend for Z3
195-
196-
Most users should use the high-level API.
197-
198-
## Examples
199-
200-
The `examples/` directory contains complete working examples for various use cases:
201-
202-
- **simple_usage.py** - Basic usage with OpenAI
203-
- **azure_simple_example.py** - Simple Azure OpenAI integration
204-
- **backend_comparison.py** - Comparing SMT2 vs JSON backends
205-
- **batch_evaluation.py** - Evaluating on datasets
206-
- **postprocessor_example.py** - Using postprocessing techniques
207-
208-
### Running Examples After pip Install
209-
210-
If you installed via `pip install proofofthought`, you can create your own scripts anywhere using the Quick Start examples above. The examples directory is primarily for development and testing.
211-
212-
### Running Examples in Development Mode
237+
## Batch Evaluation
213238

214-
If you cloned the repository:
239+
```python
240+
from z3adapter.reasoning import EvaluationPipeline, ProofOfThought
215241

216-
```bash
217-
cd /path/to/proofofthought
218-
python examples/simple_usage.py
242+
evaluator = EvaluationPipeline(proof_of_thought=pot, output_dir="results/")
243+
result = evaluator.evaluate(
244+
dataset="data/strategyQA_train.json",
245+
question_field="question",
246+
answer_field="answer",
247+
max_samples=10
248+
)
249+
print(f"Accuracy: {result.metrics.accuracy:.2%}")
219250
```
220251

221-
**Note:** Some examples use helper modules like `utils/azure_config.py` which are only available when running from the repository root.
222-
223252
## Running Experiments
224253

225254
You can use this repository as a strong baseline for LLM+Solver methods. This code is generally benchmarked with GPT-5 on the first 100 samples of 5 datasets, as an indicator of whether we broke something during development. These numbers are not the best, and you can certainly get better numbers with better prompt engineering with this same tooling. Please feel free to put in a PR if you get better numbers with modified prompts.
@@ -236,9 +265,28 @@ This will:
236265
- Generate results tables in `results/`
237266
- Automatically update the benchmark results section below
238267

268+
---
269+
270+
# Explanation
271+
272+
## Architecture
273+
274+
The system has two layers:
275+
276+
1. **High-level API** (`z3adapter.reasoning`) - Simple Python interface for reasoning tasks
277+
2. **Low-level execution** (`z3adapter.backends`) - JSON DSL or SMT2 backend for Z3
278+
279+
Most users should use the high-level API.
280+
281+
For full documentation, visit the [ProofOfThought Documentation Site](https://debarghag.github.io/proofofthought/).
282+
283+
---
284+
285+
# Reference
286+
239287
<!-- BENCHMARK_RESULTS_START -->
240288

241-
# Benchmark Results
289+
## Benchmark Results
242290

243291
**Last Updated:** 2025-10-16 18:14:07
244292

@@ -255,11 +303,9 @@ This will:
255303
| CONDITIONALQA | JSON | 100 | 76.00% | 0.9180 | 0.8750 | 0.8960 | 89.00% |
256304
| STRATEGYQA | JSON | 100 | 68.00% | 0.7500 | 0.7895 | 0.7692 | 86.00% |
257305

258-
259-
260306
<!-- BENCHMARK_RESULTS_END -->
261307

262-
# Citations
308+
## Citations
263309

264310
Please consider citing our work if you find this useful.
265311

@@ -283,4 +329,4 @@ booktitle={The Thirty-ninth Annual Conference on Neural Information Processing S
283329
year={2025},
284330
url={https://openreview.net/forum?id=QfKpJ00t2L}
285331
}
286-
```
332+
```

component.yaml

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
components:
2+
- id: ProofOfThought
3+
owner: Debargha Ganguly
4+
primaryThrust: core/ModelCommons
5+
name: Proof of Thought
6+
status: BetaRelease
7+
website: https://debarghag.github.io/proofofthought/
8+
description: LLM-based reasoning using Z3 theorem proving with multiple backend support (SMT2 and JSON). Provides a high-level Python API for neurosymbolic reasoning tasks with postprocessing techniques for enhanced quality.
9+
componentVersion: 1.0.1
10+
targetIcicleRelease: 2023-04
11+
licenseUrl: https://github.com/debarghaG/proofofthought/blob/main/LICENSE
12+
publicAccess: true
13+
sourceCodeUrl: https://github.com/debarghaG/proofofthought
14+
releaseNotesUrl: https://github.com/debarghaG/proofofthought/releases
15+
citation: "Debargha Ganguly, Srinivasan Iyengar, Vipin Chaudhary, Shivkumar Kalyanaraman. (2024) PROOF OF THOUGHT: Neurosymbolic Program Synthesis allows Robust and Interpretable Reasoning. The First Workshop on System-2 Reasoning at Scale, NeurIPS'24. https://openreview.net/forum?id=Pxx3r14j3U"
16+
pypiPackage: proofofthought:1.0.1
17+
codeReviewConducted: true
18+
testsWritten: true
19+
securityReviewConducted: false
20+
biasAssessmentConducted: false
21+
usageDocumentationAvailable: true
22+
usageDocumentationUrl: https://debarghag.github.io/proofofthought/
23+
developerDocumentationAvailable: true
24+
developerDocumentationUrl: https://github.com/debarghaG/proofofthought/blob/main/README.md
25+
trainingTutorialsAvailable: true
26+
trainingTutorialsUrl: https://debarghag.github.io/proofofthought/
27+
usageMetricsCollected: false

0 commit comments

Comments
 (0)