Skip to content

xaintly/llm_bias_analysis_tools

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LLM Bias Analysis Tools

A research toolkit for systematically analyzing gender bias in Large Language Model (LLM) responses to job description generation tasks.

Overview

This repository contains tools designed to help researchers study gender bias patterns in artificial intelligence systems, specifically Large Language Models. The toolkit systematically tests how different LLMs respond to job description generation prompts, analyzing whether the models exhibit gender stereotyping or bias in their outputs.

What This Tool Does

  • Generates job descriptions across 196 different occupations using various LLM models
  • Tests for gender bias by comparing responses to neutral vs. gendered job titles (e.g., "server" vs. "waiter" vs. "waitress")
  • Analyzes multiple dimensions including salary estimates, language patterns, and role descriptions
  • Supports 40+ LLM models from providers like Anthropic (Claude), OpenAI (GPT), Google (Gemini), Meta (Llama), and others
  • Provides statistical analysis including Inter-Rater Reliability (IRR) and Bem Sex Role Inventory (BSRI) scoring

Research Applications

This toolkit is designed for academic researchers studying:

  • AI bias and fairness
  • Gender representation in AI systems
  • Computational social science
  • Digital humanities
  • Technology ethics and policy

Authors and Contributors

Authors: Jennifer M. Krebsbach, Jane E. Lee, Steven Zeck, Arti Thakur, and Martin Hilbert

Institution: University of California, Davis

Contact: spzeck@health.ucdavis.edu, jkrebsbach@ucdavis.edu

Citation

If you use this framework in your research, please cite:

@misc{llm-bias-analysis-tools,
  title={LLM Gender Bias Analysis Tools},
  author={Krebsbach, Jennifer and Lee, Jane and Zeck, Steven},
  year={2025},
  url={https://github.com/xaintly/llm_bias_analysis_tools}
}

License

This project is licensed under the MIT License - see the LICENSE file for details.

Getting Started

Prerequisites

  • Python 3.8 or higher
  • Access to at least one LLM provider (AWS Bedrock, OpenAI, Google AI, etc.)
  • Basic familiarity with command line operations

Installation

  1. Clone the repository:

    git clone [repository-url]
    cd llm_bias_analysis_tools
  2. Install Python dependencies:

    pip install boto3 openai google-generativeai requests pandas openpyxl

Configuration Setup

The toolkit supports multiple LLM providers. You only need to configure the providers you plan to use.

AWS Bedrock Setup (Recommended for comprehensive model access)

AWS Bedrock provides access to Claude, Llama, Titan, and other models through a single interface.

  1. Create AWS credentials file: Create a file named .aws.profile in the project directory:

    default
    
  2. Configure AWS credentials (choose one method):

    Option A: Environment variables

    export AWS_ACCESS_KEY_ID=your_access_key_here
    export AWS_SECRET_ACCESS_KEY=your_secret_key_here
    export AWS_SESSION_TOKEN=your_session_token_here  # if using temporary credentials

    Option B: AWS credentials file

    aws configure --profile default
  3. Request model access:

    • Log into AWS Console → Bedrock → Model Access
    • Request access to desired models (Claude, Llama, etc.)
    • Access approval can take 1-2 business days

OpenAI Setup

  1. Get API key:

  2. Create credentials file: Create .openai-api.key in the project directory:

    your_openai_api_key_here
    

Google AI (Gemini) Setup

  1. Get API key:

  2. Create credentials file: Create .gemini-api.key in the project directory:

    your_gemini_api_key_here
    

Local Models (Ollama) Setup

For running models locally:

  1. Install Ollama:

  2. Pull desired models:

    ollama pull llama2:7b
    ollama pull deepseek-r1:7b
  3. Create URL file: Create .ollama.url in the project directory:

    http://localhost:11434/api/generate
    

Running the Analysis

Basic Usage

  1. Test your setup:

    python run_llm_prompt.py

    This will generate job descriptions for a single occupation using available models.

  2. View results: Results are saved in the results/ directory, organized by:

    • Model name (e.g., claude-3-5-sonnet/)
    • Prompt type (e.g., prompt1/)
    • Job title (e.g., server.txt)

Advanced Configuration

Modify job lists:

  • Edit input_data/job_triads.csv to change which occupations are tested
  • Each row contains: neutral_title, male_title, female_title, and metadata

Customize prompts:

  • Edit files in input_data/ to modify the prompts sent to models
  • prompt1.txt: Basic job description generation
  • prompt2.txt: Bias evaluation prompt
  • prompt3.1.txt: BSRI-based evaluation

Select specific models:

  • Edit input_data/llm_config.ini to enable/disable specific models or providers
  • Use disabled_models parameter to exclude certain models

Data Analysis

Generate Summary Reports

python create_data_summary.py

This creates Excel files with:

  • Salary analysis across models and job types
  • Bias scoring and Inter-Rater Reliability metrics
  • BSRI (Bem Sex Role Inventory) analysis
  • Statistical comparisons

Validate Results

python validate_llm_results.py

This checks for:

  • Missing or incomplete responses
  • Data quality issues
  • Consistency across model outputs

Understanding the Output

File Structure

results/
├── claude-3-5-sonnet/           # Model-specific folders
│   ├── prompt1/                 # Basic job descriptions
│   │   ├── server.txt          # Individual job outputs
│   │   └── waiter.txt
│   └── prompt2/                 # Bias evaluation outputs
├── data_summary-YYYYMMDD.xlsx   # Aggregated analysis
└── irr_combinations-YYYYMMDD.xlsx # Reliability metrics

Key Metrics

  • Salary Analysis: Compares estimated salaries across gendered variants
  • Bias Scores: Quantifies detected bias on 1-10 scales
  • BSRI Scores: Measures masculine/feminine/neutral trait attribution
  • IRR Metrics: Assesses consistency between different models/raters

Troubleshooting

Common Issues

"Model unavailable" errors:

  • Check that you've requested access through the provider (especially AWS Bedrock)
  • Verify your API keys are correctly configured
  • Some models may have regional restrictions

API rate limiting:

  • The toolkit includes automatic retry logic with exponential backoff
  • Consider running smaller batches if you encounter persistent rate limits

Missing dependencies:

  • Install required packages: pip install boto3 openai google-generativeai requests pandas openpyxl

Credential errors:

  • Ensure credential files are in the project root directory
  • Check that environment variables are set correctly
  • Verify file permissions on credential files

Getting Help

For technical issues:

  1. Check the results/prompt1_failures/ directory for detailed error logs
  2. Review the configuration in input_data/llm_config.ini
  3. Test individual model connections before running full analysis

Research Ethics and Considerations

Responsible Use

  • This toolkit is designed for academic research on AI bias detection and mitigation
  • Results should be interpreted within appropriate statistical and social contexts
  • Consider potential limitations of bias measurement approaches

Data Privacy

  • No personal data is collected or processed
  • All interactions are with AI models using synthetic job descriptions
  • API usage follows standard terms of service for each provider

Reproducibility

  • All prompts and configurations are version-controlled
  • Random seeds and model parameters are documented
  • Results include timestamps and model version information

Contributing

We welcome contributions from researchers and developers:

  1. Fork the repository
  2. Create a feature branch for your changes
  3. Add appropriate documentation
  4. Submit a pull request with a clear description of changes

Areas for Contribution

  • Additional LLM provider integrations
  • New bias measurement approaches
  • Statistical analysis enhancements
  • Documentation improvements
  • Visualization tools

Acknowledgments

This research builds upon established work in:

  • AI bias detection methodologies
  • Gender stereotype measurement in psychology
  • Computational social science approaches
  • Open science and reproducible research practices