House of 200,000 Synthetic Personae

A powerful toolkit for working with synthetic personas, built on top of the PERSONA HUB dataset. This project provides tools for persona analysis, clustering, and message testing across different business roles.

Features

Persona Synthesis (main.py): Generate specialized content from personas including:
- Math problems tailored to persona backgrounds
- Instructions based on persona expertise
- NPC characters for gaming contexts
- Knowledge articles from persona perspectives
Persona Clustering (persona_clusters.py):
- Automated clustering of personas by business roles
- Support for IT Admin, Executive, Facilities, Retail, and Multifamily roles
- Feature extraction based on role-specific keywords
- Demographic and expertise-based grouping
Message Testing (message_tester.py):
- Test marketing messages against specific persona clusters
- Get detailed resonance scores and feedback
- Analyze message effectiveness across different roles
- Generate improvement suggestions based on persona responses

Getting Started

Install dependencies:

pip install -r requirements.txt

Set up your OpenAI API key in a .env file:

OPENAI_API_KEY=your_key_here

Run persona synthesis:

python main.py

Test messages against personas:

python message_tester.py --messages_file messages.json --output_file results.json --role it_admin

Usage Examples

Persona Synthesis

python main.py --template math --sample_size 10
python main.py --template npc --sample_size 10
python main.py --template knowledge --sample_size 10

Message Testing

# Test IT-focused messages
python message_tester.py --messages_file it_messages.json --output_file it_results.json --role it_admin

# Test executive-focused messages
python message_tester.py --messages_file exec_messages.json --output_file exec_results.json --role executive

Persona Clustering

# Cluster personas and analyze demographics
python persona_clusters.py

Project Structure

main.py - Main script for persona synthesis
message_tester.py - Tool for testing messages against persona clusters
persona_clusters.py - Persona clustering and analysis
code/prompt_templates.py - Templates for different synthesis tasks
requirements.txt - Project dependencies

Output Formats

Message Testing Results

{
    "message": "Your message here",
    "average_score": 7.5,
    "detailed_responses": [
        {
            "resonance_score": 8,
            "technical_accuracy": "...",
            "operational_impact": "...",
            "security_considerations": "...",
            "implementation_concerns": [...]
        }
    ],
    "key_themes": {...}
}

Persona Clusters

{
    "IT_ADMIN": [...],
    "EXECUTIVE": [...],
    "FACILITIES": [...],
    "RETAIL": [...],
    "MULTIFAMILY": [...]
}

Disclaimer

This toolkit facilitates synthetic data creation at scale to simulate diverse inputs from a wide variety of personas. While powerful, it comes with important ethical considerations. Please review the full disclaimer and ethical considerations from the original PERSONA HUB project.

Credits

This project is built on top of the PERSONA HUB dataset and toolkit (GitHub), a comprehensive collection of 200,000 synthetic personas created by Tencent AI Lab. We acknowledge and thank the creators of PERSONA HUB for providing this valuable resource.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
archive		archive
output		output
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
main.py		main.py
message_tester.py		message_tester.py
persona_clusters.py		persona_clusters.py
prompt_templates.py		prompt_templates.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

House of 200,000 Synthetic Personae

Features

Getting Started

Usage Examples

Persona Synthesis

Message Testing

Persona Clustering

Project Structure

Output Formats

Message Testing Results

Persona Clusters

Disclaimer

Credits

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

House of 200,000 Synthetic Personae

Features

Getting Started

Usage Examples

Persona Synthesis

Message Testing

Persona Clustering

Project Structure

Output Formats

Message Testing Results

Persona Clusters

Disclaimer

Credits

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages