Skip to content

Commit 3ede96e

Browse files
committed
3.0.0a public release
1 parent 56305c6 commit 3ede96e

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

52 files changed

+4502
-1824
lines changed

.env.example

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
# LinkedIn credentials for scraping
2+
# Copy this file to .env and fill in your credentials
3+
4+
# Use either LINKEDIN_EMAIL or LINKEDIN_USERNAME (both work)
5+
LINKEDIN_EMAIL=your.email@example.com
6+
# LINKEDIN_USERNAME=your.email@example.com
7+
8+
LINKEDIN_PASSWORD=your_password_here

.gitignore

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,3 +13,33 @@ scrape.py
1313
creds.json
1414
venv
1515
*.zip
16+
.env
17+
18+
# Test outputs
19+
*.log
20+
test_*.db
21+
test_linkedin.db
22+
test_summary.json
23+
results_*.json
24+
person_*.json
25+
*_improved.json
26+
27+
# Debug scripts (keep debug_connection_selectors.py)
28+
debug_*.py
29+
!debug_connection_selectors.py
30+
31+
# Backup files
32+
*_old.py
33+
*.backup
34+
35+
# Session files (sensitive cookies)
36+
linkedin_session.json
37+
38+
# Build artifacts
39+
MANIFEST
40+
MANIFEST.ini
41+
.pytest_cache/
42+
43+
# Basic package build artifacts
44+
build-basic/
45+
dist-basic/

CONTRIBUTING.md

Lines changed: 206 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,206 @@
1+
# Contributing to LinkedIn Scraper
2+
3+
Thank you for your interest in contributing to LinkedIn Scraper! This document provides guidelines for contributing to the project.
4+
5+
## Getting Started
6+
7+
### Prerequisites
8+
9+
- Python 3.8 or higher
10+
- pip package manager
11+
- Git
12+
13+
### Setting Up Development Environment
14+
15+
1. **Clone the repository**
16+
```bash
17+
git clone https://github.com/joeyism/linkedin_scraper.git
18+
cd linkedin_scraper
19+
```
20+
21+
2. **Create a virtual environment**
22+
```bash
23+
python -m venv venv
24+
source venv/bin/activate # On Windows: venv\Scripts\activate
25+
```
26+
27+
3. **Install dependencies**
28+
```bash
29+
pip install -r requirements.txt
30+
pip install -r requirements-dev.txt
31+
```
32+
33+
4. **Install Playwright browsers**
34+
```bash
35+
playwright install chromium
36+
```
37+
38+
5. **Set up environment variables** (for testing)
39+
```bash
40+
cp .env.example .env
41+
# Edit .env and add your LinkedIn credentials (optional, for integration tests)
42+
```
43+
44+
## Development Workflow
45+
46+
### Running Tests
47+
48+
```bash
49+
# Run all tests
50+
pytest
51+
52+
# Run specific test file
53+
pytest tests/test_person.py
54+
55+
# Run with verbose output
56+
pytest -v
57+
58+
# Run with coverage
59+
pytest --cov=linkedin_scraper
60+
```
61+
62+
### Code Style
63+
64+
This project follows these guidelines:
65+
66+
- **PEP 8**: Python code style guide
67+
- **Type hints**: Use type annotations where appropriate
68+
- **Docstrings**: Document all public functions and classes
69+
- **Line length**: Maximum 100 characters
70+
71+
Before submitting, ensure your code passes linting:
72+
73+
```bash
74+
# Format code
75+
black linkedin_scraper/
76+
77+
# Check for issues
78+
flake8 linkedin_scraper/
79+
mypy linkedin_scraper/
80+
```
81+
82+
### Testing Your Changes
83+
84+
1. Write tests for new functionality
85+
2. Ensure all existing tests pass
86+
3. Test manually with the sample scripts in `samples/`
87+
4. Verify documentation is updated
88+
89+
## Making Changes
90+
91+
### Branching Strategy
92+
93+
- `main` - Stable release branch
94+
- `feature/your-feature` - New features
95+
- `fix/your-bugfix` - Bug fixes
96+
- `docs/your-doc-change` - Documentation updates
97+
98+
### Commit Messages
99+
100+
Write clear, descriptive commit messages:
101+
102+
```
103+
Add support for scraping job descriptions
104+
105+
- Extract full job description text
106+
- Parse job requirements section
107+
- Add tests for job description parsing
108+
```
109+
110+
Format:
111+
- First line: Brief summary (50 chars or less)
112+
- Blank line
113+
- Detailed explanation (wrap at 72 chars)
114+
- List specific changes with bullet points
115+
116+
### Pull Request Process
117+
118+
1. **Create a new branch**
119+
```bash
120+
git checkout -b feature/your-feature-name
121+
```
122+
123+
2. **Make your changes**
124+
- Write clean, well-documented code
125+
- Add tests for new functionality
126+
- Update documentation as needed
127+
128+
3. **Commit your changes**
129+
```bash
130+
git add .
131+
git commit -m "Your descriptive commit message"
132+
```
133+
134+
4. **Push to your fork**
135+
```bash
136+
git push origin feature/your-feature-name
137+
```
138+
139+
5. **Create a Pull Request**
140+
- Go to the GitHub repository
141+
- Click "New Pull Request"
142+
- Select your branch
143+
- Fill out the PR template
144+
- Wait for review
145+
146+
### Pull Request Guidelines
147+
148+
- **Description**: Clearly describe what changes you made and why
149+
- **Tests**: Include tests that cover your changes
150+
- **Documentation**: Update README.md or other docs if needed
151+
- **Small PRs**: Keep changes focused and manageable
152+
- **Responsive**: Be ready to address feedback
153+
154+
## What to Contribute
155+
156+
### Good First Issues
157+
158+
Look for issues labeled `good first issue` for beginner-friendly tasks:
159+
160+
- Documentation improvements
161+
- Bug fixes
162+
- Additional test coverage
163+
- Code refactoring
164+
165+
### Feature Requests
166+
167+
Before implementing major features:
168+
169+
1. Check existing issues to avoid duplication
170+
2. Open an issue to discuss the feature
171+
3. Wait for maintainer approval
172+
4. Implement the feature once approved
173+
174+
### Bug Reports
175+
176+
When reporting bugs, include:
177+
178+
- Python version
179+
- Operating system
180+
- Steps to reproduce
181+
- Expected vs actual behavior
182+
- Error messages/stack traces
183+
- Sample code if possible
184+
185+
## Code Review Process
186+
187+
1. A maintainer will review your PR
188+
2. They may request changes
189+
3. Make requested changes and push updates
190+
4. Once approved, a maintainer will merge your PR
191+
192+
## Questions?
193+
194+
If you have questions about contributing:
195+
196+
- Open an issue on GitHub
197+
- Check existing issues and discussions
198+
- Review the README.md for usage examples
199+
200+
## License
201+
202+
By contributing, you agree that your contributions will be licensed under the Apache License 2.0.
203+
204+
---
205+
206+
Thank you for contributing to LinkedIn Scraper! Your efforts help make this project better for everyone.

MANIFEST

Lines changed: 0 additions & 6 deletions
This file was deleted.

MANIFEST.in

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
# Include documentation
2+
include README.md
3+
include LICENSE
4+
include AGENTS.md
5+
include ARCHITECTURE.md
6+
include USAGE.md
7+
include TESTING.md
8+
9+
# Include configuration files
10+
include config.yaml
11+
include test_config.yaml
12+
include docker-compose.yml
13+
include requirements.txt
14+
include requirements-dev.txt
15+
16+
# Include setup files
17+
include setup.py
18+
include setup.cfg
19+
include pytest.ini
20+
21+
# Include all Python files in the package
22+
recursive-include linkedin_scraper *.py
23+
24+
# Exclude build, test, and cache files
25+
global-exclude __pycache__
26+
global-exclude *.py[cod]
27+
global-exclude *$py.class
28+
global-exclude *.so
29+
global-exclude .DS_Store
30+
31+
# Exclude test files and directories
32+
prune tests
33+
prune samples
34+
prune build
35+
prune dist
36+
prune *.egg-info
37+
prune venv
38+
prune .pytest_cache
39+
prune .git

MANIFEST.ini

Lines changed: 0 additions & 5 deletions
This file was deleted.

0 commit comments

Comments
 (0)