βββββββ βββ βββ βββββββ ββββββββββββββββββββ βββββββββββββ ββββββββ
ββββββββ βββ ββββββββββββββββββββββββββββββββ ββββββββββββββββββββββ
βββ βββββββββββββββ βββββββββββ βββ βββ ββ ββββββββββββββββββββ
βββ ββββββββββββββ βββββββββββ βββ βββββββββββββββββββββββββββ
ββββββββββββ ββββββββββββββββββββ βββ ββββββββββββββββ βββββββββββ
βββββββ βββ βββ βββββββ ββββββββ βββ ββββββββ ββββββ βββββββββββ
C T I P L A T F O R M v 7
[ 10-Engine Β· Multi-Vector Β· MITRE ATT&CK Mapped Β· Parallel ]
GhostWire CTI is an open-source, locally-run Cyber Threat Intelligence platform built for analysts who don't want their investigation data leaking to a SaaS vendor.
Submit a suspicious URL, file hash, email, IP, or raw file β ten analysis engines fire in parallel. Results land in seconds: a risk score, a verdict, a STIX 2.1 bundle ready for your TAXII server, and a PDF forensic report.
The AI runs locally via Ollama. External APIs are query-only β you send the IOC, they return data. Nothing else leaves your machine.
TARGET βββΊ [URL / IP / HASH / EMAIL / FILE]
β
βββββββββββΌββββββββββ
β 10 ENGINE POOL β β concurrent.futures (parallel)
β β
β 1. Heuristics β URL structural analysis
β 2. WHOIS β Domain age, registrar
β 3. AI NLP β Ollama LLM (local, offline)
β 4. VirusTotal β 70+ AV engines
β 5. Deception β Typosquatting, homoglyph
β 6. Sandbox β Behavioral simulation
β 7. SSL/TLS β Certificate anomaly
β 8. Passive DNS β ASN, geo, VPN detection
β 9. Shodan β Ports, CVEs, banners
β 10. GreyNoise β Internet scanner classification
β β
β + URLhaus β abuse.ch malware URL database
β + OTX β AlienVault threat intel
β + Hybrid Anal. β Cloud sandbox detonation
β + Forensic Eng. β Deep file analysis (PE/PDF/Office)
βββββββββββ¬ββββββββββ
β
βββββββββββΌββββββββββ
β SCORING ENGINE β 0β100 risk score
β VERDICT ENGINE β SAFE/LOW/MEDIUM/HIGH/CRITICAL
β STIX 2.1 EXPORT β TAXII-compatible IOC bundle
β PDF REPORT β Full forensic report (ReportLab)
β AUDIT LOG β JSONL audit trail, defanged URLs
βββββββββββββββββββββ
| Tab | Input | Parallel Engines |
|---|---|---|
| URL / Domain | Any URL, domain, or IP address | 10+ |
| File / Hash | SHA-256 / SHA-1 / MD5 lookup or file upload | 4 + Forensic |
| Email / SMS | Raw headers + body, or OCR screenshot | 7 |
| IP Intelligence | Standalone deep IP analysis | 6 |
| Sandbox | Hybrid Analysis cloud detonation | HA API |
Engine 1 β URL Heuristics
Structural analysis before any network call. Catches raw IPs in URLs, @ credential tricks, encoded path components, suspicious TLDs (.tk .ml .cf .ga), excessive subdomains, and keyword patterns like login, secure, verify, update.
Engine 2 β WHOIS / Domain Age Domains under 30 days old score high. Checks registrar reputation, privacy-protected WHOIS in suspicious context, and creation/expiry anomalies.
Engine 3 β AI NLP (Ollama, local)
Runs phi3:mini, llama3, llama3.2, or mistral locally. Detects urgency language, financial threats, authority impersonation, and provides a domain legitimacy assessment. Structured JSON output β no hallucinated free-text verdicts.
Engine 4 β Reputation (VirusTotal + AbuseIPDB) 70+ AV engine scan via VT. Community votes, comments NLP, malicious file relations, domain popularity rank. AbuseIPDB confidence score, report count, Tor exit node detection, DNS record analysis.
Engine 5 β Technical Deception Levenshtein distance check against 50+ global brands. Unicode homoglyph detection, Punycode/IDN domain flagging, redirect chain depth, link shortener identification.
Engine 6 β Sandbox Simulation Local behavioral analysis: HTTP response headers, server fingerprinting, redirect chain tracking, parked domain content patterns, VPN/proxy ASN identification.
Engine 7 β SSL/TLS Certificate Certificate validity, free CA detection (Let's Encrypt, ZeroSSL), SAN mismatch, self-signed flag, wildcard abuse detection. Certs under 7 days old β fresh phishing infrastructure flag.
Engine 8 β Passive DNS + IP Intel Geolocation, ASN-based risk scoring, shared hosting density, VPN/proxy/Tor ASN recognition, PTR reverse DNS analysis.
Engine 9 β Shodan Full open port inventory, service banners (HTTP/SSH/FTP/SMB), CVE list, Shodan tags. Falls back to InternetDB β works without an API key.
Engine 10 β GreyNoise Mass internet scanner classification. RIOT benign service detection (Google, Cloudflare). Actor attribution, scan intent analysis. Community API works without a key.
URLhaus (abuse.ch) β Active malware URL database. Tags: phishing/malware/botnet/c2. Shows URL status (online/offline), associated malware family, and SHA256 of dropped files.
OTX (AlienVault) β Pulse count, MITRE ATT&CK technique IDs, threat actor attribution (Lazarus, APT28...), campaign names. Infrastructure-signal pulses (Tor exit nodes, HoneyNet feeds, C2 feeds, scanner feeds) contribute to score independently of VirusTotal β no false negatives on network-classified threats.
Hybrid Analysis β Cloud sandbox detonation for URLs, files, hashes, domains, and IPs. Environments: Windows 10 64-bit, Windows 7 32-bit, Android. Supports up to 10 API keys in rotation pool β auto-switches on 429.
When a file is uploaded, backend/forensic_engine.py runs in parallel with the VT/URLhaus/OTX lookups:
FILE BYTES
β
ββ Hash computation MD5 + SHA1 + SHA256
ββ Magic byte detection Never trusts the extension
ββ MIME mismatch check Extension β actual type β masquerading flag
β
ββ PE (EXE / DLL)
β ββ Section entropy >7.2 = packed/encrypted/crypter
β ββ Import analysis VirtualAlloc, CreateRemoteThread, WinExec...
β ββ Compile timestamp Zeroed = timestamp stomping
β ββ Overlay data Bytes after PE = appended payload
β
ββ PDF
β ββ /JS /JavaScript Embedded script execution
β ββ /OpenAction /AA Auto-exec on document open
β ββ /Launch External process execution
β ββ /EmbeddedFile Hidden file inside PDF
β ββ Polyglot detection PDF/ZIP dual-format (GootLoader technique)
β ββ Creator tool check msfvenom / Cobalt Strike signatures
β
ββ Office (DOCX / XLSX / XLSM)
β ββ VBA macro presence vbaProject.bin detection
β ββ Auto-exec triggers AutoOpen, Document_Open, Workbook_Open
β ββ Chr() obfuscation >20 Chr() calls = string hiding
β ββ PowerShell in VBA Inline PS execution chains
β ββ OLE embedded objs External template / OLE injection
β
ββ ZIP bomb detection
β ββ Ratio check >100:1 compression ratio
β ββ Absolute size cap >100MB uncompressed β blocked
β ββ Metadata spoof file_size=0 with real compressed data
β ββ Nested archives Matryoshka / 42.zip style
β
ββ Sandbox evasion patterns
β ββ Long sleep() Timeout evasion
β ββ IsDebuggerPresent Anti-analysis
β ββ VM string checks vmware / virtualbox / qemu / sandbox
β ββ Mouse / window User presence detection
β
ββ Network indicator extraction
ββ URL / domain / IP Strings embedded in binary
ββ DGA pattern Random-looking domains + abuse TLDs
ββ C2Indicator objs Confidence-scored with context
Result: ForensicReport with risk score, threat level, final verdict, MIME mismatch flag, extracted IOCs, AI NLP label, engine divergence explanation, and a plain-English answer to "why does this hash flag in VT but look clean in a container?"
Score components (0β100 final):
heuristics_score URL structure
whois_score Domain age
ai_score Ollama NLP
reputation_score VT + AbuseIPDB
deception_score Typosquat + homoglyph
sandbox_score Behavioral simulation
ssl_score Certificate anomaly
pdns_score Passive DNS
shodan_score Port / CVE intel
greynoise_score Scanner classification
urlhaus_score Malware URL database
otx_score Threat intel (infra-signal aware)
forensic_score File deep analysis
FINAL = min(Ξ£(engine Γ weight) / 2.0, 100)
Score Floor Rules:
TOR_EXIT_NODE Confirmed Tor exit node β minimum MEDIUM (40)
VT_DETECTION VT β₯3 malicious engines β minimum 35
INFRA_OVERRIDE VT relations β₯8 malicious files β locked β₯ 85
BRAND_SQUATTING 2+ squatting signals + new domain β +35 pts
AZ_WHITELIST .gov.az / .edu.az / .mil.az β fully protected
OTX Infrastructure Signals (VT-independent scoring):
Tor Exit Node +20 pts (anonymisation infrastructure)
HoneyNet Feed +15 pts (active attacker/scanner feed)
C2/Botnet Feed +18 pts (malware infrastructure)
Brute Force Feed +12 pts (active attack feed)
Scanner Feed +10 pts (mass reconnaissance feed)
Malware Feed +15 pts (malware distribution feed)
Verdict thresholds:
SAFE 0β19 ββββ #00ffb4
LOW 20β39 ββββ #78d97a
MEDIUM 40β64 ββββ #ffd060
HIGH 65β84 ββββ #ff6b35
CRITICAL 85β100 ββββ #ff2d55
URL Defanging β Every URL written to the audit log is defanged: https://evil.com β hxxps://evil[.]com. Prevents hot-links in log viewers, email clients, and SIEMs.
SSRF Protection β Screenshot engine runs Playwright headless with JavaScript disabled. run_screenshot defaults to off. Tested in tests/test_ssrf.py.
WHOIS Injection Prevention β Domain string sanitized with strict regex before shelling out to python-whois.
Supply Chain Hardening β All packages in requirements.txt pinned with ==. No floating versions.
Non-root Docker β ghostwire user UID/GID 1000. Container never runs as root.
Rate Limiting β Two-layer: 5-second minimum gap between requests (per session) + 10 requests per 60-second window. Process-level counter prevents multi-tab bypass.
HA Key Rotation β Thread-safe round-robin pool, up to 10 Hybrid Analysis API keys. On 429, offending key cools for 65 seconds; pool switches automatically.
OTX Infrastructure Scoring β OTX pulse count alone never increases the score for generic pulses. Infrastructure-signal pulses (Tor exit, HoneyNet, C2, scanner feeds) are authoritative by definition and score independently β VT cannot detect network-classified threats. All other pulse types require VirusTotal corroboration.
DoS Protection β PDF/binary scan capped at 5 MB decode limit. ReDoS-safe regex patterns (bounded quantifiers, no unbounded lazy .*?). String extraction capped at 5 MB.
Prompt Injection Prevention β Filenames sanitized before embedding in Ollama prompts. Only alphanumeric, dot, dash, space, and parentheses are allowed.
Ollama Model Whitelist β Only explicitly approved model identifiers are passed to the Ollama API. Unknown model names fall back to phi3:mini with a warning.
SHA-256 Validation β Hash strings validated with re.fullmatch(r"[0-9a-fA-F]{64}", ...) before URL interpolation into VT API calls.
Audit Trail β ~/.ghostwire/audit.jsonl. Fields: session ID, hostname, request sequence, defanged target, verdict, score, duration.
| Technique | ID | Engine |
|---|---|---|
| Phishing | T1566 | Email + AI NLP |
| Spearphishing Link | T1566.002 | URL + Deception |
| Drive-by Compromise | T1189 | Sandbox + SSL |
| Exploit Public-Facing App | T1190 | Shodan CVE |
| Command and Scripting Interpreter | T1059 | Forensic (PS1/VBA/JS) |
| Obfuscated Files or Information | T1027 | Entropy + encoding |
| Remote Template Injection | T1221 | Office forensic |
| Exfiltration over C2 | T1041 | C2 indicator extraction |
| Dynamic Resolution / DGA | T1568 | DGA pattern heuristics |
| Web Service C2 | T1102 | Domain + sandbox |
ATT&CK technique IDs from OTX pulses render as chips in the UI. Exported in STIX 2.1 bundles and PDF reports.
- Python 3.12+
- Ollama (local AI)
- VirusTotal + AbuseIPDB API keys (minimum)
git clone https://github.com/yourusername/GhostWire_CTI.git
cd GhostWire_CTI
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txt
python -m playwright install chromium
cp env.example .env # fill in your API keys
ollama pull phi3:mini # 2.3 GB β fast
ollama serve # separate terminal
streamlit run app.py
# β http://localhost:8501cp env.example .env # fill in your API keys
docker compose build
docker compose up -d
docker exec ghostwire-ollama ollama pull phi3:mini
# β http://localhost:8501Two containers: ghostwire (Streamlit, port 8501) + ghostwire-ollama (Ollama, port 11434).
| Service | Purpose | Free Tier | Link |
|---|---|---|---|
| VirusTotal | 70+ AV engines | 500/day, 4/min | virustotal.com |
| AbuseIPDB | IP abuse confidence | 1,000/day | abuseipdb.com |
| Shodan | Ports, CVEs, banners | InternetDB (no key) | account.shodan.io |
| GreyNoise | Scanner classification | Community API (no key) | greynoise.io |
| URLhaus | Malware URL database | ~10 req/min | auth.abuse.ch |
| Hybrid Analysis | Cloud sandbox | 200 req/min, 5 sub/hr | hybrid-analysis.com |
| AlienVault OTX | MITRE ATT&CK, actors | Unlimited (key required) | otx.alienvault.com |
| Ollama | Local AI NLP | β free, runs offline | ollama.ai |
Minimum required: VirusTotal + AbuseIPDB. All others degrade gracefully.
Hybrid Analysis note: Free accounts start at
Restrictedauth level (hash lookup only). To enable URL submission and file detonation, visit hybrid-analysis.com β Profile β API Key and request upgrade toDefaultlevel (free, requires account verification).
# Required
VIRUSTOTAL_API_KEY=your_key_here
ABUSEIPDB_API_KEY=your_key_here
# Recommended
HYBRID_ANALYSIS_API_KEY=your_key_here
OTX_API_KEY=your_key_here
URLHAUS_API_KEY=your_key_here
# HA Key Pool (up to 10 keys, auto-rotates on 429)
HYBRID_ANALYSIS_API_KEY_2=second_key
HYBRID_ANALYSIS_API_KEY_3=third_key
# Optional
SHODAN_API_KEY=your_key_here
GREYNOISE_API_KEY=your_key_here
# Local AI
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=phi3:mini
# App
SANDBOX_MAX_BYTES=524288
REQUEST_TIMEOUT=8
DEBUG=falseghostwire_cti/
β
βββ app.py # Streamlit entry point, tab routing, rate limiter
βββ config.py # Config singleton, URL defanging, HA key rotator
βββ requirements.txt # Pinned dependencies
βββ Dockerfile # python:3.12-slim, non-root ghostwire user
βββ docker-compose.yml # ghostwire + ollama containers
β
βββ backend/
β βββ forensic_engine.py # Deep file analysis: PE/PDF/Office/ZIP/shellcode
β βββ hash_engine.py # Hash lookup + file pipeline
β βββ ai_analyzer.py # Ollama client, JSON prompt engineering
β βββ async_runner.py # concurrent.futures parallel engine pool
β βββ audit_log.py # JSONL audit trail, defanging
β βββ caching.py # CacheManager β VT/WHOIS/SSL/DNS cache layers
β βββ deception.py # Typosquat, homoglyph, redirect, shortener
β βββ email_engine.py # Header parser, OCR, brand impersonation
β βββ external_intel.py # Shodan + GreyNoise clients
β βββ heuristics.py # URL structural analysis (15 checks)
β βββ hybrid_analysis.py # HA Cloud Sandbox, key pool rotation
β βββ ip_intel.py # IP standalone pipeline
β βββ logging_config.py # Centralised logging configuration
β βββ otx_engine.py # AlienVault OTX β MITRE, pulses, actor, infra signals
β βββ passive_dns.py # DNS, geo, ASN, VPN detection
β βββ pdf_report.py # ReportLab PDF generator
β βββ reputation.py # VirusTotal + AbuseIPDB clients
β βββ sandbox.py # Local behavioral sandbox simulation
β βββ scoring.py # Score aggregator, whitelist, overrides, Tor floor
β βββ screenshot_engine.py # Playwright headless (JS disabled)
β βββ ssl_engine.py # TLS certificate analysis
β βββ stix_export.py # STIX 2.1 bundle + CSV IOC export
β βββ urlhaus_engine.py # abuse.ch URLhaus client
β βββ url_utils.py # Shared URL helpers (normalise, extract, detect)
β βββ verdict.py # Verdict calculator, mitigation library
β βββ whois_check.py # WHOIS domain age, registrar
β βββ whois_timeline.py # Historical WHOIS timeline
β
βββ frontend/
β βββ components.py # Shared widgets: gauge, banner, IOC chips
β βββ extra_widgets.py # WHOIS timeline, threat map, Shodan/GN panels
β βββ ha_renderer.py # Hybrid Analysis results renderer
β βββ other_renderers.py # Hash, email, IP, forensic renderers
β βββ otx_panel.py # OTX panel (MITRE chips, actor badges)
β βββ stix_panel.py # STIX 2.1 export UI
β βββ styles.py # inject_css() β dark UI, Space Mono
β βββ url_renderer.py # URL pipeline main renderer
β βββ urlhaus_panel.py # URLhaus panel renderer
β
βββ pipelines/
β βββ pipeline_email.py # Email/SMS orchestration
β βββ pipeline_hash.py # Hash/File + Forensic Engine orchestration
β βββ pipeline_ip.py # IP intelligence pipeline
β βββ pipeline_sandbox.py # Hybrid Analysis sandbox pipeline
β βββ pipeline_url.py # URL/Domain/IP main pipeline
β
βββ tests/
βββ conftest.py # Shared fixtures and mocks
βββ test_caching.py # CacheManager behaviour
βββ test_config.py # Config loader, defang_url, HA key rotator
βββ test_email_engine.py # Urgency patterns, brand detection
βββ test_hash_validation.py # MD5 / SHA1 / SHA256 format validation
βββ test_otx_engine.py # OTX mock responses, infra-signal scoring
βββ test_scoring.py # AZ domain tiers, whitelist, overrides
βββ test_scoring_normalization.py# Score normalisation edge cases
βββ test_ssrf.py # SSRF prevention
βββ test_urlhaus_engine.py # URLhaus mock + graceful degradation
βββ test_urlhaus_integration.py # URLhaus integration tests
βββ test_v8_fixes.py # Forensic wiring, dead file removal, fixes
GhostWire exports fully compliant STIX 2.1 bundles with deterministic UUIDs β the same IOC always produces the same STIX ID. No duplicate conflicts on repeated TAXII pushes.
{
"type": "bundle",
"spec_version": "2.1",
"objects": [
{ "type": "identity" },
{ "type": "indicator" },
{ "type": "malware" },
{ "type": "threat-actor" },
{ "type": "relationship" },
{ "type": "report" }
]
}Push to TAXII 2.1:
curl -X POST https://taxii.yourorg.com/api/collections/{id}/objects/ \
-H "Content-Type: application/taxii+json;version=2.1" \
-H "Authorization: Bearer $TAXII_TOKEN" \
-d @ghostwire_stix_export.jsonCSV IOC export also available for bulk SIEM import.
pytest tests/ -v
pytest tests/ --cov=backend --cov-report=term-missing| Test | Covers |
|---|---|
test_scoring.py |
AZ domain tiers, whitelist, squatting, overrides, Tor floor |
test_scoring_normalization.py |
Score normalisation edge cases |
test_email_engine.py |
Urgency patterns, brand impersonation, header parsing |
test_hash_validation.py |
MD5 / SHA1 / SHA256 format validation |
test_otx_engine.py |
OTX mock responses, infrastructure-signal scoring |
test_urlhaus_engine.py |
URLhaus mock responses, graceful degradation |
test_urlhaus_integration.py |
URLhaus end-to-end integration |
test_caching.py |
CacheManager hit/miss/expiry behaviour |
test_ssrf.py |
SSRF prevention |
test_config.py |
Config loader, defang_url, HA key rotator |
test_v8_fixes.py |
Forensic engine wiring, dead file removal, integration |
- VirusTotal β Google's AV aggregation platform
- AbuseIPDB β IP abuse reporting database
- Shodan β The search engine for the internet
- GreyNoise β Internet scanner intelligence
- abuse.ch / URLhaus β Malware URL database
- AlienVault OTX β Open Threat Exchange
- Hybrid Analysis β Falcon Sandbox (CrowdStrike)
- Ollama β Local LLM runtime
- Streamlit β Python web UI framework
- ReportLab β PDF generation
- Playwright β Headless browser automation
- MITRE ATT&CK β Adversary tactic framework
> connection established
> target acquired
> analysis complete
> stay ghost.
β GhostWire CTI v7
