feat(dns): DNS-over-HTTPS resolver for mobile networks#151
Open
samosvalishe wants to merge 8 commits into
Open
feat(dns): DNS-over-HTTPS resolver for mobile networks#151samosvalishe wants to merge 8 commits into
samosvalishe wants to merge 8 commits into
Conversation
Contributor
Author
|
Потестить можно тут https://github.com/samosvalishe/turn-proxy-android/releases/tag/v1.8.0 |
Contributor
Author
…nostics - Pick turn URL by streamID % len(urls) instead of always urlsRaw[0] - Add countingConn to track bytes written/read for TCP TURN connections - Add classifyNetErr helper for structured error categorization - Log TCP dial failures always; verbose logs gated behind isDebug
…oxy) Brings the captcha-solver improvements from main into feat/doh while keeping the flat client/ layout (no internal/* refactor pulled in). - Persistent SavedProfile (UA + Sec-CH-UA + device JSON + browser_fp) captured during manual solve and replayed by auto/slider so VK sees a consistent fingerprint across runs. Stored under $VK_PROFILE_PATH | UserCacheDir | TempDir | CWD. - callCaptchaNotRobot: per-session adFp, sha256 debug_info, jittered connectionRtt/connectionDownlink, cursor "[]" on first check, headers switched to Origin api.vk.ru / Referer not_robot_captcha. - Slider session: per-session adFp + debugInfo, savedProfile injection, ApplyBrowserProfileFhttp + same captcha headers on every request, getContent fallback with/without captcha_settings, second componentDone before getContent (matches real widget lifecycle). - Manual proxy: strip WebView identity headers (X-Requested-With and friends), server-side rewrite of src/href/action attributes (skipping <script>/<style> spans), inject helper script at <head> opening, sendBeacon + form fallback for token delivery on mobile WebView, /generic_proxy SSRF allowlist + scheme check + security-header strip + server-side success_token extract, loggingTransport that captures the real browser fingerprint and persists it as SavedProfile, best-effort 3s Shutdown, Windows rundll32 launcher, PII redaction in logs. - solvePoW returns an error instead of an empty string. - Manual captcha timeout bumped 60s -> 3m on context.Background so a human has time to solve regardless of the auth-level deadline; non-empty token/key from the manual goroutine is treated as success even if the server cleanup returned an error.
Two related changes ported from main, adapted to the flat client/ layout:
1. Identity caching + per-slot TURN creds (vkauth)
- Split the monolithic getTokenChain into:
* acquireVkIdentity — captcha-gated steps 1-3 (anonym_token,
getCallPreview, getAnonymousToken). Cached per (link, client_id)
for identityLifetime=8m, globally serialised via vkRequestMu +
3-6s cooldown.
* acquireVkTurnSlot — lightweight steps 4-5 (auth.anonymLogin
with fresh device_id, vchat.joinConversationByLink). Each call
returns a distinct (username, password) pair, so multiple
streams under the same identity each get their own VK-side
slot — bypasses per-username throttling without re-solving
captcha.
- vkCredentialsList trimmed from 5 to 2: VKVIDEO_* and VK_ID_AUTH_APP
started returning error_code:3 "Unknown method" on
calls.getAnonymousToken (observed 2026-04-28) and only burned
throttle budget if kept in rotation.
- streamsPerCache 10 -> 1: each stream now caches its own slot
creds because slots are unique per call.
- Credential rotation starts at streamID%n offset so concurrent
streams spread across the credential list instead of all hitting
the same client_id first.
- identityStore + identityEntry give per-(link, client_id)
serialisation: only one stream solves captcha per identity.
- turn_server.urls picking is transport-aware (prefers urls whose
?transport= matches udpMode, falls back to the full list when
nothing matches to preserve -port override) and round-robins
within an identity via urlCounter — streamID%len(pool) collapses
every stream of an identity onto the same parity.
2. Multiple TURN allocations per stream (oneTurnConnection)
- New -allocs-per-stream flag (default 1).
- dialTurn extracted as a helper that returns a turnAllocation
(dialConn, turn.Client, relay PacketConn).
- relayPool wraps the live relays with sync.RWMutex + atomic
counter for round-robin pick on the outbound hot path.
- Outbound goroutine (conn2 -> relay) uses pool.pick() round-robin.
Per-relay inbound goroutine (relay -> conn2) is spawned via
spawnInbound; they all feed the same conn2 keyed by
internalPipeAddr.
- Primary allocation opens immediately. Extras are deferred 3s so
the DTLS handshake completes over the primary first, letting the
server install the Connection ID; subsequent multi-path packets
are then matched to the existing session via CID rather than
5-tuple. Each extra is jittered 200ms apart.
- Allocation tracking + deadline-on-cancel + close-on-exit handle
clean shutdown of all relays.
A new udpMode global mirrors the -udp flag so acquireVkTurnSlot
(called from the credential layer, which doesn't have access to
turnParams) can filter URLs by transport.
- relayPool/sessionPool: atomic.Pointer copy-on-write, drop RWMutex from pick() - DTLS read loop caches activeLocalPeer locally to skip type-assert per packet - solvePoW parallelised across runtime.NumCPU() workers - vkRequestMu replaced with per-client_id throttle so distinct client_ids run in parallel - inboundChan 2000 -> 8192, periodic drop-counter logging - listener caches addr.String() to avoid redundant atomic.Value stores
- handshakeSem 3 -> 8 default + new -handshake-concurrency flag - per-stream startup jitter 100-500ms -> 30-130ms - TURN dial ticker 200ms -> 100ms - extra alloc deferral 3s -> 1s (DTLS handshake completes fast) - VLESS maintainer stagger 300ms -> 100ms For N=10 streams cold start drops from ~5-8s of pure scheduling overhead to ~1.5-2s; bottleneck is now the per-client_id throttle and VK API latency.
…bound - sleepCtx helper (NewTimer+Stop) replaces time.After in long backoff sites: DTLS reconnect (10-30s), captcha 60s ban backoff, lockout sleep. Avoids long-lived timer leak when ctx cancels mid-wait. - startIdentityJanitor: prunes expired identityStore entries every 5 min. Two-phase with TryLock so an in-flight acquireVkIdentity (which can hold entry.mu for tens of seconds during captcha) never blocks the janitor or other acquires. - getYandexCreds: cap ws read loop at 64 messages so a chatty/malformed peer cannot keep us reading until the 15s deadline burns down.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
На мобильном трафике некоторых операторов клиент падает ещё до
стадии TURN-а - на этапе получения VK-credentials (#144):
login.vk.ru,api.vk.ru,calls.okcdn.ruне проходит.на TLS-handshake.
Первопричина
До этого патча имена резолвились единственным способом - UDP/53 к
набору публичных резолверов (Яндекс/Google/Cloudflare), зашитых в
getCustomNetDialer():На мобильных сетях этот путь не работает по двум причинам:
:53вроде проходит (UDP connectionless — SendTo не падает), а реальный
ответ не приходит.
оператор прозрачно перехватывает UDP/53 и подставляет свой ответ.
Смена DNS в настройках Android не спасает, так как перехват происходит на уровне сети.
TCP/53 блокируется ещё агрессивнее, так что как fallback не годится.
Решение: DNS-over-HTTPS (RFC 8484) с авто-переключением
Что сделано
Новый модуль
client/doh.go- DoH-резолвер:application/dns-messageна заранее выбранные endpoint'ы.зависел от системного DNS.
IPv4-only CGNAT).
[10s, 1h].(
golang.org/x/crypto/x509roots/fallback) - нужно дляAndroid-сборок с
CGO_ENABLED=0.Локальный UDP/TCP-форвардер на 127.0.0.1 - Go-резолвер
подключается к нему как к обычному DNS-серверу, он заворачивает
приходящую wire-форму запроса в DoH и отдаёт ответ обратно.
Все edge-кейсы Go-резолвера (RESINFO, EDNS, TCP length-prefix,
повторы) обрабатываются штатно.
Единая точка входа
appDialer()вmain.go- все сетевыеклиенты проекта теперь резолвят одинаково:
tls-clientдля VK-auth;http.Transportдля Telemost-конференции;websocket.Dialerдля Telemost WSS;http.Transportдля прокси ручной капчи.Флаг
-dns=udp|doh|auto(defaultauto).В auto-режиме при старте делается один реальный DNS round-trip по
UDP/53 под дедлайном 1.5 с. Если ответ не пришёл - процесс на
всё время жизни переключается на DoH.
Список endpoints - сознательно в таком порядке:
Яндекс первым - лучше остаётся доступен с мобильных
операторов, чем Google/CF.
Удалена зависимость
bschaatsbergen/dnsdialerи еётранзитивки (
hashicorp/golang-lru/v2,google.golang.org/grpc,google.golang.org/genproto/...) - DoH-резолвер полностьюпокрывает функциональность, которую мы использовали.