Skip to content

Commit dbc41b7

Browse files
docs: describe precerts and SCT validation (google#1434)
1 parent f6216dc commit dbc41b7

File tree

2 files changed

+269
-0
lines changed

2 files changed

+269
-0
lines changed

docs/SCTValidation.md

Lines changed: 269 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,269 @@
1+
# Signed Certificate Timestamp (SCT) Validation
2+
3+
- [SCT Concepts](#sct-concepts)
4+
- [SCT Contents](#sct-contents)
5+
- [SCT Delivery Mechanism](#sct-delivery-mechanism)
6+
- [Precertificates](#precertificates)
7+
- [Pre-Issued Precertificates](#pre-issued-precertificates)
8+
- [SCT Validation Steps](#sct-validation-steps)
9+
- [Embedded SCTs](#embedded-scts)
10+
- [Inclusion Checking](#inclusion-checking)
11+
12+
## SCT Concepts
13+
14+
Certificate Transparency (CT) involves recording X.509 certificates in a CT Log.
15+
Because the Log might not be able to incorporate a certificate into its data
16+
store right away, it returns a **Signed Certificate Timestamp** (SCT), which is
17+
a promise to incorporate that certificate soon.
18+
19+
That description included a couple of
20+
[weasel words](https://en.wikipedia.org/wiki/Weasel_word):
21+
22+
- "Soon": A CT Log publishes its *maximum merge delay* (often 24 hours), which
23+
says how soon. If a Log hasn't incorporated a certificate within that
24+
period, it is misbehaving and may be punished. A "signed certificate
25+
timestamp" includes a "timestamp" to allow this to be checked.
26+
- "Promise": A CT Log includes a cryptographic signature in the contents of the
27+
SCT; this means that the Log cannot later claim that it never saw the
28+
original certificate. This is the origin of the "signed" part of "signed
29+
certificate timestamp".
30+
31+
32+
## SCT Contents
33+
34+
As described above, a signed certificate timestamp includes some key components:
35+
- "signed": a cryptographic signature over the data, which proves that the Log
36+
definitely saw the submitted certificate.
37+
- "certificate": some way of confirming that the SCT applies to a particular
38+
X.509 certificate.
39+
- "timestamp": a timestamp that gives a limit to how soon the certificate must
40+
be visible in the Log.
41+
42+
The precise contents of the SCT are defined by
43+
[RFC 6962](https://tools.ietf.org/html/rfc6962#section-3.2) (and shown in
44+
[this diagram](images/RFC6962Structures.png)), but there are a couple of
45+
features that are worth pointing out.
46+
47+
First, the SCT includes a signature *over* the timestamp and certificate details
48+
(among other things), but doesn't include the certificate data. This means that
49+
you can't verify the SCT on its own – you also need to have the
50+
certificate that it corresponds to.
51+
52+
Secondly, the SCT structure has two variants: one for certificates, and one for
53+
*precertificates*. That's the subject of a [later section](#precertificates),
54+
once we've touched on how SCTs get to users.
55+
56+
57+
## SCT Delivery Mechanisms
58+
59+
We've seen that an SCT is a promise that a certificate has been logged, so it
60+
also shows that the certificate issuance was public. If an HTTPS client gets a
61+
certificate that has one or more SCTs associated with it, that's a pretty good
62+
sign that the certificate was publicly issued – but only if the
63+
client:
64+
* gets the SCTs
65+
* validates the SCTs.
66+
67+
(Of course, *publicly issued* isn't necessarily the same as *legitimately
68+
issued*, but that's a
69+
[different topic](https://tools.ietf.org/html/rfc6962#section-5.3).)
70+
71+
[RFC 6962](https://tools.ietf.org/html/rfc6962#section-3.3) describes three ways
72+
for SCTs to make their way to users:
73+
- The TLS server can include the SCT(s) as an extension in the TLS handshake.
74+
- The TLS server can include the SCT(s) as an extension in a
75+
[stapled OCSP response](https://tools.ietf.org/html/rfc6066).
76+
- The X.509 certificate itself can include the SCT(s) as a certificate
77+
extension.
78+
79+
The last of these obviously poses a
80+
[chicken-and-egg problem](https://en.wikipedia.org/wiki/Chicken_or_the_egg): if
81+
the SCT includes a signature over the certificate, and the certificate includes
82+
the SCT, which came first?
83+
84+
The answer to this question involves *precertificates*, discussed next.
85+
86+
## Precertificates
87+
88+
If a Certificate Authority (CA) wants to embed SCTs within the certificates it
89+
issues (which provides a simple delivery mechanism for those SCTs), it needs to
90+
get an SCT which covers a precursor version of the certificate, known as a
91+
**precertificate**.
92+
93+
This is conceptually just the final certificate without the embedded SCT list,
94+
but there's a technical difficulty: CAs are supposed to ensure that valid
95+
certificates have a serial number that is unique for a given issuing key. If a
96+
CA signed the same certificate both with and without the embedded SCT
97+
list, some felt that this might be in violation of this rule – there would
98+
be two different valid certificates that shared the same issuer and serial number.
99+
100+
Precertificates take advantage of the weasel word "valid" in the previous
101+
description: the precertificate version of the certificate (which the CA signs
102+
and the Log registers) includes an X.509 extension that is marked as critical,
103+
but which is deliberately non-standard – the so-called *poison extension*.
104+
According to the rules of
105+
[RFC 5280](https://tools.ietf.org/html/rfc5280#section-4.2), a
106+
"certificate-using system MUST reject the certificate if it encounters a
107+
critical extension it does not recognize" – so the precertificate is not a
108+
*valid* X.509 certificate.
109+
110+
Unfortunately, that's got some knock-on effects on how SCTs for precertificates
111+
work, which makes life complicated.
112+
113+
First, the "certificate" part of "signed certificate timestamp" can no longer be
114+
over the whole certificate, because what the "certificate" is keeps changing:
115+
- On submission, the "certificate" is a whole X.509 certificate that includes
116+
the poison extension, and has a signature from the cert issuer over the whole
117+
cert (including the poison).
118+
- On embedded SCT validation, the "certificate" is a whole X.509 certificate
119+
that includes the SCT list extension, and has a signature from the cert
120+
issuer over the whole cert (including the SCT list).
121+
122+
To allow these "certificates" to be compared, the SCT signature for a
123+
precertificate is defined to only cover the inner part of the X.509 certificate,
124+
without the issuer's signature; this is known as the `tbsCertificate`, where
125+
`tbs` stands for "to-be-signed". This makes the two versions comparable: in
126+
either case, dropping the certificate signature and any embedded CT-related
127+
extension (poison or SCT list) should give the same bytes.
128+
129+
**Note**: this has an important corollary: to make sure this is true, any code
130+
that manipulates [pre]certificates has to make sure that *nothing else* changes
131+
in the certificate. In particular:
132+
- the [extensions](https://tools.ietf.org/html/rfc5280#section-4.2) have to
133+
stay in the same order
134+
- extension contents have to stay in the same order (e.g. for the
135+
[SAN](https://tools.ietf.org/html/rfc5280#section-4.2.1.6))
136+
- all ASN.1 types have to stay the same (no switching from `UTF8String` to
137+
`PrintableString`).
138+
139+
That leaves one more complication: if the SCT only covers the `tbsCertificate`,
140+
what guarantee do we have that the issuer of the final certificate matches the
141+
one that logged the precertificate? To cover this concern, the SCT for a
142+
precertificate also signs over the hash of the issuer's public key.
143+
144+
So to sum up precertificates:
145+
- The CA builds and signs a version of the certificate that includes the
146+
poison extension, and submits this to the log.
147+
- The Log removes the poison and extracts the inner `tbsCertificate`. This is
148+
combined with the hash of the issuer's key, to form the core `PreCert` data
149+
that the Log deals with.
150+
- The Log sends an SCT back to the CA, which includes a timestamp as usual but
151+
has a signature over data including the `PreCert` bundle of `tbsCertificate`
152+
and issuer key hash.
153+
- The CA builds an SCT list extension that includes this SCT, attaches it to
154+
the original (un-poisoned) certificate, and signs the whole thing.
155+
156+
157+
### Pre-Issued Precertificates
158+
159+
But wait, it (optionally) gets more complicated!
160+
161+
(Feel free to skip this section – in practice, this mechanism is rarely
162+
used in the wild, i.e. other than by explicit CT testing/monitoring systems.)
163+
164+
To allow for the possibility that the "valid" certificate loophole might not be
165+
enough, RFC 6962 also allows an extra level of indirection: the pre-certificate
166+
can be signed by a different key than the key that the final certificate will be
167+
signed by. This means that the "true issuer" key only ever signs one version of
168+
the leaf certificate: the final certificate with embedded SCT list.
169+
170+
The extra level of indirection comes in the form of a "pre-issuer": the key used
171+
to sign (just) the precertificate is embedded in a CA cert of its own, and this
172+
pre-issuer cert is signed by the true issuer. To make it clear that this is a
173+
special case, this pre-issuer cert needs to have the Certificate Transparency
174+
extended key usage (EKU).
175+
176+
However, this involves yet more modifications to the submitted precertificate:
177+
as it is now issued by a different intermediate, those parts of the certificate
178+
that refer to the issuer have to be updated to match:
179+
- The Issuer field needs to be updated to match the pre-issuer Subject name.
180+
- The Authority Key Identifier extension (if present) needs to be updated to
181+
identify the pre-issuer's key.
182+
183+
These modifications won't be present in the final version of the certificate
184+
that the true issuer signs, so the Log has to reverse these modifications before
185+
storing and signing over the precertificate.
186+
187+
The extra pre-issuer certificate also makes the Log's job more complicated when
188+
verifying the submitted certificate chain. The chain of signatures has to be
189+
verified using the full chain, including the pre-issuer, but other validity
190+
checks need to be done as if the pre-issuer were not present. For example:
191+
- The pre-issuer has the CT EKU, but the leaf cert does not, so any check that
192+
the leaf's EKUs are a subset of its issuer's EKUs will incorrectly fail.
193+
- Any path length constraint (in a
194+
[Basic Constraints](https://tools.ietf.org/html/rfc5280#section-4.2.1.9)
195+
extension) for a CA certificate in the chain may be off by one (and so
196+
theoretically not allow the pre-issuer to be a CA certificate).
197+
198+
So to sum up *pre-issued* precertificates:
199+
- The CA builds a precertificate version of the certificate that includes the
200+
poison.
201+
- Next, the CA modifies the Issuer and (optional) Authority Key Identifier
202+
extension in the precertificate to match the pre-issuer rather than the true
203+
issuer.
204+
- The CA signs the resulting precertificate using the pre-issuer's key.
205+
- The submits the whole chain (precert, pre-issuer, true-issuer, ... root) to
206+
the Log.
207+
- The Log removes the poison and extracts the inner `tbsCertificate`.
208+
- The Log notices that the direct issuer has the Certificate Transparency
209+
extended key usage, so treats the next entry in the chain as the true issuer.
210+
- The Log updates the Issuer and (optional) Authority Key Identifier extension
211+
to match the true issuer rather than the pre-issuer.
212+
- The Log combines the resulting inner `tbsCertificate` with the hash of the
213+
(true) issuer's key, to form the core `PreCert` data that the Log deals with.
214+
- The Log sends an SCT back to the CA, which includes a timestamp as usual but
215+
has a signature over data including the `PreCert` bundle of `tbsCertificate`
216+
and issuer key hash.
217+
- The CA builds an SCT list extension that includes this SCT, attaches it to
218+
the original (un-poisoned) certificate, and signs the whole thing with the
219+
true issuer's key.
220+
221+
This is complicated, but it's worth pointing out that only the CA and the Log
222+
need to deal with all of this; the SCT (and the tree leaf it corresponds to)
223+
only covers data that pertains to the final certificate. This means that a
224+
client who receives a certificate with an embedded SCT can just do the
225+
[same checks](#embedded-scts) as for any other embedded SCT.
226+
227+
228+
## SCT Validation Steps
229+
230+
There are two main aspects to validating an SCT:
231+
- Signature Validation: this requires the following (in addition to the SCT itself):
232+
- the Log's public key
233+
- the [pre]certificate data that the signature encompasses
234+
- (for a precertificate) the issuer's public key
235+
- Inclusion Checking: this requires the following (in addition to the SCT
236+
itself):
237+
- enough time to have passed (at most, the Log's MMD) for the certificate
238+
to be incorporated
239+
- the Log's URL (for accessing the `get-proof-by-hash`
240+
[entrypoint](https://tools.ietf.org/html/rfc6962#section-4.5))
241+
- the [pre]certificate data that the signature encompasses
242+
- (for a precertificate) the issuer's public key
243+
- a published tree size and root hash for the Log (via the `get-sth`
244+
[entrypoint](https://tools.ietf.org/html/rfc6962#section-4.1))
245+
- the Log's public key (to allow validation of the signed tree head (STH)
246+
that provided the tree size/root hash).
247+
248+
249+
### Embedded SCTs
250+
251+
To check SCTs that are embedded in an X.509 certificate, a client needs to
252+
rebuild the [precertificate](#precertificates) data that the SCT and the leaf
253+
hash encompasses:
254+
255+
- The SCT list extension must be removed.
256+
- The inner `tbsCertificate` data must be extracted, and combined with the hash
257+
of the issuer's public key,
258+
259+
### Inclusion Checking
260+
261+
To perform on-line inclusion checking for an SCT, the client needs to generate
262+
the *leaf hash* for the submitted entry, to submit in the `get-proof-by-hash`
263+
[entrypoint](https://tools.ietf.org/html/rfc6962#section-4.5).
264+
265+
This is the SHA-256 hash of a zero byte followed by the TLS encoding of a
266+
`MerkleTreeLeaf` structure (shown in
267+
[the diagram](images/RFC6962Structures.png)). Building this structure
268+
requires the same information as does validating the SCT signature, but
269+
in a different order/structure.

docs/images/RFC6962Structures.png

118 KB
Loading

0 commit comments

Comments
 (0)