This repository was archived by the owner on Feb 26, 2025. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 5
Publishing Public Suffix List as a DAFSA binary #373
Merged
Merged
Changes from all commits
Commits
Show all changes
44 commits
Select commit
Hold shift + click to select a range
e431c32
dafsa etld_datacreated successfully
heyitsarpit 482e864
Replaced wget dependency and added kinto record publishing code
heyitsarpit 4403331
errors for fetching functions handled with decorator
heyitsarpit 2d798e1
kinto update records
heyitsarpit de83ba2
script refactored, publish_dafsa now entry point for script
heyitsarpit a26a443
refactored redundant nested else statement
heyitsarpit 3f27425
get_latest_hash() now only returns hash when public_suffix_list.dat i…
heyitsarpit f41bef1
removed directory as **kwargs argument
heyitsarpit c845e6a
Credentials replaced with environment variables
heyitsarpit 0c5e08a
attachment uploading
heyitsarpit b2175e4
Creating filepaths with os.path.join()
heyitsarpit 06fbcd6
removed mimitypes module, check respose status, initialised client wi…
heyitsarpit 024f261
get_latest_hash() function now checks for just the single file with n…
heyitsarpit 9bc4dbc
Record and attachment now uploaded in the single session
heyitsarpit 9aa0843
Client creation moved to different function for now.
heyitsarpit 14f0e3a
regrouped imports and removed decorator function
heyitsarpit 8171561
raising more exceptions and getting server/auth from event object
heyitsarpit 361d5a7
Record fetching to compare and passing client as argument
heyitsarpit f76a857
added auth split to tuple
heyitsarpit 386f6d7
removed redundant code and refactoring
heyitsarpit 104de46
auth key and dafsa flag added
heyitsarpit a7a27ec
get bucket through env, added context arg to entry function
heyitsarpit af955d8
conform to flake8 standard
heyitsarpit 022cff4
conform to black standard
heyitsarpit 1cce889
Request review
heyitsarpit 09c92f3
refactor
heyitsarpit 6f9a0c2
dafsa creation comment update
heyitsarpit 70fc4a1
tests added for publish_dafsa and added argument for get_latest_hash()
heyitsarpit 9740e66
Test added for dafsa creation
heyitsarpit dd45bca
dafsa file creation now captures output of stdout
heyitsarpit 3ab6aff
kinto client tests
heyitsarpit fb0fb61
dafsa creation and publishing broken into two methods
heyitsarpit 179e730
tests for new functions
heyitsarpit 9d798a5
tests wip
heyitsarpit 1faccfa
kinto expection test
heyitsarpit 9e68bef
All assertRaises as context managers
heyitsarpit b3b541e
get_latest_hash test passing
heyitsarpit 2f16062
removed unneccsasry responses and added new test
heyitsarpit a751efd
new function for record fetching
heyitsarpit e018a64
tests refactored
heyitsarpit 8822fe4
all tests passing except remote settings publish
heyitsarpit a2c6eae
added decortor to test_HTTPError_raised_when_404
heyitsarpit 3350c70
test for remote_settings_publish now passing
heyitsarpit 4cca805
removed redundant bits and added a new test
heyitsarpit File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -4,3 +4,4 @@ venv/ | |
| __pycache__/ | ||
| docs/build/ | ||
| requests_cache1.sqlite | ||
| .vscode/ | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,111 @@ | ||
| import os | ||
| import json | ||
| import tempfile | ||
| import subprocess | ||
|
|
||
| import requests | ||
| from kinto_http import Client, KintoException | ||
|
|
||
|
|
||
| PSL_FILENAME = "public_suffix_list.dat" | ||
|
|
||
| COMMIT_HASH_URL = ( | ||
| f"https://api.github.com/repos/publicsuffix/list/commits?path={PSL_FILENAME}" | ||
| ) | ||
| LIST_URL = f"https://raw.githubusercontent.com/publicsuffix/list/master/{PSL_FILENAME}" | ||
|
|
||
| MAKE_DAFSA_PY = "https://raw.githubusercontent.com/arpit73/temp_dafsa_testing_repo/master/publishing/make_dafsa.py" # noqa | ||
| PREPARE_TLDS_PY = "https://raw.githubusercontent.com/arpit73/temp_dafsa_testing_repo/master/publishing/prepare_tlds.py" # noqa | ||
|
|
||
| BUCKET_ID = os.getenv("BUCKET_ID", "main-workspace") | ||
| COLLECTION_ID = "public-suffix-list" | ||
| RECORD_ID = "tld-dafsa" | ||
|
|
||
|
|
||
| def get_latest_hash(url): | ||
| response = requests.get(url) | ||
| response.raise_for_status() | ||
| return response.json()[0]["sha"] | ||
|
|
||
|
|
||
| def download_resources(directory, *urls): | ||
| for url in urls: | ||
| file_name = os.path.basename(url) | ||
| file_location = os.path.join(directory, file_name) | ||
| response = requests.get(url, stream=True) | ||
leplatrem marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| response.raise_for_status() | ||
|
|
||
| with open(file_location, "wb") as f: | ||
| for chunk in response.iter_content(chunk_size=1024): | ||
| f.write(chunk) | ||
|
|
||
|
|
||
| def get_stored_hash(client): | ||
| record = {} | ||
| try: | ||
| record = client.get_record(id=RECORD_ID) | ||
| except KintoException as e: | ||
| if e.response is None or e.response.status_code == 404: | ||
| raise | ||
| return record.get("data", {}).get("commit-hash") | ||
|
|
||
|
|
||
| def prepare_dafsa(directory): | ||
| download_resources(directory, LIST_URL, MAKE_DAFSA_PY, PREPARE_TLDS_PY) | ||
| """ | ||
| prepare_tlds.py is called with the three arguments the location of | ||
| the downloaded public suffix list, the name of the output file and | ||
| the '--bin' flag to create a binary file | ||
| """ | ||
| output_binary_name = "dafsa.bin" | ||
| output_binary_path = os.path.join(directory, output_binary_name) | ||
| prepare_tlds_py_path = os.path.join(directory, "prepare_tlds.py") | ||
| raw_psl_path = os.path.join(directory, PSL_FILENAME) | ||
| # Make the DAFSA | ||
| command = ( | ||
| f"python3 {prepare_tlds_py_path} {raw_psl_path} --bin > {output_binary_path}" | ||
| ) | ||
| run = subprocess.Popen( | ||
| command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT | ||
| ) | ||
| run.wait() | ||
| if run.returncode != 0: | ||
| raise Exception("DAFSA Build Failed !!!") | ||
|
|
||
| return output_binary_path | ||
|
|
||
|
|
||
| def remote_settings_publish(client, latest_hash, binary_path): | ||
| # Upload the attachment | ||
| binary_name = os.path.basename(binary_path) | ||
| mimetype = "application/octet-stream" | ||
| filecontent = open(binary_path, "rb").read() | ||
| record_uri = client.get_endpoint("record", id=RECORD_ID) | ||
| attachment_uri = f"{record_uri}/attachment" | ||
| multipart = [("attachment", (binary_name, filecontent, mimetype))] | ||
| commit_hash = json.dumps({"commit-hash": latest_hash}) | ||
| client.session.request( | ||
| method="post", data=commit_hash, endpoint=attachment_uri, files=multipart | ||
| ) | ||
| # Requesting the new record for review | ||
| client.patch_collection(data={"status": "to-review"}) | ||
|
|
||
heyitsarpit marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| def publish_dafsa(event, context): | ||
| server = event.get("server") or os.getenv("SERVER") | ||
| auth = event.get("auth") or os.getenv("AUTH") | ||
| # Auth format assumed to be "Username:Password" | ||
| if auth: | ||
| auth = tuple(auth.split(":", 1)) | ||
|
|
||
| client = Client( | ||
| server_url=server, auth=auth, bucket=BUCKET_ID, collection=COLLECTION_ID | ||
| ) | ||
|
|
||
| latest_hash = get_latest_hash(COMMIT_HASH_URL) | ||
| stored_hash = get_stored_hash(client) | ||
|
|
||
| if stored_hash != latest_hash: | ||
| with tempfile.TemporaryDirectory() as tmp: | ||
| output_binary_path = prepare_dafsa(tmp) | ||
| remote_settings_publish(client, latest_hash, output_binary_path) | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,205 @@ | ||
| import os | ||
| import tempfile | ||
| import unittest | ||
| from unittest import mock | ||
|
|
||
| import requests | ||
| import responses | ||
| from kinto_http import Client, KintoException | ||
|
|
||
|
|
||
| from commands.publish_dafsa import ( | ||
| get_latest_hash, | ||
| download_resources, | ||
| get_stored_hash, | ||
| prepare_dafsa, | ||
| remote_settings_publish, | ||
| publish_dafsa, | ||
| PREPARE_TLDS_PY, | ||
| MAKE_DAFSA_PY, | ||
| LIST_URL, | ||
| BUCKET_ID, | ||
| COLLECTION_ID, | ||
| RECORD_ID, | ||
| COMMIT_HASH_URL, | ||
| ) | ||
|
|
||
|
|
||
| class TestsGetLatestHash(unittest.TestCase): | ||
| def test_get_latest_hash_returns_sha1_hash(self): | ||
| size_latest_hash = len(get_latest_hash(COMMIT_HASH_URL)) | ||
| self.assertEqual(size_latest_hash, 40) | ||
|
|
||
| @responses.activate | ||
| def test_HTTPError_raised_when_404(self): | ||
| responses.add( | ||
| responses.GET, COMMIT_HASH_URL, json={"error": "not found"}, status=404 | ||
| ) | ||
| with self.assertRaises(requests.exceptions.HTTPError) as e: | ||
| get_latest_hash(COMMIT_HASH_URL) | ||
| self.assertEqual(e.status_code, 404) | ||
|
|
||
|
|
||
| class TestDownloadResources(unittest.TestCase): | ||
| def test_all_files_downloaded_with_correct_names(self): | ||
| with tempfile.TemporaryDirectory() as tmp: | ||
| download_resources(tmp, PREPARE_TLDS_PY, MAKE_DAFSA_PY, LIST_URL) | ||
| self.assertEqual( | ||
| sorted(os.listdir(tmp)), | ||
| sorted(["public_suffix_list.dat", "prepare_tlds.py", "make_dafsa.py"]), | ||
| ) | ||
|
|
||
| @responses.activate | ||
| def test_HTTPError_raised_when_404(self): | ||
| with tempfile.TemporaryDirectory() as tmp: | ||
| responses.add( | ||
| responses.GET, PREPARE_TLDS_PY, json={"error": "not found"}, status=404 | ||
| ) | ||
| with self.assertRaises(requests.exceptions.HTTPError) as e: | ||
| download_resources(tmp, PREPARE_TLDS_PY) | ||
| self.assertEqual(e.status_code, 404) | ||
|
|
||
|
|
||
| class TestGetStoredHash(unittest.TestCase): | ||
| def setUp(self): | ||
| server = "https://fake-server.net/v1" | ||
| auth = ("arpit73", "pAsSwErD") | ||
| self.client = Client( | ||
| server_url=server, auth=auth, bucket=BUCKET_ID, collection=COLLECTION_ID | ||
| ) | ||
| self.record_uri = server + self.client.get_endpoint( | ||
| "record", id=RECORD_ID, bucket=BUCKET_ID, collection=COLLECTION_ID | ||
| ) | ||
|
|
||
| @responses.activate | ||
| def test_stored_hash_fetched_successfully(self): | ||
| responses.add( | ||
| responses.GET, | ||
| self.record_uri, | ||
| json={"data": {"commit-hash": "fake-commit-hash"}}, | ||
| ) | ||
| stored_hash = get_stored_hash(self.client) | ||
| self.assertEqual(stored_hash, "fake-commit-hash") | ||
|
|
||
| @responses.activate | ||
| def test_KintoException_raised_when_stored_hash_fetching_failed(self): | ||
| responses.add( | ||
| responses.GET, self.record_uri, json={"error": "not found"}, status=404 | ||
| ) | ||
| with self.assertRaises(KintoException) as e: | ||
| get_stored_hash(self.client) | ||
| self.assertEqual(e.status_code, 404) | ||
|
|
||
heyitsarpit marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| class TestPrepareDafsa(unittest.TestCase): | ||
| def test_file_is_created_in_output_folder(self): | ||
| with tempfile.TemporaryDirectory() as tmp: | ||
| output_binary_path = prepare_dafsa(tmp) | ||
| self.assertIn(os.path.basename(output_binary_path), os.listdir(tmp)) | ||
| self.assertGreater(os.path.getsize(output_binary_path), 0) | ||
|
|
||
| def test_exception_is_raised_when_process_returns_non_zero(self): | ||
| with tempfile.TemporaryDirectory() as tmp: | ||
| with mock.patch("subprocess.Popen") as mocked: | ||
| mocked.return_value.returncode = 1 | ||
| with self.assertRaises(Exception) as e: | ||
| prepare_dafsa(tmp) | ||
| self.assertIn("DAFSA Build Failed", str(e.exception)) | ||
|
|
||
|
|
||
| class TestRemoteSettingsPublish(unittest.TestCase): | ||
| def setUp(self): | ||
| server = "https://fake-server.net/v1" | ||
| auth = ("arpit73", "pAsSwErD") | ||
| self.client = Client( | ||
| server_url=server, auth=auth, bucket=BUCKET_ID, collection=COLLECTION_ID | ||
| ) | ||
heyitsarpit marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| record_uri = server + self.client.get_endpoint( | ||
| "record", id=RECORD_ID, bucket=BUCKET_ID, collection=COLLECTION_ID | ||
| ) | ||
| self.collection_uri = server + self.client.get_endpoint( | ||
| "collection", bucket=BUCKET_ID, collection=COLLECTION_ID | ||
| ) | ||
heyitsarpit marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| self.attachment_uri = f"{record_uri}/attachment" | ||
|
|
||
| @responses.activate | ||
| def test_record_was_posted(self): | ||
| responses.add( | ||
| responses.POST, | ||
| self.attachment_uri, | ||
| json={"data": {"commit-hash": "fake-commit-hash"}}, | ||
| ) | ||
| responses.add( | ||
| responses.PATCH, self.collection_uri, json={"data": {"status": "to-review"}} | ||
| ) | ||
|
|
||
| with tempfile.TemporaryDirectory() as tmp: | ||
| dafsa_filename = f"{tmp}/dafsa.bin" | ||
| with open(dafsa_filename, "wb") as f: | ||
| f.write(b"some binary data") | ||
| remote_settings_publish(self.client, "fake-commit-hash", dafsa_filename) | ||
|
|
||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, so now that you intercepted the HTTP requests with the self.assertEqual(len(responses.calls), 2)
self.assertEqual(responses.calls[0].request.method, "post")
... |
||
| self.assertEqual(len(responses.calls), 2) | ||
|
|
||
| self.assertEqual(responses.calls[0].request.url, self.attachment_uri) | ||
| self.assertEqual(responses.calls[0].request.method, "POST") | ||
|
|
||
| self.assertEqual(responses.calls[1].request.url, self.collection_uri) | ||
| self.assertEqual(responses.calls[1].request.method, "PATCH") | ||
heyitsarpit marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
|
|
||
| class TestPublishDafsa(unittest.TestCase): | ||
| def setUp(self): | ||
| self.event = { | ||
| "server": "https://fake-server.net/v1", | ||
| "auth": "arpit73:pAsSwErD", | ||
| } | ||
| client = Client( | ||
| server_url=self.event.get("server"), | ||
| auth=("arpit73", "pAsSwErD"), | ||
| bucket=BUCKET_ID, | ||
| collection=COLLECTION_ID, | ||
| ) | ||
| self.record_uri = self.event.get("server") + client.get_endpoint( | ||
| "record", id=RECORD_ID, bucket=BUCKET_ID, collection=COLLECTION_ID | ||
| ) | ||
|
|
||
| mocked = mock.patch("commands.publish_dafsa.prepare_dafsa") | ||
| self.addCleanup(mocked.stop) | ||
| self.mocked_prepare = mocked.start() | ||
|
|
||
| mocked = mock.patch("commands.publish_dafsa.remote_settings_publish") | ||
| self.addCleanup(mocked.stop) | ||
| self.mocked_publish = mocked.start() | ||
|
|
||
| @responses.activate | ||
| def test_prepare_and_publish_are_not_called_when_hashes_matches(self): | ||
| responses.add( | ||
| responses.GET, COMMIT_HASH_URL, json=[{"sha": "fake-commit-hash"}] | ||
| ) | ||
| responses.add( | ||
| responses.GET, | ||
| self.record_uri, | ||
| json={"data": {"commit-hash": "fake-commit-hash"}}, | ||
| ) | ||
|
|
||
| publish_dafsa(self.event, context=None) | ||
|
|
||
| self.assertFalse(self.mocked_prepare.called) | ||
| self.assertFalse(self.mocked_publish.called) | ||
|
|
||
heyitsarpit marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| @responses.activate | ||
| def test_prepare_and_publish_are_called_when_hashes_do_not_match(self): | ||
| responses.add( | ||
| responses.GET, COMMIT_HASH_URL, json=[{"sha": "fake-commit-hash"}] | ||
| ) | ||
heyitsarpit marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| responses.add( | ||
| responses.GET, | ||
| self.record_uri, | ||
| json={"data": {"commit-hash": "different-fake-commit-hash"}}, | ||
| ) | ||
heyitsarpit marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| publish_dafsa(self.event, context=None) | ||
|
|
||
| self.assertTrue(self.mocked_prepare.called) | ||
| self.assertTrue(self.mocked_publish.called) | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.