Use aiobotocore 2.2.0 to support assume role credentials#157
Conversation
|
Relates to: #156 |
| Bucket=self._bucket, | ||
| Key=self._clean_key(path), | ||
| ) | ||
| async with self.session.create_client('s3', region_name=self.region_name, |
There was a problem hiding this comment.
Did you test this with a load-intensive process? We had performance issues in the past without the singleton client
There was a problem hiding this comment.
I did some tests using hey client in that epoch, but I didn't hit any bad behavior (but didn't keep the results :( )
I remember that the client was a corutine itself ... but I'll run some tests and paste the results here to help with the analysis.
Originally, I was worried about opening the payload at https://github.com/thumbor-community/aws/pull/157/files#diff-8c5f6e09db7784ddba2fc0a87e8c9e5436275868ae07088bc1f5a1c888c45224R74-R81, but that hint about the client is warm as well.
I'll brb soon with the hey results.
There was a problem hiding this comment.
Is there any performance tests guidelines or number to make a comparison?
Here is the information about the tests that I made.
In my case, my CPU is the following:
processor : 23
vendor_id : AuthenticAMD
cpu family : 23
model : 113
model name : AMD Ryzen 9 3900X 12-Core Processor
stepping : 0
microcode : 0x8701021
cpu MHz : 2456.247
cache size : 512 KB
physical id : 0
siblings : 24
core id : 14
cpu cores : 12
apicid : 29
initial apicid : 29
fpu : yes
fpu_exception : yes
cpuid level : 16
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr rdpru wbnoinvd arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl umip rdpid overflow_recov succor smca sme sev sev_es
bugs : sysret_ss_attrs spectre_v1 spectre_v2 spec_store_bypass
bogomips : 7585.13
TLB size : 3072 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 43 bits physical, 48 bits virtual
power management: ts ttp tm hwpstate cpb eff_freq_ro [13] [14]
I ran the tests using docker on a Linux Machine in all cases. Each test was run three times and in my scenario, I have an S3 bucket served through minio and a redis cache:
version: '3'
volumes:
redis-data:
s3-data:
thumbor-data:
thumbor-logs:
services:
s3:
image: gcavalcante8808/minio-dev:latest
environment:
MINIO_ROOT_USER: minio
MINIO_ROOT_PASSWORD: minio123
MINIO_INITIAL_BUCKET: default
MINIO_INITIAL_BUCKET_PERMISSION: none
volumes:
- s3-data:/data
thumbor:
image: apsl/thumbor:latest
volumes:
- thumbor-data:/data
- thumbor-logs:/logs
environment:
- DETECTORS=['thumbor.detectors.queued_detector.queued_complete_detector']
- STORAGE=thumbor.storages.mixed_storage
- REDIS_STORAGE_SERVER_HOST=redis
- REDIS_STORAGE_SERVER_PORT=6379
- REDIS_STORAGE_SERVER_DB=0
- REDIS_QUEUE_SERVER_HOST=redis
- REDIS_QUEUE_SERVER_PORT=6379
- REDIS_QUEUE_SERVER_DB=0
- MIXED_STORAGE_DETECTOR_STORAGE=tc_redis.storages.redis_storage
- S3_USE_SIGV4=false
- LOADER=tc_aws.loaders.s3_loader
- TC_AWS_REGION=us-east-1
- TC_AWS_LOADER_BUCKET=default
- TC_AWS_ENDPOINT="http://s3:9000"
- AWS_ACCESS_KEY_ID=minio
- AWS_SECRET_ACCESS_KEY=minio123
ports:
- 8080:8000
new:
image: thumbor:dev
build: thumbor/
volumes:
- thumbor-data:/data
- thumbor-logs:/logs
- ./thumbor/thumbor.conf:/usr/src/thumbor.conf
command:
- thumbor
- -c
- /usr/src/thumbor.conf
environment:
- DETECTORS=['thumbor.detectors.queued_detector.queued_complete_detector']
- STORAGE=thumbor.storages.mixed_storage
- REDIS_STORAGE_SERVER_HOST=redis
- REDIS_STORAGE_SERVER_PORT=6379
- REDIS_STORAGE_SERVER_DB=0
- REDIS_QUEUE_SERVER_HOST=redis
- REDIS_QUEUE_SERVER_PORT=6379
- REDIS_QUEUE_SERVER_DB=0
- MIXED_STORAGE_DETECTOR_STORAGE=tc_redis.storages.redis_storage
- S3_USE_SIGV4=false
- LOADER=tc_aws.loaders.s3_loader
- TC_AWS_REGION=us-east-1
- TC_AWS_LOADER_BUCKET=default
- TC_AWS_STORAGE_BUCKET=default
- TC_AWS_ENDPOINT="http://s3:9000"
- AWS_ACCESS_KEY_ID=minio
- AWS_SECRET_ACCESS_KEY=minio123
ports:
- 9999:8888
redis:
image: redis:latest
volumes:
- redis-data:/data
Bellow, I post the results for both thumbor 6.3 and thumbor 7.0.7 with the new plugin.
There was a problem hiding this comment.
Thumbor 6.3.0 using apsl/thumbor docker image
This is an old but functional image (but with lots of critical CVEs though) using the following packages:
appdirs==1.4.3
backports-abc==0.5
boto==2.42.0
botocore==1.2.12
certifi==2017.4.17
colour==0.1.3
contextlib2==0.5.4
dateutils==0.6.6
derpconf==0.8.1
docutils==0.13.1
envtpl==0.4.1
futures==3.1.1
graphicsmagick-engine==0.1.1
itty==0.8.2
Jinja2==2.9.6
jmespath==0.9.2
libthumbor==1.3.2
MarkupSafe==1.0
numpy==1.11.0
opencv-engine==1.0.1
packaging==16.8
pexif==0.15
pgmagick==0.6.1
Pillow==3.4.2
pycrypto==2.6.1
pycurl==7.43.0
pylibmc==1.5.2
pymongo==3.4.0
pyparsing==2.2.0
pyremotecv==0.5.0
pyres==1.2
pystache==0.5.4
python-dateutil==2.6.0
pytz==2017.2
raven==5.15.0
redis==2.10.5
remotecv==2.2.1
requests==2.13.0
setproctitle==1.1.10
shortuuid==0.5.0
simplejson==3.10.0
singledispatch==3.4.0.3
six==1.10.0
statsd==3.2.1
tc-aws==6.0.2
tc-core==0.4.0
tc-mongodb==5.1.0
tc-redis==1.0.1
tc-shortener==0.2.2
thumbor==6.3.0
thumbor-memcached==5.1.0
tornado==4.5
tornado-botocore==1.1.0
virtualenv==15.1.0
The command hey -c 100 -z 30s http://localhost:8080/unsafe/300x200/smart/0864bf97-8369-42d7-ad8c-449541ea541c-original.png`, which emulates 100 clients during the 30s, yielded the following results:
Summary:
Total: 33.6502 secs
Slowest: 4.5284 secs
Fastest: 0.2421 secs
Average: 3.8583 secs
Requests/sec: 24.5764
Total data: 28952443 bytes
Size/request: 35009 bytes
Response time histogram:
0.242 [1] |
0.671 [3] |
1.099 [12] |■
1.528 [8] |■
1.957 [12] |■
2.385 [11] |■
2.814 [11] |■
3.242 [9] |■
3.671 [20] |■■
4.100 [512] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
4.528 [228] |■■■■■■■■■■■■■■■■■■
Latency distribution:
10% in 3.6563 secs
25% in 3.9749 secs
50% in 4.0372 secs
75% in 4.1039 secs
90% in 4.1648 secs
95% in 4.1994 secs
99% in 4.3036 secs
Details (average, fastest, slowest):
DNS+dialup: 0.0017 secs, 0.2421 secs, 4.5284 secs
DNS-lookup: 0.0008 secs, 0.0000 secs, 0.0317 secs
req write: 0.0000 secs, 0.0000 secs, 0.0011 secs
resp wait: 3.8565 secs, 0.2397 secs, 4.5265 secs
resp read: 0.0000 secs, 0.0000 secs, 0.0001 secs
Status code distribution:
[200] 827 responses
During the tests, the CPU use was 100% (1 CPU) and RAM usage was about ~160MB in the first run, but was increasing by ~20MB on each test round, maybe indicating some sort of memory leak.
There was a problem hiding this comment.
Thumbor 7.0.7
This image has the following packages:
aiobotocore==0.12.0
aiohttp==3.8.1
aioitertools==0.10.0
aiosignal==1.2.0
async-timeout==4.0.2
attrs==21.4.0
botocore==1.15.15
cairocffi==1.3.0
CairoSVG==2.5.2
certifi==2021.10.8
cffi==1.15.0
cfgv==3.3.1
charset-normalizer==2.0.12
colorful==0.5.4
cssselect2==0.6.0
defusedxml==0.7.1
Deprecated==1.2.13
derpconf==0.8.3
distlib==0.3.4
docutils==0.15.2
filelock==3.6.0
frozenlist==1.3.0
identify==2.4.12
idna==3.3
jmespath==0.10.0
libthumbor==2.0.2
multidict==6.0.2
nodeenv==1.6.0
numpy==1.22.3
opencv-python-headless==4.5.5.64
packaging==21.3
Pillow==9.1.0
platformdirs==2.5.2
pre-commit==2.18.1
py3exiv2==0.7.1
pycparser==2.21
pycurl==7.45.1
pyparsing==3.0.8
pyres==1.5
python-dateutil==2.8.2
pytz==2022.1
PyYAML==6.0
redis==4.2.2
remotecv @ git+https://github.com/thumbor/remotecv@58f46eaa8ffe4e83c5afe2ea04397da8d8834a7b
sentry-sdk==0.14.4
setproctitle==1.2.3
simplejson==3.17.6
six==1.16.0
socketfromfd==0.2.0
statsd==3.3.0
tc-aws==7.0b0
tc-redis @ git+https://github.com/thumbor-community/redis@e4dea465e1f388173083143dbc0942caa143ef48
thumbor==7.0.7
tinycss2==1.1.1
toml==0.10.2
tornado==6.1
typing-extensions==4.2.0
urllib3==1.25.11
virtualenv==20.14.1
webcolors==1.11.1
webencodings==0.5.1
wrapt==1.14.0
yarl==1.7.2
The command hey -c 100 -z 30s http://localhost:9999/unsafe/300x200/smart/0864bf97-8369-42d7-ad8c-449541ea541c-original.png, which emulates 100 clients during the 30s, yielded the following results:
Summary:
Total: 32.8143 secs
Slowest: 3.1669 secs
Fastest: 0.0409 secs
Average: 2.6795 secs
Requests/sec: 35.7161
Total data: 41030548 bytes
Size/request: 35009 bytes
Response time histogram:
0.041 [1] |
0.353 [10] |
0.666 [11] |
0.979 [10] |
1.291 [11] |
1.604 [11] |
1.917 [11] |
2.229 [12] |
2.542 [11] |
2.854 [978] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
3.167 [106] |■■■■
Latency distribution:
10% in 2.7035 secs
25% in 2.7624 secs
50% in 2.7868 secs
75% in 2.8109 secs
90% in 2.8479 secs
95% in 2.8790 secs
99% in 2.9021 secs
Details (average, fastest, slowest):
DNS+dialup: 0.0011 secs, 0.0409 secs, 3.1669 secs
DNS-lookup: 0.0004 secs, 0.0000 secs, 0.0288 secs
req write: 0.0000 secs, 0.0000 secs, 0.0009 secs
resp wait: 2.6784 secs, 0.0388 secs, 3.1652 secs
resp read: 0.0001 secs, 0.0000 secs, 0.0002 secs
Status code distribution:
[200] 1172 responses
This time, the RAM Usage was near ~82MB of RAM and didn't change during other test rounds.
There was a problem hiding this comment.
Thanks for the performance tests.
We did load testing before so it was on "live" servers with a lot of simulated users, so not really comparable, but here a wait time of ~3/4secs looks quite slow.
Ideally we should do some load testing on a live server with AWS S3, and do a before (Thumbor 6) / current (Thumbor 7 with latest tc_aws without your PR) / after (Thumbor 7 + your PR) to check the improvements.
|
@gcavalcante8808 hi, did you try this on a live server? Did it handle the load properly? |
|
@Bladrak Well, as I ended up leaving the company I was working for, I can only talk about the period I was there: it ran for 6 months even in production without complications =D |
|
Ok great :) Would you mind switching the target branch to master and rebasing this? I think we will be able to merge it now! |
setup.py
Outdated
| 'python-dateutil>=2.8', | ||
| 'thumbor>=7.0.0a2,<8', | ||
| 'aiobotocore==2.2.0', | ||
| 'boto3>=1.9,<1.13', |
There was a problem hiding this comment.
You'll need to update those version to match what aiobojtocore needs. I've restricted the range to reduce the build time.
|
@gcavalcante8808 there seems to be an issue with Circle CI from your fork. It may be related to https://circleci.com/docs/oss/#build-pull-requests-from-forked-repositories
Can you check that this is right for your fork? And update accordingly if need be? |
|
Hi @gcavalcante8808 could you check out the CircleCI issue? |
|
Any updates on this? I would also like to use assumeRole via AWS Web Identity Token and it looks like old boto version does not support it. I am currently getting follow errors.
|
|
Hi @oliverschewe it seems this PR is a bit outdated. If you'd like to retake the PR and submit an updated one, I'll be happy to review. |
|
Should be fixed in https://github.com/thumbor-community/aws/releases/tag/7.0.3 |
Scenario
This PR bumps aiobotocore to its latest version (2.2.0), allowing to authenticate using IAM Roles/WebIdentity credentials with thumbor-aws.
What has been Done