Conversation
Kenneth-W-Chen
commented
Jun 28, 2024
- Created 2 tables to cache API request responses and a config for each cache
- So far, only used for Printful API (excludes orders)
- Current PHP functions let you get cached content from the table, add stuff to the cache, clear out the cache (or remove one item from the cache), and add/update cache configs
| $isOrder =substr($URI, 0, 6) == 'orders'; | ||
| if(!$isOrder){ // check cache, ignore orders because they shouldn't be cached | ||
| $r = getCached($endpoint); | ||
| if($r!== null){ // endpoint is in cache |
There was a problem hiding this comment.
I'd like for the api-cache to be able to handle the cache checking, returning and insertion without needing to implement this everywhere we use the cache. Could the getCached method take a instance of $ch here where the client just sets the curlopts and it sends the request if needed?
There was a problem hiding this comment.
Changed in 7cb4d48. Only caveat is that I made a new object to encapsulate some cURL info and return the object instead of just the content. I only did that because the PrintfulCustomAPIExceptions checked the HTTP response code, which isn't 200 if a cached item is returned (because curl_exec wasn't called).
I can have it pass the result only if we switch to throwing generic exceptions instead. Or we could pass the exception type name as a string in the parameter? Let me know if you want one of these instead of the current implementation
| if ($r) { | ||
| $id = $r['id']; | ||
| $queryString = 'UPDATE api_cache | ||
| SET last_successfully_retrieved = UTC_TIMESTAMP, |
There was a problem hiding this comment.
You might be able to get fancy if you want and set the DB column to auto update the last_successfully_retrieved timestamp each time the content column updates -- not required at all, just in case you wanted to experiement with that
There was a problem hiding this comment.
I can't get ON UPDATE to work with UTC_TIMESTAMP. Apparently, it's a bug https://bugs.mysql.com/bug.php?id=103228
| if ($config_name) { | ||
| return $db->query('INSERT INTO outgoing_request_cache_config (ttl, config_name) VALUES (' . $ttl . ', "' . $db->real_escape_string($config_name) . '")'); | ||
| } | ||
| return $db->query('INSERT INTO outgoing_request_cache_config (ttl) VALUES (' . $ttl . ')'); |
There was a problem hiding this comment.
config_name is always required, we should not allow it to be inserted empty, in fact, we should update the SQL to make it a non nullable column
| $configId = $db->query('SELECT * FROM outgoing_request_cache_config WHERE config_name = "' . $db->real_escape_string($config) . '"')->fetch_array(MYSQLI_ASSOC)['id']; | ||
| } | ||
| $queryString = "INSERT INTO api_cache (endpoint, last_successfully_retrieved, last_attempted_retrieval, config_id, content) | ||
| values ('" . $db->real_escape_string($endpoint) . "', UTC_TIMESTAMP, |
There was a problem hiding this comment.
I'm used to using CURRENT_TIMESTAMP for things like this, out of curiosity, is that just an alias for UTC_TIMESTAMP or is there actually a timezone difference between those two built in methods?
There was a problem hiding this comment.
CURRENT_TIMESTAMP returns the timestamp in the server's timezone. I was using UTC since I wasn't sure what the timezone on the server was, and I needed UTC since time() in PHP is a Unix timestamp (which is UTC).
For similar reasons, I was appending UTC to the retrieval time when converting last_successfully_retrieved to a Unix timestamp. I think strtotime() uses the server's timezone by default
| @@ -1,5 +1,5 @@ | |||
| <?php | |||
|
|
|||
| require(BASE . '/api/api-cache.php'); | |||
There was a problem hiding this comment.
Is it a guarantee that BASE is defined here? I think you might have just been lucky that BASE is also provided upstream, and you should actually be importing the template/top first here.
FYI, php will only re-import the template/top here if it wasn't already imported upstream.
| } | ||
| return false; | ||
| } | ||
| return null; |
There was a problem hiding this comment.
Oh the beauty of PHP's dynamic and mixed types. I miss it.
| $config = $db->query('SELECT * FROM outgoing_request_cache_config WHERE id = ' . $r['config_id'])->fetch_array(MYSQLI_ASSOC); | ||
| $ttl = $config['ttl']; | ||
| $now = time(); | ||
| $retrievalTime = strtotime($r['last_successfully_retrieved'] . 'UTC'); |
There was a problem hiding this comment.
You've probably already tested this but out of interest, what happens here if/when last_successfully_retrieved is null?
There was a problem hiding this comment.
It resolves to the current time. I changed it in 7cb4d48
| PRIMARY KEY (id), | ||
| FOREIGN KEY (config_id) REFERENCES outgoing_request_cache_config(id) | ||
| )AUTO_INCREMENT=8 DEFAULT CHARSET=latin1; | ||
| )AUTO_INCREMENT=8 DEFAULT CHARSET=utf8; |
There was a problem hiding this comment.
Some of the symbols in some Printful API responses are UTF-8. One of the errors lists \\xE2\\x80\\xB3\\xE2\\x80\\x93... as an issue. It might actually be a different charset, but given that it's not throwing errors over it anymore, I think UTF-8 will be fine.
There was a problem hiding this comment.
HTTP is always ISO-8859-1 charset, we can transform that into whatever client side charset we feel like. Latin1 should be fine, but if not we can update it to utf-8 if there's a specific problem.
There was a problem hiding this comment.
The issue was inserting response contents into the table. MySQL was giving an "incorrect string value" error for some products. The character I mentioned above is the double prime character which I don't think is present in ISO-8859-1. It's in the product description for every hat product
There was a problem hiding this comment.
Interesting, the curl library must be converting to UTF-8 instead of the standard ISO-8859-1
| require_once('../template/top.php'); | ||
|
|
||
| // RegEx to match certain endpoints we want to ignore, e.g., orders | ||
| const IGNORED_ENDPOINTS = '/(?:orders)/i'; |
There was a problem hiding this comment.
There's no need to have ignored endpoints, each entry in the db table would be a specific endpoint so if we want to ignore it we just won't make an entry for it.
There was a problem hiding this comment.
I think maybe the confusion here is that you have the code adding a cache entry automatically if its not found, we should not do this. Cache details need to be manually added to the database, if they don't exist and the retrieve cache method is called for that endpoint, we should just throw an error.
If we allow any call to create a cache entry, we can't enforce sensible TTLs, we'll get erroneously created entries, each time an endpoint or API base changes then suddenly we'll get duplicate config entries. We should stay on top of these things manually.
There was a problem hiding this comment.
Adjusted in c289fad, specifically, here (line 73) and here (line 87)
| /** | ||
| * @var bool $curl_executed true if curl_exec was called, false otherwise | ||
| */ | ||
| public $curl_executed; |
There was a problem hiding this comment.
curl_executed is a bad variable name, it could suggest executed is true anytime there isn't an error for example and in the future we may not stay with curl, be as verbose and specific as possible: perhaps, fetched_new_content or something
There was a problem hiding this comment.
Changed in c289fad. Also changed the respective parameter in the constructor
| } | ||
| $result = curl_exec($ch); | ||
| // add to cache if no errors and HTTP OK | ||
| if (!curl_errno($ch) && curl_getinfo($ch, CURLINFO_HTTP_CODE) == 200) { |
There was a problem hiding this comment.
success error codes are anything between 200 and 299, inclusive. for example, 201 and 204 are common success codes alongside 200
There was a problem hiding this comment.
| class PrintfulCustomAPI { | ||
| private $api_key; | ||
|
|
||
| private const PRINTFUL_CACHE_CONFIG_ID = 1; |
There was a problem hiding this comment.
Config ID should be unnecessary in the code, we can look everything up using the endpoint URL.
| PRIMARY KEY (id), | ||
| FOREIGN KEY (config_id) REFERENCES outgoing_request_cache_config(id) | ||
| )AUTO_INCREMENT=8 DEFAULT CHARSET=latin1; | ||
| )AUTO_INCREMENT=8 DEFAULT CHARSET=utf8; |
There was a problem hiding this comment.
HTTP is always ISO-8859-1 charset, we can transform that into whatever client side charset we feel like. Latin1 should be fine, but if not we can update it to utf-8 if there's a specific problem.
| curl_setopt($ch, CURLOPT_HTTPHEADER, $headers); | ||
|
|
||
| $result = curl_exec($ch); | ||
| $cache_result = get_valid_cache_entry($endpoint, $ch, $this::PRINTFUL_CACHE_CONFIG_ID); |
There was a problem hiding this comment.
Aside from the config ID being provided here, this is exactly what I was looking for 👍
| * | ||
| FROM | ||
| api_cache | ||
| JOIN |
There was a problem hiding this comment.
This might need to be a LEFT JOIN, a standard JOIN I think will return nothing at all if there's no entry in the api_cache table but there is an entry in the outgoing_request_cache_config table (i.e. JOIN expects both to exist to return anything, but the first time a cache is ever called there won't be an api_cache entry just yet)
| if ($retrievalTime !== null && $now - strtotime($retrievalTime . 'UTC') < $ttl) { | ||
| return new CacheResult($r['content']); | ||
| } | ||
| } |
There was a problem hiding this comment.
I think if $q is false we should probably throw an error
There was a problem hiding this comment.
Um, yeah I'd say return null and error log for now.
|
Couple of general notes:
|
|
|
||
| curl_setopt($ch, CURLOPT_URL, 'https://api.printful.com/' . $URI); | ||
| $endpoint = 'https://api.printful.com/' . $URI; | ||
| curl_setopt($ch, CURLOPT_URL, $endpoint); |
There was a problem hiding this comment.
It seems tautological to pass $endpoint into the get_valid_cache_entry method and also set the endpoint before passing this in, we should probably just move this inside of the method.
|
This PR looks good now, can you include the insert SQL for the creation of the configs for printful? |
|
Should I put them in the same file as the table creation or make a separate Also what would be a good TTL? |
|
It should go in the |
|
Oh, I see now that |
This comment has been minimized.
This comment has been minimized.
Changed some SELECTs to grab fewer columns. Added newlines and indents to queries in PHP file to improve readability
This comment has been minimized.
This comment has been minimized.
Rearranged some columns, specified default for columns of type timestamp because of MySQL server settings Parameterized endpoint strings and added helper function to convert parameterized endpoint strings to the full version Add varargs for send_request and api-cache functions Reformatted code
|
60c91fe I changed The specific cause is because the product has 28 variants, and Printful returns info for each variant for the single request. The data is repetitive (e.g., each variant lists if it's available in the US, UK, etc.). I can compress it into a normal zip down to 10kB or into a Edit: Changed it to compress the data in 8c98739 since 20-200kB of JSON isn't human readable (or human parsable) in the first place |