-
Notifications
You must be signed in to change notification settings - Fork 26.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(image-optimizer): use previously cached image cache entry if upstream image is identical #67257
base: canary
Are you sure you want to change the base?
feat(image-optimizer): use previously cached image cache entry if upstream image is identical #67257
Conversation
Allow CI Workflow Run
Note: this should only be enabled once the PR is ready to go and can only be enabled by a maintainer |
isStatic: boolean, | ||
xCache: XCacheHeader, | ||
imagesConfig: ImageConfigComplete, | ||
maxAge: number, | ||
isDev: boolean | ||
) { | ||
const contentType = getContentType(extension) | ||
const etag = getHash([buffer]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are storing the etag already, shouldn't we use that instead of recalculating the etag
? This can be reverted to use getImageTag
.
bda2f1a
to
a1f627d
Compare
@styfle Hi, I hate to be that guy who randomly pings maintainers, but could you look at this to check if this is a wanted feature? For instance, our site becomes slow and unresponsive after a stale cache. We have minimumCacheTTL configured for an hour, but many people might not know about it, and IMO, this would benefit people in general. |
Failing test suitesCommit: a1f627d
Expand output● ReactRefreshRequire › re-runs accepted modules
● ReactRefreshRequire › propagates a hot update to closest accepted module
● ReactRefreshRequire › propagates a module that stops accepting in next version
Read more about building and testing Next.js in contributing.md.
Expand output● app-dir - params hooks compat › should only access search params with useSearchParams
Read more about building and testing Next.js in contributing.md. |
'"url" parameter is valid but image type is not allowed' | ||
) | ||
} | ||
if ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this svg logic moved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like this a leftover, or don't remember anymore why I moved it, seems like there's no point in doing that. Reverted this now
return ( | ||
previousCacheEntry?.value?.kind === 'IMAGE' && | ||
// If the cached image is optimized | ||
previousCacheEntry.value.upstreamEtag !== previousCacheEntry.value.etag && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain why upstreamEtag should not be equal to etag?
This seems like it will always return true when not equal and therefore it will never actually revalidate, which seems wrong to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What this PR tries to do is prevent hitting this specific path, with optimizeImage
is the expensive operation.
upstreamEtag
represents the original source image's etag (currently calculated via the same function getImageEtag
), whereas the cache entry's etag
represents the resulting image's etag
, i.e. in the case image optimization is done, the etag
will be the hash of that optimized image instead of the upstream image.
This condition exists to cause a revalidation on cases where the optimization did not happen previously (in this case we fallback to using upstreamBuffer
and upstreamEtag
for the image), so this will fail early on those cases and cause a new optimization attempt. Although now thinking back at it the likelyhood of the image optimization ending up with the same outcome is quite likely anyways.
When this condition is true, we'll check the next condition, comparing the previous upstream image's etag against the current upstream image's etag to determine if we should revalidate or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The optimization not happening could be because of a fallback (such as an error was thrown) or it could be because the image was vector form (svg doesnt need to be optimized).
Either way, we still need to revalidate those images because the upstream response could have changed since the previous request. So I suggest removing line 463 and that way the following line (465) can actually compare the new upstream etag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I'm not quite following, as the function is shouldUsePreviouslyCachedEntry
, all of the conditions inside it need to return true for the revalidation not to occur.
This is how I see it:
- If this comparison is false, we will revalidate the image
- If this comparison is true, we will do the etag comparison (line 465) to determine if we want to revalidate the image.
By revalidate, I mean that the previous entry is not used and we'll fallback to previous behaviour
Am I misunderstanding something? What do you mean by revalidation? I'm fine with removing this line, but don't fully understand the reasoning 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So what is the scenario where the line is needed? Is there a test for this?
I can't see your link anymore because you force pushed.
(note: you don't need to force push because we squash on merging)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see now. I read through this again and I see that the line is handling the case when svg or animated. I think the comment should mention this.
// If the cached image is optimized | ||
previousCacheEntry.value.upstreamEtag !== previousCacheEntry.value.etag && | ||
// and the upstream etag is the same as the previous cache entry's | ||
getImageEtag(upstreamImage.buffer) === previousCacheEntry.value.upstreamEtag |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you could improve performance here by using the etag from the upstream response header and fallback to the hash if its missing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought of using the upstream response header at first, but didn't like how the code turned out, or because of the weak validation syntax (it can start with W/
, the etag value would need to be base64-encoded correctly to prevent any errors.
What I had previously was alter ImageUpstream
to include etag
and then get it from the response and have the hash as fallback. Didn't quite like the implementation then, but writing an ETag extractor helper function would probably make it a bit nicer and make it possible to write tests for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added this now along with tests, not 100% happy with the tests but they should still be sufficient I think.
a1f627d
to
39d6d5a
Compare
54d55bd
to
3c09907
Compare
3c09907
to
fba6061
Compare
it('should return base64url encoded etag if etag is provided', () => { | ||
const etag = 'sample-etag' | ||
const result = extractEtag(etag, Buffer.from('')) | ||
expect(result).toMatch(/^([a-z]|[A-Z]|[0-9]|-|_)+$/) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use exact string comparison instead of regex here?
it('should not return weak etag identifier if etag is provided', () => { | ||
const etag = 'W/"sample-etag"' | ||
const result = extractEtag(etag, Buffer.from('')) | ||
expect(result).not.toMatch(/^W\//) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, can we use exact string comparison instead of regex here?
f2acee3
to
cd61981
Compare
…tream image is identical
…shing if available
cd61981
to
f406464
Compare
etag: string | null | undefined, | ||
imageBuffer: Buffer | ||
) { | ||
if (!!etag) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (!!etag) { | |
if (etag) { |
expect(result).not.toContain('/') | ||
expect(result).toMatch('Vy8ic2FtcGxlLWV0YWci') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expect(result).not.toContain('/') | |
expect(result).toMatch('Vy8ic2FtcGxlLWV0YWci') | |
expect(result).toEqual('Vy8ic2FtcGxlLWV0YWci') |
it('should return base64url encoded etag if etag is provided', () => { | ||
const etag = 'sample-etag' | ||
const result = extractEtag(etag, Buffer.from('')) | ||
expect(result).toMatch('c2FtcGxlLWV0YWc') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expect(result).toMatch('c2FtcGxlLWV0YWc') | |
expect(result).toEqual('c2FtcGxlLWV0YWc') |
const maxAge = getMaxAge(imageUpstream.cacheControl) | ||
|
||
if (shouldUsePreviouslyCachedEntry(imageUpstream, previousCacheEntry)) { | ||
// Return the cached optimized image |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Return the cached optimized image |
You can remove this comment.
Alternatively, you can change shouldUsePreviouslyCachedEntry()
from returning a boolean to returning the result and naming it something like getPreviouslyCachedOrNull()
// Return the cached optimized image | ||
return { | ||
buffer: previousCacheEntry.value.buffer, | ||
contentType: paramsResult.mimeType, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure the contentType
is correct here?
We have a lot of logic to determine the contentType below so perhaps we should move the cached entry check below this:
next.js/packages/next/src/server/image-optimizer.ts
Lines 636 to 649 in ca73527
let contentType: string | |
if (mimeType) { | |
contentType = mimeType | |
} else if ( | |
upstreamType?.startsWith('image/') && | |
getExtension(upstreamType) && | |
upstreamType !== WEBP && | |
upstreamType !== AVIF | |
) { | |
contentType = upstreamType | |
} else { | |
contentType = JPEG | |
} |
What?
getHash
-function to returnbase64url
instead of custom variant of standardbase64
. This can be removed if deemed not necessary.getImageEtag
-function to do it) to happen insideimage-optimizer
for better consistency.Why?
Currently when an image is requested and it becomes stale, the server will trigger a full image optimization for that image, using a significant spike in CPU-usage for routes that have multiple images (e.g. an image carousel).
How?
By calculating and storing the upstream etag in the cache entry, we can utilize the previously cached entry to check if the upstream image has remained the same, making it possible to reuse the previously cached entry again, as the image optimization would produce the same results anyways.