An open laptop showing an image editing software
Photo by Radek Grzybowski on Unsplash

Simple S3 Thumbnail For Images (using custom metadata)

Isaaac
14 min readMar 10, 2023

If you have ever used Amazon S3 to store image assets, including some pretty large files, and need to quickly browse them without displaying the full source files (which can be costly in both time and money), you may find the following idea useful. By storing a base64-encoded image in S3 custom metadata, you can retrieve all items’ metadata when listing them and still be able to display a cheap preview. This approach can also be used as an image loading placeholder if you blur it out with just one line of CSS.

It might sound like an odd idea, and I’m not yet convinced it’s the best approach, but it works well in practice. I couldn’t find any literature on this niche topic, so here is my contribution :^)

Alternatives Pros/Cons

First of all, I’m aware that there are already more common methods to create low-resolution image previews:

The CDN / image resize service way

There are more common methods to create low-resolution image previews, such as using image CDN services like Cloudinary to resize, compress, or modify your heavy source files on the fly. This way, you can generate thumbnail URLs based on your source file URLs.

However, these methods come at a cost. You will still pay for some S3 egress of your heavy files (although CDN caching can help mitigate this), and you’ll also have to pay for the CDN service or be limited by its free tier.

If you’re already paying for an image CDN service, it may be the simplest solution, as you don’t have to worry about the complexity. You’re paying for a service to do the job for you. However, if money is a concern or you’re not already using an image CDN service, you may find my approach helpful.

The multiple file size generation

The old-school way is to either tweak the API responsible for uploading your files to S3 or build a cron job that scrapes your raw images at fixed intervals to generate one or more versions of your source assets. You can then either store them alongside the original with a predictable name (e.g. {{source_file_name}}-thumbnail.jpg) or store them in a dedicated root folder (e.g. /thumbnails/*).

This approach works well, and it gives you more control over the thumbnail sizes, but it can quickly become challenging to maintain. For example, every time you update or delete the image source, you must propagate the changes to every sub-asset version. Additionally, you have to ensure that all cache rules are the same across all your asset versions to prevent unpredictable behavior.

If you can easily maintain this level of complexity, then this method may be suitable for you. Otherwise, you can continue reading :)

The metadata thumbnail solution

You may already know that you can set custom metadata when you store any object in S3. This feature is useful for storing related data, such as a database ID or an image ratio. I thought, why not store a base64 thumbnail here? Base64 is a stringifiable byte blob representation, so it can easily be stored in a custom metadata field. Plus, it’s straightforward to display in a web environment by adding the following code to the src attribute of an <img/> element:

<img src="data:image/${format};base64,${data}" />

In this code, format refers to the image format (webp, jpeg, png, gif, avif, etc.), and data is the base64 string representation of the image that you may have stored in S3 metadata.

Great! You now understand the main concept. You can implement this technique in any programming language that provides easy access to the AWS SDK and an image processing library. In the following sections, I’ll provide a step-by-step implementation guide using Node.js. I’ll also provide explanations for those who are not familiar with Node.js, the Sharp library, or AWS SDK for S3 manipulations.

If you’re not planning on reading any further, let me warn you of one huge caveat: the total size of custom metadata fields in S3 is limited to 2KB. Therefore, I recommend using efficient image compression formats like WebP and tweaking the compression quality value and thumbnail dimensions to find the sweet spot where you take up all the remaining space but don’t exceed the limit (your request will be rejected anyway). I have provided a code snippet here that automatically generates thumbnails of varying quality and size until they weigh just below a certain threshold.

Also you may consider using a more efficient data representation than base64 in S3 metadata, but I’m not sure I understand all the details, and my point was to keep it simple. Here we don’t have to decode anything just retrieve the string and put it in the image src property. That’s it.

Here is a quick summary of what I’ll cover through this post:

Implementation

Save image in S3 with metadata

Here is how you can save an S3 object with custom metadata (e.g. your base64 thumbnail)

const { S3 } = require('aws-sdk');

const s3 = new S3({
accessKeyId: 'your_access_key',
secretAccessKey: 'your_secret_key',
});

// assume you already have your main image
const image: Buffer = getImage();

// and it's thumbnail
const thumbnail: Buffer = generateThumbnail(image);

const uploadParams = {
Bucket: 'your_bucket_name',
Key: 'your_image_name',
Body: image,
// ContentType: 'image/jpeg', // hard coded for demo
// CacheControl: 'max-age=31536000', // think about cache
Metadata: { // custom metadatas !
'thumbnail': thumbnail.toString("base64")
}
};

s3.upload(uploadParams).promise()
.then(() => console.log('Success'))
.catch(err => console.error('Error', err))

N.B. If you want to update the metadata of an existing object in S3, it’s not possible to do it directly. You’ll have to copy the object and override it with the updated metadata. This may not be a problem unless you rely on the object’s creation dates. Here is an example implementation -> update one object metadata from S3

Generate base64 thumbnail with sharp

Then you’ll want to generate the base64 string you’ll want to use as your custom metadata value in the example above. For this you can use the sharp library. The input imageBuffer in the example below can be loaded directly from disk (example) or downloaded from the internet (example) or even retrieved from an S3 bucket (example). As long as it is a Buffer representation of an image, it should work. You can use sharp to resize and compress the image before converting it to a base64 string. Here's an example code snippet:

import sharp from 'sharp';

export default async function generateThumbnail(
imageBuffer: Buffer,
size: number = 32, // size of the max square containing your image
quality: number = 30, // vary if you need smaller images
): Promise<Buffer> {
const image = sharp(imageBuffer);
const processed = await image
.resize({
fit: 'inside', // 'inside' to keep ratio or 'cover' to crop
width: size,
height: size,
})
.webp({ quality })
.toBuffer();
return processed;
}

Also, if you ever need to check the amount of space that will be taken according to S3, you can run the following code:

// const myThumbnail: Buffer = await gererateThumbnail(…);
const s3ByteLength = Buffer.byteLength(myThumbnail.toString("base64"), "utf8");

List all Metas from S3

Here is how you can easily list all S3 items recovering only their attached metadata. This way you could easily build a listing UI with thumbnails from the thumbnail base64 metadata. The example below will load the whole S3 under path, but you can easily tweak this example to work with pagination and lazily request each page in case of a really huge amount of items in your S3.

import { S3 } from "aws-sdk";

const s3 = new AWS.S3({
accessKeyId: 'your_access_key',
secretAccessKey: 'your_secret_key',
});

// first we need a function to get one object metadata
export async function getObjectMetadata(
bucket: string,
key: string
): Promise<S3.Metadata | undefined> {
const params: S3.Types.HeadObjectRequest = {
Bucket: bucket,
Key: key,
};
const res = await s3.headObject(params).promise();
return res.Metadata;
}

// just a custom type to aggregate object returned by objectList
// + its associated metadata. So it's more convenient tu use as a response
interface S3ObjectWithMeta extends S3.Object {
Metadata: S3.Metadata;
}

// then we can use listObjectV2 to efficiently loop through all items
export async function listObjects(
bucket: string,
path: string = ""
): Promise<S3ObjectWithMeta[]> {
let output: S3ObjectWithMeta[] = [];
let IsTruncated: boolean | undefined = undefined;
let NextContinuationToken: string | undefined = undefined;
try {
do {
const params: S3.Types.ListObjectsV2Request = {
Bucket: bucket,
Prefix: path,
MaxKeys: 1000,
ContinuationToken: NextContinuationToken,
};
const res = await s3.listObjectsV2(params).promise();
IsTruncated = res.IsTruncated;
NextContinuationToken = res.NextContinuationToken;

if (!res.Contents) continue;
for (const object of res.Contents) {
if (!object.Key) continue;
const metadata = await getObjectMetadata(bucket, object.Key);
if (!metadata) continue;
output = [
...output,
{
...object,
Metadata: metadata,
},
];
}
} while (IsTruncated);
return output;
} catch (err) {
console.error(err);
throw err;
}
}

Usage example:

import { listObjects } from 'the_example_above';

// asuming we're in an async context
// (also note that path should end with a trailing slash)
const objects = await listObjects('my_bucket', 'm_yFolder/');
for (const object of objects) {
console.log(object.Key);
console.log(object.Metadata);
}

Bonus

You already have the basics, but to make this more detailed and cover more use cases, here are some additional code samples.

Note that they are just examples of how you could do things, and they are not designed to be copy-pasted right away into your code. Also, I’m not responsible if you do this and it breaks something.

Update existing object metadata

If you want to update Metadata of an existing object in S3, there is no such a thing. You’ll have to copy the object and override it. Not really a problem as long as you don’t rely on object creation dates. Here is an example implementation:

import { S3 } from 'aws-sdk';

const s3 = new AWS.S3({
accessKeyId: 'your_access_key',
secretAccessKey: 'your_secret_key',
});

export async function updateObjectMetadata(
bucketName: string,
objectKey: string,
metadata: S3.Types.Metadata
): Promise<S3.CopyObjectOutput> {
try {
// Get the current metadata of the object
const { Metadata } = await s3
.headObject({
Bucket: bucketName,
Key: objectKey
})
.promise();

// Update the metadata by adding/modifying key-value pairs
const updatedMetadata = { …Metadata, …metadata };

// Update the object's metadata
const response = await s3
.copyObject({
Bucket: bucketName,
CopySource: `${bucketName}/${objectKey}`,
Key: objectKey,
Metadata: updatedMetadata,
MetadataDirective: "REPLACE",
})
.promise();

return response;
} catch (error) {
console.error(`Error updating object metadata: ${error}`);
throw error;
}
}

Usage example:

import { updateObjectMetadata } from 'the_example_above';

// asuming we're in an async context
void await updateObjectMetadata(
'my_bucket',
'm_yFolder/my_file.jpg',
{'my_meta': '123'}
);

Target size thumbnail generation

Because I wanted to use close to 100% of the remaining space in S3 metadata (2KB) for storing the thumbnail without going beyond of course, I needed a small piece of code that can generate a thumbnail dynamically for a specific target bytelength. This way I could just calculate the space available for my thumbnail (2KB minus all other meta key + value bytelength) and call this snippet that will give me the most efficient thumbnail for the context.

The idea is to find the sweet spot between quality and size to generate a thumbnail that fits within the available space. I have a sample snippet below to show how to generate a thumbnail based on a target byte length. It’s not perfect, but it can give you an idea of how to build your own implementation. Feel free to share your better implementation with me, and we can update this for future readers.

The following snippet tweaks the quality and pixel width of the thumbnail between two given minimum and maximum values until it fits within the target byte length:

import sharp from "sharp";

type GenThumbOptions = {
// the size we shouldn't go beyond but get as close as possible
targetByteLength?: number;
// min/max quality
qualityMin?: number;
qualityMax?: number;
// min/max pixel width
sizeMin?: number;
sizeMax?: number;
// should we keep image ratio ?
keepRatio?: boolean;
// should we use webp (I STRONGLY RECOMMAND IT, SO MUCH LIGHTER)
webp?: boolean;
};

// calculate the size of a base64
// buffer as seen by S3 (for metadata)
export function getS3Size(buffer: Buffer): number {
return Buffer.byteLength( // this is how S3 will see byteLength
buffer.toString("base64"),
"utf8"
)
}

export async function generateThumbnailImage(
imageBuffer: Buffer,
options: GenThumbOptions
): Promise<Buffer> {
// vars
let resultImageBuffer = Buffer.from("");
const targetByteLength = options.targetByteLength || 2048;
let quality = options.qualityMax || 70;
const minQuality = options.qualityMin || 20;
let size = options.sizeMax || 512;
const minSize = options.sizeMin || 32;
const imageFileType = options.webp ? "webp" : "jpg";
const imageFit = options.keepRatio ? "inside" : "cover";
const image = sharp(imageBuffer);
let loopBreaker = 50; // security to never get infinite loop

// Change those values to have smaller decrease steps
// but will take longer to find the right image
// if you might exceed 50 steps, tweak your min/maxs
// or change the loopBreaker value above
const sizeDecreaseStep = 20; //px
const qualityDecreaseStep = 5;

// try generating the max allowed image until it's smaller than
// the target byteLength. Retries with little worse size/quality
// each time.
do {
// generate one thumbnail version
let processing = image.resize({
fit: imageFit,
width: size,
height: size,
});

if (imageFileType === "webp") {
processing = processing.webp({ quality });
} else {
processing = processing.jpeg({ quality });
}

resultImageBuffer = await processing.toBuffer();

// decrease quality/width alternatively between each try
// (set values for the next loop exec, might not be used
// if the current thumbnail meets requirements)
if (loopBreaker % 2 === 0) {
quality = Math.max(minQuality, quality - qualityDecreaseStep);
} else {
size = Math.max(minSize, size - sizeDecreaseStep);
}

loopBreaker --;
} while (
loopBreaker > 0 &&
getS3Size(resultImageBuffer) > targetByteLength
);

if (loopBreaker === 0) {
throw Error(`Cannot generate image under target byte length of [${
targetByteLength
}] with the following params: [${
JSON.stringify(options)
}]`);
}

return resultImageBuffer;
}

Usage example:

// find what it your remaining byte length substracting 
// all other meta keys and values
const remainingMetadataBytes = 2000 - key1.length - value2.length //…;
const thumbnailImage = await generateThumbnailImage(imageBuffer, {
keepRatio: true,
qualityMax: 60,
qualityMin: 20,
sizeMax: 256,
sizeMin: 32,
targetByteLength: remainingMetadataBytes,
webp: true,
});
// then I'm sure to have a thumbnail fitting remaining space
// I can store it in S3 as a meta of the source image

Regenerate all thumbnails

If you want to update an existing S3 bucket that already has images stored in it to include their thumbnails, you can use the following code. However, keep in mind that as mentioned in the previous section, you cannot directly update an object’s metadata in S3. You must make a copy of each object with the added metadata and override the old object. This code snippet will iterate over all objects in the specified S3 bucket and create a copy of each object with the added thumbnail metadata.

Please note that if you use this code, it’s at your own risk and I am not responsible for any data loss that may occur. This is just a snippet that I used because it suited my needs, but please make sure it fits your requirements before using it and always create a backup of your data :)

import { S3 } from "aws-sdk";
import { updateObjectMetadata } from "example above";
import { getImageFromS3 } from "example below";
import { generateThumbnailImage } from "example above";

const s3 = new S3({
accessKeyId: 'your_access_key',
secretAccessKey: 'your_secret_key',
});

// just assume you implmented this
declare function calculateByteLength(metadata: S3.Types.Metadata): number;

// the name of metadata containing the base64 thumbnail
const META_KEY = "thumbnail";

export async function generateMissingThumbnails(
bucket: string,
path: string
): Promise<void> {
const promises = [];
let IsTruncated: boolean | undefined = undefined;
let NextContinuationToken: string | undefined = undefined;
try {
do {
const params: S3.Types.ListObjectsV2Request = {
Bucket: bucket,
Prefix: path,
MaxKeys: 1000,
ContinuationToken: NextContinuationToken,
};
const res = await s3.listObjectsV2(params).promise();
IsTruncated = res.IsTruncated;
NextContinuationToken = res.NextContinuationToken;

if (!res.Contents) continue;
for (const object of res.Contents) {
if (!object.Key) continue;
const metadataParams: S3.Types.HeadObjectRequest = {
Bucket: bucket,
Key: object.Key,
};
// run all object processing in parallel:
const promise = s3
.headObject(metadataParams)
.promise()
.then(async ({ Metadata }) => {
if (Metadata && Metadata[META_KEY]) {
// you can remove this short-circuit if you
// want to regenerate thumbnail even if the
// prop already exists
return;
}
if (!object.Key) return;
const imageBuffer = await getImageFromS3(bucket, object.Key);
const remainingSpaceForMetadata =
2000
- META_KEY.length;
-(Metadata ? calculateByteLength(Metadata) : 0);
const thumbnail = await generateThumbnailImage(imageBuffer, {
targetByteLength: remainingSpaceForMetadata,
/*...*/ // configure this how you want
});
await updateObjectMetadata(bucket, object.Key, {
[META_KEY]: thumbnail.toString("base64"),
});
});

// run all object processing in parallel:
promises.push(promise);
}
} while (IsTruncated);

// run all object processing in parallel:
void (await Promise.all(promises));
return;
} catch (err) {
console.error(err);
throw err;
}
}

Usage example:

import { updateObjectMetadata } from 'the_example_above';

// asuming we're in an async context
void await generateMissingThumbnails('my_bucket', 'm_yFolder/');

Download imageBuffer (with axios or fetch)

How to download an image (as Buffer) straight from an url. If your main image is stored somewhere, maybe not on a s3, you can download it this way in order to get back your imageBuffer:

import axios from "axios";
export async function downloadImageBuffer(url: string): Promise<Buffer> {
const imageResponse = await axios.get(url, { responseType: "arraybuffer" });
const imageBuffer = imageResponse.data;
return imageBuffer;
}

You may use fetch, the syntax will be different, but as long as you are server-side I recommand axios that’s totally worth the overhead. Here is the example with fetch though (just asked chatGPT to translate from using axios to fetch, so maybe it does not work lol)

export async function downloadImageBuffer(url: string): Promise<Buffer> {
const imageResponse = await fetch(url);
const imageArrayBuffer = await imageResponse.arrayBuffer();
const imageBuffer = Buffer.from(imageArrayBuffer);
return imageBuffer;
}

This buffer can straighly be used by sharp:

// const myImageBuffer = downloadImageBuffer('https://my-image-url');
const image = sharp(myImageBuffer);
// then apply any changes

Get imageBuffer from disk

How to get an image Buffer representation from disk

// assuming you're in a NodeJS (server) context
import fs from 'fs';

export function getImageFromDisk(string: path): Buffer {
// read image file as a buffer
const imageBuffer: Buffer = fs.readFileSync(imagePath);
return imageBuffer;
}

Usage example:

const imagePath = '/path/to/image.jpg';
const myImageBuffer = getImageFromDisk(imagePath);
// then do something with the buffer
// const image = sharp(myImageBuffer);

Get ImageBuffer from S3

How to recover the body (main data) of an S3 object giving its key and the bucket it belongs to.

const { S3 } = require('aws-sdk');

const s3 = new S3({
accessKeyId: 'your_access_key',
secretAccessKey: 'your_secret_key',
});

export function getImageFromS3(
bucket: string,
key: string
): Promise<Buffer> {
const getParams: S3.Types.GetObjectRequest = {
Bucket: bucket,
Key: key,
};
const response = await s3.getObject(getParams).promise();
const imageBuffer: Buffer = response.Body as Buffer;
return imageBuffer;
}

Usage example:

const bucket = 'my_bucket_name';
const key = 'path/to/my/image.jpg'
const myImageBuffer = await getImageFromS3(bucket, key);
// then do something with the buffer
// const image = sharp(myImageBuffer);

Get single object metadata from S3

How to recover ONLY metadata for an S3 object, giving it’s key and the bucket where it belongs.

const { S3 } = require('aws-sdk');

const s3 = new S3({
accessKeyId: 'your_access_key',
secretAccessKey: 'your_secret_key',
});

export async function getObjectMetadata(
bucket: string,
key: string
): Promise<S3.Metadata | undefined> {
const params: S3.Types.HeadObjectRequest = {
Bucket: bucket,
Key: key,
};
const res = await s3.headObject(params).promise();
return res.Metadata;
}

Usage example:

const bucket = 'my_bucket_name';
const key = 'path/to/my/image.jpg'
const myMetadata = await getObjectMetadata(bucket, key);

// myMetadata is typeof {[key: string]: string} so you can
// just check if the meta you want exists
// example:
const myMetaThumbnail = myMeta['thumbnail'];
if (myMetaThumbnail) {
// do your thing
// example: build the image base 64 data url (ready to put in <img src=X>)
const imageDataUri = `data:image/webp;base64,${myMetaThumbnail}`
// ^ you may recover type dynamically
const img = document.createElement('img');
img.src = imageDataUri;
body.appendChild(img);
}

Conclusion

I hope this has been helpful for you. If you have any suggestions or feedback, please let me know

Additionally, if you found this guide useful, consider liking it or sharing it with others. I’m really considering writing more posts about weird nerdy things I do. Already have many subjects but don’t know if it would interest anyone. So your feedback is more than welcome :D

✨ Have a lovely day ✨

--

--

Isaaac
Isaaac

Written by Isaaac

I do stuff with computer and code and internet and I like it. I'm also a photographer ✨

Responses (1)