How to Scan Files in AWS S3 for Malware
If your application stores user-uploaded files in S3, those files need scanning. It doesn't matter how they got there — direct upload, presigned URL, transfer from another service — if the file came from outside your trust boundary, it could contain malware.
This is a real-world attack vector that comes up in penetration tests, compliance audits, and incident reports. A malicious file lands in your bucket, another system or user downloads it, and your infrastructure has become the distribution mechanism.
AWS offers GuardDuty Malware Protection for S3, but it has limitations that matter for most applications. This guide covers the practical options for scanning S3 files for malware — starting with a single API call and working up to a fully automated pipeline.
The Options
There are three main approaches to scanning files in S3:
| Approach | How it works | Best for |
|---|---|---|
| API with presigned URL | Generate a presigned URL, send it to a scanning API | On-demand scanning, existing upload flows |
| Lambda + S3 events | Trigger a Lambda function on upload, scan automatically | Automated scanning of every upload |
| GuardDuty Malware Protection | AWS-native scanning via GuardDuty | Teams already deep in the AWS security ecosystem |
We'll cover all three below.
Scanning with a Presigned URL
The simplest approach: generate a presigned URL for the S3 object and pass it to AttachmentScanner. The scanning engines fetch the file directly from S3 — your application never has to download it.
import { S3Client, GetObjectCommand } from "@aws-sdk/client-s3";
import { getSignedUrl } from "@aws-sdk/s3-request-presigner";
const s3 = new S3Client({ region: "us-east-1" });
// Generate a presigned URL (valid for 15 minutes)
const presignedUrl = await getSignedUrl(
s3,
new GetObjectCommand({ Bucket: "my-bucket", Key: "uploads/document.pdf" }),
{ expiresIn: 900 }
);
// Scan the file
const response = await fetch(`https://${API_URL}/v1.0/scans`, {
method: "POST",
headers: {
"Authorization": `Bearer ${API_TOKEN}`,
"Content-Type": "application/json",
},
body: JSON.stringify({ url: presignedUrl }),
});
const result = await response.json();
if (result.status === "found") {
console.log("Malware detected:", result.matches);
// Delete or quarantine the object
}
That's it. The presigned URL gives AttachmentScanner temporary read access to the object without exposing your AWS credentials or making the bucket public.
Handling the Response
| Status | Meaning | Action |
|---|---|---|
ok |
File is clean | Allow access |
found |
Malware detected | Delete, quarantine, or flag |
warning |
Macros, encrypted archives | Depends on your policy |
pending |
Scan still running | Poll or wait for callback |
failed |
Scan error | Retry or flag for review |
For small files, scanning takes a couple of seconds. Larger files or high-volume
bursts may return pending, which means the scan is still running. Any scan over
30 seconds automatically returns pending.
Going Async: The Recommended Approach
The presigned URL example above is synchronous — your code waits for the scan to finish. For production workloads, we recommend async scanning with callbacks. Your application isn't blocked waiting for results, and you're set up to handle files of any size.
The pattern:
- File arrives in S3 (via your app, presigned upload, or transfer)
- Your application generates a presigned URL and kicks off an async scan
- AttachmentScanner fetches the file and scans it
- When the scan completes, AttachmentScanner POSTs the result to your callback URL
- Your callback handler tags the object, deletes it, or moves it to quarantine
// Kick off an async scan with a callback
async function scanS3Object(bucket: string, key: string) {
const presignedUrl = await getSignedUrl(
s3,
new GetObjectCommand({ Bucket: bucket, Key: key }),
{ expiresIn: 900 }
);
const response = await fetch(`https://${API_URL}/v1.0/scans`, {
method: "POST",
headers: {
"Authorization": `Bearer ${API_TOKEN}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
url: presignedUrl,
callback: "https://your-app.com/webhooks/scan-complete",
async: true,
}),
});
return response.json();
}
// Callback handler
async function handleScanCallback(req, res) {
const result = req.body;
if (result.status === "ok") {
await tagS3Object(result.url, { "scan-status": "clean" });
} else if (result.status === "found") {
await deleteS3Object(result.url);
await notifyAdmin(result);
} else if (result.status === "warning") {
await tagS3Object(result.url, { "scan-status": "review" });
}
return res.status(200).end();
}
This approach gives you non-blocking scans, a clear audit trail via S3 tags, and clean separation between your upload flow and your scanning flow.
Automating with Lambda
If you want every file scanned the moment it lands in S3, you can trigger scanning automatically using S3 event notifications and a Lambda function.
When a file is uploaded, S3 fires an s3:ObjectCreated:* event. A Lambda
function picks it up, generates a presigned URL, and sends it to
AttachmentScanner. When the scan completes, a callback handles the result.
// Lambda handler — triggered by S3 event notification
import { S3Event } from "aws-lambda";
export async function handler(event: S3Event) {
for (const record of event.Records) {
const bucket = record.s3.bucket.name;
const key = decodeURIComponent(record.s3.object.key.replace(/\+/g, " "));
await scanS3Object(bucket, key);
}
}
We also provide a ready-made serverless application that you can deploy from the AWS Serverless Application Repository with zero code. It supports tagging, logging, and automatic deletion of malicious files.
For a full walkthrough of the Lambda setup, see our guide: Scanning an AWS S3 Bucket for Malware with Zero Code.
What About GuardDuty?
AWS GuardDuty Malware Protection for S3 launched in 2024 and provides native scanning. It's worth considering if you're already invested in the GuardDuty ecosystem, but there are trade-offs:
| GuardDuty | AttachmentScanner | |
|---|---|---|
| Engines | Single (AWS-internal) | Multiple detection engines |
| Portability | AWS only | Any cloud, any storage |
| Result handling | EventBridge + tags | Callbacks, tags, or poll |
| Setup | GuardDuty console or API | Single API call |
| Pricing | Per GB scanned + GuardDuty base | Per scan, predictable |
| Macros/encrypted files | Binary pass/fail | warning status with detail |
GuardDuty scans with a single engine and returns a binary result — threats found
or not. AttachmentScanner runs files against multiple detection engines and
distinguishes between confirmed malware (found), suspicious content like macros
or encrypted archives (warning), and clean files (ok). That distinction
matters when you need a policy for files that aren't malicious but aren't
straightforward either.
If you're using S3 alongside other storage (Azure Blob, GCS, DigitalOcean Spaces, on-premise), AttachmentScanner gives you a single scanning API across all of them — you're not locked into one provider's security tooling.
Testing Your Setup
Once scanning is wired up, verify it works with the EICAR test file. EICAR is
a standardised test file that every detection engine recognises. It's completely
harmless but triggers a found result.
We host a copy at https://www.attachmentscanner.com/eicar.com:
curl -H "Authorization: Bearer YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{"url": "https://www.attachmentscanner.com/eicar.com"}' \
-XPOST https://YOUR_API_URL/v1.0/scans
You should get "status": "found". Then test through your actual S3 flow —
upload the EICAR file to your bucket and confirm the scan triggers and the
result is handled correctly.
Best Practices
Scan before serving. Never serve a file to users before its scan status is confirmed clean. Use S3 object tags or a database flag to gate access.
Use presigned URLs with short expiry. 15 minutes is plenty. The scanning engines fetch the file immediately — long-lived URLs are an unnecessary risk.
Handle warning status. Files with macros or encrypted archives get a
warning rather than found. Decide on a policy — block them, flag for review,
or allow them depending on your use case.
Tag scanned objects. Add tags like scan-status: clean or
scan-status: found to S3 objects. This gives you an audit trail and makes it
easy to query your bucket for unscanned or flagged files.
Don't skip scanning for internal uploads. Internal systems get compromised too. If a file ends up in your bucket from any source, scan it.
Getting Started
- Sign up for an account — you'll get an API token and endpoint URL
- Test with the EICAR test file to confirm detection works
- Scan a presigned URL to verify your S3 integration
- Set up async scanning with callbacks for production
- Optionally, automate with Lambda for hands-free scanning
For a broader look at scanning user uploads beyond S3, see our complete guide to scanning user uploads.
AttachmentScanner Team
Other Articles
Scanning User Uploads for Malware
AttachmentScanner Team
Pass a Pen Test File Upload Check
AttachmentScanner Team
A Fresh New Look
AttachmentScanner Team