Header Image How to Scan Files in AWS S3 for Malware

How to Scan Files in AWS S3 for Malware

If your application stores user-uploaded files in S3, those files need scanning. It doesn't matter how they got there — direct upload, presigned URL, transfer from another service — if the file came from outside your trust boundary, it could contain malware.

This is a real-world attack vector that comes up in penetration tests, compliance audits, and incident reports. A malicious file lands in your bucket, another system or user downloads it, and your infrastructure has become the distribution mechanism.

AWS offers GuardDuty Malware Protection for S3, but it has limitations that matter for most applications. This guide covers the practical options for scanning S3 files for malware — starting with a single API call and working up to a fully automated pipeline.

The Options

There are three main approaches to scanning files in S3:

Approach How it works Best for
API with presigned URL Generate a presigned URL, send it to a scanning API On-demand scanning, existing upload flows
Lambda + S3 events Trigger a Lambda function on upload, scan automatically Automated scanning of every upload
GuardDuty Malware Protection AWS-native scanning via GuardDuty Teams already deep in the AWS security ecosystem

We'll cover all three below.

Scanning with a Presigned URL

The simplest approach: generate a presigned URL for the S3 object and pass it to AttachmentScanner. The scanning engines fetch the file directly from S3 — your application never has to download it.

import { S3Client, GetObjectCommand } from "@aws-sdk/client-s3";
import { getSignedUrl } from "@aws-sdk/s3-request-presigner";

const s3 = new S3Client({ region: "us-east-1" });

// Generate a presigned URL (valid for 15 minutes)
const presignedUrl = await getSignedUrl(
  s3,
  new GetObjectCommand({ Bucket: "my-bucket", Key: "uploads/document.pdf" }),
  { expiresIn: 900 }
);

// Scan the file
const response = await fetch(`https://${API_URL}/v1.0/scans`, {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${API_TOKEN}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({ url: presignedUrl }),
});

const result = await response.json();

if (result.status === "found") {
  console.log("Malware detected:", result.matches);
  // Delete or quarantine the object
}

That's it. The presigned URL gives AttachmentScanner temporary read access to the object without exposing your AWS credentials or making the bucket public.

Handling the Response

Status Meaning Action
ok File is clean Allow access
found Malware detected Delete, quarantine, or flag
warning Macros, encrypted archives Depends on your policy
pending Scan still running Poll or wait for callback
failed Scan error Retry or flag for review

For small files, scanning takes a couple of seconds. Larger files or high-volume bursts may return pending, which means the scan is still running. Any scan over 30 seconds automatically returns pending.

The presigned URL example above is synchronous — your code waits for the scan to finish. For production workloads, we recommend async scanning with callbacks. Your application isn't blocked waiting for results, and you're set up to handle files of any size.

The pattern:

Upload to S3 Presign URL + scan request Scan API (async) Return "pending" Callback Clean Tag as scanned Malicious Delete + alert
  1. File arrives in S3 (via your app, presigned upload, or transfer)
  2. Your application generates a presigned URL and kicks off an async scan
  3. AttachmentScanner fetches the file and scans it
  4. When the scan completes, AttachmentScanner POSTs the result to your callback URL
  5. Your callback handler tags the object, deletes it, or moves it to quarantine
// Kick off an async scan with a callback
async function scanS3Object(bucket: string, key: string) {
  const presignedUrl = await getSignedUrl(
    s3,
    new GetObjectCommand({ Bucket: bucket, Key: key }),
    { expiresIn: 900 }
  );

  const response = await fetch(`https://${API_URL}/v1.0/scans`, {
    method: "POST",
    headers: {
      "Authorization": `Bearer ${API_TOKEN}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      url: presignedUrl,
      callback: "https://your-app.com/webhooks/scan-complete",
      async: true,
    }),
  });

  return response.json();
}

// Callback handler
async function handleScanCallback(req, res) {
  const result = req.body;

  if (result.status === "ok") {
    await tagS3Object(result.url, { "scan-status": "clean" });
  } else if (result.status === "found") {
    await deleteS3Object(result.url);
    await notifyAdmin(result);
  } else if (result.status === "warning") {
    await tagS3Object(result.url, { "scan-status": "review" });
  }

  return res.status(200).end();
}

This approach gives you non-blocking scans, a clear audit trail via S3 tags, and clean separation between your upload flow and your scanning flow.

Automating with Lambda

If you want every file scanned the moment it lands in S3, you can trigger scanning automatically using S3 event notifications and a Lambda function.

When a file is uploaded, S3 fires an s3:ObjectCreated:* event. A Lambda function picks it up, generates a presigned URL, and sends it to AttachmentScanner. When the scan completes, a callback handles the result.

// Lambda handler — triggered by S3 event notification
import { S3Event } from "aws-lambda";

export async function handler(event: S3Event) {
  for (const record of event.Records) {
    const bucket = record.s3.bucket.name;
    const key = decodeURIComponent(record.s3.object.key.replace(/\+/g, " "));

    await scanS3Object(bucket, key);
  }
}

We also provide a ready-made serverless application that you can deploy from the AWS Serverless Application Repository with zero code. It supports tagging, logging, and automatic deletion of malicious files.

For a full walkthrough of the Lambda setup, see our guide: Scanning an AWS S3 Bucket for Malware with Zero Code.

What About GuardDuty?

AWS GuardDuty Malware Protection for S3 launched in 2024 and provides native scanning. It's worth considering if you're already invested in the GuardDuty ecosystem, but there are trade-offs:

GuardDuty AttachmentScanner
Engines Single (AWS-internal) Multiple detection engines
Portability AWS only Any cloud, any storage
Result handling EventBridge + tags Callbacks, tags, or poll
Setup GuardDuty console or API Single API call
Pricing Per GB scanned + GuardDuty base Per scan, predictable
Macros/encrypted files Binary pass/fail warning status with detail

GuardDuty scans with a single engine and returns a binary result — threats found or not. AttachmentScanner runs files against multiple detection engines and distinguishes between confirmed malware (found), suspicious content like macros or encrypted archives (warning), and clean files (ok). That distinction matters when you need a policy for files that aren't malicious but aren't straightforward either.

If you're using S3 alongside other storage (Azure Blob, GCS, DigitalOcean Spaces, on-premise), AttachmentScanner gives you a single scanning API across all of them — you're not locked into one provider's security tooling.

Testing Your Setup

Once scanning is wired up, verify it works with the EICAR test file. EICAR is a standardised test file that every detection engine recognises. It's completely harmless but triggers a found result.

We host a copy at https://www.attachmentscanner.com/eicar.com:

curl -H "Authorization: Bearer YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://www.attachmentscanner.com/eicar.com"}' \
  -XPOST https://YOUR_API_URL/v1.0/scans

You should get "status": "found". Then test through your actual S3 flow — upload the EICAR file to your bucket and confirm the scan triggers and the result is handled correctly.

Best Practices

Scan before serving. Never serve a file to users before its scan status is confirmed clean. Use S3 object tags or a database flag to gate access.

Use presigned URLs with short expiry. 15 minutes is plenty. The scanning engines fetch the file immediately — long-lived URLs are an unnecessary risk.

Handle warning status. Files with macros or encrypted archives get a warning rather than found. Decide on a policy — block them, flag for review, or allow them depending on your use case.

Tag scanned objects. Add tags like scan-status: clean or scan-status: found to S3 objects. This gives you an audit trail and makes it easy to query your bucket for unscanned or flagged files.

Don't skip scanning for internal uploads. Internal systems get compromised too. If a file ends up in your bucket from any source, scan it.

Getting Started

  1. Sign up for an account — you'll get an API token and endpoint URL
  2. Test with the EICAR test file to confirm detection works
  3. Scan a presigned URL to verify your S3 integration
  4. Set up async scanning with callbacks for production
  5. Optionally, automate with Lambda for hands-free scanning

For a broader look at scanning user uploads beyond S3, see our complete guide to scanning user uploads.

2026-03-09
Profile Image: AttachmentScanner Team AttachmentScanner Team

Other Articles