Beetroot LogoBeetroot

API Lambda (Persons + Photos)

Create a small HTTP Lambda that returns People grid + Person detail data for the beetroot React app.

Goal

By the end of this phase, you'll have a backend API Lambda that can power two React screens:

  1. People grid
GET /persons → returns { personId, photoCount, repThumbKey }
  1. Person detail
GET /persons/{personId}/photos → returns { photoId, photoBucket, photoKey, thumbKey }

Why do we need an API at all?

Your React app should not query DynamoDB directly. A small API Lambda acts like a safe “middle layer” that reads DynamoDB and returns only what the UI needs.

What we already have

From earlier phases, your DynamoDB data is now ready:

  • Persons: personId, photoCount, repThumbKey
  • Occurrences: for a personId, gives photoId + thumbKey
  • Photos: for a photoId, gives original s3Bucket + s3Key

Create API Lambda

Create function

  1. Go to Lambda → Create function
  2. Select Author from scratch
  3. Fill:
    • Function name: beetroot-api
    • Runtime: Python 3.14
  4. Under Permissions:
    • Click Change default execution role
    • Select Use an existing role
    • Choose your role: beetroot-api-role (the role you created)
  5. Click Create function

Add environment variables

  • PERSONS_TABLE = Persons
  • OCCURRENCES_TABLE = Occurrences
  • PHOTOS_TABLE = Photos

API Lambda Code

What this code does

This Lambda behaves like a tiny HTTP router:

  • reads method + path from the incoming request
  • returns JSON + CORS headers
  • supports 2 routes:
GET /persons
GET /persons/{personId}/photos

Part 1: Imports + DynamoDB table handles

This section connects to DynamoDB and prepares table objects.

import json
import os
from decimal import Decimal

import boto3
from boto3.dynamodb.conditions import Key

ddb = boto3.resource("dynamodb")

PERSONS_TABLE = ddb.Table(os.environ.get("PERSONS_TABLE", "Persons"))
OCC_TABLE = ddb.Table(os.environ.get("OCCURRENCES_TABLE", "Occurrences"))
PHOTOS_TABLE = ddb.Table(os.environ.get("PHOTOS_TABLE", "Photos"))

Why Key from boto3.dynamodb.conditions?

We use Key("personId").eq(person_id) to query the Occurrences table efficiently by partition key.

Part 2: JSON safe for DynamoDB numbers

DynamoDB numbers come back from boto3 as Decimal. Python's json.dumps() cannot serialize Decimal unless we convert it.

Add this helper once, and reuse it everywhere.

def _json_default(o):
    if isinstance(o, Decimal):
        # Whole numbers (like photoCount=2) should stay integers
        if o % 1 == 0:
            return int(o)
        return float(o)
    raise TypeError(f"Object of type {o.__class__.__name__} is not JSON serializable")

Why do we need this?

Without this, you may see: TypeError: Object of type Decimal is not JSON serializable. This keeps your API responses valid JSON for React.

Part 3: Helper to return JSON (with CORS)

Instead of repeating response formatting in every route, we create one helper:

def _resp(status: int, body: dict):
    return {
        "statusCode": status,
        "headers": {
            "Content-Type": "application/json",
            "Access-Control-Allow-Origin": "*",
            "Access-Control-Allow-Methods": "GET,OPTIONS",
            "Access-Control-Allow-Headers": "Content-Type",
        },
        "body": json.dumps(body),
    }

About CORS = *

This is convenient for development. Later, you should restrict this to your real frontend domain.

Part 4: Read method + path and handle OPTIONS

API calls always include a method and path.

Browsers may send an OPTIONS request (CORS preflight) before the real GET.

def lambda_handler(event, context):
    method = event.get("requestContext", {}).get("http", {}).get("method") or event.get("httpMethod")
    path = event.get("rawPath") or event.get("path") or ""

    if method == "OPTIONS":
        return _resp(200, {"ok": True})

Browser sends a preflight:

method = OPTIONS
path = /persons

Lambda returns 200 so the browser allows the real GET request.

Part 5: Route 1 — GET /persons

This route returns the list of persons for your People grid.

In this workshop version, we do:

  • scan() the table
  • sort results by photoCount in Python
if method == "GET" and path == "/persons":
    items = []
    resp = PERSONS_TABLE.scan(Limit=200)
    items.extend(resp.get("Items", []))
    while "LastEvaluatedKey" in resp and len(items) < 200:
        resp = PERSONS_TABLE.scan(ExclusiveStartKey=resp["LastEvaluatedKey"], Limit=200)
        items.extend(resp.get("Items", []))

    items.sort(key=lambda x: int(x.get("photoCount", 0)), reverse=True)

    return _resp(200, {"persons": items[:100]})

Why scan here?

Scan is not ideal for huge tables, but it's the simplest way to get started. Once you understand the flow, we can add a GSI for efficient sorting later.

Part 6: Route 2 — GET /persons/{personId}/photos

What this route does

When the UI opens a person's page, it needs:

  • which photos this person appears in
  • the thumbnail for each appearance
  • the original photo location (bucket + key) so the UI can load it later

We build that response by combining data from two tables:

  1. Occurrences (fast query by personId)
  2. Photos (lookup by photoId)

6.1: Validate the path and extract personId

We only accept paths like:

  • /persons/<personId>/photos
parts = path.strip("/").split("/")
if len(parts) != 3:
    return _resp(400, {"error": "bad path"})
person_id = parts[1]

Example path:

/persons/0c7d4865.../photos

Extracted value:

person_id = "0c7d4865..."

Why do we parse the path manually?

Right now, we don't have API Gateway routing yet. We're keeping the Lambda logic simple and readable. Later, the HTTP layer can map this route automatically.

6.2: Query Occurrences by personId

The Occurrences table is designed exactly for this query:

  • Partition key: personId
  • Sort key: photoId

So we can fetch all photos for a person using a single query.

occ = OCC_TABLE.query(
    KeyConditionExpression=Key("personId").eq(person_id),
    Limit=200
).get("Items", [])
person_id = "0c7d4865..."
occ = [
  {"photoId": "fe40...", "thumbKey": "faces-thumbs/0c7d.../fe40_face_1.jpg", ...},
  ...
]

Why not scan Occurrences?

A scan reads the full table. A query reads only the partition you want. That's why we designed Occurrences with PK = personId.

6.3: Fetch the original photo info from Photos

Occurrences tells us photoId, but the UI also needs where the photo lives:

  • s3Bucket
  • s3Key

That info is stored in the Photos table, so we fetch it.

photos = []

for o in occ:
    photo_id = o.get("photoId")
    if not photo_id:
        continue

    p = PHOTOS_TABLE.get_item(
        Key={"photoId": photo_id}
    ).get("Item")
    if not p:
        continue
p = PHOTOS_TABLE.get_item(
    Key={"photoId": photo_id}
).get("Item")
photo_id = "fe40..."
p = {
  "photoId": "fe40...",
  "s3Bucket": "beetroot-raw",
  "s3Key": "photos-raw/st5.jpg",
  ...
}

Why do we need Photos table here?

Occurrences is “person ↔ photo link”. Photos is the single source of truth for the original S3 path.

6.4: Build the response objects for the UI

Now we combine:

  • thumbKey from Occurrences
  • s3Bucket + s3Key from Photos
  • and return a clean list React can use
photos.append({
    "photoId": photo_id,
    "photoBucket": p.get("s3Bucket"),
    "photoKey": p.get("s3Key"),
    "thumbKey": o.get("thumbKey"),
})

From Occurrences:

o.thumbKey = "faces-thumbs/0c7d.../fe40_face_1.jpg"

From Photos:

p.s3Bucket = "beetroot-raw"
p.s3Key = "photos-raw/st5.jpg"

One UI-ready entry:

{
  "photoId": "fe40...",
  "photoBucket": "beetroot-raw",
  "photoKey": "photos-raw/st5.jpg",
  "thumbKey": "faces-thumbs/0c7d.../fe40_face_1.jpg"
}

6.5: Return the final JSON response

return _resp(200, {"personId": person_id, "photos": photos})

Student question: is this efficient?

This is “good enough” for learning, but doing many get_item calls is slower. Later, we can optimize this with BatchGet once the API works end-to-end.

Part 7: Default route (404)

If the request doesn't match any of the two routes:

return _resp(404, {"error": "not found", "path": path})

Lambda Code

beetroot-api/lambda_function.py
import json
import os
from decimal import Decimal

import boto3
from boto3.dynamodb.conditions import Key

ddb = boto3.resource("dynamodb")

PERSONS_TABLE = ddb.Table(os.environ.get("PERSONS_TABLE", "Persons"))
OCC_TABLE = ddb.Table(os.environ.get("OCCURRENCES_TABLE", "Occurrences"))
PHOTOS_TABLE = ddb.Table(os.environ.get("PHOTOS_TABLE", "Photos"))


def _json_default(o):
    if isinstance(o, Decimal):
        if o % 1 == 0:
            return int(o)
        return float(o)
    raise TypeError(f"Object of type {o.__class__.__name__} is not JSON serializable")


def _resp(status: int, body: dict):
    return {
        "statusCode": status,
        "headers": {
            "Content-Type": "application/json",
            "Access-Control-Allow-Origin": "*",
            "Access-Control-Allow-Methods": "GET,OPTIONS",
            "Access-Control-Allow-Headers": "Content-Type",
        },
        "body": json.dumps(body, default=_json_default),
    }


def lambda_handler(event, context):
    method = event.get("requestContext", {}).get("http", {}).get("method") or event.get("httpMethod")
    path = event.get("rawPath") or event.get("path") or ""

    if method == "OPTIONS":
        return _resp(200, {"ok": True})

    if method == "GET" and path == "/persons":
        items = []
        resp = PERSONS_TABLE.scan(Limit=200)
        items.extend(resp.get("Items", []))
        while "LastEvaluatedKey" in resp and len(items) < 200:
            resp = PERSONS_TABLE.scan(ExclusiveStartKey=resp["LastEvaluatedKey"], Limit=200)
            items.extend(resp.get("Items", []))

        items.sort(key=lambda x: int(x.get("photoCount", 0)), reverse=True)
        return _resp(200, {"persons": items[:100]})

    if method == "GET" and path.startswith("/persons/") and path.endswith("/photos"):
        parts = path.strip("/").split("/")
        if len(parts) != 3:
            return _resp(400, {"error": "bad path"})
        person_id = parts[1]

        occ = OCC_TABLE.query(
            KeyConditionExpression=Key("personId").eq(person_id),
            Limit=200
        ).get("Items", [])

        photos = []
        for o in occ:
            photo_id = o.get("photoId")
            if not photo_id:
                continue
            p = PHOTOS_TABLE.get_item(Key={"photoId": photo_id}).get("Item")
            if not p:
                continue
            photos.append({
                "photoId": photo_id,
                "photoBucket": p.get("s3Bucket"),
                "photoKey": p.get("s3Key"),
                "thumbKey": o.get("thumbKey"),
            })

        return _resp(200, {"personId": person_id, "photos": photos})

    return _resp(404, {"error": "not found", "path": path})

On this page