API Lambda (Persons + Photos)
Create a small HTTP Lambda that returns People grid + Person detail data for the beetroot React app.
Goal
By the end of this phase, you'll have a backend API Lambda that can power two React screens:
- People grid
GET /persons → returns { personId, photoCount, repThumbKey }- Person detail
GET /persons/{personId}/photos → returns { photoId, photoBucket, photoKey, thumbKey }Why do we need an API at all?
Your React app should not query DynamoDB directly. A small API Lambda acts like a safe “middle layer” that reads DynamoDB and returns only what the UI needs.
What we already have
From earlier phases, your DynamoDB data is now ready:
- Persons:
personId,photoCount,repThumbKey - Occurrences: for a
personId, givesphotoId+thumbKey - Photos: for a
photoId, gives originals3Bucket+s3Key
Create API Lambda
Create function
- Go to Lambda → Create function
- Select Author from scratch
- Fill:
- Function name:
beetroot-api - Runtime: Python 3.14
- Function name:
- Under Permissions:
- Click Change default execution role
- Select Use an existing role
- Choose your role:
beetroot-api-role(the role you created)
- Click Create function
Add environment variables
PERSONS_TABLE=PersonsOCCURRENCES_TABLE=OccurrencesPHOTOS_TABLE=Photos
API Lambda Code
What this code does
This Lambda behaves like a tiny HTTP router:
- reads
method+pathfrom the incoming request - returns JSON + CORS headers
- supports 2 routes:
GET /persons
GET /persons/{personId}/photosPart 1: Imports + DynamoDB table handles
This section connects to DynamoDB and prepares table objects.
import json
import os
from decimal import Decimal
import boto3
from boto3.dynamodb.conditions import Key
ddb = boto3.resource("dynamodb")
PERSONS_TABLE = ddb.Table(os.environ.get("PERSONS_TABLE", "Persons"))
OCC_TABLE = ddb.Table(os.environ.get("OCCURRENCES_TABLE", "Occurrences"))
PHOTOS_TABLE = ddb.Table(os.environ.get("PHOTOS_TABLE", "Photos"))Why Key from boto3.dynamodb.conditions?
We use Key("personId").eq(person_id) to query the
Occurrences table efficiently by partition key.
Part 2: JSON safe for DynamoDB numbers
DynamoDB numbers come back from boto3 as Decimal.
Python's json.dumps() cannot serialize Decimal unless we convert it.
Add this helper once, and reuse it everywhere.
def _json_default(o):
if isinstance(o, Decimal):
# Whole numbers (like photoCount=2) should stay integers
if o % 1 == 0:
return int(o)
return float(o)
raise TypeError(f"Object of type {o.__class__.__name__} is not JSON serializable")Why do we need this?
Without this, you may see:
TypeError: Object of type Decimal is not JSON serializable. This
keeps your API responses valid JSON for React.
Part 3: Helper to return JSON (with CORS)
Instead of repeating response formatting in every route, we create one helper:
def _resp(status: int, body: dict):
return {
"statusCode": status,
"headers": {
"Content-Type": "application/json",
"Access-Control-Allow-Origin": "*",
"Access-Control-Allow-Methods": "GET,OPTIONS",
"Access-Control-Allow-Headers": "Content-Type",
},
"body": json.dumps(body),
}About CORS = *
This is convenient for development. Later, you should restrict this to your real frontend domain.
Part 4: Read method + path and handle OPTIONS
API calls always include a method and path.
Browsers may send an OPTIONS request (CORS preflight) before the real GET.
def lambda_handler(event, context):
method = event.get("requestContext", {}).get("http", {}).get("method") or event.get("httpMethod")
path = event.get("rawPath") or event.get("path") or ""
if method == "OPTIONS":
return _resp(200, {"ok": True})Browser sends a preflight:
method = OPTIONSpath = /personsLambda returns 200 so the browser allows the real GET request.
Part 5: Route 1 — GET /persons
This route returns the list of persons for your People grid.
In this workshop version, we do:
scan()the table- sort results by
photoCountin Python
if method == "GET" and path == "/persons":
items = []
resp = PERSONS_TABLE.scan(Limit=200)
items.extend(resp.get("Items", []))
while "LastEvaluatedKey" in resp and len(items) < 200:
resp = PERSONS_TABLE.scan(ExclusiveStartKey=resp["LastEvaluatedKey"], Limit=200)
items.extend(resp.get("Items", []))
items.sort(key=lambda x: int(x.get("photoCount", 0)), reverse=True)
return _resp(200, {"persons": items[:100]})Why scan here?
Scan is not ideal for huge tables, but it's the simplest way to get started. Once you understand the flow, we can add a GSI for efficient sorting later.
Part 6: Route 2 — GET /persons/{personId}/photos
What this route does
When the UI opens a person's page, it needs:
- which photos this person appears in
- the thumbnail for each appearance
- the original photo location (bucket + key) so the UI can load it later
We build that response by combining data from two tables:
- Occurrences (fast query by
personId) - Photos (lookup by
photoId)
6.1: Validate the path and extract personId
We only accept paths like:
/persons/<personId>/photos
parts = path.strip("/").split("/")
if len(parts) != 3:
return _resp(400, {"error": "bad path"})
person_id = parts[1]Example path:
/persons/0c7d4865.../photosExtracted value:
person_id = "0c7d4865..."Why do we parse the path manually?
Right now, we don't have API Gateway routing yet. We're keeping the Lambda logic simple and readable. Later, the HTTP layer can map this route automatically.
6.2: Query Occurrences by personId
The Occurrences table is designed exactly for this query:
- Partition key:
personId - Sort key:
photoId
So we can fetch all photos for a person using a single query.
occ = OCC_TABLE.query(
KeyConditionExpression=Key("personId").eq(person_id),
Limit=200
).get("Items", [])person_id = "0c7d4865..."occ = [
{"photoId": "fe40...", "thumbKey": "faces-thumbs/0c7d.../fe40_face_1.jpg", ...},
...
]Why not scan Occurrences?
A scan reads the full table. A query reads only the partition you want. That's why we designed Occurrences with PK = personId.
6.3: Fetch the original photo info from Photos
Occurrences tells us photoId, but the UI also needs where the photo lives:
s3Buckets3Key
That info is stored in the Photos table, so we fetch it.
photos = []
for o in occ:
photo_id = o.get("photoId")
if not photo_id:
continue
p = PHOTOS_TABLE.get_item(
Key={"photoId": photo_id}
).get("Item")
if not p:
continuep = PHOTOS_TABLE.get_item(
Key={"photoId": photo_id}
).get("Item")photo_id = "fe40..."p = {
"photoId": "fe40...",
"s3Bucket": "beetroot-raw",
"s3Key": "photos-raw/st5.jpg",
...
}Why do we need Photos table here?
Occurrences is “person ↔ photo link”. Photos is the single source of truth for the original S3 path.
6.4: Build the response objects for the UI
Now we combine:
thumbKeyfrom Occurrencess3Bucket+s3Keyfrom Photos- and return a clean list React can use
photos.append({
"photoId": photo_id,
"photoBucket": p.get("s3Bucket"),
"photoKey": p.get("s3Key"),
"thumbKey": o.get("thumbKey"),
})From Occurrences:
o.thumbKey = "faces-thumbs/0c7d.../fe40_face_1.jpg"From Photos:
p.s3Bucket = "beetroot-raw"
p.s3Key = "photos-raw/st5.jpg"One UI-ready entry:
{
"photoId": "fe40...",
"photoBucket": "beetroot-raw",
"photoKey": "photos-raw/st5.jpg",
"thumbKey": "faces-thumbs/0c7d.../fe40_face_1.jpg"
}6.5: Return the final JSON response
return _resp(200, {"personId": person_id, "photos": photos})Student question: is this efficient?
This is “good enough” for learning, but doing many get_item calls
is slower. Later, we can optimize this with BatchGet once the API works
end-to-end.
Part 7: Default route (404)
If the request doesn't match any of the two routes:
return _resp(404, {"error": "not found", "path": path})Lambda Code
import json
import os
from decimal import Decimal
import boto3
from boto3.dynamodb.conditions import Key
ddb = boto3.resource("dynamodb")
PERSONS_TABLE = ddb.Table(os.environ.get("PERSONS_TABLE", "Persons"))
OCC_TABLE = ddb.Table(os.environ.get("OCCURRENCES_TABLE", "Occurrences"))
PHOTOS_TABLE = ddb.Table(os.environ.get("PHOTOS_TABLE", "Photos"))
def _json_default(o):
if isinstance(o, Decimal):
if o % 1 == 0:
return int(o)
return float(o)
raise TypeError(f"Object of type {o.__class__.__name__} is not JSON serializable")
def _resp(status: int, body: dict):
return {
"statusCode": status,
"headers": {
"Content-Type": "application/json",
"Access-Control-Allow-Origin": "*",
"Access-Control-Allow-Methods": "GET,OPTIONS",
"Access-Control-Allow-Headers": "Content-Type",
},
"body": json.dumps(body, default=_json_default),
}
def lambda_handler(event, context):
method = event.get("requestContext", {}).get("http", {}).get("method") or event.get("httpMethod")
path = event.get("rawPath") or event.get("path") or ""
if method == "OPTIONS":
return _resp(200, {"ok": True})
if method == "GET" and path == "/persons":
items = []
resp = PERSONS_TABLE.scan(Limit=200)
items.extend(resp.get("Items", []))
while "LastEvaluatedKey" in resp and len(items) < 200:
resp = PERSONS_TABLE.scan(ExclusiveStartKey=resp["LastEvaluatedKey"], Limit=200)
items.extend(resp.get("Items", []))
items.sort(key=lambda x: int(x.get("photoCount", 0)), reverse=True)
return _resp(200, {"persons": items[:100]})
if method == "GET" and path.startswith("/persons/") and path.endswith("/photos"):
parts = path.strip("/").split("/")
if len(parts) != 3:
return _resp(400, {"error": "bad path"})
person_id = parts[1]
occ = OCC_TABLE.query(
KeyConditionExpression=Key("personId").eq(person_id),
Limit=200
).get("Items", [])
photos = []
for o in occ:
photo_id = o.get("photoId")
if not photo_id:
continue
p = PHOTOS_TABLE.get_item(Key={"photoId": photo_id}).get("Item")
if not p:
continue
photos.append({
"photoId": photo_id,
"photoBucket": p.get("s3Bucket"),
"photoKey": p.get("s3Key"),
"thumbKey": o.get("thumbKey"),
})
return _resp(200, {"personId": person_id, "photos": photos})
return _resp(404, {"error": "not found", "path": path})