Beetroot LogoBeetroot

Match Faces

Use Rekognition SearchFacesByImage to check if each cropped face already exists in the collection.

Goal

For every cropped face thumbnail, the ingestion Lambda should:

  1. Run SearchFacesByImage against your Rekognition collection
  2. Log one of these outcomes:
    • Match: returns an existing FaceId
    • NoMatch: no similar face found (we handle this in the next phase)

Important Note

SearchFacesByImage can only match faces that are already indexed in the collection. If the collection is empty, you will see NoMatch for everything — that’s expected until the next phase (IndexFaces).

Setup

Environment variables

  • REKOGNITION_COLLECTION_ID = beetroot-faces
  • FACE_MATCH_THRESHOLD = 95

Why 95?

A higher threshold is safer: fewer wrong groupings, even if it creates more “new person” cases.

Search Code

What this update does

This phase adds one step inside the face-cropping loop:

  • Use the in-memory cropped face bytes (BytesIO) to search the collection
  • Log whether it matches an existing face

We do not create people records yet. This phase is only: match vs no-match logging.

Part 1: Read config once

Add these lines near the top of lambda_handler (before looping faces), so we don’t read env vars repeatedly.

PHOTOS_TABLE_NAME = os.environ.get("PHOTOS_TABLE", "Photos")
RAW_PREFIX = os.environ.get("RAW_PREFIX", "photos-raw/")

collection_id = os.environ["REKOGNITION_COLLECTION_ID"] 
threshold = float(os.environ.get("FACE_MATCH_THRESHOLD", "95")) 

Collection ID is required

Use os.environ["REKOGNITION_COLLECTION_ID"] so the function fails fast if it’s missing.

Part 2: Search after cropping

Place the code blocks below right after:

out = BytesIO()
face_im.save(out, format="JPEG", quality=90)
out.seek(0)

2.1 The Search request

search_resp = rek.search_faces_by_image(
    CollectionId=collection_id,
    Image={"Bytes": out.getvalue()},
    MaxFaces=1,
    FaceMatchThreshold=threshold,
)
  • CollectionId: your collection (ex: beetroot-faces)
  • Bytes: the cropped face JPEG (in-memory)
  • MaxFaces=1: return only the best match
  • FaceMatchThreshold=95: require high similarity
  • FaceMatches : list of matches (empty if none)
  • If present, the top item contains:
    • Face.FaceId
    • Similarity

2.2 Read the result and log

matches = search_resp.get("FaceMatches", [])
if matches:
    top = matches[0]
    face_id = top["Face"]["FaceId"]
    similarity = top.get("Similarity")
    print(f"Match: idx={idx} faceId={face_id} similarity={similarity}")
else:
    print(f"NoMatch: idx={idx} threshold={threshold}")
    # Index code insert here

Search Code Snippet

out = BytesIO()
face_im.save(out, format="JPEG", quality=90)
out.seek(0)

face_bytes = out.getvalue() 
# 1) Search for match 
search_resp = rek.search_faces_by_image( 
    CollectionId=collection_id, 
    Image={"Bytes": face_bytes}, 
    MaxFaces=1, 
    FaceMatchThreshold=threshold, 
) 
# 2) Match or no match 
matches = search_resp.get("FaceMatches", []) 
if matches: 
    top = matches[0] 
    person_id = top["Face"]["FaceId"]  # personId == FaceId (your chosen rule) 
    similarity = top.get("Similarity") 
    print(f"Match: idx={idx} personId={person_id} similarity={similarity}") 
else: 
    print(f"NoMatch: idx={idx} threshold={threshold} -> indexing") 
    # 2) No match -> Index face into the collection
    # Index code insert here

Common issues

AccessDeniedException → your Lambda role is missing rekognition:SearchFacesByImage.

ResourceNotFoundException → wrong REKOGNITION_COLLECTION_ID or wrong region.

Normal if you haven’t indexed any faces yet (next phase).

On this page