Automatic Emby Media Pruning (Movies & TV Episodes) Using a Custom Docker Script – Any-User Watch Logic

Posted December 1, 2025 · December 1, 2025

Hi everyone,

I wanted to share a solution I built with a lot of help from ChatGPT.
I’m not a programmer, but I needed a smart, automatic way to clean up old movies and TV episodes on my Emby server.

Since I couldn’t find anything that did what I wanted, I ended up putting together a Docker-based pruning script. It’s been working really well for me, so I’m posting it here in case it helps someone else.

What This Does

This script automatically deletes media based on real viewing activity across all users on the server.

The retention values below are fully customisable – see the Environment Variables section.

Movies

Deleted if no user has watched them in 120 days
Or if never watched, deleted only if older than 120 days

TV Episodes

If played at least once → delete after 90 days with no recent activity
If never played → delete only if older than 180 days

The script talks to Emby via the API and calls DELETE /Items/{Id}, so items are removed cleanly from the library and the filesystem. No orphaned metadata.

Why I Built It

I share my Emby with multiple users
Storage was growing fast
I wanted cleanup based on what people actually watch, not just file dates
I wanted different rules for movies and TV
And I’m not a programmer, so it needed to be simple and repeatable

This solution is:

Docker-based (works nicely on Unraid (My Setup), Synology, Linux Docker, etc.)
Configurable via environment variables
Safe to test using dry-run mode

Setup Overview

I’ll show the steps using a Linux/Unraid style setup with nano, but the same idea works on any box with Docker.

Folder structure

Create a folder for the script and Dockerfile, for example:

mkdir -p /mnt/user/appdata/emby-prune
cd /mnt/user/appdata/emby-prune

Step 1 – Create the Dockerfile (with `nano`)

In a terminal on your server:

cd /mnt/user/appdata/emby-prune
nano Dockerfile

Paste this into nano:

FROM python:3.12-alpine

WORKDIR /app

RUN pip install --no-cache-dir requests

COPY emby_prune.py /app/emby_prune.py

CMD ["python", "/app/emby_prune.py"]

Save & exit:

Ctrl + O, Enter to save
Ctrl + X to exit

Step 2 – Create the Python Script (with `nano`)

Still in the same folder:

nano emby_prune.py

Paste the entire script below into nano:

import os
import sys
import requests
from datetime import datetime, timedelta, timezone

def parse_iso(dt_str):
    if not dt_str:
        return None
    # Emby uses ISO 8601 with offset, e.g. 2024-02-10T12:34:56.0000000+00:00
    dt_str = dt_str.replace("Z", "+00:00")
    try:
        dt = datetime.fromisoformat(dt_str)
        # Ensure timezone-aware; assume UTC if none
        if dt.tzinfo is None:
            dt = dt.replace(tzinfo=timezone.utc)
        return dt
    except Exception:
        return None

def get_env(name, default=None, required=False):
    value = os.getenv(name, default)
    if required and not value:
        print(f"Missing required env var: {name}", file=sys.stderr)
        sys.exit(1)
    return value

def fetch_users(session, emby_url, api_key):
    """Fetch all Emby users."""
    params = {"api_key": api_key}
    r = session.get(f"{emby_url}/Users", params=params, timeout=30)
    r.raise_for_status()
    users = r.json()
    print(f"Found {len(users)} users in Emby.")
    return users

def fetch_user_item_userdata(session, emby_url, api_key, user_id, item_id):
    """Fetch per-user item data (UserData) for given user and item."""
    params = {
        "api_key": api_key,
        "Fields": "UserData"
    }
    r = session.get(f"{emby_url}/Users/{user_id}/Items/{item_id}", params=params, timeout=30)
    r.raise_for_status()
    data = r.json()
    return data.get("UserData") or {}

def main():
    emby_url = get_env("EMBY_URL", required=True)       # e.g. http://emby:8096/emby
    api_key  = get_env("EMBY_API_KEY", required=True)

    # Defaults: movies 120d, TV played 90d, TV never-played 180d
    movie_days            = int(get_env("MOVIE_PRUNE_DAYS", get_env("PRUNE_DAYS", "120")))
    tv_days               = int(get_env("TV_PRUNE_DAYS", "90"))
    tv_never_played_days  = int(get_env("TV_NEVER_PLAYED_DAYS", "180"))
    dry_run               = get_env("DRY_RUN", "true").lower() == "true"

    now_utc = datetime.now(timezone.utc)
    cutoff_movies   = now_utc - timedelta(days=movie_days)
    cutoff_tv       = now_utc - timedelta(days=tv_days)
    cutoff_tv_never = now_utc - timedelta(days=tv_never_played_days)

    print(f"Emby prune starting (movies + TV, any-user mode)")
    print(f"  URL                     : {emby_url}")
    print(f"  Movie cutoff            : {cutoff_movies.isoformat()} (older than {movie_days} days)")
    print(f"  TV cutoff (played)      : {cutoff_tv.isoformat()} (older than {tv_days} days)")
    print(f"  TV cutoff (never played): {cutoff_tv_never.isoformat()} (older than {tv_never_played_days} days)")
    print(f"  Dry run                 : {dry_run}")
    print("")

    session = requests.Session()
    base_params = {"api_key": api_key}

    # 1) Get all users
    try:
        users = fetch_users(session, emby_url, api_key)
    except Exception as e:
        print(f"ERROR: Failed to fetch users: {e}", file=sys.stderr)
        sys.exit(1)

    if not users:
        print("No users found – aborting.")
        sys.exit(0)

    page_size = 200
    start_index = 0
    total_deleted = 0
    total_candidates = 0

    while True:
        # 2) Get movies + episodes
        params = {
            **base_params,
            "Recursive": "true",
            "IncludeItemTypes": "Movie,Episode",
            "Fields": "Path,DateCreated,Type,SeriesName,SeasonName,IndexNumber,ParentIndexNumber",
            "StartIndex": start_index,
            "Limit": page_size,
        }

        r = session.get(f"{emby_url}/Items", params=params, timeout=60)
        r.raise_for_status()
        data = r.json()

        items = data.get("Items", [])
        total_records = data.get("TotalRecordCount", 0)

        if not items:
            break

        for item in items:
            item_id = item.get("Id")
            name = item.get("Name", "Unknown")
            path = item.get("Path")
            date_created = parse_iso(item.get("DateCreated"))
            item_type = item.get("Type", "")

            # For episodes, we can build a bit more context (optional)
            series_name = item.get("SeriesName")
            season_num = item.get("ParentIndexNumber")
            episode_num = item.get("IndexNumber")

            # 3) Look at all users' UserData for this item
            most_recent_play = None
            total_plays_all_users = 0

            for user in users:
                user_id = user.get("Id")
                user_name = user.get("Name", "Unknown")

                try:
                    ud = fetch_user_item_userdata(session, emby_url, api_key, user_id, item_id)
                except Exception as e:
                    print(f"Warning: failed to fetch UserData for item {item_id} and user {user_name}: {e}")
                    continue

                lp = parse_iso(ud.get("LastPlayedDate"))
                pc = ud.get("PlayCount", 0) or 0
                total_plays_all_users += pc

                if lp:
                    if (most_recent_play is None) or (lp > most_recent_play):
                        most_recent_play = lp

            stale = False
            reason = ""

            if item_type == "Movie":
                # Movie rule: stale if no plays in movie_days
                if most_recent_play:
                    if most_recent_play < cutoff_movies:
                        stale = True
                        reason = f"movie: last played by some user at {most_recent_play.isoformat()}"
                else:
                    if date_created and date_created < cutoff_movies:
                        stale = True
                        reason = f"movie: never played by any user; created {date_created.isoformat()}"

            elif item_type == "Episode":
                # TV rule:
                # - if played: stale if last play older than tv_days
                # - if never played: stale only if older than tv_never_played_days
                if most_recent_play:
                    if most_recent_play < cutoff_tv:
                        stale = True
                        reason = f"episode: last played by some user at {most_recent_play.isoformat()}"
                else:
                    if date_created and date_created < cutoff_tv_never:
                        stale = True
                        reason = f"episode: never played by any user; created {date_created.isoformat()}"

            # Unknown type: skip
            if not stale:
                continue

            total_candidates += 1

            # Nicely formatted label for episodes
            if item_type == "Episode" and series_name:
                if season_num is not None and episode_num is not None:
                    label = f"{series_name} S{season_num:02d}E{episode_num:02d} - {name}"
                else:
                    label = f"{series_name} - {name}"
            else:
                label = name

            print(f"[STALE] {label}")
            print(f"        Type      : {item_type}")
            print(f"        ID        : {item_id}")
            print(f"        Path      : {path}")
            print(f"        Reason    : {reason}")
            print(f"        Total plays (all users) : {total_plays_all_users}")
            if dry_run:
                print("        Action    : SKIP (dry-run)\n")
                continue

            del_params = {"api_key": api_key}
            del_url = f"{emby_url}/Items/{item_id}"
            try:
                resp = session.delete(del_url, params=del_params, timeout=60)
                if resp.status_code in (200, 204):
                    total_deleted += 1
                    print("        Action    : DELETED\n")
                else:
                    print(f"        Action    : FAILED (status {resp.status_code})\n")
            except Exception as e:
                print(f"        Action    : FAILED ({e})\n")

        start_index += len(items)
        if start_index >= total_records:
            break

    print("")
    print(f"Scan complete.")
    print(f"  Stale candidates : {total_candidates}")
    print(f"  Deleted          : {total_deleted} (dry_run={dry_run})")

if __name__ == "__main__":
    main()

Save & exit:

Ctrl + O, Enter
Ctrl + X

Step 3 – Build the Docker Image

From the same folder:

cd /mnt/user/appdata/emby-prune
docker build -t emby-prune .

You should see something like:

Successfully built ...
Successfully tagged emby-prune:latest

Environment Variables (Retention Rules)

These control how aggressive the pruning is:

Variable	Meaning	Default
`MOVIE_PRUNE_DAYS`	Movies: del after X days inactivity	120
TV_PRUNE_DAYS	TV: del after X days if episode was played	90
`TV_NEVER_PLAYED_DAYS`	TV: delete after X days if never played	180
DRY_RUN	Preview only (`true` / `false`)	true
EMBY_URL/EMBY_API_KEY	Your Emby server base URL/API	http://server:8096/emby Generate an API Key

Step 4 – Test with a Dry Run (Safe Mode)

This does not delete anything, just prints what would be removed:

docker run --rm \
  --name emby-prune-test \
  -e EMBY_URL="http://YOURSERVER:8096/emby" \
  -e EMBY_API_KEY="YOUR_API_KEY" \
  -e MOVIE_PRUNE_DAYS="120" \
  -e TV_PRUNE_DAYS="90" \
  -e TV_NEVER_PLAYED_DAYS="180" \
  -e DRY_RUN="true" \
  emby-prune

Check the output
Make sure the stuff marked [STALE] looks sensible

Step 5 – Run for Real (Optional)

Once you’re happy with the dry-run output, you can run it for real:

docker run --rm \
  --name emby-prune-run \
  -e EMBY_URL="http://YOURSERVER:8096/emby" \
  -e EMBY_API_KEY="YOUR_API_KEY" \
  -e MOVIE_PRUNE_DAYS="120" \
  -e TV_PRUNE_DAYS="90" \
  -e TV_NEVER_PLAYED_DAYS="180" \
  -e DRY_RUN="false" \
  emby-prune

Set the schedule to:

e.g. Weekly, Monday, 03:00

Final Notes

This has been running nicely on my Emby setup:

Library stays tidy
Storage doesn’t explode
Old/watched stuff eventually ages out
New and recently watched content is kept
All rules are tweakable via environment variables

I may update when I get time to do the following ( No promises!)

whitelists, never delete tags
notifications
logs to file

Posted December 1, 2025 · December 1, 2025

Thanks for sharing.

Posted December 20, 2025 · December 20, 2025

thx! i will try

Posted January 22 · January 22

Exactly what I am looking for, but I don't see the variable for Movies between MOVIE_PRUNE_DAYS and MOVIE_NEVER_PLAYED_DAYS

I also tried adding an INTERVAL variable to allow for rerun of the container script every so often. How are you scheduling this (I am using compose)?

Dockerfile

FROM python:3.12-alpine
WORKDIR /app
RUN pip install --no-cache-dir requests
COPY emby_prune.py /app/emby_prune.py
CMD ["sh", "-c", "\
HOURS=${INTERVAL:-24}; \
SLEEP_SECONDS=$((HOURS*3600)); \
echo \"[emby-prune] Running every ${HOURS} hours\"; \
while true; do \
python /app/emby_prune.py; \
sleep ${SLEEP_SECONDS}; \
done \
"]

Sign In

Automatic Emby Media Pruning (Movies & TV Episodes) Using a Custom Docker Script – Any-User Watch Logic

Recommended Posts

mojavaid36 1

What This Does

Movies

TV Episodes

Why I Built It

Setup Overview

Folder structure

Step 1 – Create the Dockerfile (with `nano`)

Step 2 – Create the Python Script (with `nano`)

Step 3 – Build the Docker Image

Environment Variables (Retention Rules)

Step 4 – Test with a Dry Run (Safe Mode)

Step 5 – Run for Real (Optional)

Final Notes

Luke 42140

lundblad 10

ophate 0

Create an account or sign in to comment

Create an account

Sign in

Activity

Automatic Emby Media Pruning (Movies & TV Episodes) Using a Custom Docker Script – Any-User Watch Logic

Recommended Posts

mojavaid36 1

What This Does

Movies

TV Episodes

Why I Built It

Setup Overview

Folder structure

Step 1 – Create the Dockerfile (with nano)

Step 2 – Create the Python Script (with nano)

Step 3 – Build the Docker Image

Environment Variables (Retention Rules)

Step 4 – Test with a Dry Run (Safe Mode)

Step 5 – Run for Real (Optional)

Final Notes

Luke 42140

lundblad 10

ophate 0

Create an account or sign in to comment

Create an account

Sign in

Activity

Step 1 – Create the Dockerfile (with `nano`)

Step 2 – Create the Python Script (with `nano`)