Design Youtube/Netflix


Table of Contents
  1. Functional Requirements
  2. Nonfunctional Requirements
  3. Resource Estimation
  4. Storage Estimation
  5. Bandwidth Estimation
  6. Number of Servers Estimation
  7. Building Blocks
  8. High Level Design
  9. Server between client and encoder
  10. API design
  11. Detail Design Using AWS
  12. Fulfilling requirements using AWS
  13. Followup Questions and Answers

Functional Requirements

Nonfunctional Requirements

Resource Estimation

Storage Estimation

Bandwidth Estimation

Number of Servers Estimation

Building Blocks

High Level Design - youtube


  1. Client uploads video → request goes to Application Server.
  2. Server sends raw video to Encoder for multi-format transcoding (240p–4K, HLS/DASH).
  3. Server extracts and stores metadata (title, tags, thumbnails) in Metadata DB.
  4. Encoded video chunks are stored in Blob/Object Storage (e.g., S3).
  5. Blob Storage pushes video chunks to CDN (e.g., CloudFront) for global caching.
  6. Client requests video from CDN edge location.
  7. CDN serves cached chunks directly or fetches from Blob Storage if not cached.

Why there is server between client and encoder
  1. Security & Authentication – Clients should not directly access internal encoding infrastructure; the server enforces auth, rate-limiting, validation, and quota checks.
  2. Load Balancing & Orchestration – Server decides which encoder worker should process the video; encoders usually run in a distributed cluster.
  3. Metadata Handling – Server extracts and stores metadata (title, description, tags, thumbnails) in a Metadata DB while passing video to encoder.
  4. Asynchronous Processing – Upload is long-running, so the server enqueues the job in a message queue and returns success immediately; encoder workers pick jobs asynchronously.
  5. Error Handling & Retries – If encoding fails, the server can retry, redirect to another encoder, or notify the client.
  6. Separation of Concerns – Server acts as API + control plane; encoder acts as data plane (CPU/GPU processing); both can scale independently.
  7. Audit & Analytics – Server logs uploads and processing details for monitoring, billing, and abuse prevention.

API design
1. Upload Video API
  1. Method: POST /api/v1/videos
  2. Headers:
    • Authorization: Bearer <token>
    • Content-Type: multipart/form-data
  3. Body:
    {
        "title": "My Travel Vlog",
        "description": "Exploring Bali",
        "tags": ["travel", "vlog", "bali"],
        "file": <binary_video_file>
        }
        
  4. Response:
    {
        "videoId": "vid_12345",
        "status": "processing"
        }
        

2. Stream Video API
  1. Method: GET /api/v1/videos/{videoId}/stream
  2. Headers:
    • Authorization: Bearer <token> (optional for public videos)
  3. Query Params: quality=720p, format=HLS
  4. Response: Returns .m3u8 playlist or video chunks (served via CDN)

3. Search Videos API
  1. Method: GET /api/v1/videos/search
  2. Query Params:
    • q=travel vlog
    • page=1
    • limit=20
  3. Response:
    {
        "results": [
            {
            "videoId": "vid_12345",
            "title": "My Travel Vlog",
            "thumbnailUrl": "https://cdn.example.com/thumbs/vid_12345.jpg",
            "views": 100000
            }
        ]
        }
        

4. Like / Dislike Video API
  1. Method:
    • POST /api/v1/videos/{videoId}/like
    • POST /api/v1/videos/{videoId}/dislike
  2. Headers: Authorization: Bearer <token>
  3. Response:
    {
        "videoId": "vid_12345",
        "likes": 1023,
        "dislikes": 45
        }
        

5. Comment on Video API
  1. Method: POST /api/v1/videos/{videoId}/comments
  2. Headers: Authorization: Bearer <token>
  3. Body:
    {
        "comment": "Amazing video! Keep it up 👏"
        }
        
  4. Response:
    {
        "commentId": "cmt_98765",
        "videoId": "vid_12345",
        "userId": "usr_1111",
        "comment": "Amazing video! Keep it up 👏",
        "createdAt": "2025-08-18T10:00:00Z"
        }
        

6. Get Thumbnails API
  1. Method: GET /api/v1/videos/{videoId}/thumbnails
  2. Response:
    {
                "thumbnails": [
                        {"resolution": "120x90", "url": "https://cdn.example.com/thumbs/vid_12345_120x90.jpg"},
                        {"resolution": "480x360", "url": "https://cdn.example.com/thumbs/vid_12345_480x360.jpg"},
                        {"resolution": "1280x720", "url": "https://cdn.example.com/thumbs/vid_12345_1280x720.jpg"}
                    ]
                }
                

Detail Design and Flow Using AWS


📌 Overview

🏗️ Components & Responsibilities
▶️ Watch Flow
  1. User requests a video (client → CloudFront)
  2. CloudFront checks cache:
    • Hit → Serve directly from edge
    • Miss → Forward to origin (S3 or ALB)
  3. App server fetches metadata (DynamoDB/RDS)
  4. Generates signed URLs / playback manifest
  5. Client streams video via CloudFront (adaptive bitrate)

⬆️ Upload Flow
  1. User uploads video (client → ALB → App server)
  2. App server generates pre-signed S3 upload URL
  3. User uploads raw video → S3 Upload Bucket
  4. App server updates metadata (status = uploaded)
  5. Triggers MediaConvert job → encodes video → stores in S3 Blob Storage
  6. Metadata updated (status = ready, add final URLs)
  7. CloudFront serves transcoded video to users

💾 Data & Storage Patterns
⚡ Scalability & Reliability
🔒 Security & Monitoring

📌 Fulfilling requirements using AWS
⚡ Low Latency / Smooth Streaming

📈 Scalability

✅ Availability

🔒 Reliability

Followup Questions and Answers
1. Why do we need a server between the client and the encoder?

2. How does the video upload process work end-to-end?

3. How is scalability achieved in this architecture?

4. How is low latency and smooth streaming ensured?

5. What databases are used and why?

6. How is data durability and availability ensured?

7. How is security managed in this architecture?

8. How does the system handle video encoding failures?

9. Why use both S3 Upload Bucket and Blob Storage?

10. How are user interactions like likes, comments handled?

11. What kind of CDN caching strategy is used?

12. How does the system support different video qualities and formats?