Twitter Notification and Real-Time Push System Design Detailed Guide

Introduction

Notifications and real-time pushes are key interactive mechanisms in Twitter/X social platforms, used to handle events such as likes, @mentions, and replies, ensuring users receive updates promptly. Facing billions of DAU and daily billions of interactions, the system must support high-concurrency pushes, low-latency delivery, and high availability. This article systematically introduces the architecture schemes for notification pushes, trade-offs, engineering implementation details, and common interview follow-up questions, based on asynchronous decoupling (Kafka) and real-time communication (WebSocket), aiming to balance resource consumption and user experience.


1. Requirements and Challenges

  • High-Frequency Interactions: Billions of notifications daily (such as likes/@), with peak TPS in the tens of thousands; hot events can trigger millions of pushes instantly.

  • Real-Time Performance: Delivery latency <1 s, with frequent mobile reconnections (>10% users).

  • Resource Contention: Long connections consume memory (each >1 KB), peak CPU >80%; queue backlogs lead to losses.

  • Consistency and Filtering: Eventual consistency can tolerate second-level delays, but active users must be distinguished to avoid invalid pushes.

  • Fault Tolerance: Limit rates and degrade during DDoS attacks or network partitions; support multi-channels (Web/mobile).


2. Scheme Comparison and Trade-offs

2.1 WebSocket (Push Mode)

  • Principle: Establish a persistent bidirectional connection where the server actively pushes notifications, similar to a real-time phone call: once there’s a message, the system directly “calls” the user’s connection.

  • Advantages: Real-time low latency (<1 s), efficient bandwidth usage.

  • Disadvantages: Complex connection management, high memory/CPU under high concurrency; high reconnection overhead on disconnections.

2.2 Polling (Pull Mode)

  • Principle: The client periodically polls the server for new notifications, similar to checking a mailbox: the user asks “Any new messages?” every few seconds.

  • Advantages: Simple implementation, no need for long connections.

  • Disadvantages: High latency (average > interval time), bandwidth waste (many empty polls); unsuitable for real-time scenarios.

2.3 Hybrid Mode (Push + Pull)

  • Principle: Use push (WebSocket) for online users, and pull (FCM/APNS fallback) for offline/mobile, similar to smart mail: instant notifications when online, batch delivery when offline.

  • Advantages: Balances real-time and resources, reduces invalid pushes.

  • Disadvantages: Complex multi-channel management, consistency challenges.

Comparison Table:

Scheme Advantages Disadvantages Applicable Scenario Comparison
WebSocket - Real-time delivery <1 s
- Low bandwidth consumption
- High connection memory
- High DDoS risk
Superior to Polling in interaction-intensive scenarios, but requires rate limiting for high-concurrency scaling.
Polling - Simple implementation, stateless
- Easy fault tolerance
- High latency (> a few seconds)
- High bandwidth/server pressure
Suitable for low-frequency notifications, but poor real-time performance, not recommended as primary.
Hybrid Mode - Balances online/offline
- Resource optimization, delivery >99%
- Complex implementation, multi-channel synchronization
- Filtering logic overhead
Best for large-scale: Reduces pressure by 50% compared to pure push, high real-time performance.

3.1 Notification Type Classification

  • Distinguish high-priority (@mentions/replies) and low-priority (likes), prioritize hot notifications.

  • Prioritize pushes for active users (Redis online status, TTL=5 min).

3.2 Asynchronous Push

  • After interaction triggers, deliver to Kafka queue; consumers filter and batch push.

  • Merge notifications (e.g., “5 people liked”) to reduce frequency.

3.3 Multi-Channel Delivery

  • Web/App: WebSocket as the main channel.

  • Mobile: Fallback to FCM/APNS.

3.4 Rate Limiting and Degradation

  • Redis token bucket rate limiting (1000/min/user).

  • Batch merge pushes every 5 s during high load.

The following is a simplified architecture diagram of the recommended architecture (Mermaid syntax):

graph TD
    A["Interaction Event (Like/@)"] --> B["Notification Service"]
    B --> C["Kafka: Asynchronous Queue"]
    C --> D["Consumer: Filter Active Users + Merge Notifications"]
    D --> E{"Online?"}
    E -->|"Yes"| F["WebSocket: Real-time Push"]
    E -->|"No"| G["FCM/APNS: Mobile Push"]
    H["Redis: Online Status + Rate Limiting"] -.-> D
    H -.-> F
    I["Prometheus: Monitoring Delivery Rate"] -.-> G
    I -.-> F

This diagram shows the overall process from event triggering to delivery.


4. Key Data Structures and Processes

4.1 Redis Structure

  • online:{userId}: Online status, value “1”, TTL=5 min.

  • rate:{userId}: Push rate limiting count, EX=60 s.

  • notification:{userId}: Temporary storage for merged notifications (HashMap).

4.2 Notification Push Process

The following is the process sequence diagram (Mermaid syntax):

sequenceDiagram
    participant Client as Client
    participant Service as Interaction Service
    participant Kafka as Kafka Queue
    participant Consumer as Consumer
    participant Redis as Redis (Online/Rate Limiting)
    participant WS as WebSocket
    Client->>Service: Trigger Interaction (e.g., Like)
    Service->>Kafka: Deliver Notification Event
    Kafka->>Consumer: Consume Batch
    Consumer->>Redis: Check Active + Rate Limiting
    Redis-->>Consumer: Return Status
    Consumer->>Consumer: Merge Notifications
    Consumer->>WS: Push Merged Notifications
    WS-->>Client: Deliver
    Note over Consumer, WS: Offline fallback to FCM/APNS

Java Code Example (Delivering Notifications)

// Deliver notifications to Kafka in the interaction service
public void sendNotification(long fromUserId, long toUserId, String type, long tweetId) {
    // Serialize notification data
    String notificationJson = serializeNotification(fromUserId, toUserId, type, tweetId);
    // Configure Producer and send
    kafkaProducer.send(new ProducerRecord<>("notification-queue", String.valueOf(toUserId), notificationJson));
}

Java Code Example (Consumer Processing and Pushing)

// Kafka consumer processing logic
public void processNotifications() {
    KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
    consumer.subscribe(Collections.singleton("notification-queue"));
    while (true) {
        ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(100));
        Map<String, List<Notification>> batchMap = groupByUser(records); // Aggregate by user
        for (Map.Entry<String, List<Notification>> entry : batchMap.entrySet()) {
            String userId = entry.getKey();
            List<Notification> notifications = entry.getValue();
            // Check online and rate limiting
            if ("1".equals(redis.get("online:" + userId)) && redis.incr("rate:" + userId) <= 1000) {
                // Merge notifications
                Map<String, Integer> merged = aggregateNotifications(notifications);
                // Push
                Session ws = connections.get(userId);
                if (ws != null && ws.isOpen()) {
                    ws.getAsyncRemote().sendText(JSON.toString(merged));
                } else {
                    // Fallback to mobile push
                    fcm.sendToDevice(getDeviceToken(userId), merged);
                }
            }
            redis.expire("rate:" + userId, 60);
        }
    }
}

5. Performance Optimization and Engineering Details

  • Batch Processing: Consumers process 500 notifications per batch, Redis MGET for batch online checks.

  • Heartbeat Detection: WebSocket ping/pong every 30 s to update online status.

  • Security Protection: IP rate limiting, encrypted pushes to prevent DDoS.

  • Scaling: Kafka partitions=100, WebSocket nodes autoscaling (>80% CPU).

  • Monitoring: Grafana tracks delivery rate/latency, alerts for losses >1%.


6. High-Frequency Interview Follow-Ups and Real Engineering Pitfalls

High-Frequency Follow-Ups

  • How to prevent queue backlogs under high concurrency? (Dynamic consumer scaling, priority queues)

  • How to handle consistency losses? (Kafka persistence, retry mechanisms)

  • How to ensure multi-channel synchronization? (Unified JSON format, Redis temporary storage)

Engineering Pitfalls

  • Connection memory overflow; solution: Horizontal scaling, limit connections/node <100 k.

  • DDoS amplification; solution: Token bucket + Captcha.

  • Merge logic bugs leading to duplicate notifications; solution: HashMap aggregation + test scripts.

Common Misconceptions

  • Using only Polling, poor real-time performance leading to user churn.

  • Unlimited flows, peak crashes the system.

  • Ignoring offline pushes, poor mobile user experience.


7. Summary

Twitter notifications and real-time pushes are core to enhancing user stickiness. Through WebSocket + Kafka asynchronous + multi-channel architecture, efficient delivery and resource optimization can be achieved. Engineering focuses on filtering merges, rate limiting degradation, and monitoring to ensure >99% delivery rate. Combined with upstream (such as timeline events), avoid high-frequency bottlenecks, and emphasize trade-offs in interviews (such as real-time vs. Resources). Actual deployment requires testing billions-scale loads to support platform growth.




    Enjoy Reading This Article?

    Here are some more articles you might like to read next:

  • 3185. Count Pairs That Form a Complete Day II
  • Twitter Database Read and Write and Sharding System Design Detailed Guide
  • 118. Pascal's Triangle
  • 3584. Maximum Product of First and Last Elements of a Subsequence
  • 2905. Find Indices With Index and Value Difference II