At companies like Netflix, backend services often need concurrency for handling many requests, doing background work, or parallelizing CPU-heavy tasks. Choosing between processes and threads affects performance, safety, and operational complexity.
Explain the difference between a process and a thread.
In your answer, cover:
Assume a modern OS (Linux/macOS/Windows). You don’t need kernel-level implementation details, but you should be precise about isolation, synchronization primitives, and practical trade-offs in real systems.
A process typically has its own virtual address space, providing strong isolation between workloads. Threads within the same process share the same address space, which enables fast sharing but increases the risk of data races and memory corruption impacting the whole process.
Switching between threads in the same process is often cheaper than switching between processes because less memory mapping/state needs to change. However, real costs depend on OS scheduler behavior, cache/TLB effects, and synchronization contention.
Threads communicate via shared memory and must use synchronization (locks, semaphores, condition variables) to avoid races. Processes communicate via IPC (pipes, sockets, shared memory segments), which is safer but usually adds serialization/copying overhead and complexity.
import threading
lock = threading.Lock()
# Threads share memory; lock protects shared state
A crash in one thread can bring down the entire process because they share the same memory space. Separate processes provide better fault containment and are commonly used for sandboxing, privilege separation, and running untrusted plugins.
Threads are often effective for I/O-bound concurrency because they can overlap waiting on I/O. For CPU-bound parallelism, multiple processes can be preferable in runtimes with a global interpreter lock (e.g., CPython), while native languages may use threads for true parallel execution.
from multiprocessing import Pool
# Processes can achieve CPU parallelism in CPython