Linux Kernel & OS Internals
This area validates your command of core kernel subsystems and how they interact with AMD platforms. You will discuss control paths, memory management, synchronization, and subsystem-specific details relevant to EPYC-class servers.
Be ready to go over:
- Memory management (MMU, page tables, NUMA, THP): Page faults, TLB shootdowns, NUMA placement, and huge page strategies under mixed workloads.
- Interrupts and I/O (APIC/x2APIC, MSI/MSI-X, IRQ handling): Affinity, balancing, latency impacts, and ISR vs. threaded handlers.
- Kernel concurrency and synchronization: Locks, RCU, atomics, memory barriers—when and why each is appropriate.
- Advanced concepts (less common): KASLR, mitigations for speculative execution, lockdep/KASAN/UBSAN workflows, live patching, and eBPF tracing models.
Example questions or scenarios:
- "Walk through debugging a kernel panic with an oops trace and provide a hypothesis-driven plan to isolate the offending subsystem."
- "How would you diagnose a NUMA imbalance causing latency spikes on an EPYC system?"
- "What’s your approach to resolving a THP performance regression after a kernel upgrade?"
Virtualization & KVM/QEMU
You’ll be assessed on KVM architecture, QEMU device models, and how AMD virtualization extensions surface through the stack. Expect deep questions about VMEXITs, NPT, vCPU scheduling, and virtio performance.
Be ready to go over:
- KVM fundamentals: vCPU lifecycles, exits, CPUID/feature exposure, shadow vs. nested paging (NPT).
- Device virtualization: virtio, vhost, SR-IOV with IOMMU interactions and DMA remapping.
- Live migration and dirty logging: Strategies, pitfalls for large-memory VMs, and ensuring consistency.
- Advanced concepts (less common): SEV/SEV-ES/SEV-SNP, nested virtualization behaviors, PMU virtualization, and migration under memory encryption.
Example questions or scenarios:
- "A workload experiences excessive VMEXITs after enabling a new CPU feature—how do you triage?"
- "Design a plan to expose a new CPUID leaf via KVM and safely surface it to guests."
- "Explain why live migration might fail with encrypted VMs and how you’d mitigate it."
x86-64 Architecture & SoC Features
Interviewers will probe your fluency with x86-64 microarchitecture and server SoC features—how they’re configured, exposed, and monitored in Linux.
Be ready to go over:
- Core architecture: Caches, SMT, microcode, MSRs, APIC, power states (C/P-states) and their OS interfaces.
- Platform features: ACPI tables (e.g., SRAT/SLIT), PCIe, IOMMU, RAS (MCA/MCE), and CXL.
- Security and memory technologies: SME/SEV, page attributes, and mitigation trade-offs.
- Advanced concepts (less common): Side-channel mitigations, firmware-first RAS vs. OS-first, CXL.mem attach semantics.
Example questions or scenarios:
- "Interpret an MCE log on an EPYC system and outline next steps for containment and recovery."
- "How would you enable a new CXL.mem region in Linux and validate performance/NUMA policy?"
- "Explain the interaction between APIC/x2APIC and interrupt distribution on multi-socket servers."
Open-Source Upstreaming & Collaboration
Success in this role depends on your ability to deliver changes to the Linux community efficiently and sustainably. You’ll discuss mailing list workflows, maintainer engagement, and long-term maintenance strategy.
Be ready to go over:
- Patch lifecycle: RFCs, vN revisions, cover letters, checkpatch, sign-off (DCO), and bisection for regressions.
- Maintainer relations: Responding to review feedback, handling NAKs, and reaching consensus on interfaces.
- Stable/backports: Criteria, risk mitigation, and distro-specific constraints.
- Advanced concepts (less common): ABI stability principles, deprecation strategies, CI in kernel workflows, licensing nuances.
Example questions or scenarios:
- "Draft the outline of a cover letter for a 5-patch series enabling a new RAS capability."
- "You received conflicting feedback from two maintainers—how do you proceed?"
- "How do you structure a backport policy for a feature used by multiple enterprise distros?"
Debugging, Performance, and Testing
AMD will test your methodical debugging and performance engineering under real constraints. You must demonstrate data-driven analysis and the ability to reproduce and isolate complex faults.
Be ready to go over:
- Tools and techniques: perf, ftrace/trace-cmd, bpftrace/eBPF, kprobes, kgdb, kdump/crash, lockdep/KASAN.
- Performance methodology: Baselines, noise control, PMU events, flamegraphs, and regression detection.
- Robust validation: Unit tests, selftests, kselftest, QEMU-based CI, and stress/fault injection.
- Advanced concepts (less common): Microarchitectural counter interpretation, NUMA-aware load generation, reproducibility across kernels.
Example questions or scenarios:
- "A 15% regression appears after enabling a new IOMMU feature—what’s your investigation plan?"
- "Analyze a soft lockup report and propose the next three commands you’d run."
- "Describe how you’d build a minimal reproducer for a rare use-after-free."