SlideShare ist ein Scribd-Unternehmen logo
1 von 22
Downloaden Sie, um offline zu lesen
Kdump 
lkong 
2014-11-10 
Contents 
1 Kdump 3 
2 Agenda 3 
3 Background 4 
3.1 Note . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 
4 Kexec - Overview 5 
4.1 Note . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 
5 Kdump - Overview 6 
6 Kdump - Overview 6 
7 Kdump - Overview 7 
7.1 Note . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 
7.2 Note . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 
8 Kdump - Overview 10 
8.1 Note . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 
9 Install and configure kdump 11 
9.1 Note . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 
10 Install and configure kdump 13 
10.1 Note . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 
11 Kdump on Xen guest 15 
1
12 Kdump on Xen guest 16 
13 Kdump on Xen guest 17 
14 Xen dump 18 
15 Using crash tool to check core file 18 
16 Related bugs 19 
17 Reference 19 
17.1 Note . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 
18 Q & A 20 
19 Q & A 21 
19.1 Note . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 
2
1 Kdump slide 
2 Agenda slide 
• Background 
• Kexec - Overview 
• Kdump - Overview 
• Install and configure kdump 
3
• Kdump on Xen guest 
• Xen dump 
• Using crash tool to check core file 
• Related bugs 
• Reference 
• Q & A 
3 Background slide 
• Linux kernel is a rather robust entity, nevertheless kernel panic still 
occurs 
• Dump tools like: LKCD, netdump, diskdump have there limitations 
• Guest may hit problems in its whole life cycle, such as crash, reboot, 
hang, shutdown, etc 
• People may want to know current state of the guest OS for trouble 
shooting 
• kexec - directly boot into a new kernel 
3.1 Note notes 
• Unable to save memory dumps to local RAID (md) devices, outside 
network 
4
• Unstable 
• Not mul version support – So kdump borns 
• Kdump is a much more flexible tool 
4 Kexec - Overview slide 
• Kexec is a fastboot mechanism that allows booting a Linux kernel from 
the context of an already running kernel without going through BIOS 
• kexec performs the function of the boot loader from within the kernel 
• Include two components 
– User space tool - kexec-tools 
– Kernel System Call (kexec_load()) 
• Using kexec consists of 
– loading the kernel to be rebooted to into memory 
kexec -l kernel-image --initrd=ini-trd-image --append=command-line-optio 
cat /sys/kernel/kexec_loaded 
– actually rebooting to the pre-loaded kernel 
kexec -e 
4.1 Note notes 
yum install kexec-tools kexec -l 
kexec -l vmlinuz-3.10.0-118.el7.x86_64 --initrd=initramfs-3.10.0-118.el7.x86_64.img --5
5 Kdump - Overview slide 
• Kdump uses kexec to quickly boot to dump-capture kernel 
• Previous kernel’s memory is preserved before crash 
• Dump information across the kernel is exchanged in and ELF format 
Core file 
• Can use common commands, such as cp and scp, to copy the memory 
image to a dump file on the local disk, or across the network to a re-mote 
system. 
• Kdump and kexec are currently supported on the x86, x86_64, ppc64, 
ia64, and s390x architectures. And kdump is install in RHEL5, 
RHEL6, RHEL7 as default. 
• Accessing dump image in ELF Core format 
– /proc/vmcore 
• Accessing dump image in linear raw format 
– /dev/oldmem 
6 Kdump - Overview slide 
• Write Out the Dump File 
cp /proc/vmcore <dump-file> 
mknod /dev/oldmem c 1 12 
dd if=/dev/oldmem of=oldmem.001 
6
• Based on the architecture and type of image (relocatable or not), one 
can choose to load the uncompressed vmlinux or compressed bzIm-age/ 
vmlinuz of dump-capture kernel 
For i386 and x86_64: 
- Use vmlinux if kernel is not relocatable. 
- Use bzImage/vmlinuz if kernel is relocatable. 
For ppc64: 
- Use vmlinux 
For ia64: 
- Use vmlinux or vmlinuz.gz 
For s390x: 
- Use image or bzImage 
7 Kdump - Overview slide 
• Arch specific command line options to be used while loading dump-capture 
kernel 
For i386, x86_64 and ia64: 
"1 irqpoll maxcpus=1 reset_devices" 
For ppc64: 
"1 maxcpus=1 noirqdistrib reset_devices" 
• If you use a uncompressed image remember to add ’–args-linux’ to ker-nel 
command line (no need for ia64) 
7.1 Note notes 
• The "irqpoll" boot parameter reduces driver initialization failures due 
to shared interrupts in the dump-capture kernel 
• Boot parameter "1" boots the dump-capture kernel into single-user 
mode without networking. If you want networking, use "3" 
7
7.2 Note notes 
• Production/the first/standard kernel====crash, capture, the second 
kernel 
kexec -p <kernel-image> --append=<options> 
• Execution of capture kernel 
– panic() 
– Alt-Sysrq-c 
8
9
8 Kdump - Overview slide 
10
8.1 Note notes 
9 Install and configure kdump slide 
• Install kexec-tools 
11
yum install kexec-tools 
If you wish to configure kdump using a graphical user interface instead of the command yum install system-config-kdump 
• Configure kdump (Remember to set crashkernel on the kernel com-mand 
line) 
crashkernel=128M 
crashkernel=128M@16M 
crashkernel=512M-2G:64M,2G-:128M 
crashkernel=512M-2G:64M,2G-:128M@16M 
grub2-mkconfig -o /boot/grub2/grub.cfg 
9.1 Note notes 
• A limitation in the current implementation of the Intel IOMMU driver 
can occasionally prevent the kdump service from capturing the core 
dump image. To use kdump on Intel architectures reliably, it is ad-vised 
that the IOMMU support is disabled. 
• GRUB_CMDLINE_LINUX 
• chkconfig kdump on;service kdump status;service kdump start 
• influence crashkernel size: 1) arch 2) total amount of installed system 
memory, 128 MB + 4 bits for every 4KB page 
• <4G, not recommand auto. Y must larger than 16M. Usually 128M+ 
Ym, Y is calculate. 
• On many systems, kdump can reserve memory automatically. This be-havior 
is enabled by default. However, automatic memory reservation 
only works on systems which have more than a certain amount of total 
available memory 
12
10 Install and configure kdump slide 
• Configure your kernel for kdump 
– System kernel config options 
* CONFIG_KEXEC=y 
* CONFIG_SYSFS=y 
* CONFIG_DEBUG_INFO=Y 
– Dump-capture kernel config options 
* CONFIG_CRASH_DUMP=y 
* CONFIG_PROC_VMCORE=y 
* CONFIG_HIGHMEM64G=y or CONFIG_HIGHMEM4G=y 
(only for i386) 
* CONFIG_RELOCATABLE=y 
* CONFIG_PHYSICAL_START=0x100000 
• Configure file: /etc/kdump.conf 
• Configure file: /etc/sysconfig/kdump.conf 
10.1 Note notes 
• (If CONFIG_SMP=y, then specify maxcpus=1 on the kernel com-mand 
line 
when loading the dump-capture kernel, see section "Load the Dump-capture 
13
Kernel". 
• Allows the kernel to be placed somewhere else in the memory. 
• If kernel is not relocatable CONFIG_RELOCATABLE=n then bzIm-age 
will decompress itself to the above physical address 
and run from there. Otherwise bzImage will run from the address where it 
has been loaded by the boot loader 
• – Enabling the kdump Service 
#systemctl enable kdump.service 
#systemctl start kdump.service 
– Testing the dump Configration 
#systemctl is-active kdump 
#echo 1 > /proc/sys/kernel/sysrq or 
#echo c > /proc/sysrq-trigger 
• core_collector makedumpfile -c –message-level 1 -d 31 ,-c makedump-file 
,- 
-message-level 1 (1 )-d 31 
( zero page, cache page, 
cache private, user data, free page ) 
• "kexec system call" in "Processor type and features." CONFIG_KEXEC 
• "Filesystem" -> "Pseudo filesystems -> "sysfs file system support" 
CONFIG_SYSFS=y 
14
• "Compile the kernel with debug info" in "Kernel hacking." CON-FIG_ 
DEBUG_INFO=Y 
• "kernel crash dumps" support under "Processor type and features" 
CONFIG_CRASH_DUMP=y 
• "/proc/vmcore support" under "Filesystems" -> "Pseudo filesys-tems". 
CONFIG_PROC_VMCORE=y 
• On i386, enable high memory support under "Processor type and fea-tures" 
CONFIG_HIGHMEM64G=y 
• On i386 and x86_64, disable symmetric multi-processing support un-der 
"Processor type and features" CONFIG_SMP=n 
• "Build a relocatable kernel" support under "Processor type and fea-tures" 
CONFIG_RELOCATABLE=y 
• "Physical address where the kernel is loaded" (under "Processor type 
and features"). CONFIG_PHYSICAL_START=0x1000000 
• 
11 Kdump on Xen guest slide 
• kdump in rhel6 PV guest is not supported 
• Delete or comment out (using #) the line with the Kdump_not_supported_on_Xen_domU_guest 
marker, if it’s present 
• Booting without paravirt drivers (Method 1) 
15
– Preparing the kdump initrd (dumprd), if needed 
dracut -f /boot/initramfs-$(uname -r).img $(uname -r) 
– Add the kernel command line parameter xen_emul_unplug=never 
crashkernel=128M to the kernel’s command line and reboot. 
– Start the kdump service and trigger a kernel panic in guest (Need 
disable selinux first) 
service kdump restart 
echo c > /proc/sysrq-trigger 
– Login the guest and check the vmcore, vmcore can be analyzed 
12 Kdump on Xen guest slide 
• Booting without paravirt drivers (Method 2) 
– Change kernel command line parameter xen_emul_unplug=unnecessary 
– Black list the xen pv modules 
cat /etc/modprobe.d/blacklist.conf 
[...] 
blacklist xen_blkfront 
blacklist xen_netfront 
– Remove the paravirt driver if needed 
modprobe -r xen_netfront 
modprobe -r xen_blkfront 
– Append "xen_emul_unplug=unnecessary" to kernel command 
line in grub.cfg 
16
– Reboot the guest, and login, you will see there is a new kdump 
ramdisk generated under /boot 
– Start the kdump service and trigger a kernel panic in guest 
service kdump restart 
echo c > /proc/sysrq-trigger 
– Login the guest and check the vmcore, vmcore can be analyzed 
13 Kdump on Xen guest slide 
• Booting with paravirt drivers, and without unplug 
– Preparing the kdump initrd (dumprd), if needed 
dracut -f /boot/initramfs-$(uname -r).img $(uname -r) 
– Edit /etc/modprobe.d/blacklist.conf by adding the three lines 
shown below to blacklist the drivers used for the emulated de-vices. 
blacklist ata_piix 
blacklist 8139too 
blacklist 8139cp 
– Change xen_emul_unplug=never to xen_emul_unplug=unnecessary 
in kernel command line and reboot 
– Start the kdump service and trigger a kernel panic in gues 
service kdump restart 
echo c > /proc/sysrq-trigger 
– Login the guest and check the vmcore, vmcore can be analyzed 
17
14 Xen dump slide 
• Manual dump core 
– By default, xen would first pause the guest before dump core for it 
– Manual dump core is applicable to both PV and HVM, no xenpv 
driver is needed for HVM. 
• Automatic dump core 
– ’enable-dump’ must be set as ’yes’ in /etc/xen/xend-config.sxp 
– Automatic dump core is applicable to both PV and HVM, as long 
as xenpv driver is installed for HVM guest. 
15 Using crash tool to check core file slide 
• Install crash tool 
yum install crash 
• Install debuginfo packages 
yum install kernel-debuginfo-$(uname -r) 
yum install kernel-debuginfo-common-$(uname -r) 
• Using crash tool to check 
crash /usr/lib/debug/lib/modules/$(uname -r)/vmlinux $core_file 
• Commonly used commands 
18
help 
h 
log 
bt 
foreach bt 
ps 
vm 
files 
exit or q 
16 Related bugs slide 
• Bug 771712 - kexec and kdump for Xen PVonHVM guests 
• Bug 1007328 - RHEL7 guest hangs when triggering crash with kdump 
enabled on a Xen4.3 host 
• Bug 657506 - Xen 32bit HVM guest will have large time-drift after 
save/restore 
17 Reference slide 
• Section B.1, “Memory Requirements for kdump" [Kernel Crash Dump 
Guide] 
• Section B.2, “Minimum Threshold for Automatic Memory Reserva-tion”[ 
Kernel Crash Dump Guide] 
• Guest dump core 
• Enable and test kdump in RHEL7.0 guest 
• Guest kdump 
19
17.1 Note notes 
• Minimum Threshold is for crashkernel=auto 
18 Q & A slide 
1. compressed linux kernel image is not equal to a relocatable ker-nel 
image. You can get a relocatable kernel image by set CON-FIG_ 
RELOCATABLE=y when you bulid a kernel image 
2. When startup kdump service and trigger a kernel panic, kdump will 
first save the dump core file then reboot your host automatically. When 
kdump failed to save the core file it will perform as the value of default 
in /etc/kdump.conf: 
• default=reboot -> just reboot the host 
• default=halt -> Bring the system to a halt, requiring manual re-set 
poweroff 
• default=poweroff -> Bring the system to a poweroff, requiring 
manual poweron 
• default=shell -> Drop to an shell session inside the initramfs from 
where you can manually perform additional recovery actions. Ex-iting 
this shell reboots the system 
3. Option –args-linux 
When you use kexec tool to load a second kernel, some times you need 
to add option ’–args-linux’ to kernel command line, I try to find the 
reason on google but can not find any useful information. If you try to 
load a new kernel but failed, it is safe to add/remove –args-linux from 
the kernel command line and try again. 
20
19 Q & A slide 
1. What’s the exact definition of relocatable kernel? 
I can not found any official definition of *relocatable kernel*. But the 
following lines can explain what a relocatable kernel is: 
A non-relocatable kernel was loaded at a fixed memory address in order 
to work(usually 1M of the memory space), loading it in a different place 
wouldn’t work. When we use kdump, we need to load the second kernel in a 
different place(The fixed memory address is reserved by the first 
kernel), so when we build a kernel as a second kernel we need to make 
sure the kernel can be loaded at a different memory address, we have two 
ways to implement this: 
1. Build relocatable kernel 
*Relocatable kernel can be loaded at different 4K-aligned addresses*, 
but always below 1 GB. 
If we want to build a relocatable kernel we need to configure the kernel 
option: 
CONFIG_RELOCATABLE=y 
Because kernel is relocatable, it can be run from any physical address 
hence kexec boot loader will load it in memory region reserved for it 
(we can specify the memory address by add crashkerne=X@Y or 
crashkernel=auto to first kernel’s command line) 
Also when we set CONFIG_RELOCATABLE=y we can choose bzImage/vmlinuz 
kernel format 
2. Hard code the load memory address and tell the start of the memory 
region to the first kernel by crasnkernel=X@Y 
In this way we can build the second kernel by set kernel option: 
CONFIG_PHYSICAL_START=0x100000 
and add that address on the first kernel’s command line: 
crashkernel=X@Y 
When we set CONFIG_PHYSICAL_START=0x100000 the Y shoud be 16M, of course 
you can set a different value for CONFIG_PHYSICAL_START, but 0x100000 is 
21
recommand for x86 arch. 
Also when we set CONFIG_RELOCATABLE=n or not set CONFIG_RELOCATABLE, we 
can only choose vmlinux kernel format. 
If we want to learn more about relocatable kernel, we may learn more 
about kernel, of cource this would be another big topic.#+BEGIN_EXAMPLE 
19.1 Note notes 
• when start kdump service need disable selinux first?? 
22

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

QEMU Disk IO Which performs Better: Native or threads?
QEMU Disk IO Which performs Better: Native or threads?QEMU Disk IO Which performs Better: Native or threads?
QEMU Disk IO Which performs Better: Native or threads?
 
Linux Crash Dump Capture and Analysis
Linux Crash Dump Capture and AnalysisLinux Crash Dump Capture and Analysis
Linux Crash Dump Capture and Analysis
 
From printk to QEMU: Xen/Linux Kernel debugging
From printk to QEMU: Xen/Linux Kernel debuggingFrom printk to QEMU: Xen/Linux Kernel debugging
From printk to QEMU: Xen/Linux Kernel debugging
 
XPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSE
XPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSEXPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSE
XPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSE
 
XPDDS18: Xenwatch Multithreading - Dongli Zhang, Oracle
XPDDS18: Xenwatch Multithreading - Dongli Zhang, OracleXPDDS18: Xenwatch Multithreading - Dongli Zhang, Oracle
XPDDS18: Xenwatch Multithreading - Dongli Zhang, Oracle
 
First steps on CentOs7
First steps on CentOs7First steps on CentOs7
First steps on CentOs7
 
Reconnaissance of Virtio: What’s new and how it’s all connected?
Reconnaissance of Virtio: What’s new and how it’s all connected?Reconnaissance of Virtio: What’s new and how it’s all connected?
Reconnaissance of Virtio: What’s new and how it’s all connected?
 
LOAD BALANCING OF APPLICATIONS USING XEN HYPERVISOR
LOAD BALANCING OF APPLICATIONS  USING XEN HYPERVISORLOAD BALANCING OF APPLICATIONS  USING XEN HYPERVISOR
LOAD BALANCING OF APPLICATIONS USING XEN HYPERVISOR
 
XPDDS18: Design and Implementation of Automotive: Virtualization Based on Xen...
XPDDS18: Design and Implementation of Automotive: Virtualization Based on Xen...XPDDS18: Design and Implementation of Automotive: Virtualization Based on Xen...
XPDDS18: Design and Implementation of Automotive: Virtualization Based on Xen...
 
Computing Performance: On the Horizon (2021)
Computing Performance: On the Horizon (2021)Computing Performance: On the Horizon (2021)
Computing Performance: On the Horizon (2021)
 
Enhancing and Preparing TIMES for High Performance Computing
Enhancing and Preparing TIMES for High Performance ComputingEnhancing and Preparing TIMES for High Performance Computing
Enhancing and Preparing TIMES for High Performance Computing
 
Kernel Recipes 2016 - Understanding a Real-Time System (more than just a kernel)
Kernel Recipes 2016 - Understanding a Real-Time System (more than just a kernel)Kernel Recipes 2016 - Understanding a Real-Time System (more than just a kernel)
Kernel Recipes 2016 - Understanding a Real-Time System (more than just a kernel)
 
LSA2 - 01 Virtualization with KVM
LSA2 - 01 Virtualization with KVMLSA2 - 01 Virtualization with KVM
LSA2 - 01 Virtualization with KVM
 
How Netflix Tunes EC2 Instances for Performance
How Netflix Tunes EC2 Instances for PerformanceHow Netflix Tunes EC2 Instances for Performance
How Netflix Tunes EC2 Instances for Performance
 
Project ACRN Device Passthrough Introduction
Project ACRN Device Passthrough IntroductionProject ACRN Device Passthrough Introduction
Project ACRN Device Passthrough Introduction
 
XPDS14 - Intel(r) Virtualization Technology for Directed I/O (VT-d) Posted In...
XPDS14 - Intel(r) Virtualization Technology for Directed I/O (VT-d) Posted In...XPDS14 - Intel(r) Virtualization Technology for Directed I/O (VT-d) Posted In...
XPDS14 - Intel(r) Virtualization Technology for Directed I/O (VT-d) Posted In...
 
PV-Drivers for SeaBIOS using Upstream Qemu
PV-Drivers for SeaBIOS using Upstream QemuPV-Drivers for SeaBIOS using Upstream Qemu
PV-Drivers for SeaBIOS using Upstream Qemu
 
XPDS13: HVM Dom0 - Any unmodified OS as Dom0 - Will Auld, Intel
XPDS13: HVM Dom0 - Any unmodified OS as Dom0 - Will Auld, IntelXPDS13: HVM Dom0 - Any unmodified OS as Dom0 - Will Auld, Intel
XPDS13: HVM Dom0 - Any unmodified OS as Dom0 - Will Auld, Intel
 
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
 
Kernel Configuration and Compilation
Kernel Configuration and CompilationKernel Configuration and Compilation
Kernel Configuration and Compilation
 

Ähnlich wie Kdump

kexec / kdump implementation in Linux Kernel and Xen hypervisor
kexec / kdump implementation in Linux Kernel and Xen hypervisorkexec / kdump implementation in Linux Kernel and Xen hypervisor
kexec / kdump implementation in Linux Kernel and Xen hypervisor
The Linux Foundation
 

Ähnlich wie Kdump (20)

Basics_of_Kernel_Panic_Hang_and_ Kdump.pdf
Basics_of_Kernel_Panic_Hang_and_ Kdump.pdfBasics_of_Kernel_Panic_Hang_and_ Kdump.pdf
Basics_of_Kernel_Panic_Hang_and_ Kdump.pdf
 
Kernel Recipes 2015 - Kernel dump analysis
Kernel Recipes 2015 - Kernel dump analysisKernel Recipes 2015 - Kernel dump analysis
Kernel Recipes 2015 - Kernel dump analysis
 
kdump: usage and_internals
kdump: usage and_internalskdump: usage and_internals
kdump: usage and_internals
 
HKG15-409: ARM Hibernation enablement on SoCs - a case study
HKG15-409: ARM Hibernation enablement on SoCs - a case studyHKG15-409: ARM Hibernation enablement on SoCs - a case study
HKG15-409: ARM Hibernation enablement on SoCs - a case study
 
Linux Kernel Platform Development: Challenges and Insights
 Linux Kernel Platform Development: Challenges and Insights Linux Kernel Platform Development: Challenges and Insights
Linux Kernel Platform Development: Challenges and Insights
 
kubernetes practice
kubernetes practicekubernetes practice
kubernetes practice
 
SiteGround Tech TeamBuilding
SiteGround Tech TeamBuildingSiteGround Tech TeamBuilding
SiteGround Tech TeamBuilding
 
kexec / kdump implementation in Linux Kernel and Xen hypervisor
kexec / kdump implementation in Linux Kernel and Xen hypervisorkexec / kdump implementation in Linux Kernel and Xen hypervisor
kexec / kdump implementation in Linux Kernel and Xen hypervisor
 
[k8s] Kubernetes terminology (1).pdf
[k8s] Kubernetes terminology (1).pdf[k8s] Kubernetes terminology (1).pdf
[k8s] Kubernetes terminology (1).pdf
 
”Bare-Metal Container" presented at HPCC2016
”Bare-Metal Container" presented at HPCC2016”Bare-Metal Container" presented at HPCC2016
”Bare-Metal Container" presented at HPCC2016
 
RAC-Installing your First Cluster and Database
RAC-Installing your First Cluster and DatabaseRAC-Installing your First Cluster and Database
RAC-Installing your First Cluster and Database
 
Control your service resources with systemd
 Control your service resources with systemd  Control your service resources with systemd
Control your service resources with systemd
 
Deploying Foreman in Enterprise Environments
Deploying Foreman in Enterprise EnvironmentsDeploying Foreman in Enterprise Environments
Deploying Foreman in Enterprise Environments
 
OpenNebulaConf 2016 - Storage Hands-on Workshop by Javier Fontán, OpenNebula
OpenNebulaConf 2016 - Storage Hands-on Workshop by Javier Fontán, OpenNebulaOpenNebulaConf 2016 - Storage Hands-on Workshop by Javier Fontán, OpenNebula
OpenNebulaConf 2016 - Storage Hands-on Workshop by Javier Fontán, OpenNebula
 
Containers with systemd-nspawn
Containers with systemd-nspawnContainers with systemd-nspawn
Containers with systemd-nspawn
 
Intro to Kernel Debugging - Just make the crashing stop!
Intro to Kernel Debugging - Just make the crashing stop!Intro to Kernel Debugging - Just make the crashing stop!
Intro to Kernel Debugging - Just make the crashing stop!
 
Docker 原理與實作
Docker 原理與實作Docker 原理與實作
Docker 原理與實作
 
NFD9 - Matt Peterson, Data Center Operations
NFD9 - Matt Peterson, Data Center OperationsNFD9 - Matt Peterson, Data Center Operations
NFD9 - Matt Peterson, Data Center Operations
 
Systemd for developers
Systemd for developersSystemd for developers
Systemd for developers
 
Real-World Docker: 10 Things We've Learned
Real-World Docker: 10 Things We've Learned  Real-World Docker: 10 Things We've Learned
Real-World Docker: 10 Things We've Learned
 

Mehr von Lingfei Kong (8)

Emacs presentation
Emacs presentationEmacs presentation
Emacs presentation
 
It经典图书(附免费下载地址)
It经典图书(附免费下载地址)It经典图书(附免费下载地址)
It经典图书(附免费下载地址)
 
Shell实现的windows回收站功能的脚本
Shell实现的windows回收站功能的脚本Shell实现的windows回收站功能的脚本
Shell实现的windows回收站功能的脚本
 
Python学习笔记
Python学习笔记Python学习笔记
Python学习笔记
 
Device virtualization and management in xen
Device virtualization and management in xenDevice virtualization and management in xen
Device virtualization and management in xen
 
Congfigure python as_ide
Congfigure python as_ideCongfigure python as_ide
Congfigure python as_ide
 
Emacs tutorial
Emacs tutorialEmacs tutorial
Emacs tutorial
 
SR-IOV Introduce
SR-IOV IntroduceSR-IOV Introduce
SR-IOV Introduce
 

Kürzlich hochgeladen

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Kürzlich hochgeladen (20)

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Kdump

  • 1. Kdump lkong 2014-11-10 Contents 1 Kdump 3 2 Agenda 3 3 Background 4 3.1 Note . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 4 Kexec - Overview 5 4.1 Note . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 5 Kdump - Overview 6 6 Kdump - Overview 6 7 Kdump - Overview 7 7.1 Note . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 7.2 Note . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 8 Kdump - Overview 10 8.1 Note . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 9 Install and configure kdump 11 9.1 Note . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 10 Install and configure kdump 13 10.1 Note . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 11 Kdump on Xen guest 15 1
  • 2. 12 Kdump on Xen guest 16 13 Kdump on Xen guest 17 14 Xen dump 18 15 Using crash tool to check core file 18 16 Related bugs 19 17 Reference 19 17.1 Note . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 18 Q & A 20 19 Q & A 21 19.1 Note . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2
  • 3. 1 Kdump slide 2 Agenda slide • Background • Kexec - Overview • Kdump - Overview • Install and configure kdump 3
  • 4. • Kdump on Xen guest • Xen dump • Using crash tool to check core file • Related bugs • Reference • Q & A 3 Background slide • Linux kernel is a rather robust entity, nevertheless kernel panic still occurs • Dump tools like: LKCD, netdump, diskdump have there limitations • Guest may hit problems in its whole life cycle, such as crash, reboot, hang, shutdown, etc • People may want to know current state of the guest OS for trouble shooting • kexec - directly boot into a new kernel 3.1 Note notes • Unable to save memory dumps to local RAID (md) devices, outside network 4
  • 5. • Unstable • Not mul version support – So kdump borns • Kdump is a much more flexible tool 4 Kexec - Overview slide • Kexec is a fastboot mechanism that allows booting a Linux kernel from the context of an already running kernel without going through BIOS • kexec performs the function of the boot loader from within the kernel • Include two components – User space tool - kexec-tools – Kernel System Call (kexec_load()) • Using kexec consists of – loading the kernel to be rebooted to into memory kexec -l kernel-image --initrd=ini-trd-image --append=command-line-optio cat /sys/kernel/kexec_loaded – actually rebooting to the pre-loaded kernel kexec -e 4.1 Note notes yum install kexec-tools kexec -l kexec -l vmlinuz-3.10.0-118.el7.x86_64 --initrd=initramfs-3.10.0-118.el7.x86_64.img --5
  • 6. 5 Kdump - Overview slide • Kdump uses kexec to quickly boot to dump-capture kernel • Previous kernel’s memory is preserved before crash • Dump information across the kernel is exchanged in and ELF format Core file • Can use common commands, such as cp and scp, to copy the memory image to a dump file on the local disk, or across the network to a re-mote system. • Kdump and kexec are currently supported on the x86, x86_64, ppc64, ia64, and s390x architectures. And kdump is install in RHEL5, RHEL6, RHEL7 as default. • Accessing dump image in ELF Core format – /proc/vmcore • Accessing dump image in linear raw format – /dev/oldmem 6 Kdump - Overview slide • Write Out the Dump File cp /proc/vmcore <dump-file> mknod /dev/oldmem c 1 12 dd if=/dev/oldmem of=oldmem.001 6
  • 7. • Based on the architecture and type of image (relocatable or not), one can choose to load the uncompressed vmlinux or compressed bzIm-age/ vmlinuz of dump-capture kernel For i386 and x86_64: - Use vmlinux if kernel is not relocatable. - Use bzImage/vmlinuz if kernel is relocatable. For ppc64: - Use vmlinux For ia64: - Use vmlinux or vmlinuz.gz For s390x: - Use image or bzImage 7 Kdump - Overview slide • Arch specific command line options to be used while loading dump-capture kernel For i386, x86_64 and ia64: "1 irqpoll maxcpus=1 reset_devices" For ppc64: "1 maxcpus=1 noirqdistrib reset_devices" • If you use a uncompressed image remember to add ’–args-linux’ to ker-nel command line (no need for ia64) 7.1 Note notes • The "irqpoll" boot parameter reduces driver initialization failures due to shared interrupts in the dump-capture kernel • Boot parameter "1" boots the dump-capture kernel into single-user mode without networking. If you want networking, use "3" 7
  • 8. 7.2 Note notes • Production/the first/standard kernel====crash, capture, the second kernel kexec -p <kernel-image> --append=<options> • Execution of capture kernel – panic() – Alt-Sysrq-c 8
  • 9. 9
  • 10. 8 Kdump - Overview slide 10
  • 11. 8.1 Note notes 9 Install and configure kdump slide • Install kexec-tools 11
  • 12. yum install kexec-tools If you wish to configure kdump using a graphical user interface instead of the command yum install system-config-kdump • Configure kdump (Remember to set crashkernel on the kernel com-mand line) crashkernel=128M crashkernel=128M@16M crashkernel=512M-2G:64M,2G-:128M crashkernel=512M-2G:64M,2G-:128M@16M grub2-mkconfig -o /boot/grub2/grub.cfg 9.1 Note notes • A limitation in the current implementation of the Intel IOMMU driver can occasionally prevent the kdump service from capturing the core dump image. To use kdump on Intel architectures reliably, it is ad-vised that the IOMMU support is disabled. • GRUB_CMDLINE_LINUX • chkconfig kdump on;service kdump status;service kdump start • influence crashkernel size: 1) arch 2) total amount of installed system memory, 128 MB + 4 bits for every 4KB page • <4G, not recommand auto. Y must larger than 16M. Usually 128M+ Ym, Y is calculate. • On many systems, kdump can reserve memory automatically. This be-havior is enabled by default. However, automatic memory reservation only works on systems which have more than a certain amount of total available memory 12
  • 13. 10 Install and configure kdump slide • Configure your kernel for kdump – System kernel config options * CONFIG_KEXEC=y * CONFIG_SYSFS=y * CONFIG_DEBUG_INFO=Y – Dump-capture kernel config options * CONFIG_CRASH_DUMP=y * CONFIG_PROC_VMCORE=y * CONFIG_HIGHMEM64G=y or CONFIG_HIGHMEM4G=y (only for i386) * CONFIG_RELOCATABLE=y * CONFIG_PHYSICAL_START=0x100000 • Configure file: /etc/kdump.conf • Configure file: /etc/sysconfig/kdump.conf 10.1 Note notes • (If CONFIG_SMP=y, then specify maxcpus=1 on the kernel com-mand line when loading the dump-capture kernel, see section "Load the Dump-capture 13
  • 14. Kernel". • Allows the kernel to be placed somewhere else in the memory. • If kernel is not relocatable CONFIG_RELOCATABLE=n then bzIm-age will decompress itself to the above physical address and run from there. Otherwise bzImage will run from the address where it has been loaded by the boot loader • – Enabling the kdump Service #systemctl enable kdump.service #systemctl start kdump.service – Testing the dump Configration #systemctl is-active kdump #echo 1 > /proc/sys/kernel/sysrq or #echo c > /proc/sysrq-trigger • core_collector makedumpfile -c –message-level 1 -d 31 ,-c makedump-file ,- -message-level 1 (1 )-d 31 ( zero page, cache page, cache private, user data, free page ) • "kexec system call" in "Processor type and features." CONFIG_KEXEC • "Filesystem" -> "Pseudo filesystems -> "sysfs file system support" CONFIG_SYSFS=y 14
  • 15. • "Compile the kernel with debug info" in "Kernel hacking." CON-FIG_ DEBUG_INFO=Y • "kernel crash dumps" support under "Processor type and features" CONFIG_CRASH_DUMP=y • "/proc/vmcore support" under "Filesystems" -> "Pseudo filesys-tems". CONFIG_PROC_VMCORE=y • On i386, enable high memory support under "Processor type and fea-tures" CONFIG_HIGHMEM64G=y • On i386 and x86_64, disable symmetric multi-processing support un-der "Processor type and features" CONFIG_SMP=n • "Build a relocatable kernel" support under "Processor type and fea-tures" CONFIG_RELOCATABLE=y • "Physical address where the kernel is loaded" (under "Processor type and features"). CONFIG_PHYSICAL_START=0x1000000 • 11 Kdump on Xen guest slide • kdump in rhel6 PV guest is not supported • Delete or comment out (using #) the line with the Kdump_not_supported_on_Xen_domU_guest marker, if it’s present • Booting without paravirt drivers (Method 1) 15
  • 16. – Preparing the kdump initrd (dumprd), if needed dracut -f /boot/initramfs-$(uname -r).img $(uname -r) – Add the kernel command line parameter xen_emul_unplug=never crashkernel=128M to the kernel’s command line and reboot. – Start the kdump service and trigger a kernel panic in guest (Need disable selinux first) service kdump restart echo c > /proc/sysrq-trigger – Login the guest and check the vmcore, vmcore can be analyzed 12 Kdump on Xen guest slide • Booting without paravirt drivers (Method 2) – Change kernel command line parameter xen_emul_unplug=unnecessary – Black list the xen pv modules cat /etc/modprobe.d/blacklist.conf [...] blacklist xen_blkfront blacklist xen_netfront – Remove the paravirt driver if needed modprobe -r xen_netfront modprobe -r xen_blkfront – Append "xen_emul_unplug=unnecessary" to kernel command line in grub.cfg 16
  • 17. – Reboot the guest, and login, you will see there is a new kdump ramdisk generated under /boot – Start the kdump service and trigger a kernel panic in guest service kdump restart echo c > /proc/sysrq-trigger – Login the guest and check the vmcore, vmcore can be analyzed 13 Kdump on Xen guest slide • Booting with paravirt drivers, and without unplug – Preparing the kdump initrd (dumprd), if needed dracut -f /boot/initramfs-$(uname -r).img $(uname -r) – Edit /etc/modprobe.d/blacklist.conf by adding the three lines shown below to blacklist the drivers used for the emulated de-vices. blacklist ata_piix blacklist 8139too blacklist 8139cp – Change xen_emul_unplug=never to xen_emul_unplug=unnecessary in kernel command line and reboot – Start the kdump service and trigger a kernel panic in gues service kdump restart echo c > /proc/sysrq-trigger – Login the guest and check the vmcore, vmcore can be analyzed 17
  • 18. 14 Xen dump slide • Manual dump core – By default, xen would first pause the guest before dump core for it – Manual dump core is applicable to both PV and HVM, no xenpv driver is needed for HVM. • Automatic dump core – ’enable-dump’ must be set as ’yes’ in /etc/xen/xend-config.sxp – Automatic dump core is applicable to both PV and HVM, as long as xenpv driver is installed for HVM guest. 15 Using crash tool to check core file slide • Install crash tool yum install crash • Install debuginfo packages yum install kernel-debuginfo-$(uname -r) yum install kernel-debuginfo-common-$(uname -r) • Using crash tool to check crash /usr/lib/debug/lib/modules/$(uname -r)/vmlinux $core_file • Commonly used commands 18
  • 19. help h log bt foreach bt ps vm files exit or q 16 Related bugs slide • Bug 771712 - kexec and kdump for Xen PVonHVM guests • Bug 1007328 - RHEL7 guest hangs when triggering crash with kdump enabled on a Xen4.3 host • Bug 657506 - Xen 32bit HVM guest will have large time-drift after save/restore 17 Reference slide • Section B.1, “Memory Requirements for kdump" [Kernel Crash Dump Guide] • Section B.2, “Minimum Threshold for Automatic Memory Reserva-tion”[ Kernel Crash Dump Guide] • Guest dump core • Enable and test kdump in RHEL7.0 guest • Guest kdump 19
  • 20. 17.1 Note notes • Minimum Threshold is for crashkernel=auto 18 Q & A slide 1. compressed linux kernel image is not equal to a relocatable ker-nel image. You can get a relocatable kernel image by set CON-FIG_ RELOCATABLE=y when you bulid a kernel image 2. When startup kdump service and trigger a kernel panic, kdump will first save the dump core file then reboot your host automatically. When kdump failed to save the core file it will perform as the value of default in /etc/kdump.conf: • default=reboot -> just reboot the host • default=halt -> Bring the system to a halt, requiring manual re-set poweroff • default=poweroff -> Bring the system to a poweroff, requiring manual poweron • default=shell -> Drop to an shell session inside the initramfs from where you can manually perform additional recovery actions. Ex-iting this shell reboots the system 3. Option –args-linux When you use kexec tool to load a second kernel, some times you need to add option ’–args-linux’ to kernel command line, I try to find the reason on google but can not find any useful information. If you try to load a new kernel but failed, it is safe to add/remove –args-linux from the kernel command line and try again. 20
  • 21. 19 Q & A slide 1. What’s the exact definition of relocatable kernel? I can not found any official definition of *relocatable kernel*. But the following lines can explain what a relocatable kernel is: A non-relocatable kernel was loaded at a fixed memory address in order to work(usually 1M of the memory space), loading it in a different place wouldn’t work. When we use kdump, we need to load the second kernel in a different place(The fixed memory address is reserved by the first kernel), so when we build a kernel as a second kernel we need to make sure the kernel can be loaded at a different memory address, we have two ways to implement this: 1. Build relocatable kernel *Relocatable kernel can be loaded at different 4K-aligned addresses*, but always below 1 GB. If we want to build a relocatable kernel we need to configure the kernel option: CONFIG_RELOCATABLE=y Because kernel is relocatable, it can be run from any physical address hence kexec boot loader will load it in memory region reserved for it (we can specify the memory address by add crashkerne=X@Y or crashkernel=auto to first kernel’s command line) Also when we set CONFIG_RELOCATABLE=y we can choose bzImage/vmlinuz kernel format 2. Hard code the load memory address and tell the start of the memory region to the first kernel by crasnkernel=X@Y In this way we can build the second kernel by set kernel option: CONFIG_PHYSICAL_START=0x100000 and add that address on the first kernel’s command line: crashkernel=X@Y When we set CONFIG_PHYSICAL_START=0x100000 the Y shoud be 16M, of course you can set a different value for CONFIG_PHYSICAL_START, but 0x100000 is 21
  • 22. recommand for x86 arch. Also when we set CONFIG_RELOCATABLE=n or not set CONFIG_RELOCATABLE, we can only choose vmlinux kernel format. If we want to learn more about relocatable kernel, we may learn more about kernel, of cource this would be another big topic.#+BEGIN_EXAMPLE 19.1 Note notes • when start kdump service need disable selinux first?? 22