Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

Post Mortem Debugging in Embedded Linux Systems

499 Aufrufe

Veröffentlicht am

This presentation by Anton Bondarenko (Senior Software Engineer/Architect, Bosch Sensortec, Sweden) was delivered at GlobalLogic Kharkiv Embedded Conference 2019 on July 7, 2019.

Live debugging in Linux is a good method during development but it’s not always possible. Alternative is post mortem debugging. Post mortem analyze includes investigations performed using system snapshot. There are different tools supporting this approach. ‘Crash’ tool is one of them and we will review it in details during Anton’s speech. The talk provided information about different aspects of post mortem analyze like collection, processing and comparison to other methods.

Conference materials: https://www.globallogic.com/ua/events/kharkiv-embedded-conference-2019/

Veröffentlicht in: Technologie
  • Als Erste(r) kommentieren

  • Gehören Sie zu den Ersten, denen das gefällt!

Post Mortem Debugging in Embedded Linux Systems

  1. 1. Post mortem debugging in Embedded Linux Systems Anton Bondarenko Senior Software Engineer/Architect Bosch Sensortec
  2. 2. Topics ● Introduction ● What is post-mortem analysis? ● Why do we need post-mortem data? ● How it could be retrieved? Problems and solutions ● How it could be analyzed? ○ Crash tool ● Examples
  3. 3. Introduction ● 10+ years of Embedded Linux experience ● 4 years as System engineer in Sony Mobile working with Xperia Z to Z3 generations with focus on stability ○ Major activity was post-mortem analysis using different methods and approaches
  4. 4. Post-mortem analysis Post-mortem analysis consist of different methods to investigate over data collected at the moment system state become unstable Well known solutions ● GDB with coredump
  5. 5. Post-mortem data Post-mortem data may include ● RAM regions ● CPUs state ● Peripherals state RAM Video/GFX Shared memory
  6. 6. Why do we want post-mortem data Live debugging ● Focused on flow control Post-mortem debugging ● State analysis ● Single instance ● Multiple processing ● Online on target ● Offline or semi-offline ● System continues to evolve ● System state is atomic ● Limited scope ● Global scope
  7. 7. How it could be retrieved Important rules to follow; ● Keep critical state information unmodified ● Collect as much as possible Collection may happen: ● With system reset, for example in bootloader ● W/o system reset, for example kdump approach ● In Hypervisor as VM dump
  8. 8. Bootloader dumper Advantages: ● Small footprint ● Handle hardware cases Disadvantages: ● Separate drivers & tools ● Require special handling for RAM initialization ● Intermediate boot stages First kernel Unexpected system reset ROM bootloader RAM bootloader Disk Network
  9. 9. KDump Advantages: ● “Same” kernel ● Same utils ● Direct jump Disadvantages: ● Requires more memory ● Memory reservation ● HW failures might not work
  10. 10. Hypervisor All important information controlled by hypervisor RAM Video/GFX VM1 VM0 VMM
  11. 11. How it could be analyzed Main requirement - OS and CPU architecture awareness Tool Examples ● Lauterbach TRACE32 ● Red Hat Crash
  12. 12. Lauterbach TRACE32 ● Many supported architectures ● Requires Linux kernel OS awareness library ● Support scripting with its own script language ● Active maintenance ● License: Proprietary
  13. 13. Red Hat Crash Utility ● Many supported architectures (x86, ARM, ARM64, MIPS) ● Using GDB as core library ● Native support for Linux kernel OS ● Active maintenance ● License: GPL
  14. 14. Crash extensions ● Native support of plugin concept ● Few available including very promising one ○ Python scripts in Crash environment (PyScript) ● Supports symbols for whole system: kernel+modules+userspace ● Full access to OS memory ○ User space analysis in tool directly ○ JVM stack and state analysis
  15. 15. Linux Kernel crash ● Possible causes ○ Many different ones ● Important information ○ Access to OS memory
  16. 16. Linux Kernel crash (sys)
  17. 17. Linux Kernel crash
  18. 18. Linux Kernel crash (bt -l)
  19. 19. Linux Kernel crash
  20. 20. IPC issues ● Possible causes ○ Unexpected state in complex system ● Important information ○ All involved parts of memory (both kernel and userspace) Android App 1 Android Framework Manager Android Framework Service Android App 2 Android Framework Manager ?
  21. 21. LK Deadlock ● Possible causes ○ Wrong handling of locks ● Important information ○ Access to lock memory
  22. 22. Watchdog ● Possible causes ○ Interrupt handling ○ Hardware errors ○ Memory corruption ● Important information ○ CPUs registers state ○ Special traces and logging
  23. 23. Links ● KDump examples https://access.redhat.com/documentation/en- us/red_hat_enterprise_linux/6/html/deployment_guide/s1- kdump-crash ● Crash whitepaper https://people.redhat.com/anderson/crash_whitepaper ● Crash tool main page http://people.redhat.com/anderson/ ● Crash tool sources https://github.com/crash-utility/crash