SlideShare ist ein Scribd-Unternehmen logo
1 von 43
Downloaden Sie, um offline zu lesen
Node.js Postmortem
Working Group Update
Yunong Xiao, Netflix
Michael Dawson, IBM
Yunong Xiao
Platform Architect, Netflix
@yunongx
http://yunong.io
About Michael
About The Postmortem Workgroup
Howard Hellyer
@hhellyer
David Pacheco
@davepacheco
Julien Gilli
@mistredjules
Michael Dawson
@mhdawson
Chris Bailey
@seabaylea
Daniel Khan
@danielkhan
Joshua Clulow
@jclulow
Yunong Xiao
@yunong
James Bellenger
@jbellenger
Bradley Meck
@bmeck
Luca Maraschi
@lucamaraschi
David Clements
@davidmarkclements
Richard Chamberlain
@rnchamberlain
Mission Statement
The working group is dedicated to the support and improvement of
postmortem debugging for Node.js.
Debugging Node.js
Debugging Node.js
Test Environment
What about production?
“The method described in this article was designed to
provide a core dump… with a minimal impact on the
spacecraft… as the resumption of data acquisition from
the spacecraft is the highest priority.”
- Chafin, R. "Pioneer F & G Telemetry and Command Processor Core Dump
Program." JPL Technical Report XVI, no. 32-1526 (1971): 174.
Core Dumps: Brief History
● Magnetic core memory
● Dump out the contents
of “core” memory for
debugging
● “Core dump” was coined
● Initially printed on paper
● Postmortem debugging
was born
Production Constraints
● Uptime is critical
● Not easily reproducible
● Can’t simulate environment
● Resume normal operations ASAP
Postmortem Debugging
└─[0] <> node --abort_on_uncaught_exception throw.js
Uncaught Error
FROM
Object.<anonymous> (/Users/yunong/throw.js:1:63)
Module._compile (module.js:435:26)
Object.Module._extensions..js (module.js:442:10)
Module.load (module.js:356:32)
Function.Module._load (module.js:311:12)
Function.Module.runMain (module.js:467:10)
startup (node.js:134:18)
node.js:961:3
[1] 4131 illegal hardware instruction (core dumped) node
--abort_on_uncaught_exception throw.js
Where: Inspect stack trace
Why: Inspect heap and stack variable state
Generate Core Dump Ad-hoc
root@demo:~# gcore `pgrep node`
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7facaeffd700 (LWP 5650)]
[New Thread 0x7facaf7fe700 (LWP 5649)]
[New Thread 0x7facaffff700 (LWP 5648)]
[New Thread 0x7facbc967700 (LWP 5647)]
[New Thread 0x7facbd168700 (LWP 5617)]
[New Thread 0x7facbd969700 (LWP 5616)]
[New Thread 0x7facbe16a700 (LWP 5615)]
[New Thread 0x7facbe96b700 (LWP 5614)]
0x00007facbea5b5a9 in syscall () from /lib/x86_64-linux-gnu/libc.so.6
Saved corefile core.5602
Example
Netflix API Request
Node to API RPC
[2016-09-09T16:25:48.388Z] WARN: reactive
socket/rs-pool/17352 on lgud-yunong:
TcpLoadBalancer._connect: no more free
connections
Postmortem Debugging
Connection Pool State
> ::findjsobjects -p _connections | ::jsprint
{
"connected": {
},
"connecting": {},
"free": {
"100.82.188.185:7001": [...],
"100.82.37.181:7001": [...],
"100.82.41.121:7001": [...],
"100.82.102.157:7001": [...],
"100.82.106.115:7001": [...],
"100.82.129.239:7001": [...],
"100.82.102.158:7001": [...],
"100.82.74.237:7001": [...],
...
}
}
Postmortem Debugging is Critical to Large Scale
Production Node Deployments
Postmortem WG
github.com/nodejs/post-mortem/
Postmortem WG - Mission
Guide improvements in postmortem
● Interfaces/APIs
● Dump formats
● Tools and Techniques
State of key tools today
Heap dump - snapshot of heap
● heapdump module - https://github.com/bnoordhuis/node-heapdump
● Chrome developer tools
● Limitations
● Need to modify application
● Slow to generate (minutes or hours)
● O(N) memory usage
● Limited content
● Output is large
State of key tools today
Core dump - memory image
● Creation
○ Crash, signal
○ --abort-on-uncaught-exception
○ Fast (relative to heap dumps)
○ Size matches process memory
● OS debuggers
○ Examination at C/C++ or assembler level
○ No knowledge of Node/v8 structures
● Node core dump inspectors
○ MDB (limited platform support)
○ IDDE (IBM SDK specific)
○ LLNODE (newer, less complete)
Example commands
MDB_V8 command LLNODE command IDDE
Print a stack trace jsstack, jsframe v8 bt !stack, !frame
Find objects findjsobjects v8 findjsobjects,
v8 findjsinstances <type>
!jslistobjects
!jsgroupobjects
!jsfindbyproperty
!jsobjectsmatching
Print an object jsprint v8 inspect !jsobject
Print function source jssource v8 source (prints source for a stack
frame)
!jsobject, !string + work
Find constructor for an
object
jsconstructor n/a !jsconstructor
Print elements of a
FixedArray
v8array v8 inspect <instance> !array
Find native memory
backing a buffer
nodebuffer v8 inspect <instance> !nodebuffers
How to make this better?
● Improve ease of use
● Common APIs to introspect dumps
● Cross platform support
● Common command set
● Lightweight dump
The Postmortem WG is working on...
Common Heap Dump Format
Improved Core Dump Analysis
● Library in C & JS
● Tools: mdb_v8(mdb), llnode(lldb), ...
Node Report
Common Heap Dump Format
Enabler for new tools
Generation
● mdb
● llnode
Consumption
● Conversion to existing v8 format - > chrome dev tools
● C/Javascript APIs
Core Dump Analysis
Currently working on
● Platform coverage
● Re-use of command implementation
● Common APIs
Soon to be
nodejs/llnode !
https://github.com/nodejs/post-mortem/issues/37
Working to get to….
Node Report
Lightweight Dump
● Fast
● Small
● Human readable
● Key information to start investigating
● Triggers: exception, fatal error, signal, JavaScript API
NodeReport
example - heap out
of memory error
NodeReport content:
● Event summary
● Node.js and OS versions
● JavaScript stack trace
● Native stack trace
● Heap and GC statistics
● Resource usage
● libuv handle summary
● Environment variables
● OS ulimit settings
Javascript API
API in Javascript
● More accessible
● Leverages
○ llnode
○ Common Heap Dump (future)
JavaScript API - example application
Summary
What is postmortem debugging
Example of where it’s helpful
Activities of the working group
● Common heap format
● APIs (C/JS)
● Tools(lldb, mdb_v8, NodeReport)
Get Involved !
Great chance to learn
● Low level machine details
● Key debugging techniques
● Different platforms/operating systems
Where
● Most work done through GitHub issues/Pull Requests
● http://github.com/nodejs/post-mortem/
Postmortem Debugging is Critical to Large Scale
Production Node Deployments
Some production problems are otherwise impossible
Save complete process state for debugging later
Copyrights and Trademarks
IBM, the IBM logo, ibm.com are trademarks or registered
trademarks of International Business Machines Corp.,
registered in many jurisdictions worldwide. Other product and
service names might be trademarks of IBM or other companies.
A current list of IBM trademarks is available on the Web at
“Copyright and trademark information” at
www.ibm.com/legal/copytrade.shtml
Node.js is an official trademark of Joyent. IBM SDK for Node.js is not formally related to or endorsed by the official
Joyent Node.js open source or commercial project.
Java, JavaScript and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or
its affiliates.
Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.
Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United
States, other countries, or both.

Weitere ähnliche Inhalte

Was ist angesagt?

LAS16-301: OpenStack on Aarch64, running in production, upstream improvements...
LAS16-301: OpenStack on Aarch64, running in production, upstream improvements...LAS16-301: OpenStack on Aarch64, running in production, upstream improvements...
LAS16-301: OpenStack on Aarch64, running in production, upstream improvements...Linaro
 
Kernel Recipes 2018 - Live (Kernel) Patching: status quo and status futurus -...
Kernel Recipes 2018 - Live (Kernel) Patching: status quo and status futurus -...Kernel Recipes 2018 - Live (Kernel) Patching: status quo and status futurus -...
Kernel Recipes 2018 - Live (Kernel) Patching: status quo and status futurus -...Anne Nicolas
 
[OSSummitEU2017]Ten Llayers of Linux Container Security
[OSSummitEU2017]Ten Llayers of Linux Container Security[OSSummitEU2017]Ten Llayers of Linux Container Security
[OSSummitEU2017]Ten Llayers of Linux Container SecurityDaniel Oh
 
Native Java with GraalVM
Native Java with GraalVMNative Java with GraalVM
Native Java with GraalVMSylvain Wallez
 
gcma: guaranteed contiguous memory allocator
gcma:  guaranteed contiguous memory allocatorgcma:  guaranteed contiguous memory allocator
gcma: guaranteed contiguous memory allocatorSeongJae Park
 
LCE13: Test and Validation Mini-Summit: Review Current Linaro Engineering Pro...
LCE13: Test and Validation Mini-Summit: Review Current Linaro Engineering Pro...LCE13: Test and Validation Mini-Summit: Review Current Linaro Engineering Pro...
LCE13: Test and Validation Mini-Summit: Review Current Linaro Engineering Pro...Linaro
 
Fosdem'20_Past, present & future of DRLM project
Fosdem'20_Past, present & future of DRLM projectFosdem'20_Past, present & future of DRLM project
Fosdem'20_Past, present & future of DRLM projectDidac Oliveira
 
A165 tools for java and javascript
A165 tools for java and javascriptA165 tools for java and javascript
A165 tools for java and javascriptToby Corbin
 
Evaluating GPU programming Models for the LUMI Supercomputer
Evaluating GPU programming Models for the LUMI SupercomputerEvaluating GPU programming Models for the LUMI Supercomputer
Evaluating GPU programming Models for the LUMI SupercomputerGeorge Markomanolis
 
Building Android for the Cloud: Android as a Server (Mobile World Congress 2014)
Building Android for the Cloud: Android as a Server (Mobile World Congress 2014)Building Android for the Cloud: Android as a Server (Mobile World Congress 2014)
Building Android for the Cloud: Android as a Server (Mobile World Congress 2014)Ron Munitz
 
Las16 309 - lua jit arm64 port - status
Las16 309 - lua jit arm64 port - statusLas16 309 - lua jit arm64 port - status
Las16 309 - lua jit arm64 port - statusLinaro
 
Best practices for optimizing Red Hat platforms for large scale datacenter de...
Best practices for optimizing Red Hat platforms for large scale datacenter de...Best practices for optimizing Red Hat platforms for large scale datacenter de...
Best practices for optimizing Red Hat platforms for large scale datacenter de...Jeremy Eder
 
Red Hat Summit 2018 5 New High Performance Features in OpenShift
Red Hat Summit 2018 5 New High Performance Features in OpenShiftRed Hat Summit 2018 5 New High Performance Features in OpenShift
Red Hat Summit 2018 5 New High Performance Features in OpenShiftJeremy Eder
 
Let's make it flow ... one way
Let's make it flow ... one wayLet's make it flow ... one way
Let's make it flow ... one wayRoberto Ciatti
 
Introduction to OpenVX
Introduction to OpenVXIntroduction to OpenVX
Introduction to OpenVX家榮 張
 
Newbie’s guide to_the_gpgpu_universe
Newbie’s guide to_the_gpgpu_universeNewbie’s guide to_the_gpgpu_universe
Newbie’s guide to_the_gpgpu_universeOfer Rosenberg
 
The Self-Service Developer - GOTOCon CPH
The Self-Service Developer - GOTOCon CPHThe Self-Service Developer - GOTOCon CPH
The Self-Service Developer - GOTOCon CPHLaszlo Fogas
 
Trace kernel code tips
Trace kernel code tipsTrace kernel code tips
Trace kernel code tipsViller Hsiao
 
Java applications containerized and deployed
Java applications containerized and deployedJava applications containerized and deployed
Java applications containerized and deployedAnthony Dahanne
 

Was ist angesagt? (20)

LAS16-301: OpenStack on Aarch64, running in production, upstream improvements...
LAS16-301: OpenStack on Aarch64, running in production, upstream improvements...LAS16-301: OpenStack on Aarch64, running in production, upstream improvements...
LAS16-301: OpenStack on Aarch64, running in production, upstream improvements...
 
Kernel Recipes 2018 - Live (Kernel) Patching: status quo and status futurus -...
Kernel Recipes 2018 - Live (Kernel) Patching: status quo and status futurus -...Kernel Recipes 2018 - Live (Kernel) Patching: status quo and status futurus -...
Kernel Recipes 2018 - Live (Kernel) Patching: status quo and status futurus -...
 
[OSSummitEU2017]Ten Llayers of Linux Container Security
[OSSummitEU2017]Ten Llayers of Linux Container Security[OSSummitEU2017]Ten Llayers of Linux Container Security
[OSSummitEU2017]Ten Llayers of Linux Container Security
 
Native Java with GraalVM
Native Java with GraalVMNative Java with GraalVM
Native Java with GraalVM
 
gcma: guaranteed contiguous memory allocator
gcma:  guaranteed contiguous memory allocatorgcma:  guaranteed contiguous memory allocator
gcma: guaranteed contiguous memory allocator
 
LCE13: Test and Validation Mini-Summit: Review Current Linaro Engineering Pro...
LCE13: Test and Validation Mini-Summit: Review Current Linaro Engineering Pro...LCE13: Test and Validation Mini-Summit: Review Current Linaro Engineering Pro...
LCE13: Test and Validation Mini-Summit: Review Current Linaro Engineering Pro...
 
Fosdem'20_Past, present & future of DRLM project
Fosdem'20_Past, present & future of DRLM projectFosdem'20_Past, present & future of DRLM project
Fosdem'20_Past, present & future of DRLM project
 
A165 tools for java and javascript
A165 tools for java and javascriptA165 tools for java and javascript
A165 tools for java and javascript
 
Evaluating GPU programming Models for the LUMI Supercomputer
Evaluating GPU programming Models for the LUMI SupercomputerEvaluating GPU programming Models for the LUMI Supercomputer
Evaluating GPU programming Models for the LUMI Supercomputer
 
Building Android for the Cloud: Android as a Server (Mobile World Congress 2014)
Building Android for the Cloud: Android as a Server (Mobile World Congress 2014)Building Android for the Cloud: Android as a Server (Mobile World Congress 2014)
Building Android for the Cloud: Android as a Server (Mobile World Congress 2014)
 
Las16 309 - lua jit arm64 port - status
Las16 309 - lua jit arm64 port - statusLas16 309 - lua jit arm64 port - status
Las16 309 - lua jit arm64 port - status
 
The GPGPU Continuum
The GPGPU ContinuumThe GPGPU Continuum
The GPGPU Continuum
 
Best practices for optimizing Red Hat platforms for large scale datacenter de...
Best practices for optimizing Red Hat platforms for large scale datacenter de...Best practices for optimizing Red Hat platforms for large scale datacenter de...
Best practices for optimizing Red Hat platforms for large scale datacenter de...
 
Red Hat Summit 2018 5 New High Performance Features in OpenShift
Red Hat Summit 2018 5 New High Performance Features in OpenShiftRed Hat Summit 2018 5 New High Performance Features in OpenShift
Red Hat Summit 2018 5 New High Performance Features in OpenShift
 
Let's make it flow ... one way
Let's make it flow ... one wayLet's make it flow ... one way
Let's make it flow ... one way
 
Introduction to OpenVX
Introduction to OpenVXIntroduction to OpenVX
Introduction to OpenVX
 
Newbie’s guide to_the_gpgpu_universe
Newbie’s guide to_the_gpgpu_universeNewbie’s guide to_the_gpgpu_universe
Newbie’s guide to_the_gpgpu_universe
 
The Self-Service Developer - GOTOCon CPH
The Self-Service Developer - GOTOCon CPHThe Self-Service Developer - GOTOCon CPH
The Self-Service Developer - GOTOCon CPH
 
Trace kernel code tips
Trace kernel code tipsTrace kernel code tips
Trace kernel code tips
 
Java applications containerized and deployed
Java applications containerized and deployedJava applications containerized and deployed
Java applications containerized and deployed
 

Ähnlich wie Post mortem talk - Node Interactive EU

Kandroid for nhn_deview_20131013_v5_final
Kandroid for nhn_deview_20131013_v5_finalKandroid for nhn_deview_20131013_v5_final
Kandroid for nhn_deview_20131013_v5_finalNAVER D2
 
Customize and Secure the Runtime and Dependencies of Your Procedural Language...
Customize and Secure the Runtime and Dependencies of Your Procedural Language...Customize and Secure the Runtime and Dependencies of Your Procedural Language...
Customize and Secure the Runtime and Dependencies of Your Procedural Language...VMware Tanzu
 
Why use JavaScript in Hardware? GoTo Conf - Berlin
Why use JavaScript in Hardware? GoTo Conf - Berlin Why use JavaScript in Hardware? GoTo Conf - Berlin
Why use JavaScript in Hardware? GoTo Conf - Berlin TechnicalMachine
 
Критика "библиотечного" подхода в разработке под Android. UA Mobile 2016.
Критика "библиотечного" подхода в разработке под Android. UA Mobile 2016.Критика "библиотечного" подхода в разработке под Android. UA Mobile 2016.
Критика "библиотечного" подхода в разработке под Android. UA Mobile 2016.UA Mobile
 
An Introduction To Android
An Introduction To AndroidAn Introduction To Android
An Introduction To Androidnatdefreitas
 
GeoServer Developers Workshop
GeoServer Developers WorkshopGeoServer Developers Workshop
GeoServer Developers WorkshopJody Garnett
 
Node.js Web Apps @ ebay scale
Node.js Web Apps @ ebay scaleNode.js Web Apps @ ebay scale
Node.js Web Apps @ ebay scaleDmytro Semenov
 
Lessons learned from designing QA automation event streaming platform(IoT big...
Lessons learned from designing QA automation event streaming platform(IoT big...Lessons learned from designing QA automation event streaming platform(IoT big...
Lessons learned from designing QA automation event streaming platform(IoT big...Omid Vahdaty
 
Rapid and Reliable Developing with HTML5 & GWT
Rapid and Reliable Developing with HTML5 & GWTRapid and Reliable Developing with HTML5 & GWT
Rapid and Reliable Developing with HTML5 & GWTManuel Carrasco Moñino
 
Why kernelspace sucks?
Why kernelspace sucks?Why kernelspace sucks?
Why kernelspace sucks?OpenFest team
 
What's New in OpenLDAP
What's New in OpenLDAPWhat's New in OpenLDAP
What's New in OpenLDAPLDAPCon
 
"You Don't Know NODE.JS" by Hengki Mardongan Sihombing (Urbanhire)
"You Don't Know NODE.JS" by Hengki Mardongan Sihombing (Urbanhire)"You Don't Know NODE.JS" by Hengki Mardongan Sihombing (Urbanhire)
"You Don't Know NODE.JS" by Hengki Mardongan Sihombing (Urbanhire)Tech in Asia ID
 
Troubleshooting .net core on linux
Troubleshooting .net core on linuxTroubleshooting .net core on linux
Troubleshooting .net core on linuxPavel Klimiankou
 
Real-World Docker: 10 Things We've Learned
Real-World Docker: 10 Things We've Learned  Real-World Docker: 10 Things We've Learned
Real-World Docker: 10 Things We've Learned RightScale
 
Inside Android's UI / ABS 2013
Inside Android's UI / ABS 2013Inside Android's UI / ABS 2013
Inside Android's UI / ABS 2013Opersys inc.
 

Ähnlich wie Post mortem talk - Node Interactive EU (20)

Kandroid for nhn_deview_20131013_v5_final
Kandroid for nhn_deview_20131013_v5_finalKandroid for nhn_deview_20131013_v5_final
Kandroid for nhn_deview_20131013_v5_final
 
Customize and Secure the Runtime and Dependencies of Your Procedural Language...
Customize and Secure the Runtime and Dependencies of Your Procedural Language...Customize and Secure the Runtime and Dependencies of Your Procedural Language...
Customize and Secure the Runtime and Dependencies of Your Procedural Language...
 
Why use JavaScript in Hardware? GoTo Conf - Berlin
Why use JavaScript in Hardware? GoTo Conf - Berlin Why use JavaScript in Hardware? GoTo Conf - Berlin
Why use JavaScript in Hardware? GoTo Conf - Berlin
 
Nodejs
NodejsNodejs
Nodejs
 
Критика "библиотечного" подхода в разработке под Android. UA Mobile 2016.
Критика "библиотечного" подхода в разработке под Android. UA Mobile 2016.Критика "библиотечного" подхода в разработке под Android. UA Mobile 2016.
Критика "библиотечного" подхода в разработке под Android. UA Mobile 2016.
 
Node js meetup
Node js meetupNode js meetup
Node js meetup
 
An Introduction To Android
An Introduction To AndroidAn Introduction To Android
An Introduction To Android
 
GeoServer Developers Workshop
GeoServer Developers WorkshopGeoServer Developers Workshop
GeoServer Developers Workshop
 
Surge2012
Surge2012Surge2012
Surge2012
 
Node.js Web Apps @ ebay scale
Node.js Web Apps @ ebay scaleNode.js Web Apps @ ebay scale
Node.js Web Apps @ ebay scale
 
Getting started with AMD GPUs
Getting started with AMD GPUsGetting started with AMD GPUs
Getting started with AMD GPUs
 
Lessons learned from designing QA automation event streaming platform(IoT big...
Lessons learned from designing QA automation event streaming platform(IoT big...Lessons learned from designing QA automation event streaming platform(IoT big...
Lessons learned from designing QA automation event streaming platform(IoT big...
 
Rapid and Reliable Developing with HTML5 & GWT
Rapid and Reliable Developing with HTML5 & GWTRapid and Reliable Developing with HTML5 & GWT
Rapid and Reliable Developing with HTML5 & GWT
 
Why kernelspace sucks?
Why kernelspace sucks?Why kernelspace sucks?
Why kernelspace sucks?
 
What's New in OpenLDAP
What's New in OpenLDAPWhat's New in OpenLDAP
What's New in OpenLDAP
 
Android Internals
Android InternalsAndroid Internals
Android Internals
 
"You Don't Know NODE.JS" by Hengki Mardongan Sihombing (Urbanhire)
"You Don't Know NODE.JS" by Hengki Mardongan Sihombing (Urbanhire)"You Don't Know NODE.JS" by Hengki Mardongan Sihombing (Urbanhire)
"You Don't Know NODE.JS" by Hengki Mardongan Sihombing (Urbanhire)
 
Troubleshooting .net core on linux
Troubleshooting .net core on linuxTroubleshooting .net core on linux
Troubleshooting .net core on linux
 
Real-World Docker: 10 Things We've Learned
Real-World Docker: 10 Things We've Learned  Real-World Docker: 10 Things We've Learned
Real-World Docker: 10 Things We've Learned
 
Inside Android's UI / ABS 2013
Inside Android's UI / ABS 2013Inside Android's UI / ABS 2013
Inside Android's UI / ABS 2013
 

Mehr von Michael Dawson

Index 2018 talk to your code
Index 2018   talk to your codeIndex 2018   talk to your code
Index 2018 talk to your codeMichael Dawson
 
Index 2018 node.js what's next
Index 2018   node.js what's nextIndex 2018   node.js what's next
Index 2018 node.js what's nextMichael Dawson
 
N api - node interactive 2017
N api - node interactive 2017N api - node interactive 2017
N api - node interactive 2017Michael Dawson
 
N api-node summit-2017-final
N api-node summit-2017-finalN api-node summit-2017-final
N api-node summit-2017-finalMichael Dawson
 
Accelerate your digital transformation
Accelerate your digital transformationAccelerate your digital transformation
Accelerate your digital transformationMichael Dawson
 
Node.js Community Benchmarking WG update
Node.js Community  Benchmarking WG updateNode.js Community  Benchmarking WG update
Node.js Community Benchmarking WG updateMichael Dawson
 
A294 fips support in node
A294  fips support in nodeA294  fips support in node
A294 fips support in nodeMichael Dawson
 
A295 nodejs-knowledge-accelerator
A295   nodejs-knowledge-acceleratorA295   nodejs-knowledge-accelerator
A295 nodejs-knowledge-acceleratorMichael Dawson
 
A301 ctu madrid2016-monitoring
A301 ctu madrid2016-monitoringA301 ctu madrid2016-monitoring
A301 ctu madrid2016-monitoringMichael Dawson
 
Update from-build-workgroup
Update from-build-workgroupUpdate from-build-workgroup
Update from-build-workgroupMichael Dawson
 
Micro app-framework - NodeLive Boston
Micro app-framework - NodeLive BostonMicro app-framework - NodeLive Boston
Micro app-framework - NodeLive BostonMichael Dawson
 
Node liveboston welcome
Node liveboston welcomeNode liveboston welcome
Node liveboston welcomeMichael Dawson
 
Node home automation with Node.js and MQTT
Node home automation with Node.js and MQTTNode home automation with Node.js and MQTT
Node home automation with Node.js and MQTTMichael Dawson
 

Mehr von Michael Dawson (18)

Index 2018 talk to your code
Index 2018   talk to your codeIndex 2018   talk to your code
Index 2018 talk to your code
 
Index 2018 node.js what's next
Index 2018   node.js what's nextIndex 2018   node.js what's next
Index 2018 node.js what's next
 
N api - node interactive 2017
N api - node interactive 2017N api - node interactive 2017
N api - node interactive 2017
 
N api-node summit-2017-final
N api-node summit-2017-finalN api-node summit-2017-final
N api-node summit-2017-final
 
Accelerate your digital transformation
Accelerate your digital transformationAccelerate your digital transformation
Accelerate your digital transformation
 
Ask us anything v9
Ask us anything v9Ask us anything v9
Ask us anything v9
 
Node.js Community Benchmarking WG update
Node.js Community  Benchmarking WG updateNode.js Community  Benchmarking WG update
Node.js Community Benchmarking WG update
 
Cascon intro
Cascon introCascon intro
Cascon intro
 
A294 fips support in node
A294  fips support in nodeA294  fips support in node
A294 fips support in node
 
A295 nodejs-knowledge-accelerator
A295   nodejs-knowledge-acceleratorA295   nodejs-knowledge-accelerator
A295 nodejs-knowledge-accelerator
 
A301 ctu madrid2016-monitoring
A301 ctu madrid2016-monitoringA301 ctu madrid2016-monitoring
A301 ctu madrid2016-monitoring
 
Update from-build-workgroup
Update from-build-workgroupUpdate from-build-workgroup
Update from-build-workgroup
 
Node fips
Node fipsNode fips
Node fips
 
Micro app-framework - NodeLive Boston
Micro app-framework - NodeLive BostonMicro app-framework - NodeLive Boston
Micro app-framework - NodeLive Boston
 
Node liveboston welcome
Node liveboston welcomeNode liveboston welcome
Node liveboston welcome
 
Micro app-framework
Micro app-frameworkMicro app-framework
Micro app-framework
 
Node home automation with Node.js and MQTT
Node home automation with Node.js and MQTTNode home automation with Node.js and MQTT
Node home automation with Node.js and MQTT
 
Java one 2015 - v1
Java one   2015 - v1Java one   2015 - v1
Java one 2015 - v1
 

Kürzlich hochgeladen

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 

Kürzlich hochgeladen (20)

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 

Post mortem talk - Node Interactive EU

  • 1. Node.js Postmortem Working Group Update Yunong Xiao, Netflix Michael Dawson, IBM
  • 2. Yunong Xiao Platform Architect, Netflix @yunongx http://yunong.io
  • 4. About The Postmortem Workgroup Howard Hellyer @hhellyer David Pacheco @davepacheco Julien Gilli @mistredjules Michael Dawson @mhdawson Chris Bailey @seabaylea Daniel Khan @danielkhan Joshua Clulow @jclulow Yunong Xiao @yunong James Bellenger @jbellenger Bradley Meck @bmeck Luca Maraschi @lucamaraschi David Clements @davidmarkclements Richard Chamberlain @rnchamberlain
  • 5. Mission Statement The working group is dedicated to the support and improvement of postmortem debugging for Node.js.
  • 9.
  • 10.
  • 11. “The method described in this article was designed to provide a core dump… with a minimal impact on the spacecraft… as the resumption of data acquisition from the spacecraft is the highest priority.” - Chafin, R. "Pioneer F & G Telemetry and Command Processor Core Dump Program." JPL Technical Report XVI, no. 32-1526 (1971): 174.
  • 12. Core Dumps: Brief History ● Magnetic core memory ● Dump out the contents of “core” memory for debugging ● “Core dump” was coined ● Initially printed on paper ● Postmortem debugging was born
  • 13. Production Constraints ● Uptime is critical ● Not easily reproducible ● Can’t simulate environment ● Resume normal operations ASAP
  • 15. └─[0] <> node --abort_on_uncaught_exception throw.js Uncaught Error FROM Object.<anonymous> (/Users/yunong/throw.js:1:63) Module._compile (module.js:435:26) Object.Module._extensions..js (module.js:442:10) Module.load (module.js:356:32) Function.Module._load (module.js:311:12) Function.Module.runMain (module.js:467:10) startup (node.js:134:18) node.js:961:3 [1] 4131 illegal hardware instruction (core dumped) node --abort_on_uncaught_exception throw.js
  • 16. Where: Inspect stack trace Why: Inspect heap and stack variable state
  • 17. Generate Core Dump Ad-hoc root@demo:~# gcore `pgrep node` [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". [New Thread 0x7facaeffd700 (LWP 5650)] [New Thread 0x7facaf7fe700 (LWP 5649)] [New Thread 0x7facaffff700 (LWP 5648)] [New Thread 0x7facbc967700 (LWP 5647)] [New Thread 0x7facbd168700 (LWP 5617)] [New Thread 0x7facbd969700 (LWP 5616)] [New Thread 0x7facbe16a700 (LWP 5615)] [New Thread 0x7facbe96b700 (LWP 5614)] 0x00007facbea5b5a9 in syscall () from /lib/x86_64-linux-gnu/libc.so.6 Saved corefile core.5602
  • 20. Node to API RPC
  • 21. [2016-09-09T16:25:48.388Z] WARN: reactive socket/rs-pool/17352 on lgud-yunong: TcpLoadBalancer._connect: no more free connections
  • 23. Connection Pool State > ::findjsobjects -p _connections | ::jsprint { "connected": { }, "connecting": {}, "free": { "100.82.188.185:7001": [...], "100.82.37.181:7001": [...], "100.82.41.121:7001": [...], "100.82.102.157:7001": [...], "100.82.106.115:7001": [...], "100.82.129.239:7001": [...], "100.82.102.158:7001": [...], "100.82.74.237:7001": [...], ... } }
  • 24. Postmortem Debugging is Critical to Large Scale Production Node Deployments
  • 26. Postmortem WG - Mission Guide improvements in postmortem ● Interfaces/APIs ● Dump formats ● Tools and Techniques
  • 27. State of key tools today Heap dump - snapshot of heap ● heapdump module - https://github.com/bnoordhuis/node-heapdump ● Chrome developer tools ● Limitations ● Need to modify application ● Slow to generate (minutes or hours) ● O(N) memory usage ● Limited content ● Output is large
  • 28. State of key tools today Core dump - memory image ● Creation ○ Crash, signal ○ --abort-on-uncaught-exception ○ Fast (relative to heap dumps) ○ Size matches process memory ● OS debuggers ○ Examination at C/C++ or assembler level ○ No knowledge of Node/v8 structures ● Node core dump inspectors ○ MDB (limited platform support) ○ IDDE (IBM SDK specific) ○ LLNODE (newer, less complete)
  • 29. Example commands MDB_V8 command LLNODE command IDDE Print a stack trace jsstack, jsframe v8 bt !stack, !frame Find objects findjsobjects v8 findjsobjects, v8 findjsinstances <type> !jslistobjects !jsgroupobjects !jsfindbyproperty !jsobjectsmatching Print an object jsprint v8 inspect !jsobject Print function source jssource v8 source (prints source for a stack frame) !jsobject, !string + work Find constructor for an object jsconstructor n/a !jsconstructor Print elements of a FixedArray v8array v8 inspect <instance> !array Find native memory backing a buffer nodebuffer v8 inspect <instance> !nodebuffers
  • 30. How to make this better? ● Improve ease of use ● Common APIs to introspect dumps ● Cross platform support ● Common command set ● Lightweight dump
  • 31. The Postmortem WG is working on... Common Heap Dump Format Improved Core Dump Analysis ● Library in C & JS ● Tools: mdb_v8(mdb), llnode(lldb), ... Node Report
  • 32. Common Heap Dump Format Enabler for new tools Generation ● mdb ● llnode Consumption ● Conversion to existing v8 format - > chrome dev tools ● C/Javascript APIs
  • 33. Core Dump Analysis Currently working on ● Platform coverage ● Re-use of command implementation ● Common APIs Soon to be nodejs/llnode ! https://github.com/nodejs/post-mortem/issues/37
  • 34. Working to get to….
  • 35. Node Report Lightweight Dump ● Fast ● Small ● Human readable ● Key information to start investigating ● Triggers: exception, fatal error, signal, JavaScript API
  • 36. NodeReport example - heap out of memory error NodeReport content: ● Event summary ● Node.js and OS versions ● JavaScript stack trace ● Native stack trace ● Heap and GC statistics ● Resource usage ● libuv handle summary ● Environment variables ● OS ulimit settings
  • 37. Javascript API API in Javascript ● More accessible ● Leverages ○ llnode ○ Common Heap Dump (future)
  • 38. JavaScript API - example application
  • 39. Summary What is postmortem debugging Example of where it’s helpful Activities of the working group ● Common heap format ● APIs (C/JS) ● Tools(lldb, mdb_v8, NodeReport)
  • 40. Get Involved ! Great chance to learn ● Low level machine details ● Key debugging techniques ● Different platforms/operating systems Where ● Most work done through GitHub issues/Pull Requests ● http://github.com/nodejs/post-mortem/
  • 41. Postmortem Debugging is Critical to Large Scale Production Node Deployments
  • 42. Some production problems are otherwise impossible Save complete process state for debugging later
  • 43. Copyrights and Trademarks IBM, the IBM logo, ibm.com are trademarks or registered trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at “Copyright and trademark information” at www.ibm.com/legal/copytrade.shtml Node.js is an official trademark of Joyent. IBM SDK for Node.js is not formally related to or endorsed by the official Joyent Node.js open source or commercial project. Java, JavaScript and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or its affiliates. Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both.