In our everyday Java programming, we rely on familiar APIs without fully realizing their hidden performance impacts. This session aims to unveil the concealed performance aspects of common Java APIs and shed light on how they can influence your application's performance.
Join us to explore the unnoticed performance effects of these APIs and learn strategies to mitigate their impact. Whether you're a seasoned developer or new to Java, this paper equips you with essential knowledge to optimize your applications.
There are certain Java APIs that we use in our everyday programming. However, we may not be aware of their notorious performance side effects. In this session, we are going to discuss a few common Java APIs and their performance impact on your application.
This session will introduce you to five frequent Java performance problems encountered by large organizations. We will examine potential remedies to these issues, as well as how to detect them far earlier before they manifest.
Uncover the hidden challenges that plague production environments in this eye-opening session. Join us as we explore the five most common performance problems that emerge in live systems. Gain invaluable insights into detecting these issues early on, before they wreak havoc on your operations. Discover practical solutions that empower you to address these challenges head-on, ensuring optimal performance and seamless user experiences.
16 ARTIFACTS TO CAPTURE WHEN YOUR CONTAINER APPLICATION IS IN TROUBLETier1 app
Troubleshooting the container application’s performance is tricky, if proper diagnostic information isn’t captured. In this session, we will share with you 16 essential artifacts which you can consider capturing when your container application is in trouble. We will also discuss effective tools, techniques, and tips that you can use to analyse these artifacts.
1. The document discusses various steps and tools for troubleshooting real production problems related to CPU spikes, thread dumps, memory leaks, and garbage collection issues.
2. It provides guidance on using tools like 'top', 'jstack', 'jmap', 'jcmd', Eclipse MAT and HeapHero to analyze thread dumps, capture heap dumps, and diagnose memory leaks.
3. The document also emphasizes the importance of enabling GC logs and capturing the right system metrics like thread states, file descriptors, and GC throughput to detect problems early.
The document provides an overview of real-time Java and its key concepts:
- Real-time means determinism where deadlines must be met, as opposed to just fast throughput.
- The Real-Time Specification for Java (RTSJ) addresses limitations of Java SE for real-time applications like unpredictable delays from class loading and garbage collection.
- RTSJ implementations like IBM WebSphere Real Time and Sun RTS add features like high resolution timers, priority scheduling, and asynchronous event handling.
Analyzing the Performance of Mobile WebAriya Hidayat
This document discusses techniques for analyzing the performance of mobile web applications. It covers challenges like network variability, different device hardware, and continuous integration. Approaches mentioned include benchmarking, injecting instrumentation, emulation, and remote inspection. Strategies suggested are reducing complexity, replicating analysis on desktop, and tweaking at the system level. Tools mentioned include the Nexus One, Gingerbread, PhantomJS, and headless WebKit. The document provides examples and caveats for analyzing areas like network traffic, graphics commands, garbage collection, and JavaScript parsing.
‘16 artifacts’ to capture when there is a production problemTier1 app
1. The document discusses 16 types of data ("artifacts") that can be captured to troubleshoot production problems, including GC logs, thread dumps, heap dumps, process information, system calls, and application logs.
2. It provides details on each type of artifact, how to capture it, and tools that can be used to analyze it. Specific artifacts covered include GC logs, thread dumps, heap dumps, heap substitute data, top, ps, top -H, disk usage, dmesg, netstat, ping, vmstat, iostat, kernel parameters, and application logs.
3. The document emphasizes that collecting these different types of 360-degree data around the application, JVM, operating system,
There are certain Java APIs that we use in our everyday programming. However, we may not be aware of their notorious performance side effects. In this session, we are going to discuss a few common Java APIs and their performance impact on your application.
This session will introduce you to five frequent Java performance problems encountered by large organizations. We will examine potential remedies to these issues, as well as how to detect them far earlier before they manifest.
Uncover the hidden challenges that plague production environments in this eye-opening session. Join us as we explore the five most common performance problems that emerge in live systems. Gain invaluable insights into detecting these issues early on, before they wreak havoc on your operations. Discover practical solutions that empower you to address these challenges head-on, ensuring optimal performance and seamless user experiences.
16 ARTIFACTS TO CAPTURE WHEN YOUR CONTAINER APPLICATION IS IN TROUBLETier1 app
Troubleshooting the container application’s performance is tricky, if proper diagnostic information isn’t captured. In this session, we will share with you 16 essential artifacts which you can consider capturing when your container application is in trouble. We will also discuss effective tools, techniques, and tips that you can use to analyse these artifacts.
1. The document discusses various steps and tools for troubleshooting real production problems related to CPU spikes, thread dumps, memory leaks, and garbage collection issues.
2. It provides guidance on using tools like 'top', 'jstack', 'jmap', 'jcmd', Eclipse MAT and HeapHero to analyze thread dumps, capture heap dumps, and diagnose memory leaks.
3. The document also emphasizes the importance of enabling GC logs and capturing the right system metrics like thread states, file descriptors, and GC throughput to detect problems early.
The document provides an overview of real-time Java and its key concepts:
- Real-time means determinism where deadlines must be met, as opposed to just fast throughput.
- The Real-Time Specification for Java (RTSJ) addresses limitations of Java SE for real-time applications like unpredictable delays from class loading and garbage collection.
- RTSJ implementations like IBM WebSphere Real Time and Sun RTS add features like high resolution timers, priority scheduling, and asynchronous event handling.
Analyzing the Performance of Mobile WebAriya Hidayat
This document discusses techniques for analyzing the performance of mobile web applications. It covers challenges like network variability, different device hardware, and continuous integration. Approaches mentioned include benchmarking, injecting instrumentation, emulation, and remote inspection. Strategies suggested are reducing complexity, replicating analysis on desktop, and tweaking at the system level. Tools mentioned include the Nexus One, Gingerbread, PhantomJS, and headless WebKit. The document provides examples and caveats for analyzing areas like network traffic, graphics commands, garbage collection, and JavaScript parsing.
‘16 artifacts’ to capture when there is a production problemTier1 app
1. The document discusses 16 types of data ("artifacts") that can be captured to troubleshoot production problems, including GC logs, thread dumps, heap dumps, process information, system calls, and application logs.
2. It provides details on each type of artifact, how to capture it, and tools that can be used to analyze it. Specific artifacts covered include GC logs, thread dumps, heap dumps, heap substitute data, top, ps, top -H, disk usage, dmesg, netstat, ping, vmstat, iostat, kernel parameters, and application logs.
3. The document emphasizes that collecting these different types of 360-degree data around the application, JVM, operating system,
H2O Design and Infrastructure with Matt DowleSri Ambati
This document provides an overview of H2O, an open source machine learning platform that allows for distributed, in-memory analytics of large datasets. It discusses how H2O works, including how it uses a map-reduce style to parallelize machine learning algorithms across multiple nodes. The document demonstrates starting an 8-node H2O cluster on Amazon EC2 and importing a 23GB dataset in under a minute, significantly faster than with other tools. It also summarizes how H2O's distributed fork-join framework executes tasks across nodes and shares data through its distributed data structures.
invited netflix talk: JVM issues in the age of scale! We take an under the hood look at java locking, memory model, overheads, serialization, uuid, gc tuning, CMS, ParallelGC, java.
16 artifacts to capture when there is a production problemTier1 app
Production problems are tricky to troubleshoot if proper diagnostic information isn’t captured. In this session, 16 important artifacts that you need to capture and the effective tools that you can use to analyze those artifacts are discussed.
After explaining what problem Reactive Programming solves I will give an introduction to one implementation: RxJava. I show how to compose Observable without concurrency first and then with Scheduler. I finish the talk by showing examples of flow control and draw backs.
Inspired from https://www.infoq.com/presentations/rxjava-reactor and https://www.infoq.com/presentations/rx-service-architecture
Code: https://github.com/toff63/Sandbox/tree/master/java/rsjug-rx/rsjug-rx/src/main/java/rs/jug/rx
Google App Engine is a PaaS that allows developers to build and host web applications in the Google cloud. The document summarizes a workshop on using the Java runtime environment on GAE. It discusses the SDKs, deploying and managing apps on GAE, data storage using the datastore, and limitations like the 30-second request limit. The biggest benefits of GAE are scalability and low startup costs, while the hardest limit is the 30-second request processing time.
Дмитрий Иванов «Мое первое приложение в облаках или почему стоит использовать...DataArt
This document discusses the benefits of using Azure Web Apps for building cloud applications. It highlights that Azure Web Apps allows for continuous deployment from various source control services, automatic scaling based on performance metrics, use of web jobs for background tasks, traffic management across regions, backups, and hybrid connections for private access to resources. Developing applications with Azure Web Apps provides agility, scalability, and global reach.
Node.js is a JavaScript runtime built on Chrome's V8 JavaScript engine. It uses an event-driven, non-blocking I/O model that makes it lightweight and efficient for data-intensive real-time applications that run across distributed devices. The document discusses Node.js' architecture, its use of JavaScript for simplicity, and how it is inspired by other technologies like Twisted and EventMachine. It also covers related tools like NPM for package management and Grunt for automating tasks.
Shooting the troubles: Crashes, Slowdowns, CPU SpikesTier1 app
This presentation tells about the best practices to troubleshoot production problems, how to analyze thread dumps, heap dumps, GC logs and other artifacts and real world examples which caused outages in major enterprises.
Non-blocking I/O, Event loops and node.jsMarcus Frödin
This 15 minute presentation discusses non-blocking I/O, event loops, and Node.js. It builds on previous work by Ryan Dahl, explaining how threads can be expensive due to context switching and memory usage, and how Node.js uses an event-driven, non-blocking model to avoid these costs. Code examples demonstrate getting and printing a policy object, handling HTTP requests asynchronously without blocking additional connections, and using callbacks to chain asynchronous actions together.
Discover the Top 5 Java Performance Problems in our presentation. Learn about common issues in Java coding and how to fix them. This guide helps you make your Java applications run better and faster.
Presto generates Java bytecode at runtime to optimize query execution. Key query operations like filtering, projections, joins and aggregations are compiled into efficient Java methods using libraries like ASM and Fastutil. This bytecode generation improves performance by 30% through techniques like compiling row hashing for join lookups directly into machine instructions.
This document summarizes steps taken to optimize CPU usage of a JVM running an Akka application using Spray. The application was not utilizing the CPU effectively, with throughput very low. Understanding Akka's asynchronous, actor-based architecture and obtaining thread dumps revealed the logging was blocking threads. The solution was to configure logging to occur asynchronously within actors to avoid blocking and better utilize the CPU.
Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes. It is written in Java and uses a pluggable backend. Presto is fast due to code generation and runtime compilation techniques. It provides a library and framework for building distributed services and fast Java collections. Plugins allow Presto to connect to different data sources like Hive, Cassandra, MongoDB and more.
Apache Flink(tm) - A Next-Generation Stream ProcessorAljoscha Krettek
In diesem Vortrag wird es zunächst einen kurzen Überblick über den aktuellen Stand im Bereich der Streaming-Datenanalyse geben. Danach wird es mit einer kleinen Einführung in das Apache-Flink-System zur Echtzeit-Datenanalyse weitergehen, bevor wir tiefer in einige der interessanten Eigenschaften eintauchen werden, die Flink von den anderen Spielern in diesem Bereich unterscheidet. Dazu werden wir beispielhafte Anwendungsfälle betrachten, die entweder direkt von Nutzern stammen oder auf unserer Erfahrung mit Nutzern basieren. Spezielle Eigenschaften, die wir betrachten werden, sind beispielsweise die Unterstützung für die Zerlegung von Events in einzelnen Sessions basierend auf der Zeit, zu der ein Ereignis passierte (event-time), Bestimmung von Zeitpunkten zum jeweiligen Speichern des Zustands eines Streaming-Programms für spätere Neustarts, die effiziente Abwicklung bei sehr großen zustandsorientierten Streaming-Berechnungen und die Zugänglichkeit des Zustandes von außerhalb.
Production Time Profiling and Diagnostics on the JVMMarcus Hirt
These are the slides for my Code One 2018 talk on profiling and diagnostics on the JVM. The talk goes through various serviceability technologies built into the JVM, but with a focus on the production time use cases.
This document provides an introduction to Node.js, a framework for building scalable server-side applications with asynchronous JavaScript. It discusses what Node.js is, how it uses non-blocking I/O and events to avoid wasting CPU cycles, and how external Node modules help create a full JavaScript stack. Examples are given of using Node modules like Express for building RESTful APIs and Socket.IO for implementing real-time features like chat. Best practices, limitations, debugging techniques and references are also covered.
IBM InterConnect: Java vs JavaScript for Enterprise WebAppsChris Bailey
The last few years have see a huge growth in the usage of JavaScript, to the extent that it is often reported to be the #1 programming language in use today. Additionally, the arrival of server-side JavaScript through frameworks such as Node.js and Ringo.js, and JavaScript on the JVM through Nashorn and Avatar.js, mean that enterprise web applications written in JavaScript are not just a possibility—but a reality for companies such as LinkedIn, eBay, Yahoo, ADP and Dow Jones. This session will compare and contrast the two platforms and describe the advantages of each for deploying, managing and monitoring highly scalable applications. It will also introduce IBM's strategy for building a common ecosystem around the two languages.
Presented at IBM InterConnect, Feb 25th, 2015
The document discusses the ELK stack, including Logstash for collecting, centralizing, parsing, storing, and searching logs; Elasticsearch for storing parsed log data from Logstash in a searchable format; and Kibana for visualizing and interacting with logs stored in Elasticsearch. It provides examples of using Logstash to ingest logs from multiple systems and ship the parsed data to Elasticsearch.
The document discusses changes in Java versions from Java 8 to Java 14. It covers major new features and improvements in each version including modules in Java 9, switch expressions in Java 12, and records in Java 14. It also discusses real world challenges with upgrading such as compatibility, multiple JVMs, library updates, and IDE support.
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...Jason Dai
This document summarizes a CVPR 2020 tutorial on the Analytics Zoo platform for automated machine learning workflows for distributed big data using Apache Spark. The tutorial covers an overview of Analytics Zoo and the BigDL distributed deep learning framework. It demonstrates distributed training of deep learning models using TensorFlow and PyTorch on Spark, and features of Analytics Zoo like end-to-end pipelines, ML workflow for automation, and model deployment with cluster serving. Real-world use cases applying Analytics Zoo at companies like SK Telecom, Midea, and MasterCard are also presented.
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTier1 app
Even though at surface level ‘java.lang.OutOfMemoryError’ appears as one single error; underlyingly there are 9 types of OutOfMemoryError. Each type of OutOfMemoryError has different causes, diagnosis approaches and solutions. This session equips you with the knowledge, tools, and techniques needed to troubleshoot and conquer OutOfMemoryError in all its forms, ensuring smoother, more efficient Java applications.
Effectively Troubleshoot 9 Types of OutOfMemoryErrorTier1 app
Embark on a journey into the depths of java.lang.OutOfMemoryError as we unravel its complex nature. Discover the nine distinct faces of this memory-related challenge and gain valuable insights into their unique causes and solutions. This session equips you with the knowledge, tools, and techniques needed to troubleshoot and conquer OutOfMemoryError in all its forms, ensuring smoother, more efficient Java applications.
Weitere ähnliche Inhalte
Ähnlich wie KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
H2O Design and Infrastructure with Matt DowleSri Ambati
This document provides an overview of H2O, an open source machine learning platform that allows for distributed, in-memory analytics of large datasets. It discusses how H2O works, including how it uses a map-reduce style to parallelize machine learning algorithms across multiple nodes. The document demonstrates starting an 8-node H2O cluster on Amazon EC2 and importing a 23GB dataset in under a minute, significantly faster than with other tools. It also summarizes how H2O's distributed fork-join framework executes tasks across nodes and shares data through its distributed data structures.
invited netflix talk: JVM issues in the age of scale! We take an under the hood look at java locking, memory model, overheads, serialization, uuid, gc tuning, CMS, ParallelGC, java.
16 artifacts to capture when there is a production problemTier1 app
Production problems are tricky to troubleshoot if proper diagnostic information isn’t captured. In this session, 16 important artifacts that you need to capture and the effective tools that you can use to analyze those artifacts are discussed.
After explaining what problem Reactive Programming solves I will give an introduction to one implementation: RxJava. I show how to compose Observable without concurrency first and then with Scheduler. I finish the talk by showing examples of flow control and draw backs.
Inspired from https://www.infoq.com/presentations/rxjava-reactor and https://www.infoq.com/presentations/rx-service-architecture
Code: https://github.com/toff63/Sandbox/tree/master/java/rsjug-rx/rsjug-rx/src/main/java/rs/jug/rx
Google App Engine is a PaaS that allows developers to build and host web applications in the Google cloud. The document summarizes a workshop on using the Java runtime environment on GAE. It discusses the SDKs, deploying and managing apps on GAE, data storage using the datastore, and limitations like the 30-second request limit. The biggest benefits of GAE are scalability and low startup costs, while the hardest limit is the 30-second request processing time.
Дмитрий Иванов «Мое первое приложение в облаках или почему стоит использовать...DataArt
This document discusses the benefits of using Azure Web Apps for building cloud applications. It highlights that Azure Web Apps allows for continuous deployment from various source control services, automatic scaling based on performance metrics, use of web jobs for background tasks, traffic management across regions, backups, and hybrid connections for private access to resources. Developing applications with Azure Web Apps provides agility, scalability, and global reach.
Node.js is a JavaScript runtime built on Chrome's V8 JavaScript engine. It uses an event-driven, non-blocking I/O model that makes it lightweight and efficient for data-intensive real-time applications that run across distributed devices. The document discusses Node.js' architecture, its use of JavaScript for simplicity, and how it is inspired by other technologies like Twisted and EventMachine. It also covers related tools like NPM for package management and Grunt for automating tasks.
Shooting the troubles: Crashes, Slowdowns, CPU SpikesTier1 app
This presentation tells about the best practices to troubleshoot production problems, how to analyze thread dumps, heap dumps, GC logs and other artifacts and real world examples which caused outages in major enterprises.
Non-blocking I/O, Event loops and node.jsMarcus Frödin
This 15 minute presentation discusses non-blocking I/O, event loops, and Node.js. It builds on previous work by Ryan Dahl, explaining how threads can be expensive due to context switching and memory usage, and how Node.js uses an event-driven, non-blocking model to avoid these costs. Code examples demonstrate getting and printing a policy object, handling HTTP requests asynchronously without blocking additional connections, and using callbacks to chain asynchronous actions together.
Discover the Top 5 Java Performance Problems in our presentation. Learn about common issues in Java coding and how to fix them. This guide helps you make your Java applications run better and faster.
Presto generates Java bytecode at runtime to optimize query execution. Key query operations like filtering, projections, joins and aggregations are compiled into efficient Java methods using libraries like ASM and Fastutil. This bytecode generation improves performance by 30% through techniques like compiling row hashing for join lookups directly into machine instructions.
This document summarizes steps taken to optimize CPU usage of a JVM running an Akka application using Spray. The application was not utilizing the CPU effectively, with throughput very low. Understanding Akka's asynchronous, actor-based architecture and obtaining thread dumps revealed the logging was blocking threads. The solution was to configure logging to occur asynchronously within actors to avoid blocking and better utilize the CPU.
Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes. It is written in Java and uses a pluggable backend. Presto is fast due to code generation and runtime compilation techniques. It provides a library and framework for building distributed services and fast Java collections. Plugins allow Presto to connect to different data sources like Hive, Cassandra, MongoDB and more.
Apache Flink(tm) - A Next-Generation Stream ProcessorAljoscha Krettek
In diesem Vortrag wird es zunächst einen kurzen Überblick über den aktuellen Stand im Bereich der Streaming-Datenanalyse geben. Danach wird es mit einer kleinen Einführung in das Apache-Flink-System zur Echtzeit-Datenanalyse weitergehen, bevor wir tiefer in einige der interessanten Eigenschaften eintauchen werden, die Flink von den anderen Spielern in diesem Bereich unterscheidet. Dazu werden wir beispielhafte Anwendungsfälle betrachten, die entweder direkt von Nutzern stammen oder auf unserer Erfahrung mit Nutzern basieren. Spezielle Eigenschaften, die wir betrachten werden, sind beispielsweise die Unterstützung für die Zerlegung von Events in einzelnen Sessions basierend auf der Zeit, zu der ein Ereignis passierte (event-time), Bestimmung von Zeitpunkten zum jeweiligen Speichern des Zustands eines Streaming-Programms für spätere Neustarts, die effiziente Abwicklung bei sehr großen zustandsorientierten Streaming-Berechnungen und die Zugänglichkeit des Zustandes von außerhalb.
Production Time Profiling and Diagnostics on the JVMMarcus Hirt
These are the slides for my Code One 2018 talk on profiling and diagnostics on the JVM. The talk goes through various serviceability technologies built into the JVM, but with a focus on the production time use cases.
This document provides an introduction to Node.js, a framework for building scalable server-side applications with asynchronous JavaScript. It discusses what Node.js is, how it uses non-blocking I/O and events to avoid wasting CPU cycles, and how external Node modules help create a full JavaScript stack. Examples are given of using Node modules like Express for building RESTful APIs and Socket.IO for implementing real-time features like chat. Best practices, limitations, debugging techniques and references are also covered.
IBM InterConnect: Java vs JavaScript for Enterprise WebAppsChris Bailey
The last few years have see a huge growth in the usage of JavaScript, to the extent that it is often reported to be the #1 programming language in use today. Additionally, the arrival of server-side JavaScript through frameworks such as Node.js and Ringo.js, and JavaScript on the JVM through Nashorn and Avatar.js, mean that enterprise web applications written in JavaScript are not just a possibility—but a reality for companies such as LinkedIn, eBay, Yahoo, ADP and Dow Jones. This session will compare and contrast the two platforms and describe the advantages of each for deploying, managing and monitoring highly scalable applications. It will also introduce IBM's strategy for building a common ecosystem around the two languages.
Presented at IBM InterConnect, Feb 25th, 2015
The document discusses the ELK stack, including Logstash for collecting, centralizing, parsing, storing, and searching logs; Elasticsearch for storing parsed log data from Logstash in a searchable format; and Kibana for visualizing and interacting with logs stored in Elasticsearch. It provides examples of using Logstash to ingest logs from multiple systems and ship the parsed data to Elasticsearch.
The document discusses changes in Java versions from Java 8 to Java 14. It covers major new features and improvements in each version including modules in Java 9, switch expressions in Java 12, and records in Java 14. It also discusses real world challenges with upgrading such as compatibility, multiple JVMs, library updates, and IDE support.
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...Jason Dai
This document summarizes a CVPR 2020 tutorial on the Analytics Zoo platform for automated machine learning workflows for distributed big data using Apache Spark. The tutorial covers an overview of Analytics Zoo and the BigDL distributed deep learning framework. It demonstrates distributed training of deep learning models using TensorFlow and PyTorch on Spark, and features of Analytics Zoo like end-to-end pipelines, ML workflow for automation, and model deployment with cluster serving. Real-world use cases applying Analytics Zoo at companies like SK Telecom, Midea, and MasterCard are also presented.
Ähnlich wie KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx (20)
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTier1 app
Even though at surface level ‘java.lang.OutOfMemoryError’ appears as one single error; underlyingly there are 9 types of OutOfMemoryError. Each type of OutOfMemoryError has different causes, diagnosis approaches and solutions. This session equips you with the knowledge, tools, and techniques needed to troubleshoot and conquer OutOfMemoryError in all its forms, ensuring smoother, more efficient Java applications.
Effectively Troubleshoot 9 Types of OutOfMemoryErrorTier1 app
Embark on a journey into the depths of java.lang.OutOfMemoryError as we unravel its complex nature. Discover the nine distinct faces of this memory-related challenge and gain valuable insights into their unique causes and solutions. This session equips you with the knowledge, tools, and techniques needed to troubleshoot and conquer OutOfMemoryError in all its forms, ensuring smoother, more efficient Java applications.
Uncover the hidden challenges that plague production environments in this eye-opening session. Join us as we explore the five most common performance problems that emerge in live systems. Gain invaluable insights into detecting these issues early on, before they wreak havoc on your operations. Discover practical solutions that empower you to address these challenges head-on, ensuring optimal performance and seamless user experiences.
1) The document discusses how monitoring micro-metrics like garbage collection logs and thread dumps can help predict production outages in applications. It provides examples of how specific micro-metrics could predict issues like memory leaks, backend slowdowns, CPU spikes, and poor response times.
2) The document also describes yCrash, a tool that captures micro-metrics every 3 minutes from applications and uses machine learning to detect potential problems and trigger full troubleshooting if an issue is forecasted. It provides open-source scripts to collect various system and application metrics for troubleshooting.
3) Real-world case studies are presented on how micro-metrics helped predict and solve issues for major financial, trading, and travel companies to prevent production
Step into the future of application performance monitoring as we unveil the game-changing potential of micro-metrics. In this enlightening session, we'll explore why traditional macro-metrics fall short in predicting performance problems and how to overcome this limitation. Discover the key micro-metrics that serve as powerful lead indicators, enabling you to forecast application performance with remarkable accuracy. Unleash the ability to detect and mitigate potential outages several minutes, even hours, before they impact your operations.
Predicting Production Outages: Unleashing the Power of Micro-Metrics – ADDO C...Tier1 app
This document discusses various micro-metrics and tools that can be used to predict and troubleshoot production issues related to memory leaks, garbage collection throughput, backend slowdowns, CPU spikes, and concurrency issues. It provides examples of micro-metrics like GC throughput and tools like IBM GC & Memory Visualizer, yCrash, FastThread, and Google Garbage Cat that can analyze GC logs, thread dumps, and process data to identify potential problems. The document aims to help predict issues before they impact customers by continuously monitoring these micro-metrics and signals.
There are more than 600 arguments that you can pass to JVM only for garbage collection and memory. It's way too many arguments for anyone to digest and comprehend. In this session, we will highlight seven essential JVM parameters that will improve the performance of your application.
In this session, we will be discussing major outages that happened in major enterprises. We will analyse the actual thread dumps, heap dumps, GC logs, and other artifacts captured at the time of the problem. After this session, troubleshooting CPU spikes, OutOfMemoryError, response time degradations, network connectivity issues, and application unresponsiveness may not stump you.
It's not just stock market charts that have patterns. Your application memory also has patterns. In this session, you are going to learn 6 unique memory patterns. Using these patterns, you can *predict* application outages well in advance and also optimize the application's performance.
This session brings to your attention how several millions of dollars are wasted and what you can do to save money. Optimizing garbage collection performance not only saves money, but also improves the overall customer experience as well.
This session brings to your attention how several millions of dollars are wasted and what you can do to save money. Optimizing garbage collection performance not only saves money, but also improves the overall customer experience as well.
Ram Lakshmanan, the founder and the architect of yCrash talked about how to diagnose and solve issues when a app crashes due to a memory leak, thread leak, CPU spike, unresponsiveness, BLOCKED threads, Deadlocks, and Heavy I/O activity.
In this session, the following topics have been discussed: code snippets that can generate memory leak, thread leak, CPU spike, unresponsiveness, BLOCKED threads, Deadlocks, Heavy I/O activity. If you can understand what triggers these problems, diagnosing and solving them might become easier.
In this session, sample code snippets that can generate memory leak, thread leak, CPU spike, unresponsiveness, BLOCKED threads, deadlocks, heavy I / O activity are discussed. If you can understand what triggers these problems, diagnosing and solving them might become easier.
7 habits of highly effective Performance TroubleshootersTier1 app
Troubleshooting production performance problems is a combination of art, science, and discipline. Below is the presentation deck shared in the conference which explains, how to forecast the problems?, what to do when the problem is happening?, how to identify the root cause instantly? and how to prevent problems from happening in the future and so on.
In this session, we will be discussing major outages that happened in major enterprises. We will be analyzing the actual thread dumps, heap dumps, GC logs, and other artifacts captured at the time of the problem. After this session, troubleshooting CPU spikes, OutOfMemoryError, response time degradations, network connectivity issues, application unresponsiveness may not stump you.
The Java Virtual Machine (JVM) is a black box for several of us. In this session, you will learn the fundamentals of JVM just in 1 single slide. You will understand what happens under the hood when your program executes; you will be able to troubleshoot, tune, and optimize the JVM effectively.
Accelerating Incident Response To Production OutagesTier1 app
In this webinar, following topics were discussed
1) Production outages that happened in major enterprises in their JVM applications.
2) Analyzing the actual thread dumps, heap dumps, GC logs, and other artifacts captured at the time of the problem.
This document discusses 7 important JVM arguments for optimizing Java applications:
1. -Xmx sets the maximum heap size to control memory usage and number of JVM instances.
2. -XX:MaxMetaspaceSize sets the maximum metaspace size for class metadata.
3. -Xss sets the thread stack size which impacts memory efficiency of Java threads.
4. GC algorithm arguments like -XX:+UseParallelGC to select the garbage collection algorithm.
5. GC logging arguments like -Xloggc to analyze garbage collection logs.
6. Network timeout arguments -Dsun.net.client.defaultConnectTimeout and -Dsun.net.client.
This document discusses 7 important JVM arguments for optimizing Java applications:
1. -Xmx and -XX:MaxMetaspaceSize for configuring heap and metaspace sizes
2. -Xss for configuring thread stack sizes
3. GC algorithm arguments like -XX:+UseParallelGC
4. Arguments for enabling GC logging like -Xloggc
5. -XX:+HeapDumpOnOutOfMemoryError for generating heap dumps on OOM errors
6. Network timeout arguments like -Dsun.net.client.defaultReadTimeout
7. -Duser.timeZone for setting the application timezone
WWDC 2024 Keynote Review: For CocoaCoders AustinPatrick Weigel
Overview of WWDC 2024 Keynote Address.
Covers: Apple Intelligence, iOS18, macOS Sequoia, iPadOS, watchOS, visionOS, and Apple TV+.
Understandable dialogue on Apple TV+
On-device app controlling AI.
Access to ChatGPT with a guest appearance by Chief Data Thief Sam Altman!
App Locking! iPhone Mirroring! And a Calculator!!
SOCRadar's Aviation Industry Q1 Incident Report is out now!
The aviation industry has always been a prime target for cybercriminals due to its critical infrastructure and high stakes. In the first quarter of 2024, the sector faced an alarming surge in cybersecurity threats, revealing its vulnerabilities and the relentless sophistication of cyber attackers.
SOCRadar’s Aviation Industry, Quarterly Incident Report, provides an in-depth analysis of these threats, detected and examined through our extensive monitoring of hacker forums, Telegram channels, and dark web platforms.
Microservice Teams - How the cloud changes the way we workSven Peters
A lot of technical challenges and complexity come with building a cloud-native and distributed architecture. The way we develop backend software has fundamentally changed in the last ten years. Managing a microservices architecture demands a lot of us to ensure observability and operational resiliency. But did you also change the way you run your development teams?
Sven will talk about Atlassian’s journey from a monolith to a multi-tenanted architecture and how it affected the way the engineering teams work. You will learn how we shifted to service ownership, moved to more autonomous teams (and its challenges), and established platform and enablement teams.
Measures in SQL (SIGMOD 2024, Santiago, Chile)Julian Hyde
SQL has attained widespread adoption, but Business Intelligence tools still use their own higher level languages based upon a multidimensional paradigm. Composable calculations are what is missing from SQL, and we propose a new kind of column, called a measure, that attaches a calculation to a table. Like regular tables, tables with measures are composable and closed when used in queries.
SQL-with-measures has the power, conciseness and reusability of multidimensional languages but retains SQL semantics. Measure invocations can be expanded in place to simple, clear SQL.
To define the evaluation semantics for measures, we introduce context-sensitive expressions (a way to evaluate multidimensional expressions that is consistent with existing SQL semantics), a concept called evaluation context, and several operations for setting and modifying the evaluation context.
A talk at SIGMOD, June 9–15, 2024, Santiago, Chile
Authors: Julian Hyde (Google) and John Fremlin (Google)
https://doi.org/10.1145/3626246.3653374
How Can Hiring A Mobile App Development Company Help Your Business Grow?ToXSL Technologies
ToXSL Technologies is an award-winning Mobile App Development Company in Dubai that helps businesses reshape their digital possibilities with custom app services. As a top app development company in Dubai, we offer highly engaging iOS & Android app solutions. https://rb.gy/necdnt
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian CompaniesQuickdice ERP
Explore the seamless transition to e-invoicing with this comprehensive guide tailored for Saudi Arabian businesses. Navigate the process effortlessly with step-by-step instructions designed to streamline implementation and enhance efficiency.
What to do when you have a perfect model for your software but you are constrained by an imperfect business model?
This talk explores the challenges of bringing modelling rigour to the business and strategy levels, and talking to your non-technical counterparts in the process.
Artificia Intellicence and XPath Extension FunctionsOctavian Nadolu
The purpose of this presentation is to provide an overview of how you can use AI from XSLT, XQuery, Schematron, or XML Refactoring operations, the potential benefits of using AI, and some of the challenges we face.
Malibou Pitch Deck For Its €3M Seed Roundsjcobrien
French start-up Malibou raised a €3 million Seed Round to develop its payroll and human resources
management platform for VSEs and SMEs. The financing round was led by investors Breega, Y Combinator, and FCVC.
Transform Your Communication with Cloud-Based IVR SolutionsTheSMSPoint
Discover the power of Cloud-Based IVR Solutions to streamline communication processes. Embrace scalability and cost-efficiency while enhancing customer experiences with features like automated call routing and voice recognition. Accessible from anywhere, these solutions integrate seamlessly with existing systems, providing real-time analytics for continuous improvement. Revolutionize your communication strategy today with Cloud-Based IVR Solutions. Learn more at: https://thesmspoint.com/channel/cloud-telephony
7. Thread dump analysis report – stack trace
STATE : BLOCKED
java.security.SecureRandom.nextBytes(SecureRandom.java:433)
java.util.UUID.randomUUID(UUID.java:159)
com.buggycompany.jtm.bp.<init>(bp.java:185)
com.buggycompany.jtm.a4.f(a4.java:94)
com.buggycompany.agent.trace.RootTracer.topComponentMethodBbuggycompanyin(RootTracer.java:439)
weblogicx.servlet.gzip.filter.GZIPFilter.doFilter(GZIPFilter.java)
weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:56)
weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.wrapRun(WebAppServletContext.java:3730)
weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.run(WebAppServletContext.java:3696)
weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:321)
weblogic.security.service.SecurityManager.runAs(SecurityManager.java:120)
weblogic.servlet.internal.WebAppServletContext.securedExecute(WebAppServletContext.java:2273)
weblogic.servlet.internal.WebAppServletContext.execute(WebAppServletContext.java:2179)
weblogic.servlet.internal.ServletRequestImpl.run(ServletRequestImpl.java:1490)
weblogic.work.ExecuteThread.execute(ExecuteThread.java:256)
weblogic.work.ExecuteThread.run(ExecuteThread.java:221)
Checking Entropy in Linux
cat /proc/sys/kernel/random/entropy_avail
If < 1000, it’s a problem
8. Solution
• RHEL
• Upgrade to RHEL 7 or above version
• If < RHEL 7, follow recommendations given here
• Install Haveged Library - Unpredictable Random number generator
• Use /dev/urandom instead of /dev/random
• ‘/dev/random’ serve as pseudorandom number generators
• ‘/dev/urandom’ is another special file that is capable of generating random
numbers. Downside: reduced security due to less randomness
• -Djava.security.egd=file:/dev/urandom
10. System.getProperty()
• ‘java.lang.System.getProperty()’ API underlyingly uses
‘java.util.Hashtable.get()’ API.
public synchronized V get(Object key) {
:
:
}
• If used in critical code path, can significantly affect application
performance
12. Victim Thread Stack trace
http-nio-8080-exec-293
Stack Trace is:
java.lang.Thread.State: BLOCKED (on object monitor)
at java.util.Hashtable.get(Hashtable.java:362)
- waiting to lock <0x0000000080f5e118> (a java.util.Properties)
at java.util.Properties.getProperty(Properties.java:969)
at java.util.Properties.getProperty(Properties.java:988)
at java.lang.System.getProperty(System.java:756)
at net.java.ao.atlassian.ConverterUtils.enforceLength(ConverterUtils.java:16)
at net.java.ao.atlassian.ConverterUtils.checkLength(ConverterUtils.java:9)
:
13. Culprit Thread Stack trace
Camel Thread #6 – backboneThreadPool
Stack Trace is:
at java.util.Hashtable.get(Hashtable.java:362)
- locked <0x0000000080f5e118> (a java.util.Properties)
at java.util.Properties.getProperty(Properties.java:969)
at java.util.Properties.getProperty(Properties.java:988)
at java.lang.System.getProperty(System.java:756)
at net.java.ao.atlassian.ConverterUtils.enforceLength(ConverterUtils.java:16)
at net.java.ao.atlassian.ConverterUtils.checkLength(ConverterUtils.java:9)
:
14. Solution
• Upgrade to JDK 11 or above
Synchronized HashTable has been replaced with ConcurrentHashMap
• Cache the values:
public static String getAppName() {
String app = System.getProperty("appName");
return app;
}
private static String app = System.getProperty("appName");
public static String getAppName() {
return app;
}
16. Interview question
• What is the difference between HashMap and HashTable?
• But what happens when you do concurrent put() and get() on
HashMap -
• How to diagnose CPU spike?
top –H –p <PROCESS_ID> + Thread dump
27. Real world example – Trading app
public void clear() {
modCount++;
// clear to let GC do its work
for (int i = 0; i < size; i++)
elementData[i] = null;
size = 0;
}
30. 1 million threads
for (int i = 0; i < 1_000_000; i++) {
new Thread(new Runnable() {
@Override
public void run() {
TimeUnit.HOURS.sleep(1);
}
}).start();
}
31. Performance Comparison
Thread Count Memory Size Thread Analysis Heap Analysis
Platform Threads 1599.
After that
OutOfMemoryErr
or
1.85 MB https://tinyurl.co
m/ntfastthread
https://tinyurl.co
m/ntheaphero
Virtual Threads 1 million.
No issues
401 MB https://tinyurl.co
m/vtfastthread
https://tinyurl.co
m/vtheaphero
39. Real world – Long GC Pause in Top Cloud Provider
https://blog.gceasy.io/2022/03/04/garbage-collection-tuning-success-story-reducing-young-gen-size/
40. What is GC throughput?
How does 96% GC Throughput sound?
1 day = 1440 Minutes (i.e., 24 hours x 60 minutes)
96% GC Throughput means app pausing for 57.6
minutes/day
Amount of time application spends in processing customer
transactions
vs
Amount of time application spends in processing garbage
collection activity
44. How to tune GC Performance?
Free Video: https://www.youtube.com/watch?v=6G0E4O5yxks
Online Training: https://ycrash.io/java-performance-training
45. Thank you friends!
Ram Lakshmanan
ram@tier1app.com
@tier1app
linkedin.com/company/ycrash
This deck will be published in: https://blog.fastthread.io