An in-depth virtual session on Einstein Analytics on the Belgian Salesforce Administrators user group. What is Einstein Analytics, how does it compare to standard Salesforce Reports & Dashboards, an in-depth demo and how do you quickstart your knowledge on EA.
BIWUG1303 - SharePoint BI on top project server dataBIWUG
This document discusses using Project Server and SharePoint for business intelligence reporting. It covers:
- Project Server as a data source for reporting on project management data
- Different reporting tools in SharePoint including quick views, Excel Services, PowerPivot, Reporting Services, and PerformancePoint
- The importance of choosing the right tool based on the audience and job
- Involving end users in report development
- SharePoint's capabilities for project reporting
The document lists the various courses that Ricardo A. VanEgas has successfully completed related to data analysis, data science, business intelligence, programming, and project management. Some of the key courses completed include Data Analysis with Excel, Introduction to Data Science, Querying with Transact-SQL, and Data Visualizations with Power BI in Excel 2013. The dates of achievement for each course are also provided.
Sigma Infosolutions developed a web-based reporting solution using Pentaho for a global energy company to monitor the performance of over 120 MW of battery storage systems across multiple projects and locations. The automated reporting engine leverages Pentaho tools to provide customized reports and dashboards for analytical reporting and visualization. It extracts data from the company's database, as well as JIRA and Google APIs, to generate operational reports, KPIs, analytical reports, and OLAP cubes for strategic decision making.
The Alteryx Designer solves this by delivering an intuitive workflow for data blending and advanced analytics that leads to deeper insights in hours, not the weeks typical of traditional approaches! The Alteryx Designer empowers data analysts by combining data blending, predictive analytics, spatial analytics, and reporting, visualization and analytic apps into one workflow.
Google Cloud Data Platform - Why Google for Data Analysis?Andreas Raible
Introduction to our Data Platform from capture, processing, analysis and exploration.
The Google Cloud Platform products are based on our internal systems which are powering Google AdWords, Search, YouTube and our leading research in the field of real-time data analysis.
You can get access ($300 for 60 days) to our free trial through google.com/cloud
The document defines a data warehouse as a subject-oriented, integrated, and time-variant collection of data used to support management decision making. It notes that while databases are transaction-oriented and store online data, data warehouses are designed to analyze historical data from across an entire organization. Data warehouses are used for reporting, analysis, and decision making, and may include departmental data marts that contain only relevant information. The document also references the architecture of a data warehouse but does not provide details.
An in-depth virtual session on Einstein Analytics on the Belgian Salesforce Administrators user group. What is Einstein Analytics, how does it compare to standard Salesforce Reports & Dashboards, an in-depth demo and how do you quickstart your knowledge on EA.
BIWUG1303 - SharePoint BI on top project server dataBIWUG
This document discusses using Project Server and SharePoint for business intelligence reporting. It covers:
- Project Server as a data source for reporting on project management data
- Different reporting tools in SharePoint including quick views, Excel Services, PowerPivot, Reporting Services, and PerformancePoint
- The importance of choosing the right tool based on the audience and job
- Involving end users in report development
- SharePoint's capabilities for project reporting
The document lists the various courses that Ricardo A. VanEgas has successfully completed related to data analysis, data science, business intelligence, programming, and project management. Some of the key courses completed include Data Analysis with Excel, Introduction to Data Science, Querying with Transact-SQL, and Data Visualizations with Power BI in Excel 2013. The dates of achievement for each course are also provided.
Sigma Infosolutions developed a web-based reporting solution using Pentaho for a global energy company to monitor the performance of over 120 MW of battery storage systems across multiple projects and locations. The automated reporting engine leverages Pentaho tools to provide customized reports and dashboards for analytical reporting and visualization. It extracts data from the company's database, as well as JIRA and Google APIs, to generate operational reports, KPIs, analytical reports, and OLAP cubes for strategic decision making.
The Alteryx Designer solves this by delivering an intuitive workflow for data blending and advanced analytics that leads to deeper insights in hours, not the weeks typical of traditional approaches! The Alteryx Designer empowers data analysts by combining data blending, predictive analytics, spatial analytics, and reporting, visualization and analytic apps into one workflow.
Google Cloud Data Platform - Why Google for Data Analysis?Andreas Raible
Introduction to our Data Platform from capture, processing, analysis and exploration.
The Google Cloud Platform products are based on our internal systems which are powering Google AdWords, Search, YouTube and our leading research in the field of real-time data analysis.
You can get access ($300 for 60 days) to our free trial through google.com/cloud
The document defines a data warehouse as a subject-oriented, integrated, and time-variant collection of data used to support management decision making. It notes that while databases are transaction-oriented and store online data, data warehouses are designed to analyze historical data from across an entire organization. Data warehouses are used for reporting, analysis, and decision making, and may include departmental data marts that contain only relevant information. The document also references the architecture of a data warehouse but does not provide details.
This document discusses how to synchronize data between applications using Oracle's data synchronization tools. It describes creating mapping tables between source and destination dimensions, using a wizard to set up synchronizations by specifying the source and destination applications, mapping source dimensions to destination dimensions, applying filters, validating synchronizations, executing synchronizations, and viewing data flows between applications. The goal is to share and consolidate data across different applications.
Notebooks @ Netflix: From analytics to engineering with Jupyter notebooksMichelle Ufford
Slides from JupyterCon 2018 in NYC on 8/23/2018.
Notebooks have moved beyond a niche solution at Netflix; they are now the critical path for how everyone runs jobs against the company’s data platform. From creating original content to delivering bufferless streaming, Netflix relies on notebooks to inform decisions and fuel experiments across the company. Netflix also uses notebooks to power its machine learning infrastructure and run over 150,000 jobs against its 100 PB cloud-based data warehouse every day. The goal is to deliver a compelling notebooks experience that simplifies end-to-end workflows for every type of user. To enable this, Netflix is investing deeply in notebook infrastructure and open source projects such as nteract.
In this talk, Michelle Ufford and Kyle Kelley share interesting ways Netflix uses data and some of the big bets the company is making on notebooks. Topics will include architecture, kernels, UIs, and Netflix’s open source collaborations with projects such as Jupyter, nteract, pandas, and Spark.
This document discusses Google BigQuery, a tool for analyzing large datasets that is fast, easy to use, and cost effective. It provides SQL-like queries against nested and columnar data stored in Google's infrastructure. Developers can access BigQuery through Google Cloud Storage, a REST API, or command line tools. BigQuery handles the infrastructure maintenance and offers on-demand or reserved pricing models.
Smart View is a Microsoft Excel add-in that allows users to connect to and analyze financial data from within Excel. It provides tools to connect to data sources, view and manipulate data using familiar Excel functions and formulas, and submit data back to the source system. Key capabilities include ad hoc analysis of retrieved data, pivoting and drilling into dimension hierarchies, and creating functions to exchange data between Excel and the source application.
Big Query - Utilizing Google Data Warehouse for Media Analyticshafeeznazri
This topic will cover the intermediate understanding of Google Big Query and how Media Prima Digital utilizing Big Query as data warehouse for production.
Tor Hovland: Taking a swim in the big data lakeAnalyticsConf
Are you curious about the possibilities enabled by Microsoft Azure and Cortana Analytics? Come and see how to handle data input from a large number of “Internet of Things” devices, how to work with all the data, how to scale big computations, how to make predictions, and how to build applications on top of it. There will be demos!
The document discusses the idea behind Apache Hivemall, which is an open-source machine learning library that allows running machine learning on large datasets stored in data warehouses. It addresses concerns about scalability, data movement, and tools when performing machine learning on big data. It suggests pushing more machine learning logic, like data preprocessing, back to the database where the data resides for better performance and stability. Hivemall provides machine learning functions that can be used within SQL queries on Hadoop systems like Hive and Spark SQL, enabling parallel and distributed machine learning.
Implementing google big query automation using google analytics dataCountants
The increasing value of big data analytics for business presents a lot of use cases for BigQuery technology. Through Google Analytics to BigQuery automation, data analysts can save time as well as extract deeper insights from the latest Google Analytics data.
EPSi was founded in 1999 with the goal of re-engineering decision support. Over the past 20 years we have grown our client base to over 950+ individual healthcare institutions across the United States and internationally. Our customers rely on our platform for the insights they need, including 15 of the top 20 U.S. News & World Best Hospitals; and, and, our experience in delivering world-class financial decision support and integrated performance management applications is unparalleled.
EPSi’s platform has always been highly scalable. We have installations in regional medical centers and community hospitals, as well as large health systems. This includes 128 Integrated Health System customers, among them Catholic Health Initiatives, a flagship EPSi customer and one of the nation's largest healthcare systems comprised of 108 acute care hospitals and over 11,000 beds. EPSi’s base also includes 40 very prestigious academic medical centers and 14 stand-alone pediatric hospitals using our software applications.
In addition to the above, EPSi continues our rich history of innovation with the launch of our new RealCost™ platform. Our vision for RealCost™ was catalyzed by the rapidly changing healthcare landscape with new reimbursement models, increasing cost of care and organizational consolidation; while available systems today do not offer the level of integration, precision, and timeliness required to optimize effective decision making.
RealCost™ changes all of this with a high performing, cloud-based solution encompassing specific application modules along with intuitive workflow and embedded analytics. One of the goals of RealCost™ is to predict upcoming cost variances so that you can avoid costs before they happen – thereby empowering healthcare organizations to move away from the traditional practice of month-end retrospective analysis and inefficient backend processes – in favor of real-time actions – to help drive more timely and effective decision making for streamlining operations and improving financial outcomes.
RealCost will supplement existing decision support applications – and in some cases replace specific decision support system functions entirely – like Cost Accounting, the first RealCost™ module released in 2018. Additional modules scheduled for release in 2019 include Rolling Forecast and Real Time Cost.
This document discusses Enterprise Data Analytics (EDA), an open source business analytics tool. EDA allows users to build reports and dashboards without code or SQL. It features an integrated data model, maps, alerts, Sankey charts, cross tables, master-detail reports, and more. EDA is available both for self-hosting and as a hosted service called EDA Server. It supports multiple languages and has an active user forum for feedback and questions.
ML, Statistics, and Spark with Databricks for Maximizing Revenue in a Delayed...Databricks
In this talk, we will present how we used Spark, Databricks, Airflow and MLflow to process big data, and build a pipeline of both ML(XGBoost) and statistical models that maximizes our revenues in one of our core products, called the “Offer Wall”. The “Offer wall” is a mobile phone product that is integrated with existing apps, suggesting users to perform tasks in exchange for in-app currency. The problem gets even more interesting when considering the fact that some of the tasks users do take 15 minutes and some may take up to 2 to weeks, forcing us to make revenue determining decisions in an uncertain space all of the time. The solution we developed utilizes Databricks and Spark’s strengths and diversity in machine learning, big data, MLflow and Airflow integrations, allowing us to deliver a production-grade solution with short development time between experiments.
BDT has moved from SAS-based workflow a cloud-based workflow leveraging tools like BigQuery, Looker, and Apache Airflow. Originally presented at the 2018 Pennsylvania Data Users Conference: https://pasdcconference.org/
Using open source BI. Practical experience 2012 - EnSerge Ivanko
The document discusses the implementation of an open source business intelligence (BI) solution at a digital e-commerce company. Previously, the company used various homegrown systems and Google Analytics, requiring manual data merging. The goals of the new solution were to effectively manage key performance indicators (KPIs) and market responses. Pentaho was selected as it provided OLAP, pivots, and integration with the company's MySQL database. The implementation took around 6 months, resulting in a single source of data truth, timely insights from regular KPI monitoring, and new opportunities from business analysis and data mining.
This document provides an agenda and notes for a DB2 Update Day event held in March 2015. The agenda includes sessions on DB2 use cases, archives, native stored procedures, and updates for developers. Additional notes provide status updates from various Nordic locations attending the event with a total of 403 attendees. The document also includes sections on IBM zSystems and enabling mobile analytics and reporting on zSystems using tools like QMF and Cognos.
IBM offers a business analytics portfolio including Cognos BI, Cognos TM1, and SPSS to provide capabilities across reporting, planning, analysis, and predictive analytics. The suite integrates various data sources and allows different levels of analytics from basic reporting and dashboards to more advanced predictive modeling and optimization. It provides a unified workspace and promotes collaboration.
The document discusses reporting capabilities with SAP cloud solutions. It describes SAP's analytics concept and the types of reporting objects like data sources, reports, key figures, and characteristics. It outlines the integrated analytics capabilities for editing existing reports, designing new reports and data sources, and integrating reports with Microsoft Excel. It also discusses interfaces for connecting external analytics systems and retrieving analytics data using OData web services.
An Analytics Engineer’s Guide to Streaming With Amy Chen | Current 2022HostedbyConfluent
An Analytics Engineer’s Guide to Streaming With Amy Chen | Current 2022
What happens to the modern data stack (MDS) and analytics as a whole when streaming becomes accessible? For years, the MDS has been centered around batch-based workflows with dbt at its core, introducing software engineering best practices to analysts. But now with even major data warehouses like Snowflake getting in the game, expanding their streaming capabilities, what does that mean?
In this talk, we will explore what streaming in a batch-based analytics world should look like. How does that change your thoughts about implementing testing and performance optimization in your data pipelines? Do you still need dbt? And the question that we are all asking: do you really need a real-time dashboard?
This document discusses how to synchronize data between applications using Oracle's data synchronization tools. It describes creating mapping tables between source and destination dimensions, using a wizard to set up synchronizations by specifying the source and destination applications, mapping source dimensions to destination dimensions, applying filters, validating synchronizations, executing synchronizations, and viewing data flows between applications. The goal is to share and consolidate data across different applications.
Notebooks @ Netflix: From analytics to engineering with Jupyter notebooksMichelle Ufford
Slides from JupyterCon 2018 in NYC on 8/23/2018.
Notebooks have moved beyond a niche solution at Netflix; they are now the critical path for how everyone runs jobs against the company’s data platform. From creating original content to delivering bufferless streaming, Netflix relies on notebooks to inform decisions and fuel experiments across the company. Netflix also uses notebooks to power its machine learning infrastructure and run over 150,000 jobs against its 100 PB cloud-based data warehouse every day. The goal is to deliver a compelling notebooks experience that simplifies end-to-end workflows for every type of user. To enable this, Netflix is investing deeply in notebook infrastructure and open source projects such as nteract.
In this talk, Michelle Ufford and Kyle Kelley share interesting ways Netflix uses data and some of the big bets the company is making on notebooks. Topics will include architecture, kernels, UIs, and Netflix’s open source collaborations with projects such as Jupyter, nteract, pandas, and Spark.
This document discusses Google BigQuery, a tool for analyzing large datasets that is fast, easy to use, and cost effective. It provides SQL-like queries against nested and columnar data stored in Google's infrastructure. Developers can access BigQuery through Google Cloud Storage, a REST API, or command line tools. BigQuery handles the infrastructure maintenance and offers on-demand or reserved pricing models.
Smart View is a Microsoft Excel add-in that allows users to connect to and analyze financial data from within Excel. It provides tools to connect to data sources, view and manipulate data using familiar Excel functions and formulas, and submit data back to the source system. Key capabilities include ad hoc analysis of retrieved data, pivoting and drilling into dimension hierarchies, and creating functions to exchange data between Excel and the source application.
Big Query - Utilizing Google Data Warehouse for Media Analyticshafeeznazri
This topic will cover the intermediate understanding of Google Big Query and how Media Prima Digital utilizing Big Query as data warehouse for production.
Tor Hovland: Taking a swim in the big data lakeAnalyticsConf
Are you curious about the possibilities enabled by Microsoft Azure and Cortana Analytics? Come and see how to handle data input from a large number of “Internet of Things” devices, how to work with all the data, how to scale big computations, how to make predictions, and how to build applications on top of it. There will be demos!
The document discusses the idea behind Apache Hivemall, which is an open-source machine learning library that allows running machine learning on large datasets stored in data warehouses. It addresses concerns about scalability, data movement, and tools when performing machine learning on big data. It suggests pushing more machine learning logic, like data preprocessing, back to the database where the data resides for better performance and stability. Hivemall provides machine learning functions that can be used within SQL queries on Hadoop systems like Hive and Spark SQL, enabling parallel and distributed machine learning.
Implementing google big query automation using google analytics dataCountants
The increasing value of big data analytics for business presents a lot of use cases for BigQuery technology. Through Google Analytics to BigQuery automation, data analysts can save time as well as extract deeper insights from the latest Google Analytics data.
EPSi was founded in 1999 with the goal of re-engineering decision support. Over the past 20 years we have grown our client base to over 950+ individual healthcare institutions across the United States and internationally. Our customers rely on our platform for the insights they need, including 15 of the top 20 U.S. News & World Best Hospitals; and, and, our experience in delivering world-class financial decision support and integrated performance management applications is unparalleled.
EPSi’s platform has always been highly scalable. We have installations in regional medical centers and community hospitals, as well as large health systems. This includes 128 Integrated Health System customers, among them Catholic Health Initiatives, a flagship EPSi customer and one of the nation's largest healthcare systems comprised of 108 acute care hospitals and over 11,000 beds. EPSi’s base also includes 40 very prestigious academic medical centers and 14 stand-alone pediatric hospitals using our software applications.
In addition to the above, EPSi continues our rich history of innovation with the launch of our new RealCost™ platform. Our vision for RealCost™ was catalyzed by the rapidly changing healthcare landscape with new reimbursement models, increasing cost of care and organizational consolidation; while available systems today do not offer the level of integration, precision, and timeliness required to optimize effective decision making.
RealCost™ changes all of this with a high performing, cloud-based solution encompassing specific application modules along with intuitive workflow and embedded analytics. One of the goals of RealCost™ is to predict upcoming cost variances so that you can avoid costs before they happen – thereby empowering healthcare organizations to move away from the traditional practice of month-end retrospective analysis and inefficient backend processes – in favor of real-time actions – to help drive more timely and effective decision making for streamlining operations and improving financial outcomes.
RealCost will supplement existing decision support applications – and in some cases replace specific decision support system functions entirely – like Cost Accounting, the first RealCost™ module released in 2018. Additional modules scheduled for release in 2019 include Rolling Forecast and Real Time Cost.
This document discusses Enterprise Data Analytics (EDA), an open source business analytics tool. EDA allows users to build reports and dashboards without code or SQL. It features an integrated data model, maps, alerts, Sankey charts, cross tables, master-detail reports, and more. EDA is available both for self-hosting and as a hosted service called EDA Server. It supports multiple languages and has an active user forum for feedback and questions.
ML, Statistics, and Spark with Databricks for Maximizing Revenue in a Delayed...Databricks
In this talk, we will present how we used Spark, Databricks, Airflow and MLflow to process big data, and build a pipeline of both ML(XGBoost) and statistical models that maximizes our revenues in one of our core products, called the “Offer Wall”. The “Offer wall” is a mobile phone product that is integrated with existing apps, suggesting users to perform tasks in exchange for in-app currency. The problem gets even more interesting when considering the fact that some of the tasks users do take 15 minutes and some may take up to 2 to weeks, forcing us to make revenue determining decisions in an uncertain space all of the time. The solution we developed utilizes Databricks and Spark’s strengths and diversity in machine learning, big data, MLflow and Airflow integrations, allowing us to deliver a production-grade solution with short development time between experiments.
BDT has moved from SAS-based workflow a cloud-based workflow leveraging tools like BigQuery, Looker, and Apache Airflow. Originally presented at the 2018 Pennsylvania Data Users Conference: https://pasdcconference.org/
Using open source BI. Practical experience 2012 - EnSerge Ivanko
The document discusses the implementation of an open source business intelligence (BI) solution at a digital e-commerce company. Previously, the company used various homegrown systems and Google Analytics, requiring manual data merging. The goals of the new solution were to effectively manage key performance indicators (KPIs) and market responses. Pentaho was selected as it provided OLAP, pivots, and integration with the company's MySQL database. The implementation took around 6 months, resulting in a single source of data truth, timely insights from regular KPI monitoring, and new opportunities from business analysis and data mining.
This document provides an agenda and notes for a DB2 Update Day event held in March 2015. The agenda includes sessions on DB2 use cases, archives, native stored procedures, and updates for developers. Additional notes provide status updates from various Nordic locations attending the event with a total of 403 attendees. The document also includes sections on IBM zSystems and enabling mobile analytics and reporting on zSystems using tools like QMF and Cognos.
IBM offers a business analytics portfolio including Cognos BI, Cognos TM1, and SPSS to provide capabilities across reporting, planning, analysis, and predictive analytics. The suite integrates various data sources and allows different levels of analytics from basic reporting and dashboards to more advanced predictive modeling and optimization. It provides a unified workspace and promotes collaboration.
The document discusses reporting capabilities with SAP cloud solutions. It describes SAP's analytics concept and the types of reporting objects like data sources, reports, key figures, and characteristics. It outlines the integrated analytics capabilities for editing existing reports, designing new reports and data sources, and integrating reports with Microsoft Excel. It also discusses interfaces for connecting external analytics systems and retrieving analytics data using OData web services.
An Analytics Engineer’s Guide to Streaming With Amy Chen | Current 2022HostedbyConfluent
An Analytics Engineer’s Guide to Streaming With Amy Chen | Current 2022
What happens to the modern data stack (MDS) and analytics as a whole when streaming becomes accessible? For years, the MDS has been centered around batch-based workflows with dbt at its core, introducing software engineering best practices to analysts. But now with even major data warehouses like Snowflake getting in the game, expanding their streaming capabilities, what does that mean?
In this talk, we will explore what streaming in a batch-based analytics world should look like. How does that change your thoughts about implementing testing and performance optimization in your data pipelines? Do you still need dbt? And the question that we are all asking: do you really need a real-time dashboard?
This presentation is on the support that the WSO2 middleware platform provides for Big Data Analytics. Explains how WSO2 makes data driven intelligence for your enterprise easy. Explains both real time (complex event processing based) and batch mode (business activity monitoring based) options for Big Data Analytics.
P6 Analytics provides business intelligence and analytical reporting capabilities for project management data stored in Oracle Primavera P6. It allows users to create customizable dashboards to analyze project trends over time and support decision making. The document discusses the sample dashboards included in P6 Analytics and how users can build their own dashboards by combining data from the system's 13 subject areas into analyses, views, and prompts. It emphasizes taking a business perspective to identify key questions and building reusable reporting components to produce meaningful results.
Data Architecture at Vente-Exclusive.com - TOTM ExellysWout Scheepers
Vente-Exclusive is a leading e-commerce company in the Benelux region with over 6 million members and annual turnover of €126M in 2016. As the company looks to scale its business geographically and add new channels, its monolithic architecture needs to be modernized. The company transitioned to a microservices architecture using containers, Kubernetes, and Google Cloud Platform. This allows independent scaling of services, continuous deployment, and a focus on application development over infrastructure. Business intelligence was also improved through use of BigQuery, Tableau, and event-driven data collection across services.
The document summarizes new features in the Jedox 7 planning and performance management software. It highlights pre-built models for profit and loss, cost centers, HR planning, and sales planning that provide out-of-the-box functionality. It also describes the new Jedox marketplace for additional business content and models from Jedox and partners. Key benefits of Jedox 7 include simplified planning, intuitive visualization and reporting, enterprise performance, and support for secure cloud, hybrid and on-premises deployment.
Dynamics Day 2014: Microsoft Dynamics NAV - Business Insight (Reporting and A...Intergen
Reporting from simple, do-it-yourself through to advanced analytics.
Dynamics Day is Australasia's leading event for users of Microsoft Dynamics. For those of you who couldn't make it along to the event, we have made all session content available online.
This document discusses DevOps and MLOps practices for machine learning models. It outlines that while ML development shares some similarities with traditional software development, such as using version control and CI/CD pipelines, there are also key differences related to data, tools, and people. Specifically, ML requires additional focus on exploratory data analysis, feature engineering, and specialized infrastructure for training and deploying models. The document provides an overview of how one company structures their ML team and processes.
1. Spil Games uses a bottoms-up monthly forecasting process where ARIMA models in R are used to generate initial traffic forecasts for 500 markets/channels which are then loaded into Tableau for exploratory analysis and adjustment.
2. Key business users explore and modify the forecasts in Tableau before the adjusted forecast is loaded back into the data warehouse.
3. Forecasting considers factors like seasonality, known events, and regressors to predict metrics like traffic, gameplays, pageviews, and advertising across markets/channels on a monthly basis.
This document discusses next generation big data business intelligence (BI). It describes traditional BI and how it is evolving to incorporate big data. Key points:
- Traditional BI includes dashboards, KPIs, OLAP, reporting, and forecasting to provide insights from structured data.
- Next generation BI leverages big data technologies like Hadoop and NoSQL databases to handle larger and more diverse unstructured data in batch and real-time.
- This enables deeper insights through analytics across all data, from basic queries to advanced predictive modeling and streaming analysis.
- The modern BI stack incorporates big data technologies alongside traditional data warehousing and OLAP for integrated insights.
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeWalaa Eldin Moustafa
Dynamic policy enforcement is becoming an increasingly important topic in today’s world where data privacy and compliance is a top priority for companies, individuals, and regulators alike. In these slides, we discuss how LinkedIn implements a powerful dynamic policy enforcement engine, called ViewShift, and integrates it within its data lake. We show the query engine architecture and how catalog implementations can automatically route table resolutions to compliance-enforcing SQL views. Such views have a set of very interesting properties: (1) They are auto-generated from declarative data annotations. (2) They respect user-level consent and preferences (3) They are context-aware, encoding a different set of transformations for different use cases (4) They are portable; while the SQL logic is only implemented in one SQL dialect, it is accessible in all engines.
#SQL #Views #Privacy #Compliance #DataLake
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...sameer shah
"Join us for STATATHON, a dynamic 2-day event dedicated to exploring statistical knowledge and its real-world applications. From theory to practice, participants engage in intensive learning sessions, workshops, and challenges, fostering a deeper understanding of statistical methodologies and their significance in various fields."
Build applications with generative AI on Google CloudMárton Kodok
We will explore Vertex AI - Model Garden powered experiences, we are going to learn more about the integration of these generative AI APIs. We are going to see in action what the Gemini family of generative models are for developers to build and deploy AI-driven applications. Vertex AI includes a suite of foundation models, these are referred to as the PaLM and Gemini family of generative ai models, and they come in different versions. We are going to cover how to use via API to: - execute prompts in text and chat - cover multimodal use cases with image prompts. - finetune and distill to improve knowledge domains - run function calls with foundation models to optimize them for specific tasks. At the end of the session, developers will understand how to innovate with generative AI and develop apps using the generative ai industry trends.
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...Social Samosa
The Modern Marketing Reckoner (MMR) is a comprehensive resource packed with POVs from 60+ industry leaders on how AI is transforming the 4 key pillars of marketing – product, place, price and promotions.
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Kaxil Naik
Navigating today's data landscape isn't just about managing workflows; it's about strategically propelling your business forward. Apache Airflow has stood out as the benchmark in this arena, driving data orchestration forward since its early days. As we dive into the complexities of our current data-rich environment, where the sheer volume of information and its timely, accurate processing are crucial for AI and ML applications, the role of Airflow has never been more critical.
In my journey as the Senior Engineering Director and a pivotal member of Apache Airflow's Project Management Committee (PMC), I've witnessed Airflow transform data handling, making agility and insight the norm in an ever-evolving digital space. At Astronomer, our collaboration with leading AI & ML teams worldwide has not only tested but also proven Airflow's mettle in delivering data reliably and efficiently—data that now powers not just insights but core business functions.
This session is a deep dive into the essence of Airflow's success. We'll trace its evolution from a budding project to the backbone of data orchestration it is today, constantly adapting to meet the next wave of data challenges, including those brought on by Generative AI. It's this forward-thinking adaptability that keeps Airflow at the forefront of innovation, ready for whatever comes next.
The ever-growing demands of AI and ML applications have ushered in an era where sophisticated data management isn't a luxury—it's a necessity. Airflow's innate flexibility and scalability are what makes it indispensable in managing the intricate workflows of today, especially those involving Large Language Models (LLMs).
This talk isn't just a rundown of Airflow's features; it's about harnessing these capabilities to turn your data workflows into a strategic asset. Together, we'll explore how Airflow remains at the cutting edge of data orchestration, ensuring your organization is not just keeping pace but setting the pace in a data-driven future.
Session in https://budapestdata.hu/2024/04/kaxil-naik-astronomer-io/ | https://dataml24.sessionize.com/session/667627
Learn SQL from basic queries to Advance queriesmanishkhaire30
Dive into the world of data analysis with our comprehensive guide on mastering SQL! This presentation offers a practical approach to learning SQL, focusing on real-world applications and hands-on practice. Whether you're a beginner or looking to sharpen your skills, this guide provides the tools you need to extract, analyze, and interpret data effectively.
Key Highlights:
Foundations of SQL: Understand the basics of SQL, including data retrieval, filtering, and aggregation.
Advanced Queries: Learn to craft complex queries to uncover deep insights from your data.
Data Trends and Patterns: Discover how to identify and interpret trends and patterns in your datasets.
Practical Examples: Follow step-by-step examples to apply SQL techniques in real-world scenarios.
Actionable Insights: Gain the skills to derive actionable insights that drive informed decision-making.
Join us on this journey to enhance your data analysis capabilities and unlock the full potential of SQL. Perfect for data enthusiasts, analysts, and anyone eager to harness the power of data!
#DataAnalysis #SQL #LearningSQL #DataInsights #DataScience #Analytics
5. Open Business Analytics
Featured News
EDA
New connectors to "big data" databases:
● Google Big Query
● Snowflake
In addition to the usual ones:
● Vertica
● PostgreSQL
● MySql / MariaDB
● SqlServer
● Oracle
6. Open Business Analytics
Featured News
EDA
New Graphics:
● Funnel
● Speedometer
In addition to:
● Boards
● Cross tables
● KPI
● Pie
● Polar
● Bars
● Stacked bars
● Horizontal bars
● Lines
● Mixed bars and lines
● Parallel series
● Treemap
● Scatterplot
● Coordinate Map
● Map of areas
7. Open Business Analytics
Featured News
EDA
Smart Cache
○ No Cache⇒ Real time
○ With Cache:
■ Choose the moment of refreshment
■ Choose what to cache
8. Open Business Analytics
Featured News
EDA
Email alerts:
KPIs can be configured as alerts that
are checked regularly and send an
alert email.
10. Open Business Analytics
How to have it
EDA
EDA is Open Source:
● Github:
https://github.com/jortilles/EDA
● Docker:
docker run -p 80:80 jortilles/eda:latest
EDAServer: