Midway in our life's journey, I went astray from the straight imperative road and woke to find myself alone in a dark declarative wood.
My guide out of this dark declarative wood was a familiar friend, SQL, who showed me the way to wrap a context of a window to push through using Window Functions to escape the Inferno.
Next I found myself somewhere in-between running up hill with one foot in front of the other advancing so as the leading foot was always above the ground running with my friend LINQ, I was able to wrap the context of a collection around my data to advance my journey through Purgatorio.
My last guide into the blinding brilliant light of Paradiso was from the Dutch Caribbean, who taught me how to wrap my computations into a context and move my data through leading me into brilliant bliss.
Join me on my divine data comedy.
Overview of a few ways to group and summarize data in R using sample airfare data from DOT/BTS's O&D Survey.
Starts with naive approach with subset() & loops, shows base R's tapply() & aggregate(), highlights doBy and plyr packages.
Presented at the March 2011 meeting of the Greater Boston useR Group.
Python for R developers and data scientistsLambda Tree
This is an introductory talk aimed at data scientists who are well versed with R but would like to work with Python as well. I will cover common workflows in R and how they translate into Python. No Python experience necessary.
Cascading provides a simpler way to write MapReduce programs through data flows. It uses a pipe and tap metaphor where data flows through pipes and is read from or written to taps. This allows assembling MapReduce jobs as data flow graphs in a more logical way compared to the traditional MapReduce API.
This document provides an overview of the R programming language. It describes R as a functional programming language for statistical computing and graphics that is open source and has over 6000 packages. Key features of R discussed include matrix calculation, data visualization, statistical analysis, machine learning, and data manipulation. The document also covers using R Studio as an IDE, reading and writing different data types, programming features like flow control and functions, and examples of correlation, regression, and plotting in R.
The document defines a function called covcor() that calculates and returns the covariance and correlation between variables in a data frame. The function takes a data frame as input, splits it by a grouping variable, applies covariance and correlation calculations to subsets of the data, and combines the results into an output data frame. Three methods for defining the covcor() function are presented: 1) Using subset() and merge(), 2) Using tapply(), and 3) Using ddply() from the plyr package. The function is demonstrated on orange tree data to calculate covariance and correlation between tree age and circumference for each tree. Transforming the circumference variable affects the covariance but not the correlation, demonstrating properties of these statistical measures.
- R is a free software environment for statistical computing and graphics. It has an active user community and supports graphical capabilities.
- R can import and export data, perform data manipulation and summaries. It provides various plotting functions and control structures to control program flow.
- Debugging tools in R include traceback, debug, browser and trace which help identify and fix issues in functions.
This Tutorial is designed for the HDF5 users with some HDF5 experience. It will cover advanced features of the HDF5 library for achieving better I/O performance and efficient storage. The following HDF5 features will be discussed: partial I/O, compression and other filters including new n-bit and scale+offset filters, and data storage options. Significant time will be devoted to the discussion of complex HDF5 datatypes such as strings, variable-length, array and compound datatypes. Participants will work with the Tutorial examples and exercises during the hands-on sessions.
The document provides an overview of using the R programming language for data munging, modeling, visualization and simulation. It discusses R's capabilities for data manipulation, statistical modeling, visualization and as a programming language. Specific functions and code examples are provided for basic math operations, probability distributions, data structures and simulation.
Overview of a few ways to group and summarize data in R using sample airfare data from DOT/BTS's O&D Survey.
Starts with naive approach with subset() & loops, shows base R's tapply() & aggregate(), highlights doBy and plyr packages.
Presented at the March 2011 meeting of the Greater Boston useR Group.
Python for R developers and data scientistsLambda Tree
This is an introductory talk aimed at data scientists who are well versed with R but would like to work with Python as well. I will cover common workflows in R and how they translate into Python. No Python experience necessary.
Cascading provides a simpler way to write MapReduce programs through data flows. It uses a pipe and tap metaphor where data flows through pipes and is read from or written to taps. This allows assembling MapReduce jobs as data flow graphs in a more logical way compared to the traditional MapReduce API.
This document provides an overview of the R programming language. It describes R as a functional programming language for statistical computing and graphics that is open source and has over 6000 packages. Key features of R discussed include matrix calculation, data visualization, statistical analysis, machine learning, and data manipulation. The document also covers using R Studio as an IDE, reading and writing different data types, programming features like flow control and functions, and examples of correlation, regression, and plotting in R.
The document defines a function called covcor() that calculates and returns the covariance and correlation between variables in a data frame. The function takes a data frame as input, splits it by a grouping variable, applies covariance and correlation calculations to subsets of the data, and combines the results into an output data frame. Three methods for defining the covcor() function are presented: 1) Using subset() and merge(), 2) Using tapply(), and 3) Using ddply() from the plyr package. The function is demonstrated on orange tree data to calculate covariance and correlation between tree age and circumference for each tree. Transforming the circumference variable affects the covariance but not the correlation, demonstrating properties of these statistical measures.
- R is a free software environment for statistical computing and graphics. It has an active user community and supports graphical capabilities.
- R can import and export data, perform data manipulation and summaries. It provides various plotting functions and control structures to control program flow.
- Debugging tools in R include traceback, debug, browser and trace which help identify and fix issues in functions.
This Tutorial is designed for the HDF5 users with some HDF5 experience. It will cover advanced features of the HDF5 library for achieving better I/O performance and efficient storage. The following HDF5 features will be discussed: partial I/O, compression and other filters including new n-bit and scale+offset filters, and data storage options. Significant time will be devoted to the discussion of complex HDF5 datatypes such as strings, variable-length, array and compound datatypes. Participants will work with the Tutorial examples and exercises during the hands-on sessions.
The document provides an overview of using the R programming language for data munging, modeling, visualization and simulation. It discusses R's capabilities for data manipulation, statistical modeling, visualization and as a programming language. Specific functions and code examples are provided for basic math operations, probability distributions, data structures and simulation.
This document discusses various functions in R for exporting data, including print(), cat(), paste(), paste0(), sprintf(), writeLines(), write(), write.table(), write.csv(), and sink(). It provides descriptions, syntax, examples, and help documentation for each function. The functions can be used to output data to the console, files, or save R objects. write.table() and write.csv() convert data to a data frame or matrix before writing to a text file or CSV. sink() diverts R output to a file instead of the console.
Learn to manipulate strings in R using the built in R functions. This tutorial is part of the Working With Data module of the R Programming Course offered by r-squared.
This document summarizes a presentation about the North Concepts Data Pipeline framework. It describes how Data Pipeline allows converting, transforming, and transferring data through a Java library. It uses a pipes and filters architecture with readers, writers, and operations to process data records. Common data formats and transformations are supported out of the box, and the framework can be customized through subclassing components or implementing custom readers, writers, and transformers.
This document provides information about functions in Apache Hive, including a cheat sheet covering user defined functions (UDFs) and built-in functions. It describes how to create UDFs, UDAFs, and UDTFs in Hive along with examples. The document also lists many common mathematical, string, date and other function types available in Hive with descriptions.
The document describes time series analysis techniques in R including decomposition, forecasting, clustering, and classification. It provides examples of decomposing the AirPassengers dataset, building an ARIMA forecasting model, hierarchical clustering with Euclidean and DTW distances, and classifying the synthetic control chart time series data using decision trees with and without discrete wavelet transform feature extraction. Accuracy improved from 81.8% to 88.8% when using DWT features with decision trees.
Spark 4th Meetup Londond - Building a Product with Sparksamthemonad
This document discusses common technical problems encountered when building products with Spark and provides solutions. It covers Spark exceptions like out of memory errors and shuffle file problems. It recommends increasing partitions and memory configurations. The document also discusses optimizing Spark code using functional programming principles like strong and weak pipelining, and leveraging monoid structures to reduce shuffling. Overall it provides tips to debug issues, optimize performance, and productize Spark applications.
Stefan Hochdörfer - The NoSQL Store everyone ignores: PostgreSQL - NoSQL matt...NoSQLmatters
PostgreSQL is well known being an object-relational database management system. In it`s core PostgreSQL is schema-aware dealing with fixed database tables and column types. However, recent versions of PostgreSQL made it possible to deal with schema-free data. Learn which new features PostgreSQL supports and how to use those features in your application.
1) The document discusses merging multiple CSV files containing air pollution data from 332 different monitors into a single dataframe in R. Each file has data from a single monitor and the ID is in the file name (e.g. data for monitor 200 is in "200.csv").
2) It provides information on relevant functions in R like rbind() and lists the steps to bind all 332 files into a single dataframe. First, all file paths are stored in a list using list.files(). Then, the files are read and row bound (rbind()) into a growing data object.
3) It also discusses how to handle missing values (NAs) in R and provides an example function to calculate the
This talk was prepared for the November 2013 DataPhilly Meetup: Data in Practice ( http://www.meetup.com/DataPhilly/events/149515412/ )
Map Reduce: Beyond Word Count by Jeff Patti
Have you ever wondered what map reduce can be used for beyond the word count example you see in all the introductory articles about map reduce? Using Python and mrjob, this talk will cover a few simple map reduce algorithms that in part power Monetate's information pipeline
Bio: Jeff Patti is a backend engineer at Monetate with a passion for algorithms, big data, and long walks on the beach. Prior to working at Monetate he performed software R&D for Lockheed Martin, where he worked on projects ranging from social network analysis to robotics.
Cassandra data structures and algorithmsDuyhai Doan
This document discusses Cassandra data structures and algorithms. It begins with an introduction and agenda, then covers Cassandra's use of CRDTs, bloom filters, and Merkle trees for its data model. It explains how Cassandra columns can be modeled as a CRDT join semilattice and proves their eventual convergence. The document also covers Cassandra's write path, read path optimized with bloom filters, and the math behind bloom filter probabilities.
- Apply functions in R are used to apply a specified function to each column or row of R objects. Common apply functions include apply(), lapply(), sapply(), tapply(), vapply(), and mapply().
- The dplyr package is a powerful R package for data manipulation. It provides verbs like select(), filter(), arrange(), mutate(), and summarize() to work with tabular data.
- Functions like apply(), lapply(), sapply() apply a function over lists or matrices. Arrange() reorders data, mutate() adds new variables, and summarize() collapses multiple values into single values.
The document contains summaries of code snippets and explanations of technical concepts. It discusses:
1) How a code snippet with post-increment operator i++ would output a garbage value.
2) Why a code snippet multiplying two ints and storing in a long int variable would not give the desired output.
3) Why a code snippet attempting to concatenate a character to a string would not work.
4) How to determine the maximum number of elements an array can hold based on its data type and memory model.
5) How to read data from specific memory locations using the peekb() function in C.
This document contains a dictionary that maps nucleotide sequences to amino acids. It defines the genetic code by listing the 3-letter codon sequences and their corresponding single-letter amino acid abbreviations. The document also contains some example nucleotide sequences and a prompt to find the answer in the file ultimate-sequence.txt.
Data warehouse or conventional database: Which is right for you?Data Con LA
Data Con LA 2020
Description
Developers have a plethora of choice for application data stores. In this talk we'll explore the differences between transaction processing systems like MySQL and analytic databases like ClickHouse to help you make the best choice for your application. Confused about when to use a data warehouse vs a traditional relational database? Open source has so many choices! Using MySQL and ClickHouse as examples, we'll work through use cases to see where each shines. Along the way we'll explore key technical differences like:
* row vs. column storage
* indexing and compression
* query parallelization
* concurrency support
* transaction models.
Finally we'll discuss how to handle use cases that require capabilities of both. Listeners will leave with clear criteria and and deeper understanding of database internals that enable them to make the right choice(s) for their own use cases.
Speaker
Robert Hodges, Altinity, Inc, CEO
This document provides an overview of MongoDB aggregation which allows processing data records and returning computed results. It describes some common aggregation pipeline stages like $match, $lookup, $project, and $unwind. $match filters documents, $lookup performs a left outer join, $project selects which fields to pass to the next stage, and $unwind deconstructs an array field. The document also lists other pipeline stages and aggregation pipeline operators for arithmetic, boolean, and comparison expressions.
This document provides a cheat sheet for frequently used commands in Stata for data processing, exploration, transformation, and management. It highlights commands for viewing and summarizing data, importing and exporting data, string manipulation, merging datasets, and more. Keyboard shortcuts for navigating Stata are also included.
Data Exploration and Visualization with RYanchang Zhao
- The document discusses exploring and visualizing data with R. It explores the iris data set through various visualizations and statistical analyses.
- Individual variables in the iris data are explored through histograms, density plots, and summaries of their distributions. Correlations between variables are also examined.
- Multiple variables are visualized through scatter plots, box plots, and a heatmap of distances between observations. Three-dimensional scatter plots are also demonstrated.
- The document shows how to access attributes of the data, view the first rows, and aggregate statistics by subgroups. Various plots are created to visualize the data from different perspectives.
Stata cheat sheet: programming. Co-authored with Tim Essam (linkedin.com/in/timessam). See all cheat sheets at http://bit.ly/statacheatsheets. Updated 2016/06/04
This document discusses refactoring Java code to Clojure using macros. It provides examples of refactoring Java code that uses method chaining to equivalent Clojure code using the threading macros (->> and -<>). It also discusses other Clojure features like type hints, the doto macro, and polyglot projects using Leiningen.
Pandas is a popular Python library for data analysis and manipulation. It allows users to import data from various sources like CSV, Excel, JSON, databases into DataFrames. DataFrames can then be queried, filtered, reshaped, inspected, and exported back into different formats. Some key operations include importing/exporting data, subsetting rows and columns, reshaping the layout, and aggregating/summarizing the data.
This document discusses various functions in R for exporting data, including print(), cat(), paste(), paste0(), sprintf(), writeLines(), write(), write.table(), write.csv(), and sink(). It provides descriptions, syntax, examples, and help documentation for each function. The functions can be used to output data to the console, files, or save R objects. write.table() and write.csv() convert data to a data frame or matrix before writing to a text file or CSV. sink() diverts R output to a file instead of the console.
Learn to manipulate strings in R using the built in R functions. This tutorial is part of the Working With Data module of the R Programming Course offered by r-squared.
This document summarizes a presentation about the North Concepts Data Pipeline framework. It describes how Data Pipeline allows converting, transforming, and transferring data through a Java library. It uses a pipes and filters architecture with readers, writers, and operations to process data records. Common data formats and transformations are supported out of the box, and the framework can be customized through subclassing components or implementing custom readers, writers, and transformers.
This document provides information about functions in Apache Hive, including a cheat sheet covering user defined functions (UDFs) and built-in functions. It describes how to create UDFs, UDAFs, and UDTFs in Hive along with examples. The document also lists many common mathematical, string, date and other function types available in Hive with descriptions.
The document describes time series analysis techniques in R including decomposition, forecasting, clustering, and classification. It provides examples of decomposing the AirPassengers dataset, building an ARIMA forecasting model, hierarchical clustering with Euclidean and DTW distances, and classifying the synthetic control chart time series data using decision trees with and without discrete wavelet transform feature extraction. Accuracy improved from 81.8% to 88.8% when using DWT features with decision trees.
Spark 4th Meetup Londond - Building a Product with Sparksamthemonad
This document discusses common technical problems encountered when building products with Spark and provides solutions. It covers Spark exceptions like out of memory errors and shuffle file problems. It recommends increasing partitions and memory configurations. The document also discusses optimizing Spark code using functional programming principles like strong and weak pipelining, and leveraging monoid structures to reduce shuffling. Overall it provides tips to debug issues, optimize performance, and productize Spark applications.
Stefan Hochdörfer - The NoSQL Store everyone ignores: PostgreSQL - NoSQL matt...NoSQLmatters
PostgreSQL is well known being an object-relational database management system. In it`s core PostgreSQL is schema-aware dealing with fixed database tables and column types. However, recent versions of PostgreSQL made it possible to deal with schema-free data. Learn which new features PostgreSQL supports and how to use those features in your application.
1) The document discusses merging multiple CSV files containing air pollution data from 332 different monitors into a single dataframe in R. Each file has data from a single monitor and the ID is in the file name (e.g. data for monitor 200 is in "200.csv").
2) It provides information on relevant functions in R like rbind() and lists the steps to bind all 332 files into a single dataframe. First, all file paths are stored in a list using list.files(). Then, the files are read and row bound (rbind()) into a growing data object.
3) It also discusses how to handle missing values (NAs) in R and provides an example function to calculate the
This talk was prepared for the November 2013 DataPhilly Meetup: Data in Practice ( http://www.meetup.com/DataPhilly/events/149515412/ )
Map Reduce: Beyond Word Count by Jeff Patti
Have you ever wondered what map reduce can be used for beyond the word count example you see in all the introductory articles about map reduce? Using Python and mrjob, this talk will cover a few simple map reduce algorithms that in part power Monetate's information pipeline
Bio: Jeff Patti is a backend engineer at Monetate with a passion for algorithms, big data, and long walks on the beach. Prior to working at Monetate he performed software R&D for Lockheed Martin, where he worked on projects ranging from social network analysis to robotics.
Cassandra data structures and algorithmsDuyhai Doan
This document discusses Cassandra data structures and algorithms. It begins with an introduction and agenda, then covers Cassandra's use of CRDTs, bloom filters, and Merkle trees for its data model. It explains how Cassandra columns can be modeled as a CRDT join semilattice and proves their eventual convergence. The document also covers Cassandra's write path, read path optimized with bloom filters, and the math behind bloom filter probabilities.
- Apply functions in R are used to apply a specified function to each column or row of R objects. Common apply functions include apply(), lapply(), sapply(), tapply(), vapply(), and mapply().
- The dplyr package is a powerful R package for data manipulation. It provides verbs like select(), filter(), arrange(), mutate(), and summarize() to work with tabular data.
- Functions like apply(), lapply(), sapply() apply a function over lists or matrices. Arrange() reorders data, mutate() adds new variables, and summarize() collapses multiple values into single values.
The document contains summaries of code snippets and explanations of technical concepts. It discusses:
1) How a code snippet with post-increment operator i++ would output a garbage value.
2) Why a code snippet multiplying two ints and storing in a long int variable would not give the desired output.
3) Why a code snippet attempting to concatenate a character to a string would not work.
4) How to determine the maximum number of elements an array can hold based on its data type and memory model.
5) How to read data from specific memory locations using the peekb() function in C.
This document contains a dictionary that maps nucleotide sequences to amino acids. It defines the genetic code by listing the 3-letter codon sequences and their corresponding single-letter amino acid abbreviations. The document also contains some example nucleotide sequences and a prompt to find the answer in the file ultimate-sequence.txt.
Data warehouse or conventional database: Which is right for you?Data Con LA
Data Con LA 2020
Description
Developers have a plethora of choice for application data stores. In this talk we'll explore the differences between transaction processing systems like MySQL and analytic databases like ClickHouse to help you make the best choice for your application. Confused about when to use a data warehouse vs a traditional relational database? Open source has so many choices! Using MySQL and ClickHouse as examples, we'll work through use cases to see where each shines. Along the way we'll explore key technical differences like:
* row vs. column storage
* indexing and compression
* query parallelization
* concurrency support
* transaction models.
Finally we'll discuss how to handle use cases that require capabilities of both. Listeners will leave with clear criteria and and deeper understanding of database internals that enable them to make the right choice(s) for their own use cases.
Speaker
Robert Hodges, Altinity, Inc, CEO
This document provides an overview of MongoDB aggregation which allows processing data records and returning computed results. It describes some common aggregation pipeline stages like $match, $lookup, $project, and $unwind. $match filters documents, $lookup performs a left outer join, $project selects which fields to pass to the next stage, and $unwind deconstructs an array field. The document also lists other pipeline stages and aggregation pipeline operators for arithmetic, boolean, and comparison expressions.
This document provides a cheat sheet for frequently used commands in Stata for data processing, exploration, transformation, and management. It highlights commands for viewing and summarizing data, importing and exporting data, string manipulation, merging datasets, and more. Keyboard shortcuts for navigating Stata are also included.
Data Exploration and Visualization with RYanchang Zhao
- The document discusses exploring and visualizing data with R. It explores the iris data set through various visualizations and statistical analyses.
- Individual variables in the iris data are explored through histograms, density plots, and summaries of their distributions. Correlations between variables are also examined.
- Multiple variables are visualized through scatter plots, box plots, and a heatmap of distances between observations. Three-dimensional scatter plots are also demonstrated.
- The document shows how to access attributes of the data, view the first rows, and aggregate statistics by subgroups. Various plots are created to visualize the data from different perspectives.
Stata cheat sheet: programming. Co-authored with Tim Essam (linkedin.com/in/timessam). See all cheat sheets at http://bit.ly/statacheatsheets. Updated 2016/06/04
This document discusses refactoring Java code to Clojure using macros. It provides examples of refactoring Java code that uses method chaining to equivalent Clojure code using the threading macros (->> and -<>). It also discusses other Clojure features like type hints, the doto macro, and polyglot projects using Leiningen.
Pandas is a popular Python library for data analysis and manipulation. It allows users to import data from various sources like CSV, Excel, JSON, databases into DataFrames. DataFrames can then be queried, filtered, reshaped, inspected, and exported back into different formats. Some key operations include importing/exporting data, subsetting rows and columns, reshaping the layout, and aggregating/summarizing the data.
A good data model is key to getting the best performance from Apache Cassandra. The Log Structured Storage Engine and it's distributed architecture mean we cannot rely on a paradigm such as Normal Form to evaluate a model. Instead we need to design data models that support the read path of the application. In this talk Aaron Morton will walk through the key principles and patterns of a good CQL3 data model using simple examples.
R is an open source statistical computing platform that is rapidly growing in popularity within academia. It allows for statistical analysis and data visualization. The document provides an introduction to basic R functions and syntax for assigning values, working with data frames, filtering data, plotting, and connecting to databases. More advanced techniques demonstrated include decision trees, random forests, and other data mining algorithms.
This document provides an overview of using R for financial modeling. It covers basic R commands for calculations, vectors, matrices, lists, data frames, and importing/exporting data. Graphical functions like plots, bar plots, pie charts, and boxplots are demonstrated. Advanced topics discussed include distributions, parameter estimation, correlations, linear and nonlinear regression, technical analysis packages, and practical exercises involving financial data analysis and modeling.
This document provides an introduction and overview of Cassandra and NoSQL databases. It discusses the challenges faced by modern web applications that led to the development of NoSQL databases. It then describes Cassandra's data model, API, consistency model, and architecture including write path, read path, compactions, and more. Key features of Cassandra like tunable consistency levels and high availability are also highlighted.
PostgreSQL provides several data types for storing different kinds of data. It supports standard SQL data types like integers, floats, characters, and timestamps. It also provides range, network, geometric, and JSON data types. PostgreSQL allows arrays, composite types, and domains to define constraints on columns. Additional types like hstore and XML are provided by extensions. Indexes can be created on columns and expressions to improve query performance. Tables can be partitioned using table inheritance.
This document provides a summary of frequently used commands in Stata with brief explanations. Key commands are highlighted for importing, exploring, summarizing, and viewing data. Other sections cover manipulating strings, combining data, and reshaping data.
This document discusses SQL windowing functions. It covers topics such as window aggregate functions like COUNT, SUM, AVG; set-based vs iterative programming; uses of window functions like paging, deduplicating data, and running totals; ranking functions; common table expressions; optimizing ranking functions; creating sequences; removing duplicate entries; pivoting; and what's new in SQL Server 2012 like distribution and offset functions.
AWS July Webinar Series: Amazon Redshift Optimizing PerformanceAmazon Web Services
This document provides an overview and best practices for optimizing performance on Amazon Redshift. It discusses topics like data distribution, sort keys, compression, loading data efficiently, vacuum operations, and query processing. The webinar agenda covers architecture, distribution styles, sort keys, compression, workload management and more. Examples are provided to demonstrate how different techniques can significantly improve query performance. Administrative scripts and views are also recommended as helpful tools.
The document discusses pointers and arrays in C programming. It explains that an array stores multiple elements of the same type in contiguous memory locations, while a pointer variable stores the address of another variable. The summary demonstrates how to declare and initialize arrays and pointers, access array elements using pointers, pass arrays to functions by reference using pointers, and how pointers and arrays are related but not synonymous concepts.
Getting started with Pandas Cheatsheet.pdfSudhakarVenkey
Pandas is a popular Python library for data wrangling and analysis that can handle various data formats like CSV, Excel, JSON, HTML, and SQL. It allows users to create DataFrames from dictionaries, lists, and by importing data from text, Excel files, websites, databases and JSON. DataFrames can be exported, inspected by viewing rows and columns, subsetted by selecting rows and columns, queried using logical conditions, and reshaped by changing layout, renaming columns, appending rows/columns, and sorting values.
This is a brief introduction to how R can be useful in the manufacturing sector to calculate the frequency of faults and then developing the model so that preventive maintenance can be done
The document discusses Structured Query Language (SQL). It describes SQL as a declarative query language used to define database schemas, manipulate data through queries, and perform operations like insert, update, delete. It also outlines SQL's data definition language for defining database structure and data types, and its data manipulation language for conducting queries and CRUD operations. The document provides a brief history of SQL and describes the SQL standard.
UKOUG Tech14 - Using Database In-Memory Column Store with Complex DatatypesMarco Gralike
Presentation used during the UKOUG Tech14 conference in Liverpool (UK) discussing possibilities of the use of the 12.1.0.2 In-Memory Column Store option with XMLType data type storage options common to the Oracle 12.1 database
This document summarizes the analysis of pump failure data from three different datasets - Kirlosker, KSB, and a turbine/compressor dataset. The data is extracted from multiple excel files into R and combined. Dates are standardized and the number of days between installation and first failure is calculated for each pump. The failure notification dates are arranged in descending order for each pump and the number of days between failures is found. The analysis aims to understand pump reliability and failure patterns over time.
Designing a database like an archaeologistyoavrubin
The document describes the design approach for Datomic, a database modeled after archaeological excavation sites. Each change to an entity creates a new layer in the database, similar to layers uncovered during excavation. Entities have attributes with values at specific times, represented as datoms. Transactions operate by adding layers and updating database state. Queries can retrieve the evolutionary history of attributes or treat the database as a graph with entities as nodes and references as labeled links.
This document provides a cheat sheet on vector and matrix operations, time series analysis functions, modeling functions, and plotting functions in R. It includes the basic syntax for constructing and selecting vectors and matrices, as well as functions for time series decomposition, modeling, testing, and plotting time series data. Examples are given for accessing Quandl data and plotting it using ggplot2.
Desk reference for data wrangling, analysis, visualization, and programming in Stata. Co-authored with Tim Essam(@StataRGIS, linkedin.com/in/timessam). See all cheat sheets at http://bit.ly/statacheatsheets. Updated 2016/06/03
S, K, and I combinators with example from C# using MoreLINQ's pipe (K combinator). Implementations of S, K, and I combinators in F# and JavaScript. Ends with proof that SKK = I.
Great Scott! C# 7 is almost out!
Time to hop into the DeLorean with Doc Brown. If my calculations are correct, when this baby hits 88 miles per hour, we'll be traveling back to the future with C# 7. Switches? Where we're going, we don't need Switches we got Pattern Matching. Just hold on for one second. Let's get this straight, in the future Tuples will be usable! This will beg the questions. Where are we? When are we? We are in the future and we now have Local Functions and Records.
You'll walk away with a sense of where C# is going and how you can learn about its new features today by looking back to the future. This is heavy.
Hiking through the Functional Forest with Fizz BuzzMike Harris
FizzBuzz the interview question which we all laugh at. What if I told you that FizzBuzz can be used as a tool to navigate your way through the functional forest? Join me on a short hike through the functional forest using FizzBuzz to navigate our way. We’ll look at Union Types, Pattern Matching, and Higher Order Functions all while staying in the domain of FizzBuzz. You’ll never look at FizzBuzz the same way again.
The document discusses Roman numeral kata and provides potential solutions in F#. It also lists resources for learning more about F#, lambda calculus, and type theory including book recommendations and links. It concludes by thanking the reader and sharing a quote about how programming languages should influence your thinking.
Using the work of Dr. Graham Hutton as our guide, we'll look at how to satisfy all of your list processing needs with one function, fold. First we'll start off simple by finding the length of a list, then we'll reverse a list, followed by and-ing and or-ing a list; all using fold. Next we'll look at implementing the higher order functions of: map, filter, and zip. Last we'll look at fold in action by using it on the Coin Changer kata. You'll never look at fold the same way again.
Using the work of Dr. Graham Hutton as our guide, we'll look at how to satisfy all of your list processing needs with one function, fold. First we'll start off simple by finding the length of a list, then we'll reverse a list, followed by and-ing and or-ing a list; all using fold. Next we'll look at implementing the higher order functions of: map, filter, and zip. Last we'll look at fold in action by using it on the Coin Changer kata. You'll never look at fold the same way again.
This document summarizes Mike Harris's travels learning Clojure. It outlines the key topics covered in the log, including higher order functions, variadic functions, destructuring, and working with map structures. Code examples are provided in JavaScript, ECMAScript 6, and Clojure to demonstrate these concepts. Bibliographies of relevant conference talks, online videos, and books are also included.
The Marvelous Land of Higher Order FunctionsMike Harris
From my talk at LCNUG.
http://www.lcnug.org/News/14-12-16/LCNUG_January_8th_-_The_Marvelous_Land_of_Higher_Order_Functions.aspx
Looking to be more pragmatic with LINQ? Want to see the connection between the C# you use every day and Functional Programming? Want to stop using for loops?
Join me on a trip to the marvelous Land of Higher Order Functions.
We’ll travel to the forest of Maps, Filters, and Folding—oh my!
Then we’ll visit the towns of Zip and Mapcat.
We’ll end our trip with a sight of the City of FP.
The document discusses different approaches to testing the FizzBuzz problem using properties and traits. It describes testing FizzBuzz translations using NUnit tests, SpecFlow scenarios, and property-based testing with FsCheck. Examples are provided of generating test data to validate that numbers divisible by 3 return "Fizz", numbers divisible by 5 return "Buzz", and numbers divisible by 15 return "FizzBuzz". The document suggests creating a model to further demonstrate the robustness of the FizzBuzz solution.
Learn You a Functional JavaScript for Great GoodMike Harris
We have been told by just about everyone that we should learning functional programming, let this session be your introduction using a language you already know JavaScript.
Yep, JavaScript. No endless amounts of parentheses. No monadic arrows. Just good old JavaScript.
We'll look take a look at the functional JavaScript landscape see how to use:
Currying
Combinators
Multimethods
We'll be using Underscore.js, allong.es, and bilby.js to help us.
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeAftab Hussain
Understanding variable roles in code has been found to be helpful by students
in learning programming -- could variable roles help deep neural models in
performing coding tasks? We do an exploratory study.
- These are slides of the talk given at InteNSE'23: The 1st International Workshop on Interpretability and Robustness in Neural Software Engineering, co-located with the 45th International Conference on Software Engineering, ICSE 2023, Melbourne Australia
Graspan: A Big Data System for Big Code AnalysisAftab Hussain
We built a disk-based parallel graph system, Graspan, that uses a novel edge-pair centric computation model to compute dynamic transitive closures on very large program graphs.
We implement context-sensitive pointer/alias and dataflow analyses on Graspan. An evaluation of these analyses on large codebases such as Linux shows that their Graspan implementations scale to millions of lines of code and are much simpler than their original implementations.
These analyses were used to augment the existing checkers; these augmented checkers found 132 new NULL pointer bugs and 1308 unnecessary NULL tests in Linux 4.4.0-rc5, PostgreSQL 8.3.9, and Apache httpd 2.2.18.
- Accepted in ASPLOS ‘17, Xi’an, China.
- Featured in the tutorial, Systemized Program Analyses: A Big Data Perspective on Static Analysis Scalability, ASPLOS ‘17.
- Invited for presentation at SoCal PLS ‘16.
- Invited for poster presentation at PLDI SRC ‘16.
WhatsApp offers simple, reliable, and private messaging and calling services for free worldwide. With end-to-end encryption, your personal messages and calls are secure, ensuring only you and the recipient can access them. Enjoy voice and video calls to stay connected with loved ones or colleagues. Express yourself using stickers, GIFs, or by sharing moments on Status. WhatsApp Business enables global customer outreach, facilitating sales growth and relationship building through showcasing products and services. Stay connected effortlessly with group chats for planning outings with friends or staying updated on family conversations.
OpenMetadata Community Meeting - 5th June 2024OpenMetadata
The OpenMetadata Community Meeting was held on June 5th, 2024. In this meeting, we discussed about the data quality capabilities that are integrated with the Incident Manager, providing a complete solution to handle your data observability needs. Watch the end-to-end demo of the data quality features.
* How to run your own data quality framework
* What is the performance impact of running data quality frameworks
* How to run the test cases in your own ETL pipelines
* How the Incident Manager is integrated
* Get notified with alerts when test cases fail
Watch the meeting recording here - https://www.youtube.com/watch?v=UbNOje0kf6E
May Marketo Masterclass, London MUG May 22 2024.pdfAdele Miller
Can't make Adobe Summit in Vegas? No sweat because the EMEA Marketo Engage Champions are coming to London to share their Summit sessions, insights and more!
This is a MUG with a twist you don't want to miss.
SOCRadar's Aviation Industry Q1 Incident Report is out now!
The aviation industry has always been a prime target for cybercriminals due to its critical infrastructure and high stakes. In the first quarter of 2024, the sector faced an alarming surge in cybersecurity threats, revealing its vulnerabilities and the relentless sophistication of cyber attackers.
SOCRadar’s Aviation Industry, Quarterly Incident Report, provides an in-depth analysis of these threats, detected and examined through our extensive monitoring of hacker forums, Telegram channels, and dark web platforms.
GraphSummit Paris - The art of the possible with Graph TechnologyNeo4j
Sudhir Hasbe, Chief Product Officer, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Łukasz Chruściel
No one wants their application to drag like a car stuck in the slow lane! Yet it’s all too common to encounter bumpy, pothole-filled solutions that slow the speed of any application. Symfony apps are not an exception.
In this talk, I will take you for a spin around the performance racetrack. We’ll explore common pitfalls - those hidden potholes on your application that can cause unexpected slowdowns. Learn how to spot these performance bumps early, and more importantly, how to navigate around them to keep your application running at top speed.
We will focus in particular on tuning your engine at the application level, making the right adjustments to ensure that your system responds like a well-oiled, high-performance race car.
Software Engineering, Software Consulting, Tech Lead, Spring Boot, Spring Cloud, Spring Core, Spring JDBC, Spring Transaction, Spring MVC, OpenShift Cloud Platform, Kafka, REST, SOAP, LLD & HLD.
Zoom is a comprehensive platform designed to connect individuals and teams efficiently. With its user-friendly interface and powerful features, Zoom has become a go-to solution for virtual communication and collaboration. It offers a range of tools, including virtual meetings, team chat, VoIP phone systems, online whiteboards, and AI companions, to streamline workflows and enhance productivity.
Hand Rolled Applicative User ValidationCode KataPhilip Schwarz
Could you use a simple piece of Scala validation code (granted, a very simplistic one too!) that you can rewrite, now and again, to refresh your basic understanding of Applicative operators <*>, <*, *>?
The goal is not to write perfect code showcasing validation, but rather, to provide a small, rough-and ready exercise to reinforce your muscle-memory.
Despite its grandiose-sounding title, this deck consists of just three slides showing the Scala 3 code to be rewritten whenever the details of the operators begin to fade away.
The code is my rough and ready translation of a Haskell user-validation program found in a book called Finding Success (and Failure) in Haskell - Fall in love with applicative functors.
Enterprise Resource Planning System includes various modules that reduce any business's workload. Additionally, it organizes the workflows, which drives towards enhancing productivity. Here are a detailed explanation of the ERP modules. Going through the points will help you understand how the software is changing the work dynamics.
To know more details here: https://blogs.nyggs.com/nyggs/enterprise-resource-planning-erp-system-modules/
Odoo ERP software
Odoo ERP software, a leading open-source software for Enterprise Resource Planning (ERP) and business management, has recently launched its latest version, Odoo 17 Community Edition. This update introduces a range of new features and enhancements designed to streamline business operations and support growth.
The Odoo Community serves as a cost-free edition within the Odoo suite of ERP systems. Tailored to accommodate the standard needs of business operations, it provides a robust platform suitable for organisations of different sizes and business sectors. Within the Odoo Community Edition, users can access a variety of essential features and services essential for managing day-to-day tasks efficiently.
This blog presents a detailed overview of the features available within the Odoo 17 Community edition, and the differences between Odoo 17 community and enterprise editions, aiming to equip you with the necessary information to make an informed decision about its suitability for your business.
Utilocate offers a comprehensive solution for locate ticket management by automating and streamlining the entire process. By integrating with Geospatial Information Systems (GIS), it provides accurate mapping and visualization of utility locations, enhancing decision-making and reducing the risk of errors. The system's advanced data analytics tools help identify trends, predict potential issues, and optimize resource allocation, making the locate ticket management process smarter and more efficient. Additionally, automated ticket management ensures consistency and reduces human error, while real-time notifications keep all relevant personnel informed and ready to respond promptly.
The system's ability to streamline workflows and automate ticket routing significantly reduces the time taken to process each ticket, making the process faster and more efficient. Mobile access allows field technicians to update ticket information on the go, ensuring that the latest information is always available and accelerating the locate process. Overall, Utilocate not only enhances the efficiency and accuracy of locate ticket management but also improves safety by minimizing the risk of utility damage through precise and timely locates.
Takashi Kobayashi and Hironori Washizaki, "SWEBOK Guide and Future of SE Education," First International Symposium on the Future of Software Engineering (FUSE), June 3-6, 2024, Okinawa, Japan
6. –Dante (tr. Longfellow), Inferno
Canto I, Lines 1-3
Midway upon the journey of our life
I found myself within a forest dark,
For the straightforward pathway had been lost.
7. Canto
• Data Flow in a System
• The Language of Data Flow Manipulations
• Data Flow in Context
8. Canto
• Data Flow in a System
• The Language of Data Flow Manipulations
• Data Flow in Context
9. Example Data Set
create table stock_price (
symbol varchar(5) not null
,price_date date not null
,price money not null
);
insert into stock_price
(symbol, price_date, price)
values
('ZVZZT', '2017-01-03', 10.02)
,('CBO' , '2017-01-03', 18.99)
,('ZVZZT', '2017-01-04', 9.99)
,('CBO' , '2017-01-04', 19.01)
;
10. Example Data Set
Symbol Price Date Price
ZVZZT 2017-01-03 10.02
CBO 2017-01-03 18.99
ZVZZT 2017-01-04 9.99
CBO 2017-01-04 19.01
11. SQL Statement
select symbol, price
from stock_price
where symbol = 'ZVZZT'
order by price_date
;
Symbol Price Date Price
ZVZZT 2017-01-03 10.02
ZVZZT 2017-01-04 9.99
12. Logical Order of a T-SQL Statement
1. FROM
2. ON
3. JOIN
4. WHERE
5. GROUP BY
6. WITH ROLLUP
7. HAVING
8. SELECT
9. DISTINCT
10.ORDER BY
11.TOP
13. Logical Order of a T-SQL Statement
1. FROM
2. ON
3. JOIN
4. WHERE
5. GROUP BY
6. WITH ROLLUP
7. HAVING
8. SELECT
9. DISTINCT
10.ORDER BY
11.TOP
Extract
Aggregate
Map
Filter
Order
14. Logical Order of a T-SQL Statement
1. FROM
2. ON
3. JOIN
4. WHERE
5. GROUP BY
6. WITH ROLLUP
7. HAVING
8. SELECT
9. DISTINCT
10.ORDER BY
11.TOP
16. SQL Statement
select symbol, price
from stock_price
where symbol = 'ZVZZT'
order by price_date
;
Symbol Price Date Price
ZVZZT 2017-01-03 10.02
ZVZZT 2017-01-04 9.99
2
1
3
4
23. Group By Folded SQL Statement
select
symbol
,max(price_date) as price_date
from stock_price
group by symbol
;
Symbol Price Date
CBO 2017-01-04
ZVZZT 2017-01-04
24. Logical Order of a T-SQL Statement
1. FROM
2. ON
3. JOIN
4. WHERE
5. GROUP BY
6. WITH ROLLUP
7. HAVING
8. SELECT
9. DISTINCT
10.ORDER BY
11.TOP
25. Group By Folded SQL Statement Data Flow
from
group
by
select max
26. Group By Folded SQL Statement
select
symbol
,max(price_date) as price_date
from stock_price
group by symbol
;
Symbol Price Date
CBO 2017-01-04
ZVZZT 2017-01-04
2
1
3
28. Window Folded SQL Statement
select distinct
symbol
,max(price_date) over(
partition by symbol) as price_date
from stock_price
;
Symbol Price Date
CBO 2017-01-04
ZVZZT 2017-01-04
29. Logical Order of a T-SQL Statement
1. FROM
2. ON
3. JOIN
4. WHERE
5. GROUP BY
6. WITH ROLLUP
7. HAVING
8. SELECT
9. DISTINCT
10.ORDER BY
11.TOP
30. Window SQL Statement Data View
max(price_date) over(
partition by symbol) as price_date
Symbol Price_Date
CBO 2017-01-03
CBO 2017-01-04
Price_Date
2017-01-04
2017-01-04
32. Window Folded SQL Statement
select distinct
symbol
,max(price_date) over(
partition by symbol) as price_date
from stock_price
;
Symbol Price Date
CBO 2017-01-04
ZVZZT 2017-01-04
1
2
34. Window SQL Statement
select
symbol
,price
,lag(price, 1) over (
partition by symbol
order by price_date) as prev_price
,price - lag(price, 1) over (
partition by symbol
order by price_date) as price_change
,lead(price, 1) over (
partition by symbol
order by price_date) as next_price
from stock_price
;
36. Window SQL Statement Data View
lag(price, 1) over (
partition by symbol
order by price_date) as prev_price
Symbol Price_Date Price
CBO 2017-01-03 18.99
CBO 2017-01-04 19.01
prev_price
(null)
18.99
37. Window SQL Statement Data View
lead(price, 1) over (
partition by symbol
order by price_date) as next_price
Symbol Price_Date Price
CBO 2017-01-03 18.99
CBO 2017-01-04 19.01
next_price
19.01
(null)
38. Logical Order of a T-SQL Statement
1. FROM
2. ON
3. JOIN
4. WHERE
5. GROUP BY
6. WITH ROLLUP
7. HAVING
8. SELECT
9. DISTINCT
10.ORDER BY
11.TOP
40. Window SQL Statement
select
symbol
,price
,lag(price, 1) over (
partition by symbol
order by price_date) as prev_price
,price - lag(price, 1) over (
partition by symbol
order by price_date) as price_change
,lead(price, 1) over (
partition by symbol
order by price_date) as next_price
from stock_price
;
1
2
41. Canto
• Data Flow in a System
• The Language of Data Flow Manipulations
• Data Flow in Context
42. Imperative Example
var s = "Midway in our life's journey, I went astray";
var count = 0;
foreach(var word in s.Split(' '))
{
var stripped =
Regex.Replace(word, "('s)|W+", "");
if (stripped.Length > 2)
count++;
}
44. Nested Example
var s = "Midway in our life's journey, I went astray";
var count = Enumerable.Count(
Enumerable.Where(
Enumerable.Select(
s.Split(' '),
word => Regex.Replace(
word, "('s)|W+", "")),
stripped => stripped.Length > 2));
48. The Language of Data Flow Manipulations
• map
• bind
• filter
• fold
• mutate
• group by
• order by
49. The Language of Data Flow Manipulations
• map: Select
• bind: SelectMany
• filter: Where
• fold: Aggregate
• mutate: ForEach
• group by: GroupBy
• order by: OrderBy
50. The Language of Data Flow Manipulations
• map
• bind
• filter
• fold
• mutate
• group by
• order by
60. Bind Example
var list = new [] {
"Midway in our life's journey,",
"I went astray"
}
.SelectMany(s => s.Split(' '))
.Select(w => Regex.Replace(w, "('s)|W+", ""));
output>
new [] { "Midway", "in", “our",
"life", "journey", “I",
“went", "astray" }
61. Bind Query Syntax Example
var list =
from sentences in
new [] {
"Midway in our life's journey,",
"I went astray"
}
from words in sentences.Split(' ')
select Regex.Replace(words, "('s)|W+", "");
output>
new [] { "Midway", "in", “our",
"life", "journey", “I",
“went", "astray" }
64. SelectMany Source Code
public override bool MoveNext()
{
switch (_state)
{
case 1:
// Retrieve the source enumerator.
_sourceEnumerator = _source.GetEnumerator();
_state = 2;
goto case 2;
case 2:
// Take the next element from the source enumerator.
https://github.com/dotnet/corefx/blob/7a673547563dc92f1f4cad802d7b57483cdf7500/src/System.Linq/src/System/Linq/SelectMany.cs#L127-L223
65. SelectMany Source Code
case 2:
// Take the next element from the source enumerator.
if (!_sourceEnumerator.MoveNext())
{
break;
}
TSource element = _sourceEnumerator.Current;
// Project it into a sub-collection and get its enumerator.
_subEnumerator = _selector(element).GetEnumerator();
_state = 3;
goto case 3;
case 3:
// Take the next element from the sub-collection and yield.
https://github.com/dotnet/corefx/blob/7a673547563dc92f1f4cad802d7b57483cdf7500/src/System.Linq/src/System/Linq/SelectMany.cs#L127-L223
66. SelectMany Source Code
case 3:
// Take the next element from the sub-collection and yield.
if (!_subEnumerator.MoveNext())
{
_subEnumerator.Dispose();
_subEnumerator = null;
_state = 2;
goto case 2;
}
_current = _subEnumerator.Current;
return true;
https://github.com/dotnet/corefx/blob/7a673547563dc92f1f4cad802d7b57483cdf7500/src/System.Linq/src/System/Linq/SelectMany.cs#L127-L223
67. Bind Example
var list = new [] {
"Midway in our life's journey, I went astray"
}
.SelectMany(s => s.Split(' '))
.Select(w => Regex.Replace(w, "('s)|W+", ""));
output>
new [] { "Midway", "in", “our",
"life", "journey", “I",
“went", "astray" }
68. The Language of Data Flow Manipulations
• map
• bind
• filter
• fold
• mutate
• group by
• order by
85. ForEach Source Code
public void ForEach(Action<T> action)
{
// ... assert we have something
int version = _version;
for (int i = 0; i < _size; i++)
{
if (version != _version)
{
break;
}
action(_items[i]);
}
// cannot alter list
if (version != _version)
throw new InvalidOperationException(
SR.InvalidOperation_EnumFailedVersion);
}
https://github.com/dotnet/corefx/blob/bffef76f6af208e2042a2f27bc081ee908bb390b/src/System.Collections/src/System/Collections/Generic/
List.cs#L598-L618
86. Mutate Example
var visited = new List<int>();
new [] {3, 9, 10, 33, 100}
.Where(n => n % 3 == 0)
.ToList()
.ForEach(n => visited.Add(n));
output>
//void
var visited = new [] {3, 9, 33}
87. The Language of Data Flow Manipulations
• map
• bind
• filter
• fold
• mutate
• group by
• order by
89. Group By Example
var groups = new [] {3, 9, 10, 33, 100}
.GroupBy(n => n % 2 == 0)
.ToDictionary(g => g.Key, g => g.ToList());
output>
new Dictionary<bool, List<int>>()
{
{true, new List<int>() {10, 100}},
{false, new List<int>() {3, 9, 33}}
}
92. Group By Example
var groups = new [] {3, 9, 10, 33, 100}
.GroupBy(n => n % 2 == 0)
.ToDictionary(g => g.Key, g => g.ToList());
output>
new Dictionary<bool, List<int>>()
{
{true, new List<int>() {10, 100}},
{false, new List<int>() {3, 9, 33}}
}
93. The Language of Data Flow Manipulations
• map
• bind
• filter
• fold
• mutate
• group by
• order by
98. Order By Example
var unordered = new [] {33, 9, 100, 3, 10}
.OrderBy(n => n);
output>
new [] {3, 9, 10, 33, 100}
99. Canto
• Data Flow in a System
• The Language of Data Flow Manipulations
• Data Flow in Context
100. –Dante (tr. Barolini), Paradiso
Canto XVII, Lines 55-60
You shall leave everything you love most dearly:
this is the arrow that the bow of exile
shoots first. You are to know the bitter taste
of others’ bread, how salt it is, and know
how hard a path it is for one who goes
descending and ascending others’ stairs.
119. Next Steps
• StrangeLoop (conference)
https://www.thestrangeloop.com/
• Brian Lonsdorf - Professor Frisby Introduces Composable Functional JavaScript
(video series)
https://egghead.io/courses/professor-frisby-introduces-composable-functional-
javascript
• Aditya Bhargava - Functors, Applicatives, And Monads In Pictures (blog post)
http://adit.io/posts/2013-04-17-
functors,_applicatives,_and_monads_in_pictures.html
• Scott Wlaschin - F# for Fun and Profit (blog series)
http://fsharpforfunandprofit.com/series/thinking-functionally.html
• Dixin Yan - LINQ and Functional Programming via C# (blog series)
https://weblogs.asp.net/dixin/linq-via-csharp
122. Biography
• Barolini, Teodolinda. Commento Baroliniano, Digital Dante. New York, NY: Columbia University Libraries, 2017.
https://digitaldante.columbia.edu/dante/divine-comedy/
• Ben-Gan, Itzik. Microsoft SQL Server 2012 high-performance T-SQL using Window functions: Microsoft Press, 2012.
• Byham, Rick. Microsoft Docs. https://docs.microsoft.com/en-us/sql/t-sql/queries/select-transact-sql
• Cook, William R. and Herzman, Ronald B. Dante's Divine Comedy. The Great Courses. http://
www.thegreatcourses.com/courses/dante-s-divine-comedy.html
• Equity Regulatory Alert #2016 - 3 Guidance On Test Stock Usage https://www.nasdaqtrader.com/MicroNews.aspx?
id=ERA2016-3
• Petricek, Tomas. Beyond the Monad fashion (I.): Writing idioms in LINQ- http://tomasp.net/blog/idioms-in-linq.aspx/
• Securities Industry Automation Corporation. New York, 10 September 2015. https://www.nyse.com/publicdocs/
ctaplan/notifications/trader-update/CTS_CQS%20%20NEW%20DEDICATED%20TEST%20SYMBOL_09102015.pdf
• Wickham, Hadley, and Garrett Grolemund. R for data science : import, tidy, transform, visualize, and model data.
Sebastopol, CA: O'Reilly Media, 2016. http://r4ds.had.co.nz/
123. Images
• By Domenico di Michelino - Jastrow, Self-photographed, Public Domain, https://
commons.wikimedia.org/w/index.php?curid=970608
• By Sailko - Own work, CC BY 3.0, https://commons.wikimedia.org/w/index.php?curid=49841035
• By Lua - Sotheby's, London, 17 December 2015, lot 22, Public Domain, https://
commons.wikimedia.org/w/index.php?curid=32782393
• By Sailko - Own work, CC BY 3.0, https://commons.wikimedia.org/w/index.php?curid=56266161
• By William Blake - http://www.blakearchive.org/exist/blake/archive/work.xq?
workid=but812&java=no, Public Domain, https://commons.wikimedia.org/w/index.php?
curid=27118518
• By Salvatore Postiglione - http://www.hampel-auctions.com/en/92-325/onlinecatalog-detail-
n229.html, Public Domain, https://commons.wikimedia.org/w/index.php?curid=26540023
• Di Giovanni Stradano - Opera propria, 2007-10-25, Pubblico dominio, https://
commons.wikimedia.org/w/index.php?curid=2981981