SlideShare a Scribd company logo
1 of 25
Download to read offline
r2c Documentation
Release 0.0.18
r2c
July 09, 2019
CONTENTS:
1 Getting Started 3
1.1 Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Creating an Analyzer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Writing Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 Running Analysis Locally . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5 Running on r2c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.6 Analyzing Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2 API Reference 15
2.1 analyzer.json . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2 Database Schemas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3 Docker Containers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.4 JSON Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3 Best Practices 19
4 Troubleshooting 21
4.1 The /analysis directory is completely empty . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
i
ii
r2c Documentation, Release 0.0.18
Note: This documentation is for our closed beta as of release 0.0.18. Visit app.r2c.dev to join the waitlist!
r2c is a program analysis platform and cli for security researchers, hackers, and developers. Run your analysis on our
infrastructure to quicky get results for millions of projects.
The platform offers:
1. scale: run your analysis at large scale across millions of open-source projects and their history
2. re-use: create analysis pipelines with reusable analyzers and shared results
3. determinism: easily define source code targets from GitHub, npm, PyPi, and others
4. collaboration & feedback: collaboratively triage, label, and disclose results
Have questions about the beta, r2c, or our fellowships? We’d love to help! Contact us at hello@r2c.dev.
CONTENTS: 1
r2c Documentation, Release 0.0.18
2 CONTENTS:
CHAPTER
ONE
GETTING STARTED
This section will walk you through using r2c-cli and the platform by writing an analyzer the computes the per-
centage of whitespace in the source code of 1000 npm packages. Weâ ˘A´Zll use it to find which projects have checked
in minified JavaScript files.
For detailed documentation of commands, options, and formats, see API Reference.
Installation
We’ll start our journey with r2c-cli, which is used to develop and test analyzers locally before pushing them to the
platform to run at scale.
Requirements
Note: We support Mac OSX and Ubuntu for local analyzer development.
Please install the following required software:
• docker: a tool for running software in isolated containers
• python3.6+: the programming language needed to run (but not to develop) analyzers
• pip3: the Python package manager used to install and update r2c-cli
When installing docker:
• Do not use snap, it is incompatible with r2c-cli
• For Ubuntu users we highly encourage setting up Docker for non-root users. You need to run docker as the
same user you’ll run r2c-cli with, and we discourage running r2c-cli as a root or system administrator
Run the following to test docker, python3, and pip3 installation:
$ docker run hello-world
$ python3 --version
$ pip3 --version
Getting r2c-cli
It’s easy! Just run:
3
r2c Documentation, Release 0.0.18
$ pip3 install r2c-cli
To verify the r2c-cli installation, run:
$ r2c --help
Usage: r2c [OPTIONS] COMMAND [ARGS]...
Options:
--debug Show extra output, error messages, and exception
stack traces
--version Show current version of r2c cli.
--no-traverse-manifest Don't attempt to find an analyzer.json if it doesn't
exist in the current or specified directory
--help Show this message and exit.
Commands:
build Builds an analyzer without running it.
init Creates an example analyzer for analyzing JavaScript/TypeScript.
login Log in to the R2C analysis platform.
logout Log out of the R2C analysis platform.
push Push the analyzer in the current directory to the R2C platform.
run Run the analyzer in the current directory over a code directory.
test Locally run integration tests for the current analyzer.
unittest Locally unit tests for the current analyzer directory.
If the help prompt prints, you’re ready to move on to Creating an Analyzer.
Creating an Analyzer
Creating the Boilerplate
In a directory where you want to create your new analyzer, run:
$ r2c init
For the Analyzer name prompt, enter <YOUR-USERNAME>-minifinder. For the rest of the prompts we’ll
use the defaults.
$ r2c init
Analyzer name (can only contain lowercase letters, numbers or - and _): <YOUR-USERNAME>-minifinder
Author name [Jav A. Script]:
Author email [hello@example.com]:
Will your analyzer produce:
- output for a particular `git` repository
- output for a particular git `commit` in a repo
- the same `constant` output regardless of commit or repo? (git, commit, constant) [commit]:
Does your analyzer output
- a single schema-compliant JSON file
- a full filesystem output? (json, filesystem) [json]:
âIJˇT Done! Your analyzer can be found in the <YOUR-USERNAME>-minifinder directory
Check it out by changing to the new folder:
$ cd <YOUR-USERNAME>-minifinder/
4 Chapter 1. Getting Started
r2c Documentation, Release 0.0.18
Understanding Analyzer Files
The init command created several files in the directory you initialized:
.
âˇTIJ-- Dockerfile
âˇTIJ-- README.md
âˇTIJ-- analyzer.json
âˇTˇT-- src
âˇTIJ-- analyze.sh
âˇTˇT-- unittest.sh
Each of these files is used by the r2c system in a different way.
• analyzer.json defines how your analyzer will interact with the r2c system and tools. Some important
values in this file are:
• analyzer_name: The namespaced analyzer name. For the beta, all analyzers must be namespaced with
beta to be uploaded to the platform.
• version: The version of the analyzer. Versions should follow semantic versioning. r2c uses the analyzer
name, version, and other parameters for caching: when running an analyzer at scale, all of these are used to
determine if the analysis has already been run on that piece of code.
• dependencies: This analyzer depends on the most basic component, public/source-code. It spec-
ifies that it depends on any version of the public/source-code component by specifying "*" as the
version. For more complicated analysis, we could depend on components such as r2c/transpiler or
r2c/typed-ast, which are beyond the scope of this tutorial.
• Dockerfile is responsible for the container’s setup and configuration. In this file you can install dependencies
to build and run your analysis. To learn more about Dockerfiles in general, see Docker’s tutorial.
Note: Though it can be tempting to use images like node:latest, most analyzers should be deterministic and
therefore benefit from pinning their base image to a specific version. For more information, see Best Practices.
• src/analyze.sh is the main entry point. From this file, we’ll run your program that performs analysis!
• src/unittest.sh run your analyzer’s unit tests, if it has them, inside the container by caling them from
this file.
Once you’ve checked out those files, let’s move on to Writing Analysis.
Writing Analysis
For this tutorial, we’re writing an analysis that reports how much of each file in a project is whitespace. We’ll use it
to find which projects have checked in minified JavaScript files.
Setting up the Container
We will write this analysis in the Python programming language. Though we are using Python for this tutorial, you
can write analysis in any programming language as each docker container can have different software installed.
Before we can write Python, we’ll need to install it in our docker container. To do this, add the following line to the
project’s Dockerfile:
1.3. Writing Analysis 5
r2c Documentation, Release 0.0.18
1 FROM ubuntu:17.10
2
3 RUN apt update && apt install -y python3 
4 && rm -rf /var/lib/apt/lists/*
5
6 RUN groupadd -r analysis && useradd -m --no-log-init --gid analysis analysis
7
8 USER analysis
9 COPY src /analyzer
10
11 WORKDIR /
12 CMD ["/analyzer/analyze.sh"]
When we edit our code later, the container will automatically be rebuilt by r2c run.
Writing the Code
We need to be able to count both whitespace and non-whitespace characters in a given file. We can do this with a
simple regular expression in Python. Create a file src/whitespace.py with the following contents:
1 import json
2 import re
3 import sys
4
5 WHITESPACE_RE = re.compile("s")
6
7
8 def count_whitespace(path):
9 print("Counting whitespace in file {}".format(path))
10 with open(path, "r", encoding="utf-8") as file:
11 data = file.read()
12 result = {}
13 result["check_id"] = "whitespace"
14 result["path"] = path
15 result["extra"] = {}
16 result["extra"]["size"] = len(data)
17 result["extra"]["num_whitespace"] = len(WHITESPACE_RE.findall(data))
18 return result
19
20
21 all_results = []
22 for path in sys.argv[1:]:
23 all_results.append(count_whitespace(path))
24
25 with open("/analysis/output/output.json", "w") as output:
26 output.write(json.dumps({"results": all_results}, sort_keys=True, indent=4))
This file computes the number of whitespace characters and total characters in each file in its input. When we run our
analyzer, we want to run this file with all JavaScript input files as arguments.
We write this object to /analysis/output/output.json because this is a JSON-type analyzer. r2c also
supports filesystem type analyzers, that modify or augment their input but want to preserve a filesystem structure or
output large binary data, e.g. neural net training results. Most analysis eventually leads to JSON output, because JSON
output is what gets consumed r2c’s other tools.
To get just JavaScript files, we’ll use the find program on our mounted source-code directory. Change
src/analyze.sh to look like this:
6 Chapter 1. Getting Started
r2c Documentation, Release 0.0.18
1 #!/bin/bash
2
3 set -e
4 CODE_DIR="/analysis/inputs/public/source-code"
5
6 cd ${CODE_DIR}
7
8 find . -type f -name '*.js' -print0 | xargs -0 python3 /analyzer/whitespace.py
First, we change to the directory our source code is checked out. That folder is
/analysis/inputs/public/source-code/ inside the docker container. This location is a result of
minifinder depending on the source-code component (configured in analyzer.json). For more information
about dependencies and locating their output, see API Reference.
Then, we use the find command to find all files that end in .js and use xargs to have it pass all of those file paths
as arguments to our python program.
Note: Though we wrote our python in a file src/whitespace.py, inside of src/analyze.sh we invoke it at
the path /analyzer/whitespace.py. This is because in line 12 of our Dockerfile, we copy the src folder to
the /analyzer folder inside the container.
Now that we’ve written our code, let’s try Running Analysis Locally.
Running Analysis Locally
Let’s test our analyzer! First we’ll need a target JavaScript project to test it on. Let’s use the popular left-pad project.
Make a folder to keep our test projects in and clone left-pad to the folder:
$ mkdir ~/test-repos
$ git clone https://github.com/stevemao/left-pad ~/test-repos/left-pad
Now we can run our analyzer:
$ r2c run ~/test-repos/left-pad
We get the following output:
$ r2c run ~/test-repos/left-pad/
ð§´RˇC Running analyzer...
Counting whitespace in file ./test.js
Counting whitespace in file ./perf/es6Repeat.js
Counting whitespace in file ./perf/perf.js
Counting whitespace in file ./perf/O(n).js
Counting whitespace in file ./index.js
{
"results": [
...
Notice how the print statements in our Python code appear as if we were running the program outside of r2c. This
makes it easy to debug programs normally. The full JSON output of the program should look something like this:
1 {
2 "results": [
3 {
4 "check_id": "whitespace",
1.4. Running Analysis Locally 7
r2c Documentation, Release 0.0.18
5 "extra": {
6 "num_whitespace": 638,
7 "size": 3673
8 },
9 "path": "./test.js"
10 },
11 {
12 "check_id": "whitespace",
13 "extra": {
14 "num_whitespace": 61,
15 "size": 216
16 },
17 "path": "./perf/es6Repeat.js"
18 },
19 {
20 "check_id": "whitespace",
21 "extra": {
22 "num_whitespace": 213,
23 "size": 1442
24 },
25 "path": "./perf/perf.js"
26 },
27 {
28 "check_id": "whitespace",
29 "extra": {
30 "num_whitespace": 77,
31 "size": 241
32 },
33 "path": "./perf/O(n).js"
34 },
35 {
36 "check_id": "whitespace",
37 "extra": {
38 "num_whitespace": 360,
39 "size": 1137
40 },
41 "path": "./index.js"
42 }
43 ]
44 }
Cool! Now we can hunt for minified files. All of these files look reasonable; though we don’t know exactly what our
threshold should be, minified files will probably be less than 5 to 10 percent whitespace. To find minified files in top
projects and get a sense of what cut-off point to use, we’ll want to run this analyzer at scale against npm packages.
To get started, head on over to Running on r2c.
Running on r2c
Note: The following sections require you to be a member of the r2c beta.
8 Chapter 1. Getting Started
r2c Documentation, Release 0.0.18
Logging in to r2c
To publish and download analyzers you’ll need to log in to the r2c platform.
$ r2c login
Please enter your org name, or to use the common r2c platform, press enter [wildwest]:
Opening web browser to get login token. Do you want to continue? [Y/n]:
trying to open https://app.r2c.dev/settings/token in your browser...
Please enter the API token: ...
Follow the on-screen instructions, accepting the default wildwest org name (which is tied to the beta).
Pushing to r2c
Before uploading our analyzer, let’s take a quick look at it to make sure everything is ready. Once published, analyzer
versions can’t be unpublished. For larger analyzers, this would also be the time to run unit and integration tests.
For this tutorial analyzer, we’ll just sanity-check the fields in analyzer.json:
1 {
2 "analyzer_name": "[YOUR GROUP]/minifinder",
3 "author_name": "Jav. A. Script",
4 "author_email": "hello@example.com",
5 "version": "0.0.1",
6 "spec_version": "1.2.0",
7 "dependencies": {
8 "public/source-code": "*"
9 },
10 "type": "commit",
11 "output": {
12 "type": "json"
13 },
14 "deterministic": true
15 }
Everything looks mostly good. However, to follow best practices, our analyzer should use Semantic Versioning. As
this is our first release, but we’re not yet sure if everything is production ready, we should designate this release version
0.1.0:
"version": "0.1.0",
Great! Now we’re all set. We can push our analyzer to r2c by running the following command from within the analyzer
directory:
$ r2c push
That’s it! Head on over to app.r2c.dev to run your analyzer on an npm input set and dive into the results.
Starting the Job
Once you’ve logged in to the platform you’ll be taken to the Jobs page, which will look similar to this:
1.5. Running on r2c 9
r2c Documentation, Release 0.0.18
Click the “Create Job” button, which will pull up a menu letting you select the analyzer you’ve just pushed!
Now we’ll select an input set to run our analyzer on: r2c npm 1000, 2019-04-01.
10 Chapter 1. Getting Started
r2c Documentation, Release 0.0.18
Click “Run Job” to start the analysis! Your job will be added to the jobs list, where you can click on it to see output,
console logs, and errors coming in in real time!
See Analyzing Results for next steps.
Note: When a job is first kicked off, the infrastructure may need to “warm up”; idle machines will start processing
your job and new machines will be brought online to handle the demand. Once warmed up a job should proceed
quickly and you’ll get results within just a few minutes for quick analyzers like the tutorial.
Note: We currently limit your container to running for 5 minutes with 1.5 GB of memory. Most analyzers don’t
hit this limit, however, if you believe you’re running into them please let us know and we can adjust them for your
research.
Analyzing Results
Note: The following instructions are for analyzers with output type json. See analyzer.json to learn more about this
and other output types.
Now that your analyzer has been run in the web UI (and completed successfully, we hope!) we’ll write some Python
to load and graph the results.
When you run an analyzer on the r2c platform we store the results in a PostgreSQL database. To interact with this
database, we’ll use a mix of sqlalchemy, psycopg2, pandas, and matplotlib Python libraries. Let’s install these in your
local environment:
$ pip3 install sqlalchemy psycopg2-binary pandas matplotlib
Next, create a file in a scratch directory called minifinder-results.py:
$ touch minifinder-results.py
Copy the following code into minifinder-results.py and update the constants with your org information:
1 import os
2 import pandas as pd
3 import psycopg2
4 from sqlalchemy import create_engine
5 import matplotlib.pyplot as plt
6
7 ###################################
8 # EDIT THESE CONSTANTS
9 ###################################
10
11 DB_PASSWORD = "DB-PASSWORD-FROM-EMAIL"
12 AUTHOR_NAME = "YOUR-NAME"
13
14 ANALYZER_NAME = f"beta/{AUTHOR_NAME}-minifinder"
15 ANALYZER_VERSION = "0.1.0"
16 CORPUS_NAME = "r2c-1000"
17
18 ###################################
19 # END EDIT SECTION
1.6. Analyzing Results 11
r2c Documentation, Release 0.0.18
20 ###################################
21
22 # Canonical SQL query to get job-specific results back.
23 JOB_QUERY = """
24 SELECT *
25 FROM result,
26 commit_corpus
27 WHERE result.commit_hash = commit_corpus.commit_hash
28 AND analyzer_name = %(analyzer_name)s
29 AND analyzer_version = %(analyzer_version)s
30 AND corpus_name = %(corpus_name)s
31 """
32
33 QUERY_PARAMS = {
34 "corpus_name": CORPUS_NAME,
35 "analyzer_name": ANALYZER_NAME,
36 "analyzer_version": ANALYZER_VERSION,
37 }
38
39 # Connect to PostgreSQL host and query for job-specific results
40 engine = create_engine(f"postgresql://notebook_user:{DB_PASSWORD}@db.r2c.dev/postgres")
41 job_df = pd.read_sql(JOB_QUERY, engine, params=QUERY_PARAMS)
42
43 # Print pandas dataframe to stdout for debugging
44 print("Raw job dataframe:")
45 print(job_df[1:10])
46
47 # Helper method to compute % whitespace from the num_whitespace and size fields in our 'extra' column
48 def get_percent_whitespace(row):
49 size = row.extra["size"]
50 # Avoid 'division by zero' exceptions.
51 return row.extra["num_whitespace"] / size if size else 0
52
53
54 # Add 'percent_whitespace' column
55 job_df["percent_whitespace"] = job_df.apply(get_percent_whitespace, axis=1)
56
57 # Create a histogram of our data using the `percent_whitespace` column
58 job_df.hist(column="percent_whitespace", bins=100)
59 plt.show()
Finally, run minifinder-results.py, which will produce a graph of results:
$ python3 minifinder-results.py
12 Chapter 1. Getting Started
r2c Documentation, Release 0.0.18
Success! You’ve gone from a question, “how much of each file in a project is whitespace”, to analyzing 1000 npm
projects, to graphing the results!
Next up is writing your own analyzer, interrogating your data, and making the JavaScript ecosystem a better place. If
you have any questions or concerns, please don’t hesitate to reach out to us at .
1.6. Analyzing Results 13
r2c Documentation, Release 0.0.18
14 Chapter 1. Getting Started
CHAPTER
TWO
API REFERENCE
The r2c analysis API has three main parts - the analyzer.json manifest file which specifies how the analyzer
interacts with the rest of the r2c system, the docker container in which the analyzer runs and how r2c expects
programs in the container to behave, and the final output formats of the analysis that go on to be used by other parts of
the r2c toolchain.
analyzer.json
The current version of the analyzer.json specification is 1.0.0. This version supports the following fields in
the top-level JSON object:
analyzer_name [string] The name of the analyzer. All analyzers must belong to an org, e.g. r2c/transpiler.
author_name [string] The name of the author.
author_email [string] A contact address for the author.
version [string] The semantic version string representing the current analyzer version.
Note: Analyzer versions are used for caching and result correctness. A given version of a deterministic analyzer run
on the same input is expected to always produce the same result. One consequence of this is that you cannot overwrite
or push multiple of the same version for a given analyzer.
spec_version [string] The version of the analyzer specification this analyzer is compatible with. To match the version
of the documented on this page, this should always be 1.0.0.
dependencies [object] This is the main mechanism by which analyzers interact with other analysis components
in the r2c system. Each key in this object is the name of another analyzer, such as r2c/transpiler.
Each value is the version of the analyzer you want to use. An analyzer’s output for the same com-
mit or repository appears under its name in /analysis/inputs when your container runs, e.g.
/analysis/inputs/r2c/transpiler.
type [“constant”, “commit”, or “git”] The type of input for which this analyzer produces a unique output.
• A constant analyzer always produces the same output. This can be useful for the trained data for a
machine learning based analysis, by having the analyzer depend on its trained weights separately from the
actual analysis.
• A commit analyzer produces output for a JavaScript project at a specific point in its history, a single
instance of code. This analyzer should not expect to make use of any version control features. Most
analyzers are commit anlayzers.
• A git analyzer understands the evolution and history of a repository. It produces unique outputs for an
entire commit graph or bare git checkout. These analyzers can directly compare trends in a project.
15
r2c Documentation, Release 0.0.18
output [“json”, “filesystem”, or “both”] The type of output this analyzer produces.
• json output is the most common output format. For the structure of the resulting JSON, see JSON
Results.
• filesystem output allows the analyzer to output an arbitrary directory and file structure. Hard links
are not supported. These analyzers may do such things as transpile code, install dependencies, or produce
binary or other data from an analysis that would not be serializable as JSON. They lose the benefit of parts
of the r2c toolchain that understand the JSON format, so most analysis should eventually end up as JSON.
• The both value indicates that both json and filesystem output appear together. This is useful for
describing properties of the filesystem output to query the results.
deterministic [boolean] The current version of the r2c platform does not support non-deterministic analyzers. How-
ever, this feature is a part of the analyzer specifciation version 1.0.0 and will be implemneted soon.
Database Schemas
Note: We recommend psql for raw querying and exploration of data or the Python library sqlalchemy for data analysis
in environments like Jupyter Notebook. See Analyzing Results for more details and connection strings.
Analyzers with output type json (discussed in analyzer.json) are stored in our PostgreSQL database. There are two
primary tables:
1. result
2. commit_corpus
result
The schema for the result table closely follows the format of JSON Results.
Column | Type
------------------+---------
id | integer
analyzer_name | text
analyzer_version | text
commit_hash | text
check_id | text
path | text
start_line | integer
start_col | integer
end_line | integer
end_col | integer
extra | jsonb
commit_corpus
Column | Type
-------------+------
commit_hash | text
repo_url | text
corpus_name | text
16 Chapter 2. API Reference
r2c Documentation, Release 0.0.18
Docker Containers
The Dockerfile for r2c analyzers allows you to customize your operating system, software, and filesystem as much as
you like. However, r2c expects a few things of these containers to be able to run analysis, and a few more things for a
smooth experience when developing analysis on your local machine.
Filesystem Structure
Dependencies for analyzers are mounted into the container at /analysis/inputs. For more on specifying depen-
dencies, see manifest.
All dependency names must also be valid Linux filesystem paths; this enables r2c to mount each dependency
in a subfolder matching the org and analyzer name. For example, for the transpiler analyzer published by
r2c-cli, specified in the dependency object as "r2c/transpiler": "1.0.0", its output will appear un-
der /analysis/inputs/r2c/transpiler.
This behavior is slightly different for filesystem type analyzers and JSON type analyzers. For filesystem type an-
alyzers, their input appears at that path. If r2c/transpiler is of type filesystem and writes a file named
foo.bin to /analysis/output/output/, then when depending on r2c/tranpsiler, that file would
exist at /analysis/inputs/r2c/transpiler/foo.bin. If on the other hand, r2c/transpiler is
of type JSON and writes its output to /analysis/output/output.json, that JSON file would exist at
/analysis/inputs/r2c/transpiler.json.
Above we allude to /analysis/output/; this is the folder where analyzers are expected to write their output. For
filesystem type analyzers, the contents of the folder /analysis/output/output/ will be saved. For JSON type
analyzers, the contents of /analysis/output/output.json will be saved.
Entry Point Behavior
Analyzers may specify any entry point (default Docker CMD) they wish. By default, the r2c init template sets this to
/analyzer/analyze.sh. However, this command could instead be a java -jar command, a python program,
or any other valid command. The only expectation r2c has of this command is that if it exits with a non-zero return
code, the analysis is considered to have failed and output will not be saved.
User Permissions
It is considered bad practice to run inside of Docker containers as the root user. It is also considered bad practice when
running locally to develop r2c analyzers as root. This leads to one complication: files created in the container as the
user inside the container may have different user permissions that prevent the local user running r2c from cleaning up
the temporary files.
To avoid this, the default r2c template takes arguments that create a user with the same ID as the current user on the
host machine. This happens once at build time, and does not affect the ability of the analyzer to run at scale on r2c
infrastructure, but avoids files being created that require extra permission to clean up when running locally. If you need
to, you may have analysis inside the Docker container run as any user you like, though it is often not good practice to
do so.
JSON Results
The JSON results produced by your analysis must conform to the output.json specification. As of 1.0.0, there
are two top-level fields: results (required) and errors (optional):
2.3. Docker Containers 17
r2c Documentation, Release 0.0.18
{
"results": [],
"errors": []
}
results
The results key is a required top-level field containing an array of 0 or more result objects that specify the type
of result and its location (if applicable).
A sample results response from our whitepace finding analyzer in Running Analysis Locally is:
{
"results":[
{
"check_id":"whitespace",
"path":"perf/O(n).js",
"extra":{
"whitespace":77,
"total":241
}
}
]
}
Each result object supports the following fields:
Each point object supports the following fields:
errors
Optional field containing an array of 0 or more error objects with error messages and related data.
A sample errors response looks like:
{
"results":[],
"errors": [
{
"message": "Cyclomatic complexity limit reached.",
"data": {
"path": "foobar.js"
}
}
]
}
Each error object supports the following fields:
18 Chapter 2. API Reference
CHAPTER
THREE
BEST PRACTICES
Coming soon!
19
r2c Documentation, Release 0.0.18
20 Chapter 3. Best Practices
CHAPTER
FOUR
TROUBLESHOOTING
The /analysis directory is completely empty
If the /analysis directory doesn’t even have inputs and output subdirectories, then Docker is probably having
issues mounting those folders. To verify, run
$ tempdir=$(mktemp -d)
$ touch $tempdir/data
$ docker run --volume $tempdir:/dummy alpine:latest ls -l /dummy
You should see a single line of output:
total 0
If this is the case, then Docker can’t mount folders inside /tmp. This can happen if you installed Docker via the snap
package manager (i.e., which docker shows a path starting with /snap). If you did that, you’ll need to run sudo
snap remove docker and then reinstall Docker via your package manager or a similar method.
If you see two lines of output like
total 0
-rw-r--r-- 1 root root 0 Apr 15 21:47 data
then the issue is something else. In that case, please contact us and we’ll be happy to assist.
21

More Related Content

What's hot

sphinx demo
sphinx demosphinx demo
sphinx demoak013
 
Analysis of commits and pull requests in Travis CI, Buddy and AppVeyor using ...
Analysis of commits and pull requests in Travis CI, Buddy and AppVeyor using ...Analysis of commits and pull requests in Travis CI, Buddy and AppVeyor using ...
Analysis of commits and pull requests in Travis CI, Buddy and AppVeyor using ...Andrey Karpov
 
Nagios Conference 2011 - Nathan Vonnahme - Integrating Nagios With Test Drive...
Nagios Conference 2011 - Nathan Vonnahme - Integrating Nagios With Test Drive...Nagios Conference 2011 - Nathan Vonnahme - Integrating Nagios With Test Drive...
Nagios Conference 2011 - Nathan Vonnahme - Integrating Nagios With Test Drive...Nagios
 
Trust, but verify | Testing with Docker Containers
Trust, but verify | Testing with Docker ContainersTrust, but verify | Testing with Docker Containers
Trust, but verify | Testing with Docker ContainersNan Liu
 
Crash dump analysis - experience sharing
Crash dump analysis - experience sharingCrash dump analysis - experience sharing
Crash dump analysis - experience sharingJames Hsieh
 
HoneyNet SOTM 32 - Windows Malware Analysis
HoneyNet SOTM 32 - Windows Malware AnalysisHoneyNet SOTM 32 - Windows Malware Analysis
HoneyNet SOTM 32 - Windows Malware AnalysisChetan Ganatra
 
Docker 導入:障礙與對策
Docker 導入:障礙與對策Docker 導入:障礙與對策
Docker 導入:障礙與對策William Yeh
 
Testing Salt States (part 1)
Testing Salt States (part 1)Testing Salt States (part 1)
Testing Salt States (part 1)jasondenning
 
JavaDay Kiev 2017 - Integration testing with TestContainers
JavaDay Kiev 2017 - Integration testing with TestContainersJavaDay Kiev 2017 - Integration testing with TestContainers
JavaDay Kiev 2017 - Integration testing with TestContainersAnton Arhipov
 
10 Cool Facts about Gradle
10 Cool Facts about Gradle10 Cool Facts about Gradle
10 Cool Facts about GradleEvgeny Goldin
 
Malware analysis using volatility
Malware analysis using volatilityMalware analysis using volatility
Malware analysis using volatilityYashashree Gund
 
Security and dev ops for high velocity organizations
Security and dev ops for high velocity organizationsSecurity and dev ops for high velocity organizations
Security and dev ops for high velocity organizationsChef
 
Kernel Recipes 2018 - A year of fixing Coverity issues all over the Linux ker...
Kernel Recipes 2018 - A year of fixing Coverity issues all over the Linux ker...Kernel Recipes 2018 - A year of fixing Coverity issues all over the Linux ker...
Kernel Recipes 2018 - A year of fixing Coverity issues all over the Linux ker...Anne Nicolas
 
The Gradle in Ratpack: Dissected
The Gradle in Ratpack: DissectedThe Gradle in Ratpack: Dissected
The Gradle in Ratpack: DissectedDavid Carr
 
Kernel Recipes 2018 - 10 years of automated evolution in the Linux kernel - J...
Kernel Recipes 2018 - 10 years of automated evolution in the Linux kernel - J...Kernel Recipes 2018 - 10 years of automated evolution in the Linux kernel - J...
Kernel Recipes 2018 - 10 years of automated evolution in the Linux kernel - J...Anne Nicolas
 
Deployment With Subversion - Lorna Mitchell
Deployment With Subversion - Lorna MitchellDeployment With Subversion - Lorna Mitchell
Deployment With Subversion - Lorna Mitchelldpc
 
Open Source tools overview
Open Source tools overviewOpen Source tools overview
Open Source tools overviewLuciano Resende
 

What's hot (19)

sphinx demo
sphinx demosphinx demo
sphinx demo
 
Analysis of commits and pull requests in Travis CI, Buddy and AppVeyor using ...
Analysis of commits and pull requests in Travis CI, Buddy and AppVeyor using ...Analysis of commits and pull requests in Travis CI, Buddy and AppVeyor using ...
Analysis of commits and pull requests in Travis CI, Buddy and AppVeyor using ...
 
Nagios Conference 2011 - Nathan Vonnahme - Integrating Nagios With Test Drive...
Nagios Conference 2011 - Nathan Vonnahme - Integrating Nagios With Test Drive...Nagios Conference 2011 - Nathan Vonnahme - Integrating Nagios With Test Drive...
Nagios Conference 2011 - Nathan Vonnahme - Integrating Nagios With Test Drive...
 
Trust, but verify | Testing with Docker Containers
Trust, but verify | Testing with Docker ContainersTrust, but verify | Testing with Docker Containers
Trust, but verify | Testing with Docker Containers
 
Module net cdf4
Module net cdf4 Module net cdf4
Module net cdf4
 
Crash dump analysis - experience sharing
Crash dump analysis - experience sharingCrash dump analysis - experience sharing
Crash dump analysis - experience sharing
 
HoneyNet SOTM 32 - Windows Malware Analysis
HoneyNet SOTM 32 - Windows Malware AnalysisHoneyNet SOTM 32 - Windows Malware Analysis
HoneyNet SOTM 32 - Windows Malware Analysis
 
Docker 導入:障礙與對策
Docker 導入:障礙與對策Docker 導入:障礙與對策
Docker 導入:障礙與對策
 
Testing Salt States (part 1)
Testing Salt States (part 1)Testing Salt States (part 1)
Testing Salt States (part 1)
 
JavaDay Kiev 2017 - Integration testing with TestContainers
JavaDay Kiev 2017 - Integration testing with TestContainersJavaDay Kiev 2017 - Integration testing with TestContainers
JavaDay Kiev 2017 - Integration testing with TestContainers
 
10 Cool Facts about Gradle
10 Cool Facts about Gradle10 Cool Facts about Gradle
10 Cool Facts about Gradle
 
Malware analysis using volatility
Malware analysis using volatilityMalware analysis using volatility
Malware analysis using volatility
 
Security and dev ops for high velocity organizations
Security and dev ops for high velocity organizationsSecurity and dev ops for high velocity organizations
Security and dev ops for high velocity organizations
 
Kernel Recipes 2018 - A year of fixing Coverity issues all over the Linux ker...
Kernel Recipes 2018 - A year of fixing Coverity issues all over the Linux ker...Kernel Recipes 2018 - A year of fixing Coverity issues all over the Linux ker...
Kernel Recipes 2018 - A year of fixing Coverity issues all over the Linux ker...
 
The Gradle in Ratpack: Dissected
The Gradle in Ratpack: DissectedThe Gradle in Ratpack: Dissected
The Gradle in Ratpack: Dissected
 
Kernel Recipes 2018 - 10 years of automated evolution in the Linux kernel - J...
Kernel Recipes 2018 - 10 years of automated evolution in the Linux kernel - J...Kernel Recipes 2018 - 10 years of automated evolution in the Linux kernel - J...
Kernel Recipes 2018 - 10 years of automated evolution in the Linux kernel - J...
 
Deployment With Subversion - Lorna Mitchell
Deployment With Subversion - Lorna MitchellDeployment With Subversion - Lorna Mitchell
Deployment With Subversion - Lorna Mitchell
 
Open Source tools overview
Open Source tools overviewOpen Source tools overview
Open Source tools overview
 
Introduction of k8s rancher
Introduction of k8s rancherIntroduction of k8s rancher
Introduction of k8s rancher
 

Similar to R2C Documentation: Getting Started with Writing Your First Analyzer

Reproducible Research in R and R Studio
Reproducible Research in R and R StudioReproducible Research in R and R Studio
Reproducible Research in R and R StudioSusan Johnston
 
Openobject install
Openobject installOpenobject install
Openobject installAli Mashduqi
 
Architecting the Future: Abstractions and Metadata - STL SilverLinings
Architecting the Future: Abstractions and Metadata - STL SilverLiningsArchitecting the Future: Abstractions and Metadata - STL SilverLinings
Architecting the Future: Abstractions and Metadata - STL SilverLiningsDaniel Barker
 
Analysis of merge requests in GitLab using PVS-Studio for C#
Analysis of merge requests in GitLab using PVS-Studio for C#Analysis of merge requests in GitLab using PVS-Studio for C#
Analysis of merge requests in GitLab using PVS-Studio for C#Andrey Karpov
 
Code quality par Simone Civetta
Code quality par Simone CivettaCode quality par Simone Civetta
Code quality par Simone CivettaCocoaHeads France
 
Investigation report on 64 bit support and some of new features in aosp master
Investigation report on 64 bit support and some of new features in aosp masterInvestigation report on 64 bit support and some of new features in aosp master
Investigation report on 64 bit support and some of new features in aosp masterhidenorly
 
Openshift cheat rhce_r3v1 rhce
Openshift cheat rhce_r3v1 rhceOpenshift cheat rhce_r3v1 rhce
Openshift cheat rhce_r3v1 rhceDarnette A
 
Container Monitoring with Sysdig
Container Monitoring with SysdigContainer Monitoring with Sysdig
Container Monitoring with SysdigSreenivas Makam
 
Architecting the Future: Abstractions and Metadata - CodeStock
Architecting the Future: Abstractions and Metadata - CodeStockArchitecting the Future: Abstractions and Metadata - CodeStock
Architecting the Future: Abstractions and Metadata - CodeStockDaniel Barker
 
Spinnaker Summit 2018: CI/CD Patterns for Kubernetes with Spinnaker
Spinnaker Summit 2018: CI/CD Patterns for Kubernetes with SpinnakerSpinnaker Summit 2018: CI/CD Patterns for Kubernetes with Spinnaker
Spinnaker Summit 2018: CI/CD Patterns for Kubernetes with SpinnakerAndrew Phillips
 
OpenSCAP Overview(security scanning for docker image and container)
OpenSCAP Overview(security scanning for docker image and container)OpenSCAP Overview(security scanning for docker image and container)
OpenSCAP Overview(security scanning for docker image and container)Jooho Lee
 
Extending OpenShift Origin: Build Your Own Cartridge with Bill DeCoste of Red...
Extending OpenShift Origin: Build Your Own Cartridge with Bill DeCoste of Red...Extending OpenShift Origin: Build Your Own Cartridge with Bill DeCoste of Red...
Extending OpenShift Origin: Build Your Own Cartridge with Bill DeCoste of Red...OpenShift Origin
 
Automate drupal deployments with linux containers, docker and vagrant
Automate drupal deployments with linux containers, docker and vagrant Automate drupal deployments with linux containers, docker and vagrant
Automate drupal deployments with linux containers, docker and vagrant Ricardo Amaro
 

Similar to R2C Documentation: Getting Started with Writing Your First Analyzer (20)

Reproducible Research in R and R Studio
Reproducible Research in R and R StudioReproducible Research in R and R Studio
Reproducible Research in R and R Studio
 
Openobject install
Openobject installOpenobject install
Openobject install
 
Installation d openerp
Installation d openerpInstallation d openerp
Installation d openerp
 
ABCs of docker
ABCs of dockerABCs of docker
ABCs of docker
 
ROS ROS
ROS ROSROS ROS
ROS ROS
 
Architecting the Future: Abstractions and Metadata - STL SilverLinings
Architecting the Future: Abstractions and Metadata - STL SilverLiningsArchitecting the Future: Abstractions and Metadata - STL SilverLinings
Architecting the Future: Abstractions and Metadata - STL SilverLinings
 
Analysis of merge requests in GitLab using PVS-Studio for C#
Analysis of merge requests in GitLab using PVS-Studio for C#Analysis of merge requests in GitLab using PVS-Studio for C#
Analysis of merge requests in GitLab using PVS-Studio for C#
 
Code quality par Simone Civetta
Code quality par Simone CivettaCode quality par Simone Civetta
Code quality par Simone Civetta
 
Investigation report on 64 bit support and some of new features in aosp master
Investigation report on 64 bit support and some of new features in aosp masterInvestigation report on 64 bit support and some of new features in aosp master
Investigation report on 64 bit support and some of new features in aosp master
 
textconfig
textconfigtextconfig
textconfig
 
Openshift cheat rhce_r3v1 rhce
Openshift cheat rhce_r3v1 rhceOpenshift cheat rhce_r3v1 rhce
Openshift cheat rhce_r3v1 rhce
 
Container Monitoring with Sysdig
Container Monitoring with SysdigContainer Monitoring with Sysdig
Container Monitoring with Sysdig
 
Introduction to Docker
Introduction to DockerIntroduction to Docker
Introduction to Docker
 
Architecting the Future: Abstractions and Metadata - CodeStock
Architecting the Future: Abstractions and Metadata - CodeStockArchitecting the Future: Abstractions and Metadata - CodeStock
Architecting the Future: Abstractions and Metadata - CodeStock
 
Spinnaker Summit 2018: CI/CD Patterns for Kubernetes with Spinnaker
Spinnaker Summit 2018: CI/CD Patterns for Kubernetes with SpinnakerSpinnaker Summit 2018: CI/CD Patterns for Kubernetes with Spinnaker
Spinnaker Summit 2018: CI/CD Patterns for Kubernetes with Spinnaker
 
Aci dp
Aci dpAci dp
Aci dp
 
Readme
ReadmeReadme
Readme
 
OpenSCAP Overview(security scanning for docker image and container)
OpenSCAP Overview(security scanning for docker image and container)OpenSCAP Overview(security scanning for docker image and container)
OpenSCAP Overview(security scanning for docker image and container)
 
Extending OpenShift Origin: Build Your Own Cartridge with Bill DeCoste of Red...
Extending OpenShift Origin: Build Your Own Cartridge with Bill DeCoste of Red...Extending OpenShift Origin: Build Your Own Cartridge with Bill DeCoste of Red...
Extending OpenShift Origin: Build Your Own Cartridge with Bill DeCoste of Red...
 
Automate drupal deployments with linux containers, docker and vagrant
Automate drupal deployments with linux containers, docker and vagrant Automate drupal deployments with linux containers, docker and vagrant
Automate drupal deployments with linux containers, docker and vagrant
 

More from Felipe Prado

DEF CON 24 - Sean Metcalf - beyond the mcse red teaming active directory
DEF CON 24 - Sean Metcalf - beyond the mcse red teaming active directoryDEF CON 24 - Sean Metcalf - beyond the mcse red teaming active directory
DEF CON 24 - Sean Metcalf - beyond the mcse red teaming active directoryFelipe Prado
 
DEF CON 24 - Bertin Bervis and James Jara - exploiting and attacking seismolo...
DEF CON 24 - Bertin Bervis and James Jara - exploiting and attacking seismolo...DEF CON 24 - Bertin Bervis and James Jara - exploiting and attacking seismolo...
DEF CON 24 - Bertin Bervis and James Jara - exploiting and attacking seismolo...Felipe Prado
 
DEF CON 24 - Tamas Szakaly - help i got ants
DEF CON 24 - Tamas Szakaly - help i got antsDEF CON 24 - Tamas Szakaly - help i got ants
DEF CON 24 - Tamas Szakaly - help i got antsFelipe Prado
 
DEF CON 24 - Ladar Levison - compelled decryption
DEF CON 24 - Ladar Levison - compelled decryptionDEF CON 24 - Ladar Levison - compelled decryption
DEF CON 24 - Ladar Levison - compelled decryptionFelipe Prado
 
DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101Felipe Prado
 
DEF CON 24 - Chris Rock - how to overthrow a government
DEF CON 24 - Chris Rock - how to overthrow a governmentDEF CON 24 - Chris Rock - how to overthrow a government
DEF CON 24 - Chris Rock - how to overthrow a governmentFelipe Prado
 
DEF CON 24 - Fitzpatrick and Grand - 101 ways to brick your hardware
DEF CON 24 - Fitzpatrick and Grand - 101 ways to brick your hardwareDEF CON 24 - Fitzpatrick and Grand - 101 ways to brick your hardware
DEF CON 24 - Fitzpatrick and Grand - 101 ways to brick your hardwareFelipe Prado
 
DEF CON 24 - Rogan Dawes and Dominic White - universal serial aBUSe remote at...
DEF CON 24 - Rogan Dawes and Dominic White - universal serial aBUSe remote at...DEF CON 24 - Rogan Dawes and Dominic White - universal serial aBUSe remote at...
DEF CON 24 - Rogan Dawes and Dominic White - universal serial aBUSe remote at...Felipe Prado
 
DEF CON 24 - Jay Beale and Larry Pesce - phishing without frustration
DEF CON 24 - Jay Beale and Larry Pesce - phishing without frustrationDEF CON 24 - Jay Beale and Larry Pesce - phishing without frustration
DEF CON 24 - Jay Beale and Larry Pesce - phishing without frustrationFelipe Prado
 
DEF CON 24 - Gorenc Sands - hacker machine interface
DEF CON 24 - Gorenc Sands - hacker machine interfaceDEF CON 24 - Gorenc Sands - hacker machine interface
DEF CON 24 - Gorenc Sands - hacker machine interfaceFelipe Prado
 
DEF CON 24 - Allan Cecil and DwangoAC - tasbot the perfectionist
DEF CON 24 - Allan Cecil and DwangoAC -  tasbot the perfectionistDEF CON 24 - Allan Cecil and DwangoAC -  tasbot the perfectionist
DEF CON 24 - Allan Cecil and DwangoAC - tasbot the perfectionistFelipe Prado
 
DEF CON 24 - Rose and Ramsey - picking bluetooth low energy locks
DEF CON 24 - Rose and Ramsey - picking bluetooth low energy locksDEF CON 24 - Rose and Ramsey - picking bluetooth low energy locks
DEF CON 24 - Rose and Ramsey - picking bluetooth low energy locksFelipe Prado
 
DEF CON 24 - Rich Mogull - pragmatic cloud security
DEF CON 24 - Rich Mogull - pragmatic cloud securityDEF CON 24 - Rich Mogull - pragmatic cloud security
DEF CON 24 - Rich Mogull - pragmatic cloud securityFelipe Prado
 
DEF CON 24 - Grant Bugher - Bypassing captive portals
DEF CON 24 - Grant Bugher - Bypassing captive portalsDEF CON 24 - Grant Bugher - Bypassing captive portals
DEF CON 24 - Grant Bugher - Bypassing captive portalsFelipe Prado
 
DEF CON 24 - Patrick Wardle - 99 problems little snitch
DEF CON 24 - Patrick Wardle - 99 problems little snitchDEF CON 24 - Patrick Wardle - 99 problems little snitch
DEF CON 24 - Patrick Wardle - 99 problems little snitchFelipe Prado
 
DEF CON 24 - Plore - side -channel attacks on high security electronic safe l...
DEF CON 24 - Plore - side -channel attacks on high security electronic safe l...DEF CON 24 - Plore - side -channel attacks on high security electronic safe l...
DEF CON 24 - Plore - side -channel attacks on high security electronic safe l...Felipe Prado
 
DEF CON 24 - Six Volts and Haystack - cheap tools for hacking heavy trucks
DEF CON 24 - Six Volts and Haystack - cheap tools for hacking heavy trucksDEF CON 24 - Six Volts and Haystack - cheap tools for hacking heavy trucks
DEF CON 24 - Six Volts and Haystack - cheap tools for hacking heavy trucksFelipe Prado
 
DEF CON 24 - Dinesh and Shetty - practical android application exploitation
DEF CON 24 - Dinesh and Shetty - practical android application exploitationDEF CON 24 - Dinesh and Shetty - practical android application exploitation
DEF CON 24 - Dinesh and Shetty - practical android application exploitationFelipe Prado
 
DEF CON 24 - Klijnsma and Tentler - stargate pivoting through vnc
DEF CON 24 - Klijnsma and Tentler - stargate pivoting through vncDEF CON 24 - Klijnsma and Tentler - stargate pivoting through vnc
DEF CON 24 - Klijnsma and Tentler - stargate pivoting through vncFelipe Prado
 
DEF CON 24 - Antonio Joseph - fuzzing android devices
DEF CON 24 - Antonio Joseph - fuzzing android devicesDEF CON 24 - Antonio Joseph - fuzzing android devices
DEF CON 24 - Antonio Joseph - fuzzing android devicesFelipe Prado
 

More from Felipe Prado (20)

DEF CON 24 - Sean Metcalf - beyond the mcse red teaming active directory
DEF CON 24 - Sean Metcalf - beyond the mcse red teaming active directoryDEF CON 24 - Sean Metcalf - beyond the mcse red teaming active directory
DEF CON 24 - Sean Metcalf - beyond the mcse red teaming active directory
 
DEF CON 24 - Bertin Bervis and James Jara - exploiting and attacking seismolo...
DEF CON 24 - Bertin Bervis and James Jara - exploiting and attacking seismolo...DEF CON 24 - Bertin Bervis and James Jara - exploiting and attacking seismolo...
DEF CON 24 - Bertin Bervis and James Jara - exploiting and attacking seismolo...
 
DEF CON 24 - Tamas Szakaly - help i got ants
DEF CON 24 - Tamas Szakaly - help i got antsDEF CON 24 - Tamas Szakaly - help i got ants
DEF CON 24 - Tamas Szakaly - help i got ants
 
DEF CON 24 - Ladar Levison - compelled decryption
DEF CON 24 - Ladar Levison - compelled decryptionDEF CON 24 - Ladar Levison - compelled decryption
DEF CON 24 - Ladar Levison - compelled decryption
 
DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101
 
DEF CON 24 - Chris Rock - how to overthrow a government
DEF CON 24 - Chris Rock - how to overthrow a governmentDEF CON 24 - Chris Rock - how to overthrow a government
DEF CON 24 - Chris Rock - how to overthrow a government
 
DEF CON 24 - Fitzpatrick and Grand - 101 ways to brick your hardware
DEF CON 24 - Fitzpatrick and Grand - 101 ways to brick your hardwareDEF CON 24 - Fitzpatrick and Grand - 101 ways to brick your hardware
DEF CON 24 - Fitzpatrick and Grand - 101 ways to brick your hardware
 
DEF CON 24 - Rogan Dawes and Dominic White - universal serial aBUSe remote at...
DEF CON 24 - Rogan Dawes and Dominic White - universal serial aBUSe remote at...DEF CON 24 - Rogan Dawes and Dominic White - universal serial aBUSe remote at...
DEF CON 24 - Rogan Dawes and Dominic White - universal serial aBUSe remote at...
 
DEF CON 24 - Jay Beale and Larry Pesce - phishing without frustration
DEF CON 24 - Jay Beale and Larry Pesce - phishing without frustrationDEF CON 24 - Jay Beale and Larry Pesce - phishing without frustration
DEF CON 24 - Jay Beale and Larry Pesce - phishing without frustration
 
DEF CON 24 - Gorenc Sands - hacker machine interface
DEF CON 24 - Gorenc Sands - hacker machine interfaceDEF CON 24 - Gorenc Sands - hacker machine interface
DEF CON 24 - Gorenc Sands - hacker machine interface
 
DEF CON 24 - Allan Cecil and DwangoAC - tasbot the perfectionist
DEF CON 24 - Allan Cecil and DwangoAC -  tasbot the perfectionistDEF CON 24 - Allan Cecil and DwangoAC -  tasbot the perfectionist
DEF CON 24 - Allan Cecil and DwangoAC - tasbot the perfectionist
 
DEF CON 24 - Rose and Ramsey - picking bluetooth low energy locks
DEF CON 24 - Rose and Ramsey - picking bluetooth low energy locksDEF CON 24 - Rose and Ramsey - picking bluetooth low energy locks
DEF CON 24 - Rose and Ramsey - picking bluetooth low energy locks
 
DEF CON 24 - Rich Mogull - pragmatic cloud security
DEF CON 24 - Rich Mogull - pragmatic cloud securityDEF CON 24 - Rich Mogull - pragmatic cloud security
DEF CON 24 - Rich Mogull - pragmatic cloud security
 
DEF CON 24 - Grant Bugher - Bypassing captive portals
DEF CON 24 - Grant Bugher - Bypassing captive portalsDEF CON 24 - Grant Bugher - Bypassing captive portals
DEF CON 24 - Grant Bugher - Bypassing captive portals
 
DEF CON 24 - Patrick Wardle - 99 problems little snitch
DEF CON 24 - Patrick Wardle - 99 problems little snitchDEF CON 24 - Patrick Wardle - 99 problems little snitch
DEF CON 24 - Patrick Wardle - 99 problems little snitch
 
DEF CON 24 - Plore - side -channel attacks on high security electronic safe l...
DEF CON 24 - Plore - side -channel attacks on high security electronic safe l...DEF CON 24 - Plore - side -channel attacks on high security electronic safe l...
DEF CON 24 - Plore - side -channel attacks on high security electronic safe l...
 
DEF CON 24 - Six Volts and Haystack - cheap tools for hacking heavy trucks
DEF CON 24 - Six Volts and Haystack - cheap tools for hacking heavy trucksDEF CON 24 - Six Volts and Haystack - cheap tools for hacking heavy trucks
DEF CON 24 - Six Volts and Haystack - cheap tools for hacking heavy trucks
 
DEF CON 24 - Dinesh and Shetty - practical android application exploitation
DEF CON 24 - Dinesh and Shetty - practical android application exploitationDEF CON 24 - Dinesh and Shetty - practical android application exploitation
DEF CON 24 - Dinesh and Shetty - practical android application exploitation
 
DEF CON 24 - Klijnsma and Tentler - stargate pivoting through vnc
DEF CON 24 - Klijnsma and Tentler - stargate pivoting through vncDEF CON 24 - Klijnsma and Tentler - stargate pivoting through vnc
DEF CON 24 - Klijnsma and Tentler - stargate pivoting through vnc
 
DEF CON 24 - Antonio Joseph - fuzzing android devices
DEF CON 24 - Antonio Joseph - fuzzing android devicesDEF CON 24 - Antonio Joseph - fuzzing android devices
DEF CON 24 - Antonio Joseph - fuzzing android devices
 

Recently uploaded

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 

Recently uploaded (20)

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 

R2C Documentation: Getting Started with Writing Your First Analyzer

  • 2.
  • 3. CONTENTS: 1 Getting Started 3 1.1 Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2 Creating an Analyzer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 Writing Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.4 Running Analysis Locally . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.5 Running on r2c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.6 Analyzing Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2 API Reference 15 2.1 analyzer.json . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.2 Database Schemas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.3 Docker Containers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.4 JSON Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3 Best Practices 19 4 Troubleshooting 21 4.1 The /analysis directory is completely empty . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 i
  • 4. ii
  • 5. r2c Documentation, Release 0.0.18 Note: This documentation is for our closed beta as of release 0.0.18. Visit app.r2c.dev to join the waitlist! r2c is a program analysis platform and cli for security researchers, hackers, and developers. Run your analysis on our infrastructure to quicky get results for millions of projects. The platform offers: 1. scale: run your analysis at large scale across millions of open-source projects and their history 2. re-use: create analysis pipelines with reusable analyzers and shared results 3. determinism: easily define source code targets from GitHub, npm, PyPi, and others 4. collaboration & feedback: collaboratively triage, label, and disclose results Have questions about the beta, r2c, or our fellowships? We’d love to help! Contact us at hello@r2c.dev. CONTENTS: 1
  • 6. r2c Documentation, Release 0.0.18 2 CONTENTS:
  • 7. CHAPTER ONE GETTING STARTED This section will walk you through using r2c-cli and the platform by writing an analyzer the computes the per- centage of whitespace in the source code of 1000 npm packages. Weâ ˘A´Zll use it to find which projects have checked in minified JavaScript files. For detailed documentation of commands, options, and formats, see API Reference. Installation We’ll start our journey with r2c-cli, which is used to develop and test analyzers locally before pushing them to the platform to run at scale. Requirements Note: We support Mac OSX and Ubuntu for local analyzer development. Please install the following required software: • docker: a tool for running software in isolated containers • python3.6+: the programming language needed to run (but not to develop) analyzers • pip3: the Python package manager used to install and update r2c-cli When installing docker: • Do not use snap, it is incompatible with r2c-cli • For Ubuntu users we highly encourage setting up Docker for non-root users. You need to run docker as the same user you’ll run r2c-cli with, and we discourage running r2c-cli as a root or system administrator Run the following to test docker, python3, and pip3 installation: $ docker run hello-world $ python3 --version $ pip3 --version Getting r2c-cli It’s easy! Just run: 3
  • 8. r2c Documentation, Release 0.0.18 $ pip3 install r2c-cli To verify the r2c-cli installation, run: $ r2c --help Usage: r2c [OPTIONS] COMMAND [ARGS]... Options: --debug Show extra output, error messages, and exception stack traces --version Show current version of r2c cli. --no-traverse-manifest Don't attempt to find an analyzer.json if it doesn't exist in the current or specified directory --help Show this message and exit. Commands: build Builds an analyzer without running it. init Creates an example analyzer for analyzing JavaScript/TypeScript. login Log in to the R2C analysis platform. logout Log out of the R2C analysis platform. push Push the analyzer in the current directory to the R2C platform. run Run the analyzer in the current directory over a code directory. test Locally run integration tests for the current analyzer. unittest Locally unit tests for the current analyzer directory. If the help prompt prints, you’re ready to move on to Creating an Analyzer. Creating an Analyzer Creating the Boilerplate In a directory where you want to create your new analyzer, run: $ r2c init For the Analyzer name prompt, enter <YOUR-USERNAME>-minifinder. For the rest of the prompts we’ll use the defaults. $ r2c init Analyzer name (can only contain lowercase letters, numbers or - and _): <YOUR-USERNAME>-minifinder Author name [Jav A. Script]: Author email [hello@example.com]: Will your analyzer produce: - output for a particular `git` repository - output for a particular git `commit` in a repo - the same `constant` output regardless of commit or repo? (git, commit, constant) [commit]: Does your analyzer output - a single schema-compliant JSON file - a full filesystem output? (json, filesystem) [json]: âIJˇT Done! Your analyzer can be found in the <YOUR-USERNAME>-minifinder directory Check it out by changing to the new folder: $ cd <YOUR-USERNAME>-minifinder/ 4 Chapter 1. Getting Started
  • 9. r2c Documentation, Release 0.0.18 Understanding Analyzer Files The init command created several files in the directory you initialized: . âˇTIJ-- Dockerfile âˇTIJ-- README.md âˇTIJ-- analyzer.json âˇTˇT-- src âˇTIJ-- analyze.sh âˇTˇT-- unittest.sh Each of these files is used by the r2c system in a different way. • analyzer.json defines how your analyzer will interact with the r2c system and tools. Some important values in this file are: • analyzer_name: The namespaced analyzer name. For the beta, all analyzers must be namespaced with beta to be uploaded to the platform. • version: The version of the analyzer. Versions should follow semantic versioning. r2c uses the analyzer name, version, and other parameters for caching: when running an analyzer at scale, all of these are used to determine if the analysis has already been run on that piece of code. • dependencies: This analyzer depends on the most basic component, public/source-code. It spec- ifies that it depends on any version of the public/source-code component by specifying "*" as the version. For more complicated analysis, we could depend on components such as r2c/transpiler or r2c/typed-ast, which are beyond the scope of this tutorial. • Dockerfile is responsible for the container’s setup and configuration. In this file you can install dependencies to build and run your analysis. To learn more about Dockerfiles in general, see Docker’s tutorial. Note: Though it can be tempting to use images like node:latest, most analyzers should be deterministic and therefore benefit from pinning their base image to a specific version. For more information, see Best Practices. • src/analyze.sh is the main entry point. From this file, we’ll run your program that performs analysis! • src/unittest.sh run your analyzer’s unit tests, if it has them, inside the container by caling them from this file. Once you’ve checked out those files, let’s move on to Writing Analysis. Writing Analysis For this tutorial, we’re writing an analysis that reports how much of each file in a project is whitespace. We’ll use it to find which projects have checked in minified JavaScript files. Setting up the Container We will write this analysis in the Python programming language. Though we are using Python for this tutorial, you can write analysis in any programming language as each docker container can have different software installed. Before we can write Python, we’ll need to install it in our docker container. To do this, add the following line to the project’s Dockerfile: 1.3. Writing Analysis 5
  • 10. r2c Documentation, Release 0.0.18 1 FROM ubuntu:17.10 2 3 RUN apt update && apt install -y python3 4 && rm -rf /var/lib/apt/lists/* 5 6 RUN groupadd -r analysis && useradd -m --no-log-init --gid analysis analysis 7 8 USER analysis 9 COPY src /analyzer 10 11 WORKDIR / 12 CMD ["/analyzer/analyze.sh"] When we edit our code later, the container will automatically be rebuilt by r2c run. Writing the Code We need to be able to count both whitespace and non-whitespace characters in a given file. We can do this with a simple regular expression in Python. Create a file src/whitespace.py with the following contents: 1 import json 2 import re 3 import sys 4 5 WHITESPACE_RE = re.compile("s") 6 7 8 def count_whitespace(path): 9 print("Counting whitespace in file {}".format(path)) 10 with open(path, "r", encoding="utf-8") as file: 11 data = file.read() 12 result = {} 13 result["check_id"] = "whitespace" 14 result["path"] = path 15 result["extra"] = {} 16 result["extra"]["size"] = len(data) 17 result["extra"]["num_whitespace"] = len(WHITESPACE_RE.findall(data)) 18 return result 19 20 21 all_results = [] 22 for path in sys.argv[1:]: 23 all_results.append(count_whitespace(path)) 24 25 with open("/analysis/output/output.json", "w") as output: 26 output.write(json.dumps({"results": all_results}, sort_keys=True, indent=4)) This file computes the number of whitespace characters and total characters in each file in its input. When we run our analyzer, we want to run this file with all JavaScript input files as arguments. We write this object to /analysis/output/output.json because this is a JSON-type analyzer. r2c also supports filesystem type analyzers, that modify or augment their input but want to preserve a filesystem structure or output large binary data, e.g. neural net training results. Most analysis eventually leads to JSON output, because JSON output is what gets consumed r2c’s other tools. To get just JavaScript files, we’ll use the find program on our mounted source-code directory. Change src/analyze.sh to look like this: 6 Chapter 1. Getting Started
  • 11. r2c Documentation, Release 0.0.18 1 #!/bin/bash 2 3 set -e 4 CODE_DIR="/analysis/inputs/public/source-code" 5 6 cd ${CODE_DIR} 7 8 find . -type f -name '*.js' -print0 | xargs -0 python3 /analyzer/whitespace.py First, we change to the directory our source code is checked out. That folder is /analysis/inputs/public/source-code/ inside the docker container. This location is a result of minifinder depending on the source-code component (configured in analyzer.json). For more information about dependencies and locating their output, see API Reference. Then, we use the find command to find all files that end in .js and use xargs to have it pass all of those file paths as arguments to our python program. Note: Though we wrote our python in a file src/whitespace.py, inside of src/analyze.sh we invoke it at the path /analyzer/whitespace.py. This is because in line 12 of our Dockerfile, we copy the src folder to the /analyzer folder inside the container. Now that we’ve written our code, let’s try Running Analysis Locally. Running Analysis Locally Let’s test our analyzer! First we’ll need a target JavaScript project to test it on. Let’s use the popular left-pad project. Make a folder to keep our test projects in and clone left-pad to the folder: $ mkdir ~/test-repos $ git clone https://github.com/stevemao/left-pad ~/test-repos/left-pad Now we can run our analyzer: $ r2c run ~/test-repos/left-pad We get the following output: $ r2c run ~/test-repos/left-pad/ ð§´RˇC Running analyzer... Counting whitespace in file ./test.js Counting whitespace in file ./perf/es6Repeat.js Counting whitespace in file ./perf/perf.js Counting whitespace in file ./perf/O(n).js Counting whitespace in file ./index.js { "results": [ ... Notice how the print statements in our Python code appear as if we were running the program outside of r2c. This makes it easy to debug programs normally. The full JSON output of the program should look something like this: 1 { 2 "results": [ 3 { 4 "check_id": "whitespace", 1.4. Running Analysis Locally 7
  • 12. r2c Documentation, Release 0.0.18 5 "extra": { 6 "num_whitespace": 638, 7 "size": 3673 8 }, 9 "path": "./test.js" 10 }, 11 { 12 "check_id": "whitespace", 13 "extra": { 14 "num_whitespace": 61, 15 "size": 216 16 }, 17 "path": "./perf/es6Repeat.js" 18 }, 19 { 20 "check_id": "whitespace", 21 "extra": { 22 "num_whitespace": 213, 23 "size": 1442 24 }, 25 "path": "./perf/perf.js" 26 }, 27 { 28 "check_id": "whitespace", 29 "extra": { 30 "num_whitespace": 77, 31 "size": 241 32 }, 33 "path": "./perf/O(n).js" 34 }, 35 { 36 "check_id": "whitespace", 37 "extra": { 38 "num_whitespace": 360, 39 "size": 1137 40 }, 41 "path": "./index.js" 42 } 43 ] 44 } Cool! Now we can hunt for minified files. All of these files look reasonable; though we don’t know exactly what our threshold should be, minified files will probably be less than 5 to 10 percent whitespace. To find minified files in top projects and get a sense of what cut-off point to use, we’ll want to run this analyzer at scale against npm packages. To get started, head on over to Running on r2c. Running on r2c Note: The following sections require you to be a member of the r2c beta. 8 Chapter 1. Getting Started
  • 13. r2c Documentation, Release 0.0.18 Logging in to r2c To publish and download analyzers you’ll need to log in to the r2c platform. $ r2c login Please enter your org name, or to use the common r2c platform, press enter [wildwest]: Opening web browser to get login token. Do you want to continue? [Y/n]: trying to open https://app.r2c.dev/settings/token in your browser... Please enter the API token: ... Follow the on-screen instructions, accepting the default wildwest org name (which is tied to the beta). Pushing to r2c Before uploading our analyzer, let’s take a quick look at it to make sure everything is ready. Once published, analyzer versions can’t be unpublished. For larger analyzers, this would also be the time to run unit and integration tests. For this tutorial analyzer, we’ll just sanity-check the fields in analyzer.json: 1 { 2 "analyzer_name": "[YOUR GROUP]/minifinder", 3 "author_name": "Jav. A. Script", 4 "author_email": "hello@example.com", 5 "version": "0.0.1", 6 "spec_version": "1.2.0", 7 "dependencies": { 8 "public/source-code": "*" 9 }, 10 "type": "commit", 11 "output": { 12 "type": "json" 13 }, 14 "deterministic": true 15 } Everything looks mostly good. However, to follow best practices, our analyzer should use Semantic Versioning. As this is our first release, but we’re not yet sure if everything is production ready, we should designate this release version 0.1.0: "version": "0.1.0", Great! Now we’re all set. We can push our analyzer to r2c by running the following command from within the analyzer directory: $ r2c push That’s it! Head on over to app.r2c.dev to run your analyzer on an npm input set and dive into the results. Starting the Job Once you’ve logged in to the platform you’ll be taken to the Jobs page, which will look similar to this: 1.5. Running on r2c 9
  • 14. r2c Documentation, Release 0.0.18 Click the “Create Job” button, which will pull up a menu letting you select the analyzer you’ve just pushed! Now we’ll select an input set to run our analyzer on: r2c npm 1000, 2019-04-01. 10 Chapter 1. Getting Started
  • 15. r2c Documentation, Release 0.0.18 Click “Run Job” to start the analysis! Your job will be added to the jobs list, where you can click on it to see output, console logs, and errors coming in in real time! See Analyzing Results for next steps. Note: When a job is first kicked off, the infrastructure may need to “warm up”; idle machines will start processing your job and new machines will be brought online to handle the demand. Once warmed up a job should proceed quickly and you’ll get results within just a few minutes for quick analyzers like the tutorial. Note: We currently limit your container to running for 5 minutes with 1.5 GB of memory. Most analyzers don’t hit this limit, however, if you believe you’re running into them please let us know and we can adjust them for your research. Analyzing Results Note: The following instructions are for analyzers with output type json. See analyzer.json to learn more about this and other output types. Now that your analyzer has been run in the web UI (and completed successfully, we hope!) we’ll write some Python to load and graph the results. When you run an analyzer on the r2c platform we store the results in a PostgreSQL database. To interact with this database, we’ll use a mix of sqlalchemy, psycopg2, pandas, and matplotlib Python libraries. Let’s install these in your local environment: $ pip3 install sqlalchemy psycopg2-binary pandas matplotlib Next, create a file in a scratch directory called minifinder-results.py: $ touch minifinder-results.py Copy the following code into minifinder-results.py and update the constants with your org information: 1 import os 2 import pandas as pd 3 import psycopg2 4 from sqlalchemy import create_engine 5 import matplotlib.pyplot as plt 6 7 ################################### 8 # EDIT THESE CONSTANTS 9 ################################### 10 11 DB_PASSWORD = "DB-PASSWORD-FROM-EMAIL" 12 AUTHOR_NAME = "YOUR-NAME" 13 14 ANALYZER_NAME = f"beta/{AUTHOR_NAME}-minifinder" 15 ANALYZER_VERSION = "0.1.0" 16 CORPUS_NAME = "r2c-1000" 17 18 ################################### 19 # END EDIT SECTION 1.6. Analyzing Results 11
  • 16. r2c Documentation, Release 0.0.18 20 ################################### 21 22 # Canonical SQL query to get job-specific results back. 23 JOB_QUERY = """ 24 SELECT * 25 FROM result, 26 commit_corpus 27 WHERE result.commit_hash = commit_corpus.commit_hash 28 AND analyzer_name = %(analyzer_name)s 29 AND analyzer_version = %(analyzer_version)s 30 AND corpus_name = %(corpus_name)s 31 """ 32 33 QUERY_PARAMS = { 34 "corpus_name": CORPUS_NAME, 35 "analyzer_name": ANALYZER_NAME, 36 "analyzer_version": ANALYZER_VERSION, 37 } 38 39 # Connect to PostgreSQL host and query for job-specific results 40 engine = create_engine(f"postgresql://notebook_user:{DB_PASSWORD}@db.r2c.dev/postgres") 41 job_df = pd.read_sql(JOB_QUERY, engine, params=QUERY_PARAMS) 42 43 # Print pandas dataframe to stdout for debugging 44 print("Raw job dataframe:") 45 print(job_df[1:10]) 46 47 # Helper method to compute % whitespace from the num_whitespace and size fields in our 'extra' column 48 def get_percent_whitespace(row): 49 size = row.extra["size"] 50 # Avoid 'division by zero' exceptions. 51 return row.extra["num_whitespace"] / size if size else 0 52 53 54 # Add 'percent_whitespace' column 55 job_df["percent_whitespace"] = job_df.apply(get_percent_whitespace, axis=1) 56 57 # Create a histogram of our data using the `percent_whitespace` column 58 job_df.hist(column="percent_whitespace", bins=100) 59 plt.show() Finally, run minifinder-results.py, which will produce a graph of results: $ python3 minifinder-results.py 12 Chapter 1. Getting Started
  • 17. r2c Documentation, Release 0.0.18 Success! You’ve gone from a question, “how much of each file in a project is whitespace”, to analyzing 1000 npm projects, to graphing the results! Next up is writing your own analyzer, interrogating your data, and making the JavaScript ecosystem a better place. If you have any questions or concerns, please don’t hesitate to reach out to us at . 1.6. Analyzing Results 13
  • 18. r2c Documentation, Release 0.0.18 14 Chapter 1. Getting Started
  • 19. CHAPTER TWO API REFERENCE The r2c analysis API has three main parts - the analyzer.json manifest file which specifies how the analyzer interacts with the rest of the r2c system, the docker container in which the analyzer runs and how r2c expects programs in the container to behave, and the final output formats of the analysis that go on to be used by other parts of the r2c toolchain. analyzer.json The current version of the analyzer.json specification is 1.0.0. This version supports the following fields in the top-level JSON object: analyzer_name [string] The name of the analyzer. All analyzers must belong to an org, e.g. r2c/transpiler. author_name [string] The name of the author. author_email [string] A contact address for the author. version [string] The semantic version string representing the current analyzer version. Note: Analyzer versions are used for caching and result correctness. A given version of a deterministic analyzer run on the same input is expected to always produce the same result. One consequence of this is that you cannot overwrite or push multiple of the same version for a given analyzer. spec_version [string] The version of the analyzer specification this analyzer is compatible with. To match the version of the documented on this page, this should always be 1.0.0. dependencies [object] This is the main mechanism by which analyzers interact with other analysis components in the r2c system. Each key in this object is the name of another analyzer, such as r2c/transpiler. Each value is the version of the analyzer you want to use. An analyzer’s output for the same com- mit or repository appears under its name in /analysis/inputs when your container runs, e.g. /analysis/inputs/r2c/transpiler. type [“constant”, “commit”, or “git”] The type of input for which this analyzer produces a unique output. • A constant analyzer always produces the same output. This can be useful for the trained data for a machine learning based analysis, by having the analyzer depend on its trained weights separately from the actual analysis. • A commit analyzer produces output for a JavaScript project at a specific point in its history, a single instance of code. This analyzer should not expect to make use of any version control features. Most analyzers are commit anlayzers. • A git analyzer understands the evolution and history of a repository. It produces unique outputs for an entire commit graph or bare git checkout. These analyzers can directly compare trends in a project. 15
  • 20. r2c Documentation, Release 0.0.18 output [“json”, “filesystem”, or “both”] The type of output this analyzer produces. • json output is the most common output format. For the structure of the resulting JSON, see JSON Results. • filesystem output allows the analyzer to output an arbitrary directory and file structure. Hard links are not supported. These analyzers may do such things as transpile code, install dependencies, or produce binary or other data from an analysis that would not be serializable as JSON. They lose the benefit of parts of the r2c toolchain that understand the JSON format, so most analysis should eventually end up as JSON. • The both value indicates that both json and filesystem output appear together. This is useful for describing properties of the filesystem output to query the results. deterministic [boolean] The current version of the r2c platform does not support non-deterministic analyzers. How- ever, this feature is a part of the analyzer specifciation version 1.0.0 and will be implemneted soon. Database Schemas Note: We recommend psql for raw querying and exploration of data or the Python library sqlalchemy for data analysis in environments like Jupyter Notebook. See Analyzing Results for more details and connection strings. Analyzers with output type json (discussed in analyzer.json) are stored in our PostgreSQL database. There are two primary tables: 1. result 2. commit_corpus result The schema for the result table closely follows the format of JSON Results. Column | Type ------------------+--------- id | integer analyzer_name | text analyzer_version | text commit_hash | text check_id | text path | text start_line | integer start_col | integer end_line | integer end_col | integer extra | jsonb commit_corpus Column | Type -------------+------ commit_hash | text repo_url | text corpus_name | text 16 Chapter 2. API Reference
  • 21. r2c Documentation, Release 0.0.18 Docker Containers The Dockerfile for r2c analyzers allows you to customize your operating system, software, and filesystem as much as you like. However, r2c expects a few things of these containers to be able to run analysis, and a few more things for a smooth experience when developing analysis on your local machine. Filesystem Structure Dependencies for analyzers are mounted into the container at /analysis/inputs. For more on specifying depen- dencies, see manifest. All dependency names must also be valid Linux filesystem paths; this enables r2c to mount each dependency in a subfolder matching the org and analyzer name. For example, for the transpiler analyzer published by r2c-cli, specified in the dependency object as "r2c/transpiler": "1.0.0", its output will appear un- der /analysis/inputs/r2c/transpiler. This behavior is slightly different for filesystem type analyzers and JSON type analyzers. For filesystem type an- alyzers, their input appears at that path. If r2c/transpiler is of type filesystem and writes a file named foo.bin to /analysis/output/output/, then when depending on r2c/tranpsiler, that file would exist at /analysis/inputs/r2c/transpiler/foo.bin. If on the other hand, r2c/transpiler is of type JSON and writes its output to /analysis/output/output.json, that JSON file would exist at /analysis/inputs/r2c/transpiler.json. Above we allude to /analysis/output/; this is the folder where analyzers are expected to write their output. For filesystem type analyzers, the contents of the folder /analysis/output/output/ will be saved. For JSON type analyzers, the contents of /analysis/output/output.json will be saved. Entry Point Behavior Analyzers may specify any entry point (default Docker CMD) they wish. By default, the r2c init template sets this to /analyzer/analyze.sh. However, this command could instead be a java -jar command, a python program, or any other valid command. The only expectation r2c has of this command is that if it exits with a non-zero return code, the analysis is considered to have failed and output will not be saved. User Permissions It is considered bad practice to run inside of Docker containers as the root user. It is also considered bad practice when running locally to develop r2c analyzers as root. This leads to one complication: files created in the container as the user inside the container may have different user permissions that prevent the local user running r2c from cleaning up the temporary files. To avoid this, the default r2c template takes arguments that create a user with the same ID as the current user on the host machine. This happens once at build time, and does not affect the ability of the analyzer to run at scale on r2c infrastructure, but avoids files being created that require extra permission to clean up when running locally. If you need to, you may have analysis inside the Docker container run as any user you like, though it is often not good practice to do so. JSON Results The JSON results produced by your analysis must conform to the output.json specification. As of 1.0.0, there are two top-level fields: results (required) and errors (optional): 2.3. Docker Containers 17
  • 22. r2c Documentation, Release 0.0.18 { "results": [], "errors": [] } results The results key is a required top-level field containing an array of 0 or more result objects that specify the type of result and its location (if applicable). A sample results response from our whitepace finding analyzer in Running Analysis Locally is: { "results":[ { "check_id":"whitespace", "path":"perf/O(n).js", "extra":{ "whitespace":77, "total":241 } } ] } Each result object supports the following fields: Each point object supports the following fields: errors Optional field containing an array of 0 or more error objects with error messages and related data. A sample errors response looks like: { "results":[], "errors": [ { "message": "Cyclomatic complexity limit reached.", "data": { "path": "foobar.js" } } ] } Each error object supports the following fields: 18 Chapter 2. API Reference
  • 24. r2c Documentation, Release 0.0.18 20 Chapter 3. Best Practices
  • 25. CHAPTER FOUR TROUBLESHOOTING The /analysis directory is completely empty If the /analysis directory doesn’t even have inputs and output subdirectories, then Docker is probably having issues mounting those folders. To verify, run $ tempdir=$(mktemp -d) $ touch $tempdir/data $ docker run --volume $tempdir:/dummy alpine:latest ls -l /dummy You should see a single line of output: total 0 If this is the case, then Docker can’t mount folders inside /tmp. This can happen if you installed Docker via the snap package manager (i.e., which docker shows a path starting with /snap). If you did that, you’ll need to run sudo snap remove docker and then reinstall Docker via your package manager or a similar method. If you see two lines of output like total 0 -rw-r--r-- 1 root root 0 Apr 15 21:47 data then the issue is something else. In that case, please contact us and we’ll be happy to assist. 21