CodeChecker Overview Nov 2019

CodeChecker
overview and demo
Olivera Milenkovic, Ericsson
Dublin C++ Meetup,18.11.2019.
Olivera.O.Milenkovic@ericsson.com

Contents
• Introduction
• What is CodeChecker
• Why to use Codechecker - features
• Demo
• Deployment
• Limitations
• Next steps

Main reasons to use Static Code analyses
• Find bugs
• have reliable and robust code
• Increase path coverage not just line coverage
• Identify potential security vulnerabilities
• Improve customer perception - find software bugs before they find your customers
• Reduce cost of fixing customer trouble reports
• by decreasing number of potential defects, especially those that are hard to reproduce
• Comply to Coding standards for example MISRA
• Increase coding and design rules competence within teams
• Analyze source code without running it
• Find optimization opportunities
• Visualize
• Metrics
It saves time and money

• No additional test code needed
• Features
• Full path coverage
• False path pruning
• Inline explanations of defects
• Customer trust and satisfaction
• Enforce safe coding standards: MISRA, CWE,
STATIC CODE ANALYSES
ClangSA
Cppcheck
Flexlint
Speed Depth Accuracy Usability Supported versions of OS and language
Commercial Open source
And many more….

1. Textual Pattern Matching – (CppCheck...)
2. AST Matchers/Walkers - (CppCheck, Clang Tidy...)
3. Abstract Interpretation - Flow sensitive Algorithms (Compiler warnings)
4. Abstract Interpretation - Path sensitive algorithms, Symbolic Execution (Clang static analyzer)
5. Concolic Execution
6. …
How can we measure the precision of the checkers?
1. False positive rate: False Reports / All Reports
2. False negative rate: Non-reported defects / All existing defects
The lower these values, the better, but 0 is impossible.
Checking techniques

1. Pattern matching
int foo (const int value){
int clampedValue=value;
if (clampedValue<minVal)
clampedValue=minVal;
else if (clampedValue > maxVal)
clampedVal=maxVal;
return clampedValue;
}
int foo (const int value){
const int clampedValue=value;
clampedValue =
clamp(value,minVal,maxVal);
return clampedValue;
}
if ( w+ x3c w+ ) { w+ = w+ ; } else {if ( w+ x3c w+ ) { w+ = w+ ; } }
Design rule matching pattern
(Sample from CppCheck)
Original Code Desired Code

2. ASTMatchers
void foo (double * p, size_t s){
…
p=(double*) malloc(s*sizeof(p));
…
}
void foo (double * p, size_t s){
…
p=(double*) malloc(s*sizeof(*p));
…
}
Arglist
Arg
Type
double *
Name
p
Primary Expr
sizeof
paramlist
Param 1
p
Suspicious code
Correct code
Size of the pointer is obtained
instead of the pointed to type

Multiple tool usage
More tools ->
• More defects detected due to
different analyses engines
• Better quality – shift left
• Less security vulnerabilities
• Less vendor lock-in
• More false positives
• More hardware usage
• More education
• More cost if commercial tools used
tool4
tool1
tool2
Tool 3

Successful projects depend on
• Members
• Developers
• Standards
• Infrastructure…
To develop products that market will adopt

Ericsson CodeChecker Team
• Develops the CodeChecker Tooling in open source:
• https://github.com/Ericsson/codechecker
• Contributes to the open source Clang project
• With new checkers
• With new analyzer features: CTU, Statistical
analysis
• With bug fixes
• Contributing to open source Clang since 2013

CodeChecker references
• https://github.com/Ericsson/codechecker
• https://codechecker.readthedocs.io/en/latest/
• https://codechecker.readthedocs.io/en/latest/usage/
• http://codechecker-demo.eastus.cloudapp.azure.com/login.html#
• User id: demo
• Password: demo
• http://clang-analyzer.llvm.org/available_checks.html
• http://clang.llvm.org/extra/clang-tidy/checks/list.html

CODECHECKER
LLVM Project
http://llvm.org/
Clang
http://clang.llvm.org/ http://clang.llvm.org/docs/index.html
"LLVM native" C/C++/Objective-C compiler:
C/C++/Objective C compiler
Open source, modular
Can be reused as library
Many analysis related tools:
Address Sanitizer, Thread Sanitizer, Clang-format, …
Clang Tidy, Clang Static Analyzer, Clang compiler warnings
Up to date C/C++ language support (C++11,14,17)
Active developer community:
Google, Samsung, Sony, Ericsson …
source code analysis tool
• find programming bugs in C/C++, Objective C
• Impressive checker framework
• Symbolic Execution
• 150+ Checkers
• Extensible with new checkers
• Active community: Apple, Ericsson, Samsung …
collection of modular and reusable compiler and toolchain technologies http://llvm.org/
Clang Static Analyzer
http://clang-analyzer.llvm.org
CodeChecker
database and viewer extension for Clang
Static Analyzer
https://github.com/Ericsson/codechecker
Clang Tidy
http://clang-analyzer.llvm.org

CodeChecker by Ericsson
open source tooling for clang analyzers
Clang Tidy
clang.llvm.org
Clang SA
clang-analyzer.llvm.org
CodeChecker
Report Mgmt
Web Server
GIT (CI) Commit
Loop
utilities
Browser
Eclipse Client
CodeChecker
command line
E/// Checkers
Analyzer Report Storage
Ericsson
only checkers
New Analyzer Features
Cross Translation Unit Analysis
Statistical Checkers
Viewer & Report
Management
Features
1
2
3

Summary
Short summary card CodeChecker 6.10.1 - current latest version
(Clang 8 )
Description Analyzer tooling, defect database and viewer extension for the Clang Static Analyzer and
Clang Tidy
static analysis infrastructure built on the LLVM/Clang Static Analyzer toolchain
Supported Languages C,C++,Objective C
Supported Architectures Linux, OSX, docker support
Technology AST Matcher,
Symbolic Execution
Analyzers Support for multiple analyzers, currently Clang Static Analyzer and Clang-Tidy, soon also
CppCheck, plan is to also add FB Infer and other tools (sanitizers)
Features Inter-procedural analysis, Cross TU Analysis, statistical checkers, suppression
handling, filtering,…
Price Free
False Positive Rate Low
Checker Database ~300+ checkers (Clangsa 120+, clang-tidy 250+,clang warnings…)
+50 Ericsson rules
Developer Community Large open source. Apple, Google, Ericsson supported

ClangSA Capabilities & Limitations
Codechecker is invocation of clang analyses and ClangSA has following capabilities
• Constraints registered for bools, integers, chars, pointers (during symbolic execution values represented as
intervals)
• Memory Aliasing detected
int a=0; int* p =&a; 1/*p; //!Div by Zero! - pointer value followed
• Hierarchical memory model for arrays, structs and classes - tracking all values of members
• Path-sensitive analysis -all branches, switch statements followed
• Context-sensitive inter-procedural analysis – function calls are followed even in external TUs
Limitations (mainly scalability)
• Number of unique paths are exponential with the number of branches: Opportunistic exploration of some
paths (does not give full path coverage) if, switch, for loop, while potential branches… - promising paths
selected by heuristics
• Limited loop unrolling – by defaults not all possible explored (4 times unrolled if we do not know condition
variable for loop), for fixed numbers try to unroll
• Limited call depths – cannot go indefinitely, fixed limit used
• Too long paths are hard to understand for humans – in some cases presume that return value is unknown

Included checkers
• CodeChecker is doing many type of checks
• http://clang-analyzer.llvm.org/available_checks.html
• http://clang.llvm.org/extra/clang-tidy/checks/list.html
• Ericsson customized checkers
• new clang analysis features, such as CTU (Cross Translation Unit Analysis)
• statistical checkers not open sourced yet, but plan is to open source them in 2020
• It is identifying
• defects in the code that were missed during deskchecks and testing
• places where users did not follow some of design rules and best practices

SEI Cert rules and testing
• https://wiki.sei.cmu.edu/confluence/display/seccode
• Example
• https://wiki.sei.cmu.edu/confluence/display/cplusplus/DCL50-
CPP.+Do+not+define+a+C-style+variadic+function
• Automation tools support info on SEI Cert website is not complete and
up to date
• Ericsson test suite testing SEI cert rules
• CodeChecker with Clang 8 covers about 38% of rules
• 151 PASSED from the 380 Test-cases: in total 39.7% rule coverage
• We need clarification from SEI Cert to publish test suite
• With other Clang dynamic sanitizer tools coverage can be increased

Checker categories covered by clang I
Null pointer dereferences
Dereference after a null check
Dereference a null return value
Dereference before a null check
Security best practice violations
Possible buffer overflow
Copy into a fixed sized buffer
Calling risky function
User pointer dereference
Program hangs
sleep()while holding lock
Double lock or missing unlock
Infinite loop
Negative loop bound
Thread deadlock
Code Maintainability Issues
Multiple return statements
Unused pointer value
Incorrect Expressions
Evaluation Order Violation
Copy & paste error
Insecure Data handling
Integer overflow
Loop bound by untrusted source
Write/read array/pointer with untrusted
value
Format string with untrusted source
Performance inefficiencies
Big parameter passed by value
Large stack use

Checker categories covered by clang II
Resource Leaks
Memory Leaks
Resource leak in object
Incomplete delete
Uninitialized variables
Missing return statement
Uninitialized pointer/scalar/…
Uninitialized data member in class or struct
Integer Handling Issues
Improper use of negative value
Unintended sign extension
Incompatible cast
Improper use of APIs
Insecure chroot
Using invalid iterator
printf() argument mismatch
Memory Corruptions
Out-of-bounds access
String length miscalculations
Copying to too small destination buffers
Overflowed pointer write
Negative array index write
Allocation size error
Memory-illegal access
Incorrect delete operator
Overflowed pointer read
Out-of-bounds read
Returning pointer to local variable
Negative array index read
User/read pointer after free
Control Flow issues
Logically/structurally dead code
Missing break in switch
Error handling issues
Unchecked return value
Uncaught exception

MAIN REASONS
TO USE
CODECHECKER
1. Easier visual understanding of defects (The root cause of each defect is clearly
explained, making it easy to fix bugs)
2. Full path coverage – CTU analyses and statistical checkers
3. Overall summary of results for product (good for Status monitoring and
Planning of cleanups)
4. Filtering possibilities
5. Visibility of “Depth” of finding - number of steps that lead to error
6. Suppression handling (Per finding not file, False positive vs intentional)
7. Report generation
8. Easy detection of new defects
9. Easy integration to Gerrit verification for new defects
10. Additional Ericsson checkers
11. Eclipse integration...
12. Low false positive rate – path pruning

THE CLANG STATIC ANALYZER
•Approximations/heuristics
• Non-complete and non-sound
• False positive/false negative
• Industrial use (Ericsson, Apple, …)
•Symbolic execution
• Path sensitive
28
Introduction • Motivation • Implementation • Evaluation • Summary

Ericsson Internal | 2016-03-08 | Page 5
Sim pl e An a l y s is
test1(){
int z=1/(3-abs(3));
}
int abs(int a){
if a<0
return –a;
else
return a;
}
source1.c Error:
Division by zero
Interprocedural: symbolic execution across
procedure (function) boundaries.
Simple analyses

int abs(int a){
if a<0
return –a;
else
return a;
}
test1(){
int z=1/(3-abs(3));
}
Cr o s s Tr a n s l a t io n
Un it a n a l y s is (CTU)
source1.c
source2.c
Translation Unit 1
Translation Unit 2
Error:
Division by zero
To detect bugs across source file boundaries CTU analysis is needed!
Cross translational unit analyses (CTU)

St a t is t ic a l a n a l y s is
func1(){
char *s = read_from_user();
if (s!=NULL)
print(“user said: %s”, s);
}
func2(){
if (s!=NULL)
print(“user says: %s”, s);
}
func3(){
if (s)
print(“user thinks: %s”, s);
}
func4(){
print(“user wrote: %s”, s);
}
Based on usage samples
the analyzer knows that
read_from_user()
can return NULL pointer.
Error:
s can potentially be
NULL, so
we report a
null pointer
dereference error
here.
Null return, negative return, checked return
Statistical analyses

SUMMARY CTU
•Clang SA TU-internal analysis extended to Cross-
TU analysis for C/C++
•Finds 2-3 times more reports
•Scalable & useful for industrial-size projects
(PostgreSql, OpenSSL, …)
•Patch has been accepted into upstream Clang
•Try it yourself with latest Clang and CodeChecker
32

• Executable level
• Generate code, create build database and analyse all code included
in executable
• Library level
• Generate code and build database only for lib/code under
investigation
• Analyse code
• Limitations
• Some higher level defects would be missed
• Statistical analyses would also give different results
Analysis approach

Example of flag usage
• --ctu --stats --report-hash context-free -j10 --skip <path>/codechecker_skip_file -e sensitive -e abseil-string-
find-startswith -e apiModeling.TrustNonnull -e bugprone-narrowing-conversions -e cert-msc32-c -d bugprone-
virtual-near-miss -d bugprone-incorrect-roundings -o
• CodeChecker checkers --profile sensitive - display all checkers included

CodeChecker suppression handling

STATIC ANALYSES findings
Bug Intentional False positive
• Fix the code • Tool correctly reports
• We have reason not to fix
• Tool not correct and should be
improved

• Before doing any suppressions please read user
guide https://github.com/Ericsson/codechecker/blob/master/docs/false_positives.md that explains how to deal with
false positives
• Also https://github.com/Ericsson/codechecker/blob/master/docs/user_guide.md#suppression-code explains format of
suppressions if you still decide to go ahead with suppressions
• codechecker_false_positive
• codechecker_intentional
• codechecker_confirmed
• If by mistake you have added suppression into the code and then change your mind after some run was
stored to the database then you need to both remove suppression from the code, but also change review
status in the database for that bug to "Unreviewed"
Suppression handling

Tracking results & Reporting tool issues
False positives False negatives
• Configuration or filtering issue or
• Tool does not detect
• Checker exist but limitation of the tool
• Unsupported type of checker
• Should be fixed to improve tool and
prevent waste
True positive-bug
• Should be treated as TR
• Collect stats about identified issues
• Track accuracy of checker
True positive - intentional
• Prepare standalone reproducible example and report ticket!

Examples using open source projects
• http://codechecker-demo.eastus.cloudapp.azure.com/Default/#
• Password – Demo/Demo
• https://codechecker.readthedocs.io/en/latest/usage/

Demo of codechecker workflow
1. Analyze a project
2. View results in command line & static HTML
3. Upload results to CodeChecker Web Server (optional)
4. Add a bug & Fix a bug
5. List results changes after code update
6. Fix the added bugs
7. Confirm that no new bug is introduced
8. Commit patch & (Store new analysis results)

1. Analyze project
//Create compile commands json
CodeChecker log -b "make clean;make -j10" -o
compile_commands.json
//Create a first clean analysis
CodeChecker analyze ./compile_commands.json -j20 -o reports
[INFO 2018-03-04 19:08] - Starting static analysis ...
[INFO 2018-03-04 19:08] - [1/10] clang-tidy analyzed
tinystr.cpp successfully.
[INFO 2018-03-04 19:08] - [2/10] clang-tidy analyzed
tinyxmlerror.cpp successfully.
Analysis results are stored in the reports directory.

2. View results
CodeChecker parse ./reports --print-steps
Found no defects while analyzing xmltest.cpp
[HIGH] tinyxml.cpp:1542:23: Access to field 'next' results in a dereference of a null pointer (loaded from
variable 'node') [core.NullDereference]
node->prev->next = node->next;
^
Report hash: f99cd33b42d9620b2aba1e32bfdce636
Steps:
1, tinyxml.cpp:564:2: Calling 'TiXmlElement::ClearThis'
2, tinyxml.cpp:568:1: Entered call from '~TiXmlElement'
3, tinyxml.cpp:571:9: Entering loop body
4, tinyxml.cpp:574:3: Calling 'TiXmlAttributeSet::Remove'
5, tinyxml.cpp:1534:1: Entered call from 'TiXmlElement::ClearThis'
7, tinyxml.cpp:1538:48: Value assigned to 'node'
8, tinyxml.cpp:1538:2: Looping back to the head of the loop
10, tinyxml.cpp:1540:8: Assuming 'node' is equal to 'removeMe'
11, tinyxml.cpp:1540:8: Assuming pointer value is null
12, tinyxml.cpp:1542:23: Access to field 'next' results in a dereference of a null pointer (loaded from
variable 'node')
Found 2 defect(s) while analyzing tinyxml.cpp
----==== Summary ====----
--------------------------------
Filename | Report count
--------------------------------
tinyxmlparser.cpp | 2
tinyxml.cpp | 1
tinyxml.h | 1
--------------------------------
-----------------------
Severity | Report count
-----------------------
HIGH | 4
-----------------------
----=================----
Total number of reports: 4
-----------

3. Generate reports to static
html
CodeChecker parse ./reports -e html -o reports_html
[INFO 2018-03-04 19:11] - Generating html output files:
...
To view the results in a browser run:
> firefox
/home/ednikru/work/codechecker/education_material/TinyXML/reports_html

4. Start server and store results
//Starting server (with sqlite db backend)
CodeChecker server -v 12345 -w ./workspace&
//Store analysis results to the local server
CodeChecker store ./reports --url http://localhost:12345/Default -n
tinyxml_base
//results can be viewed at
//firefox http://localhost:12345/Default
For official CI runs in your organizations setup permanent database,
setup access rights and push results to it
This is done by CI scripts and not by the developer!

5. List results changes after
code update
//Re-analyzing the project
CodeChecker check -b "make" -o reports -j10
//Calculating diff to the central DB
CodeChecker cmd diff -n ./reports -b tinyxml_base --url
http://localhost:12345/Default --new
[INFO 2018-03-04 20:07] - Matching against runs: tinyxml_base
/home/ednikru/work/codechecker/education_material/TinyXML/tinyxml.cpp:1
887:2: Potential leak of memory pointed to by 'something'
[cplusplus.NewDeleteLeaks]
DoIndent();
Edit the code and reanalyze the project!
Note: Here we use incremental analysis!
http://localhost:12345/Default --new -o html -e ./diff_html
//To see the results
//$ firefox ./diff_html
You can generate the diff output to static HTML too:

6. Fix the code and re-check
//Re-analyzing the project
CodeChecker check -b "make" -o reports -j10
http://localhost:12345/Default --new
[INFO 2018-03-04 19:49] - Matching against runs: tinyxml_base
[INFO 2018-03-04 19:49] - No results
Fix the errors reported by CodeChecker in your source code!
All good!
Your patch will not introduce new faults that can be detected by Clang! 
It is safe to push it for gerrit review.
All re-analysis and diff is done locally by the developer
without updating the central DB!

What to do with false positives?
1. Ruling out infeasible paths
1. Use asserts (and debug builds)
2. Correlated conditions
3. Partial functions
4. Loops
5. Prefer standard functions
6. Use const whenever possible
7. Do not turn off core checks
2. Suppress or skip results
1. 3rd party/Unauthored code
• Put them to skip list
2. Authored code:
• Use asserts, make intents explicit
• Otherwise //codechekcer_suppress
For more info see:
CodeChecker False positives Guide
Clang SA FAQ and How to Deal with Common False Positives

If nothing else helps use codechecker suppress
If nothing else helps, you can use in-line code suppression
to suppress false positive.
It is likely that this indicates a bug in the analyzer.

CodeChecker analyze ./compile_commands.json -j20 -o
reports -i SKIPFILE
False positives in 3pp code: use skip file
-/skip/all/files/in/directory/*
-/do/not/check/this.file
+/dir/do.check.this.file
-/dir/*
Skip file format:

Introducing
new analysis
tool or
changing
settings
New tool, new version of existing
tool or modified settings
New legacy findings identified
preventing new defects
Continue using new version
Cleanup of legacy
findings ( most
critical first) and/or
suppressions of
some less critical

CLANG STATIC ANALYZER – RESTRICTIONS
•Limits on exploration: Callstack depth, number of
inlined function, number of ExplodedNodes…
•Analysis Budget
•Clang SA supported: Inter-procedural within one
translation unit (TU)
56
void foo(bool b) {
for(int i = 0; i < 100; ++i)
{
if (b) ...
}
}

How to get more complete results
• Limitations
• Hardware limitations – abstract syntax tree size
• Analyses time limitations
• Settings control depth of analyses, number of loops etc
• Goal
• to get from tool potential defects in the code
• Not speed of runs
• Regular default setting but also some less frequent longer runs with
increased settings
• Cross translational analyses
• Statistical checkers
• Incremental analyses

Further plans
• Multiple analyzers will be supported
• More checkers will be added
• Source code will be updated to Python 3
• Debian packages will be released

Activities in open source
• Contributors on GitHub:
• Total: 27
• External active in this year: 5
• Open source projects analyzed regurarly with CodeChecker:
• Firefox
• Chromium
• …
• Docker image is available on dockerhub for the webserver
• docker run -d -p 8001:8001 -v
/home/$USER/codechecker_workspace:/workspace
codechecker/codechecker-web:6.10.1

Various comparisons of tools – additional
features and settings matter
•https://www.spazioit.com/pages_en/sol_inf_en/code_quality_en/clang_vrs_fb_infer_en/
Here you are with the reports produced by the two tools:
Clang Report
FB Infer Report
•>>SAFe Toolset<<
•>Clang Analizer Demo I<
•>Clang Analyzer Demo II<
•>Clang Analyzer Demo III<
•>Facebook Infer Demo<
•>SonarQube Demo<

Summary
• Define good strategy for legacy cleanups
• Together we can do more in open source community
• Reuse more and share findings
• Report bugs and examples
• Contribute with checkers
• Compare tools so that we specialize usage and get more in total

CodeChecker Overview Nov 2019

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie CodeChecker Overview Nov 2019

Ähnlich wie CodeChecker Overview Nov 2019 (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

CodeChecker Overview Nov 2019

Hinweis der Redaktion