SlideShare ist ein Scribd-Unternehmen logo
1 von 4
Downloaden Sie, um offline zu lesen
How to add a new diagnostic rule into
PVS-Studio? Days from developers' life...
Author: Evgeniy Ryzhkov

Date: 14.09.2011

We are often asked the question how one can add one's own diagnostic rule into our static analyzer
PVS-Studio. And we always answer that it is very simple: "You just need to write us a letter with your
request and we will add this rule into the analyzer". This interface of adding new rules is convenient to
users. This is the best and most convenient interface actually. It's not so easy to do it on your own as it
may seem. In this post, I will show you the bottom of the iceberg implied in the words "we have added
this simple rule".




OK, let's be honest. PVS-Studio doesn't have a mechanism to enable users to write their own rules. This
is, first of all, an architectural restriction. Some users may get upset because of it. But actually such
people don't understand what they want because the mechanism of adding a new rule is rather
complicated in practice.

For instance, there is rule V536: Be advised that the utilized constant value is represented by an octal
form. It is intended for detecting errors like the following one found in Miranda IM:

static const struct _tag_cpltbl

{

    unsigned cp;

    const char* mimecp;

} cptbl[] =

{
{     037, "IBM037" },              // IBM EBCDIC US-Canada

    {     437, "IBM437" },              // OEM United States

    {     500, "IBM500" },              // IBM EBCDIC International

    {     708, "ASMO-708" },            // Arabic (ASMO 708)

    ...

}

Value 037 is written here instead of 37 just for the sake of alignment, although it is really number 31 in
decimal form because 037 is an octal number according to the C++ rules. The error is rather simple for
diagnosis, isn't it? Indeed: search for a constant, check if it begins with 0 and generate a diagnostic
message. This diagnostic rule can be implemented in an hour. And a static analyzer's user (a skilled
programmer but novice regarding development of such tools) is theoretically ready to make this rule
himself/herself.

But he/she will fail, or rather he/she will succeed but the result won't satisfy him/her because of too
many false reports.

We (PVS-Studio's developers) do it in the following way. Before starting to implement a rule, we at first
write unit-tests for it where we add special marks for those lines that should trigger the rule and leave
those lines that shouldn't. Then we write the first version of the code for diagnostics. As this new rule
passes successfully all the unit-tests written particularly for it, we run full unit-tests to make sure that
the new rule hasn't broken anything. The full unit-tests are run for several tens of minutes. Surely, you
rarely manage "not to break anything" at the first time, so iterations are run again and again until the
unit-tests are passed completely.

The unit-tests are followed by the next step when we run the rule on real projects. We have about 70
projects in our test base (for 2011); these are famous and not very famous open source projects we test
our analyzer on. We have a self-made launcher that opens projects in Visual Studio, launches the
analyzer, saves the log, compares it to the master copy and shows the differences (messages
disappeared, messages added and messages changed). One cycle of running all the projects for Visual
Studio 2005 takes 4 hours on a computer with four cores and 8 Gbytes of memory, the analysis process
being parallelized. Judging by the analysis results we can see how well the new rule is integrated.
Usually we start to study the differences without waiting for analysis to complete because sometimes
it's clear if something has gone wrong. But we may stop running tests even if everything works well and
nothing is broken. It happens when there are too many false reports. If we see that a diagnostic rule is
triggered too often, we make exceptions from this rule.

Let's return to the search of incorrect octal numbers, the V536 rule. It has quite a number of exceptions.
These are the situations when the warning is not generated:

    1. The constant's value is below 8.
    2. The number is defined through #define.
    3. There are more than two octal numbers in one block (the rule was triggered more than twice),
       except for 0 and char type.
    4. This number is used as an argument in functions: _open, mknod, open, wxMkdir,
       wxFileName::Mkdir, wxFileName::AppendDir, chmod.
5. This is a nested block of numbers. We check only the top one. For example: { {090}, {091}, {092},
       {093} }.
    6. The number is <= 0777 and acts as a part of a statement that contains words: file, File.
    7. The number is an argument of a function one parameter of which contains the string "%o".

Sometimes you cannot predict what exceptions there will be before running the rule on real projects.
That's why the iteration "making an exception - running tests - analyzing results" may be repeated many
times.

Then, when all the exceptions are implemented and the number of false reports is tolerable, the tests'
results (saved log-files of detected errors) are defined as a master copy. After that we run the same tests
for other versions of Visual Studio. At present (in 2011), we have support for the three versions of Visual
Studio: Visual Studio 2005/2008/2010. Tests in them run for 4/4.5/5 hours correspondingly, which
makes the total time of 14-15 hours. It may appear that one and the same code causes a different
number of diagnostic messages depending on Visual Studio's version. Usually it is determined by
differences in header files of different SDKs. We try to eliminate the differences so that the tests run in
all the versions of Visual Studio give identical results.

Why is it so important to run all those tiresome tests? Is it only because of false reports? No, it's not only
because of them. The point is that you can't take account of all the possible types of constructs while
making a rule and write it so that it works correctly and moreover doesn't cause the analyzer to crash.
This is what we need all those numerous tests for!

What strange constructs do we mean? Here you are a couple of examples.

Do you know that you may define a variable, for instance, in a way like this:

int const unsigned static a22 = 0;

Do you know that __int64 can be a variable's name?

int __identifier(__int64);

Do you know that braces are not necessary after switch?

switch (0)

   if (X) Foo();

Even if you do not use such constructs in code yourself, there can be some in libraries you use.

Finally, having passed all the tests, we may start working on documentation. For each diagnostic rule we
write a help entry in Russian and English. Translation to English also takes some time. Now that the code
is debugged, the tests are passed and documentation is written, the rule appears in the next release of
PVS-Studio.

So, the development cycle of a new diagnostic rule can be briefly represented as follows:

    1. An idea of a new rule: either we invent it ourselves, or our users tell us, or we manage to spot it
       somewhere.
    2. Formulating the rule.
    3. Implementation of the rule.
4. Testing of every kind.
    5. Analyzing the results. Variants are possible:
           a. Improving the rule's formulation, adding exceptions, going to step 2.
           b. Rule's implementation is considered established; we may write and translate the
               documentation.

The total time of implementing one rule is several days. Usually it is from 2 to 5 days. Naturally, we have
enough experience now and can predict what exceptions from a rule we may need and implement them
right away. We also manage to reduce the number of test runs. But anyway, the key idea is that only
running tests you can formulate the exceptions and eliminate most of the false reports.

Now let's return to the question of how users can add rules themselves. As you can see, there are two
reasons why it's difficult to do: they don't have enough skill to make out good exceptions and a good
test base to run the rule on it.

A question arises: "Then why are there means of creating user-own rules in some famous and
respectable static analyzers?" Perhaps, we can't do it? Partly because of this, yes. But the mechanism of
creating user-own rules serves only one purpose - to compose "rival-comparison matrixes":




Well, users will ask, won't we be enabled to create our rules in PVS-Studio some day? Some day we will
manage to implement this mechanism and not to lose in comparison matrixes :-). But for now, all who
want to create a new rule will receive a link to this post with the words: "We have the best interface for
adding user rules - just write us a letter!"

Weitere ähnliche Inhalte

Mehr von Andrey Karpov

PVS-Studio в 2021 - Примеры ошибок
PVS-Studio в 2021 - Примеры ошибокPVS-Studio в 2021 - Примеры ошибок
PVS-Studio в 2021 - Примеры ошибокAndrey Karpov
 
Make Your and Other Programmer’s Life Easier with Static Analysis (Unreal Eng...
Make Your and Other Programmer’s Life Easier with Static Analysis (Unreal Eng...Make Your and Other Programmer’s Life Easier with Static Analysis (Unreal Eng...
Make Your and Other Programmer’s Life Easier with Static Analysis (Unreal Eng...Andrey Karpov
 
Best Bugs from Games: Fellow Programmers' Mistakes
Best Bugs from Games: Fellow Programmers' MistakesBest Bugs from Games: Fellow Programmers' Mistakes
Best Bugs from Games: Fellow Programmers' MistakesAndrey Karpov
 
Does static analysis need machine learning?
Does static analysis need machine learning?Does static analysis need machine learning?
Does static analysis need machine learning?Andrey Karpov
 
Typical errors in code on the example of C++, C#, and Java
Typical errors in code on the example of C++, C#, and JavaTypical errors in code on the example of C++, C#, and Java
Typical errors in code on the example of C++, C#, and JavaAndrey Karpov
 
How to Fix Hundreds of Bugs in Legacy Code and Not Die (Unreal Engine 4)
How to Fix Hundreds of Bugs in Legacy Code and Not Die (Unreal Engine 4)How to Fix Hundreds of Bugs in Legacy Code and Not Die (Unreal Engine 4)
How to Fix Hundreds of Bugs in Legacy Code and Not Die (Unreal Engine 4)Andrey Karpov
 
Game Engine Code Quality: Is Everything Really That Bad?
Game Engine Code Quality: Is Everything Really That Bad?Game Engine Code Quality: Is Everything Really That Bad?
Game Engine Code Quality: Is Everything Really That Bad?Andrey Karpov
 
C++ Code as Seen by a Hypercritical Reviewer
C++ Code as Seen by a Hypercritical ReviewerC++ Code as Seen by a Hypercritical Reviewer
C++ Code as Seen by a Hypercritical ReviewerAndrey Karpov
 
The Use of Static Code Analysis When Teaching or Developing Open-Source Software
The Use of Static Code Analysis When Teaching or Developing Open-Source SoftwareThe Use of Static Code Analysis When Teaching or Developing Open-Source Software
The Use of Static Code Analysis When Teaching or Developing Open-Source SoftwareAndrey Karpov
 
Static Code Analysis for Projects, Built on Unreal Engine
Static Code Analysis for Projects, Built on Unreal EngineStatic Code Analysis for Projects, Built on Unreal Engine
Static Code Analysis for Projects, Built on Unreal EngineAndrey Karpov
 
Safety on the Max: How to Write Reliable C/C++ Code for Embedded Systems
Safety on the Max: How to Write Reliable C/C++ Code for Embedded SystemsSafety on the Max: How to Write Reliable C/C++ Code for Embedded Systems
Safety on the Max: How to Write Reliable C/C++ Code for Embedded SystemsAndrey Karpov
 
The Great and Mighty C++
The Great and Mighty C++The Great and Mighty C++
The Great and Mighty C++Andrey Karpov
 
Static code analysis: what? how? why?
Static code analysis: what? how? why?Static code analysis: what? how? why?
Static code analysis: what? how? why?Andrey Karpov
 
Zero, one, two, Freddy's coming for you
Zero, one, two, Freddy's coming for youZero, one, two, Freddy's coming for you
Zero, one, two, Freddy's coming for youAndrey Karpov
 
PVS-Studio Is Now in Chocolatey: Checking Chocolatey under Azure DevOps
PVS-Studio Is Now in Chocolatey: Checking Chocolatey under Azure DevOpsPVS-Studio Is Now in Chocolatey: Checking Chocolatey under Azure DevOps
PVS-Studio Is Now in Chocolatey: Checking Chocolatey under Azure DevOpsAndrey Karpov
 
PVS-Studio Static Analyzer as a Tool for Protection against Zero-Day Vulnerab...
PVS-Studio Static Analyzer as a Tool for Protection against Zero-Day Vulnerab...PVS-Studio Static Analyzer as a Tool for Protection against Zero-Day Vulnerab...
PVS-Studio Static Analyzer as a Tool for Protection against Zero-Day Vulnerab...Andrey Karpov
 
Analysis of commits and pull requests in Travis CI, Buddy and AppVeyor using ...
Analysis of commits and pull requests in Travis CI, Buddy and AppVeyor using ...Analysis of commits and pull requests in Travis CI, Buddy and AppVeyor using ...
Analysis of commits and pull requests in Travis CI, Buddy and AppVeyor using ...Andrey Karpov
 
PVS-Studio in the Clouds: CircleCI
PVS-Studio in the Clouds: CircleCIPVS-Studio in the Clouds: CircleCI
PVS-Studio in the Clouds: CircleCIAndrey Karpov
 
PVS-Studio in the Clouds: Azure DevOps
PVS-Studio in the Clouds: Azure DevOpsPVS-Studio in the Clouds: Azure DevOps
PVS-Studio in the Clouds: Azure DevOpsAndrey Karpov
 

Mehr von Andrey Karpov (20)

PVS-Studio в 2021 - Примеры ошибок
PVS-Studio в 2021 - Примеры ошибокPVS-Studio в 2021 - Примеры ошибок
PVS-Studio в 2021 - Примеры ошибок
 
PVS-Studio в 2021
PVS-Studio в 2021PVS-Studio в 2021
PVS-Studio в 2021
 
Make Your and Other Programmer’s Life Easier with Static Analysis (Unreal Eng...
Make Your and Other Programmer’s Life Easier with Static Analysis (Unreal Eng...Make Your and Other Programmer’s Life Easier with Static Analysis (Unreal Eng...
Make Your and Other Programmer’s Life Easier with Static Analysis (Unreal Eng...
 
Best Bugs from Games: Fellow Programmers' Mistakes
Best Bugs from Games: Fellow Programmers' MistakesBest Bugs from Games: Fellow Programmers' Mistakes
Best Bugs from Games: Fellow Programmers' Mistakes
 
Does static analysis need machine learning?
Does static analysis need machine learning?Does static analysis need machine learning?
Does static analysis need machine learning?
 
Typical errors in code on the example of C++, C#, and Java
Typical errors in code on the example of C++, C#, and JavaTypical errors in code on the example of C++, C#, and Java
Typical errors in code on the example of C++, C#, and Java
 
How to Fix Hundreds of Bugs in Legacy Code and Not Die (Unreal Engine 4)
How to Fix Hundreds of Bugs in Legacy Code and Not Die (Unreal Engine 4)How to Fix Hundreds of Bugs in Legacy Code and Not Die (Unreal Engine 4)
How to Fix Hundreds of Bugs in Legacy Code and Not Die (Unreal Engine 4)
 
Game Engine Code Quality: Is Everything Really That Bad?
Game Engine Code Quality: Is Everything Really That Bad?Game Engine Code Quality: Is Everything Really That Bad?
Game Engine Code Quality: Is Everything Really That Bad?
 
C++ Code as Seen by a Hypercritical Reviewer
C++ Code as Seen by a Hypercritical ReviewerC++ Code as Seen by a Hypercritical Reviewer
C++ Code as Seen by a Hypercritical Reviewer
 
The Use of Static Code Analysis When Teaching or Developing Open-Source Software
The Use of Static Code Analysis When Teaching or Developing Open-Source SoftwareThe Use of Static Code Analysis When Teaching or Developing Open-Source Software
The Use of Static Code Analysis When Teaching or Developing Open-Source Software
 
Static Code Analysis for Projects, Built on Unreal Engine
Static Code Analysis for Projects, Built on Unreal EngineStatic Code Analysis for Projects, Built on Unreal Engine
Static Code Analysis for Projects, Built on Unreal Engine
 
Safety on the Max: How to Write Reliable C/C++ Code for Embedded Systems
Safety on the Max: How to Write Reliable C/C++ Code for Embedded SystemsSafety on the Max: How to Write Reliable C/C++ Code for Embedded Systems
Safety on the Max: How to Write Reliable C/C++ Code for Embedded Systems
 
The Great and Mighty C++
The Great and Mighty C++The Great and Mighty C++
The Great and Mighty C++
 
Static code analysis: what? how? why?
Static code analysis: what? how? why?Static code analysis: what? how? why?
Static code analysis: what? how? why?
 
Zero, one, two, Freddy's coming for you
Zero, one, two, Freddy's coming for youZero, one, two, Freddy's coming for you
Zero, one, two, Freddy's coming for you
 
PVS-Studio Is Now in Chocolatey: Checking Chocolatey under Azure DevOps
PVS-Studio Is Now in Chocolatey: Checking Chocolatey under Azure DevOpsPVS-Studio Is Now in Chocolatey: Checking Chocolatey under Azure DevOps
PVS-Studio Is Now in Chocolatey: Checking Chocolatey under Azure DevOps
 
PVS-Studio Static Analyzer as a Tool for Protection against Zero-Day Vulnerab...
PVS-Studio Static Analyzer as a Tool for Protection against Zero-Day Vulnerab...PVS-Studio Static Analyzer as a Tool for Protection against Zero-Day Vulnerab...
PVS-Studio Static Analyzer as a Tool for Protection against Zero-Day Vulnerab...
 
Analysis of commits and pull requests in Travis CI, Buddy and AppVeyor using ...
Analysis of commits and pull requests in Travis CI, Buddy and AppVeyor using ...Analysis of commits and pull requests in Travis CI, Buddy and AppVeyor using ...
Analysis of commits and pull requests in Travis CI, Buddy and AppVeyor using ...
 
PVS-Studio in the Clouds: CircleCI
PVS-Studio in the Clouds: CircleCIPVS-Studio in the Clouds: CircleCI
PVS-Studio in the Clouds: CircleCI
 
PVS-Studio in the Clouds: Azure DevOps
PVS-Studio in the Clouds: Azure DevOpsPVS-Studio in the Clouds: Azure DevOps
PVS-Studio in the Clouds: Azure DevOps
 

Kürzlich hochgeladen

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 

Kürzlich hochgeladen (20)

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

How to add a new diagnostic rule into PVS-Studio?

  • 1. How to add a new diagnostic rule into PVS-Studio? Days from developers' life... Author: Evgeniy Ryzhkov Date: 14.09.2011 We are often asked the question how one can add one's own diagnostic rule into our static analyzer PVS-Studio. And we always answer that it is very simple: "You just need to write us a letter with your request and we will add this rule into the analyzer". This interface of adding new rules is convenient to users. This is the best and most convenient interface actually. It's not so easy to do it on your own as it may seem. In this post, I will show you the bottom of the iceberg implied in the words "we have added this simple rule". OK, let's be honest. PVS-Studio doesn't have a mechanism to enable users to write their own rules. This is, first of all, an architectural restriction. Some users may get upset because of it. But actually such people don't understand what they want because the mechanism of adding a new rule is rather complicated in practice. For instance, there is rule V536: Be advised that the utilized constant value is represented by an octal form. It is intended for detecting errors like the following one found in Miranda IM: static const struct _tag_cpltbl { unsigned cp; const char* mimecp; } cptbl[] = {
  • 2. { 037, "IBM037" }, // IBM EBCDIC US-Canada { 437, "IBM437" }, // OEM United States { 500, "IBM500" }, // IBM EBCDIC International { 708, "ASMO-708" }, // Arabic (ASMO 708) ... } Value 037 is written here instead of 37 just for the sake of alignment, although it is really number 31 in decimal form because 037 is an octal number according to the C++ rules. The error is rather simple for diagnosis, isn't it? Indeed: search for a constant, check if it begins with 0 and generate a diagnostic message. This diagnostic rule can be implemented in an hour. And a static analyzer's user (a skilled programmer but novice regarding development of such tools) is theoretically ready to make this rule himself/herself. But he/she will fail, or rather he/she will succeed but the result won't satisfy him/her because of too many false reports. We (PVS-Studio's developers) do it in the following way. Before starting to implement a rule, we at first write unit-tests for it where we add special marks for those lines that should trigger the rule and leave those lines that shouldn't. Then we write the first version of the code for diagnostics. As this new rule passes successfully all the unit-tests written particularly for it, we run full unit-tests to make sure that the new rule hasn't broken anything. The full unit-tests are run for several tens of minutes. Surely, you rarely manage "not to break anything" at the first time, so iterations are run again and again until the unit-tests are passed completely. The unit-tests are followed by the next step when we run the rule on real projects. We have about 70 projects in our test base (for 2011); these are famous and not very famous open source projects we test our analyzer on. We have a self-made launcher that opens projects in Visual Studio, launches the analyzer, saves the log, compares it to the master copy and shows the differences (messages disappeared, messages added and messages changed). One cycle of running all the projects for Visual Studio 2005 takes 4 hours on a computer with four cores and 8 Gbytes of memory, the analysis process being parallelized. Judging by the analysis results we can see how well the new rule is integrated. Usually we start to study the differences without waiting for analysis to complete because sometimes it's clear if something has gone wrong. But we may stop running tests even if everything works well and nothing is broken. It happens when there are too many false reports. If we see that a diagnostic rule is triggered too often, we make exceptions from this rule. Let's return to the search of incorrect octal numbers, the V536 rule. It has quite a number of exceptions. These are the situations when the warning is not generated: 1. The constant's value is below 8. 2. The number is defined through #define. 3. There are more than two octal numbers in one block (the rule was triggered more than twice), except for 0 and char type. 4. This number is used as an argument in functions: _open, mknod, open, wxMkdir, wxFileName::Mkdir, wxFileName::AppendDir, chmod.
  • 3. 5. This is a nested block of numbers. We check only the top one. For example: { {090}, {091}, {092}, {093} }. 6. The number is <= 0777 and acts as a part of a statement that contains words: file, File. 7. The number is an argument of a function one parameter of which contains the string "%o". Sometimes you cannot predict what exceptions there will be before running the rule on real projects. That's why the iteration "making an exception - running tests - analyzing results" may be repeated many times. Then, when all the exceptions are implemented and the number of false reports is tolerable, the tests' results (saved log-files of detected errors) are defined as a master copy. After that we run the same tests for other versions of Visual Studio. At present (in 2011), we have support for the three versions of Visual Studio: Visual Studio 2005/2008/2010. Tests in them run for 4/4.5/5 hours correspondingly, which makes the total time of 14-15 hours. It may appear that one and the same code causes a different number of diagnostic messages depending on Visual Studio's version. Usually it is determined by differences in header files of different SDKs. We try to eliminate the differences so that the tests run in all the versions of Visual Studio give identical results. Why is it so important to run all those tiresome tests? Is it only because of false reports? No, it's not only because of them. The point is that you can't take account of all the possible types of constructs while making a rule and write it so that it works correctly and moreover doesn't cause the analyzer to crash. This is what we need all those numerous tests for! What strange constructs do we mean? Here you are a couple of examples. Do you know that you may define a variable, for instance, in a way like this: int const unsigned static a22 = 0; Do you know that __int64 can be a variable's name? int __identifier(__int64); Do you know that braces are not necessary after switch? switch (0) if (X) Foo(); Even if you do not use such constructs in code yourself, there can be some in libraries you use. Finally, having passed all the tests, we may start working on documentation. For each diagnostic rule we write a help entry in Russian and English. Translation to English also takes some time. Now that the code is debugged, the tests are passed and documentation is written, the rule appears in the next release of PVS-Studio. So, the development cycle of a new diagnostic rule can be briefly represented as follows: 1. An idea of a new rule: either we invent it ourselves, or our users tell us, or we manage to spot it somewhere. 2. Formulating the rule. 3. Implementation of the rule.
  • 4. 4. Testing of every kind. 5. Analyzing the results. Variants are possible: a. Improving the rule's formulation, adding exceptions, going to step 2. b. Rule's implementation is considered established; we may write and translate the documentation. The total time of implementing one rule is several days. Usually it is from 2 to 5 days. Naturally, we have enough experience now and can predict what exceptions from a rule we may need and implement them right away. We also manage to reduce the number of test runs. But anyway, the key idea is that only running tests you can formulate the exceptions and eliminate most of the false reports. Now let's return to the question of how users can add rules themselves. As you can see, there are two reasons why it's difficult to do: they don't have enough skill to make out good exceptions and a good test base to run the rule on it. A question arises: "Then why are there means of creating user-own rules in some famous and respectable static analyzers?" Perhaps, we can't do it? Partly because of this, yes. But the mechanism of creating user-own rules serves only one purpose - to compose "rival-comparison matrixes": Well, users will ask, won't we be enabled to create our rules in PVS-Studio some day? Some day we will manage to implement this mechanism and not to lose in comparison matrixes :-). But for now, all who want to create a new rule will receive a link to this post with the words: "We have the best interface for adding user rules - just write us a letter!"