SlideShare ist ein Scribd-Unternehmen logo
1 von 9
Downloaden Sie, um offline zu lesen
Static code analysis for verification of the
64-bit applications
Authors: Andrey Karpov, Evgeniy Ryzhkov

Date: 22.04.2007


Abstract
The coming of 64-bit processors to the PC market causes a problem which the developers have to solve:
the old 32-bit applications should be ported to the new platform. After such code migration an
application may behave incorrectly. The article is elucidating question of development and appliance of
static code analyzer for checking out of the correctness of such application. Some problems emerging in
applications after recompiling in 64-bit systems are considered in this article as well as the rules
according to which the code check up is performed.

This article contains various examples of 64-bit errors. However, we have learnt much more examples
and types of errors since we started writing the article and they were not included into it. Please see the
article "A Collection of Examples of 64-bit Errors in Real Programs" that covers defects in 64-bit
programs we know of most thoroughly. We also recommend you to study the course "Lessons on
development of 64-bit C/C++ applications" where we describe the methodology of creating correct 64-
bit code and searching for all types of defects using the Viva64 code analyzer.


1. Introduction
Mass production of the 64-bit processors and the fact that they are widely spread led the developers to
the necessity to develop 64-bit versions of their programs. The applications must be recompiled to
support 64-bit architectures exactly for users to get real advantages of the new processors.
Theoretically, this process must not contain any problems. But in practice after the recompiling an
application often does not function in the way it is supposed to do. This may occur in different
situations: from data file failure up to help system break down. The cause of such behavior is the
alteration of the base type data size in 64-bit processors, to be more exact, in the alteration of type size
ratio. That's why the main problems of code migration appear in applications which were developed
using programming languages like C or C++. In languages with strictly structuralized type system (for
example .NET Framework languages) as a rule there are no such problems.

So, what's the problem with exactly these languages? The matter is that even all the high-level
constructions and C++ libraries are finally realized with the use of the low-level data types, such as a
pointer, a machine word, etc. When the architecture is changed and these data types are changed, too,
the behavior of the program may also change.

In order to be sure that the program is correct with the new platform it is necessary to check the whole
code manually and to make sure that it is correct. However, it is impossible to perform the whole real
commercial application check-up because of its huge size.
2. The example of problems arising when code is ported to 64-bit
platforms
Here are some examples illustrating the appearance of some new errors in an application after the code
migration to a 64-bit platform. Other examples may be found in different articles [1, 2].

When the amount of memory necessary for the array was defined constant size of type was used. With
the 64-bit system this size was changed, but the code remained the same:

size_t ArraySize = N * 4;

intptr_t *Array = (intptr_t *)malloc(ArraySize);

Some function returned the value of -1 size_t type if there was an error. The checking of the result was
written in the following way:

size_t result = func();

if (result == 0xffffffffu) {

// error

}

For the 64-bit system the value of -1 for this type is different from 0xffffffff and the check up does not
work.

The pointer arithmetic is a permanent source of problems. But in the case of 64-bit applications some
new problems are added to the already existing ones. Let's consider the example:

unsigned a16, b16, c16;

char *pointer;

...

pointer += a16 * b16 * c16;

As we can see, the pointer is never able to get an increment more than 4 gigabytes and this, though, is
not diagnosed by modern compilers as a warning, and in the future would lead to disability of programs
to work. There exist many more examples of potentially dangerous code.

All these and many other errors were discovered in real applications while migration to the 64-bit
platform.


3. The review of the existing solutions
There exist different approaches to the securing of the code applications correctness. Let's enumerate
the most widely spread ones: unit test checking, dynamic code analysis (performed when an application
is working), static code analysis (analysis of source code). No one can claim that one of the variants of
testing is better than the others, but all these approaches support different aspects of application
quality.
Unit tests are meant for quick checking of small sections of a code, for instance, of single functions and
classes [3]. Their peculiarity is that these tests are performed quickly and allow being started often. And
this causes two nuances of using this technology. The first one is that these tests must be written.
Secondly, testing of large amounts of memory (for example, more than two gigabytes) takes much time,
so it is not expedient because the unit tests must work fast.

Dynamic code analyzers (the best representative of which is Compuware Bounds Checker) are meant to
find errors in an application while the latter is running a program. This principle of work determines the
main disadvantage of the dynamic analyzer. To make sure the program is correct it is necessary to
accomplish all the possible code branches. For a real program this might be difficult. But this does not
mean that the dynamic code analyzer is useless. This analysis allows to discover the errors which
depends upon the actions of the user and cannot be defined through the application code.

Static code analyzers (for instance Gimpel Software PC-lint and Parasoft C++test) are meant for complex
securing of the code quality and contain several hundreds of analyzed rules [4]. They also contain some
rules which analyze the correctness of 64-bit applications. However, they are code analyzers of general
purpose, so their use of securing the 64-bit application quality is not always appropriate. This can be
explained by the fact that they are not meant for this purpose. Another serious disadvantage is their
directivity to the data model which is used in Unix-systems (LP64),while the data model used in
Windows-systems (LLP64) is quite different. That's why the use of static analyzers for checking of 64-bit
Windows applications can be possible only after unobvious additional setting.

The presence of special diagnostic system for potentially incorrect code (for instance key /Wp64 in
Microsoft Visual C++ compiler) can be considered as some additional level of code check. However this
key allows to track only the most incorrect constructions, while it leaves out many other dangerous
operations.

There arises a question "Is it really necessary to check the code while migrating to 64-bit systems if there
are only few such errors in the application?" We believe that this checking is necessary at least because
large companies (such as IBM and Hewlett-Packard) have laid out some articles [2] devoted to errors
which appear when the code is being ported in their sites.


4. The Rules of the Code Correctness Analysis
We have formulated 10 rules of search of dangerous from the point of view of code migrating to 64-bit
system C++ language constructions.

In the rules we use a specially introduced memsize type. Here we mean any simple integer type capable
of storing a pointer inside and able to change its size when the digit capacity of a platform changes from
32 to 64 bit. The examples of memsize types are size_t, ptrdiff_t, all pointers, intptr_t, INT_PTR,
DWORD_PTR.

Now let's list the rules themselves and give some examples of their application.

RULE 1.
Constructions of implicit and explicit integer type of 32 bits converted to memsize types should be
considered dangerous:

unsigned a;
size_t b = a;

array[a] = 1;

The exceptions are:

1) The converted 32-bit integer type is a result of an expression in which less than 32 bits are required to
represent the value of an expression:

unsigned short a;

unsigned char b;

size_t c = a * b;

At the same time the expression must not consist of only numerical literals:

size_t a = 100 * 100 * 100;

2) The converted 32-bit type is represented by a numeric literal:

size_t a = 1;

size_t b = 'G';

RULE 2.
Constructions of implicit and explicit conversion of memsize types to integer types of 32-bit size should
be considered dangerous:

size_t a;

unsigned b = a;

An exception: the converted size_t is the result of sizeof() operator accomplishment:

int a = sizeof(float);

RULE 3.
We should also consider to be dangerous a virtual function which meets the following conditions:

a) The function is declared in the base class and in the derived class.

b) Function argument types does not coincide but they are equivalent to each other with a 32-bit system
(for example: unsigned, size_t) and are not equivalent with 64-bit one.

class Base {

   virtual void foo(size_t);

};

class Derive : public Base {

   virtual void foo(unsigned);

};
RULE 4.
The call of overloaded functions with the argument of memsize type. And besides, the functions must be
overloaded for the whole 32-bit and 64-bit data types:

void WriteValue(__int32);

void WriteValue(__int64);

...

ptrdiff_t value;

WriteValue(value);

RULE 5.
The explicit conversion of one type of pointer to another should be consider dangerous if one of them
refers to 32/64 bit type and the other refers to the memsize type:

int *array;

size_t *sizetPtr = (size_t *)(array);

RULE 6.
Explicit and implicit conversion of memsize type to double and vice versa should be considered
dangerous:

size_t a;

double b = a;

RULE 7.
The transition of memsize type to a function with variable number of arguments should be considered
dangerous:

size_t a;

printf("%u", a);

RULE 8.
The use of series of magic constants (4, 32, 0x7fffffff, 0x80000000, 0xffffffff) should be considered as
dangerous:

size_t values[ARRAY_SIZE];

memset(values, ARRAY_SIZE * 4, 0);

RULE 9.
The presence of memsize types members in unions should be considered as dangerous:

union PtrNumUnion {

   char *m_p;

   unsigned m_n;
} u;

...

u.m_p = str;

u.m_n += delta;

RULE 10.
Generation and processing of exceptions with use of memsize type should be considered dangerous:

char *p1, *p2;

try {

    throw (p1 - p2);

}

catch (int) {

    ...

}

It is necessary to note the fact that rule 1 covers not only type conversion while it is being assigned, but
also when a function is called, an array is indexated and with pointer arithmetic. These rules (the first as
well as the others) describe a large amount of errors, larger than the given examples. In other words,
the given examples only illustrate some particular situations when these rules are applied.

The represented rules are embodied in static code analyzer Viva64. The principle of its functioning is
covered in the following part.


5. Analyzer architecture
The work of analyzer consists of several stages, some of which are typical for common C++ compilers
(picture 1).




                                     Picture 1. Analyzer architecture.
At the input of the analyzer we have a file with the source code, and as a result of its work a report
about potential code errors (with line numbers attached) is generated. The stages of the analyzer's work
are the following: preprocessing, parsing and analysis itself.

At the preprocessing stage the files introduced by means of #include directive are inserted, and also the
parameters of conditional compiling (#ifdef/#endif) are processed.

After the parsing of a file we get an abstract syntax tree with the information necessary for the future
analysis is constructed. Let's take up a simple example:

int A, B;

ptrdiff_t C;

C = B * A;

There is a potential problem concerned with different data types in this code. Variable C can never
possess the value less or more than 2 gigabytes and such situation may be incorrect. The analyzer must
report that there is a potentially incorrect construction in the line "C = B * A". There are several variants
of correction for this code. If variables B and a cannot possess the value less or more than 2 gigabytes in
terms of the value, but the variable C can do it, so the expression should be written in the following way:

C =    (ptrdiff_t)(B) * (ptrdiff_t)(A);

But if the variables A and B with a 64-bit system can possess large values, so we should replace them
with ptrdiff_t type:

ptrdiff_t A;

ptrdiff _t B;

ptrdiff _t C;

C = B * A;

Let's see how all this can be performed at the parsing stage.

First, an abstract syntax tree is constructed for the code (picture 2).
Picture 2. Abstract syntax tree.

Then, at the parsing stage it is necessary to determine the types of variables, which participate in the
evaluation of the expression. For this purpose some auxiliary information is used. This information was
received when the tree was being constructed (type storage module). We can see this on the picture 3.




                                   Picture 3. Type Information storage.

After the determination of types of all the variables participating in the expression it is necessary to
calculate the resulting types of subexpressions. In the given example it is necessary to define the type of
result of the intermediate expression "B * A". This can be done by means of the type evaluation module,
as it is shown on the picture 4.
Picture 4. Expression type evaluation.

Then the correction of the resulting type expression evaluation is performed (operation "=" in the given
example) and in the case of type conflict the construction is marked as potentially dangerous. There is
such a conflict in the given example, because the variable C possesses the size of 64 bits (with the 64-bt
system) and the result of the expression "B * A" possesses the size of 32 bits.

The analysis of other rules is performed in the similar way because almost all of them are related to the
correction of the types of one or another parameter.


6. Results
Most methods of code analysis described in this article are embodied in the commercial static code
analyzer Viva64. The use of this analyzer with real projects has proved the expediency of the code
checking while developing 64-bit applications - real code errors could be discovered much quicker by
means of this analyzer, than if you just use common examination of the source codes.


References
    1. J. P. Mueller. "24 Considerations for Moving Your Application to a 64-bit Platform", DevX.com,
       June 30, 2006.
    2. Hewlett-Packard, "Transitioning C and C++ programs to the 64-bit data model".
    3. S. Sokolov, "Bulletproofing C++ Code", Dr. Dobb's Journal, January 09, 2007.
    4. S. Meyers, M. Klaus, "A First Look at C++ Program Analyzer", Dr. Dobb's Journal, Feb. Issue,
       1997.

Weitere Àhnliche Inhalte

Was ist angesagt?

Mining Fix Patterns for FindBugs Violations
Mining Fix Patterns for FindBugs ViolationsMining Fix Patterns for FindBugs Violations
Mining Fix Patterns for FindBugs Violations
Dongsun Kim
 
Binary code obfuscation through c++ template meta programming
Binary code obfuscation through c++ template meta programmingBinary code obfuscation through c++ template meta programming
Binary code obfuscation through c++ template meta programming
nong_dan
 

Was ist angesagt? (18)

Mining Fix Patterns for FindBugs Violations
Mining Fix Patterns for FindBugs ViolationsMining Fix Patterns for FindBugs Violations
Mining Fix Patterns for FindBugs Violations
 
64-bit Loki
64-bit Loki64-bit Loki
64-bit Loki
 
TBar: Revisiting Template-based Automated Program Repair
TBar: Revisiting Template-based Automated Program RepairTBar: Revisiting Template-based Automated Program Repair
TBar: Revisiting Template-based Automated Program Repair
 
20 issues of porting C++ code on the 64-bit platform
20 issues of porting C++ code on the 64-bit platform20 issues of porting C++ code on the 64-bit platform
20 issues of porting C++ code on the 64-bit platform
 
Program errors occurring while porting C++ code from 32-bit platforms on 64-b...
Program errors occurring while porting C++ code from 32-bit platforms on 64-b...Program errors occurring while porting C++ code from 32-bit platforms on 64-b...
Program errors occurring while porting C++ code from 32-bit platforms on 64-b...
 
C04701019027
C04701019027C04701019027
C04701019027
 
How to do code review and use analysis tool in software development
How to do code review and use analysis tool in software developmentHow to do code review and use analysis tool in software development
How to do code review and use analysis tool in software development
 
100% code coverage by static analysis - is it that good?
100% code coverage by static analysis - is it that good?100% code coverage by static analysis - is it that good?
100% code coverage by static analysis - is it that good?
 
Analysis of PascalABC.NET using SonarQube plugins: SonarC# and PVS-Studio
Analysis of PascalABC.NET using SonarQube plugins: SonarC# and PVS-StudioAnalysis of PascalABC.NET using SonarQube plugins: SonarC# and PVS-Studio
Analysis of PascalABC.NET using SonarQube plugins: SonarC# and PVS-Studio
 
Static code analysis and the new language standard C++0x
Static code analysis and the new language standard C++0xStatic code analysis and the new language standard C++0x
Static code analysis and the new language standard C++0x
 
Static code analysis and the new language standard C++0x
Static code analysis and the new language standard C++0xStatic code analysis and the new language standard C++0x
Static code analysis and the new language standard C++0x
 
Difficulties of comparing code analyzers, or don't forget about usability
Difficulties of comparing code analyzers, or don't forget about usabilityDifficulties of comparing code analyzers, or don't forget about usability
Difficulties of comparing code analyzers, or don't forget about usability
 
Difficulties of comparing code analyzers, or don't forget about usability
Difficulties of comparing code analyzers, or don't forget about usabilityDifficulties of comparing code analyzers, or don't forget about usability
Difficulties of comparing code analyzers, or don't forget about usability
 
Difficulties of comparing code analyzers, or don't forget about usability
Difficulties of comparing code analyzers, or don't forget about usabilityDifficulties of comparing code analyzers, or don't forget about usability
Difficulties of comparing code analyzers, or don't forget about usability
 
Project 2 the second project involves/tutorialoutlet
Project 2 the second project involves/tutorialoutletProject 2 the second project involves/tutorialoutlet
Project 2 the second project involves/tutorialoutlet
 
Binary code obfuscation through c++ template meta programming
Binary code obfuscation through c++ template meta programmingBinary code obfuscation through c++ template meta programming
Binary code obfuscation through c++ template meta programming
 
C and CPP Interview Questions
C and CPP Interview QuestionsC and CPP Interview Questions
C and CPP Interview Questions
 
Intro to c++
Intro to c++Intro to c++
Intro to c++
 

Andere mochten auch

Andere mochten auch (7)

The impact of innovation on travel and tourism industries (World Travel Marke...
The impact of innovation on travel and tourism industries (World Travel Marke...The impact of innovation on travel and tourism industries (World Travel Marke...
The impact of innovation on travel and tourism industries (World Travel Marke...
 
Open Source Creativity
Open Source CreativityOpen Source Creativity
Open Source Creativity
 
Reuters: Pictures of the Year 2016 (Part 2)
Reuters: Pictures of the Year 2016 (Part 2)Reuters: Pictures of the Year 2016 (Part 2)
Reuters: Pictures of the Year 2016 (Part 2)
 
What's Next in Growth? 2016
What's Next in Growth? 2016What's Next in Growth? 2016
What's Next in Growth? 2016
 
The Six Highest Performing B2B Blog Post Formats
The Six Highest Performing B2B Blog Post FormatsThe Six Highest Performing B2B Blog Post Formats
The Six Highest Performing B2B Blog Post Formats
 
The Outcome Economy
The Outcome EconomyThe Outcome Economy
The Outcome Economy
 
32 Ways a Digital Marketing Consultant Can Help Grow Your Business
32 Ways a Digital Marketing Consultant Can Help Grow Your Business32 Ways a Digital Marketing Consultant Can Help Grow Your Business
32 Ways a Digital Marketing Consultant Can Help Grow Your Business
 

Ähnlich wie Static code analysis for verification of the 64-bit applications

Diving into VS 2015 Day2
Diving into VS 2015 Day2Diving into VS 2015 Day2
Diving into VS 2015 Day2
Akhil Mittal
 

Ähnlich wie Static code analysis for verification of the 64-bit applications (20)

The forgotten problems of 64-bit programs development
The forgotten problems of 64-bit programs developmentThe forgotten problems of 64-bit programs development
The forgotten problems of 64-bit programs development
 
Comparison of analyzers' diagnostic possibilities at checking 64-bit code
Comparison of analyzers' diagnostic possibilities at checking 64-bit codeComparison of analyzers' diagnostic possibilities at checking 64-bit code
Comparison of analyzers' diagnostic possibilities at checking 64-bit code
 
Lesson 19. Pattern 11. Serialization and data interchange
Lesson 19. Pattern 11. Serialization and data interchangeLesson 19. Pattern 11. Serialization and data interchange
Lesson 19. Pattern 11. Serialization and data interchange
 
Safety of 64-bit code
Safety of 64-bit codeSafety of 64-bit code
Safety of 64-bit code
 
Development of a static code analyzer for detecting errors of porting program...
Development of a static code analyzer for detecting errors of porting program...Development of a static code analyzer for detecting errors of porting program...
Development of a static code analyzer for detecting errors of porting program...
 
The static code analysis rules for diagnosing potentially unsafe construction...
The static code analysis rules for diagnosing potentially unsafe construction...The static code analysis rules for diagnosing potentially unsafe construction...
The static code analysis rules for diagnosing potentially unsafe construction...
 
PVS-Studio advertisement - static analysis of C/C++ code
PVS-Studio advertisement - static analysis of C/C++ codePVS-Studio advertisement - static analysis of C/C++ code
PVS-Studio advertisement - static analysis of C/C++ code
 
Traps detection during migration of C and C++ code to 64-bit Windows
Traps detection during migration of C and C++ code to 64-bit WindowsTraps detection during migration of C and C++ code to 64-bit Windows
Traps detection during migration of C and C++ code to 64-bit Windows
 
PVS-Studio, a solution for developers of modern resource-intensive applications
PVS-Studio, a solution for developers of modern resource-intensive applicationsPVS-Studio, a solution for developers of modern resource-intensive applications
PVS-Studio, a solution for developers of modern resource-intensive applications
 
Lesson 6. Errors in 64-bit code
Lesson 6. Errors in 64-bit codeLesson 6. Errors in 64-bit code
Lesson 6. Errors in 64-bit code
 
C++11 and 64-bit Issues
C++11 and 64-bit IssuesC++11 and 64-bit Issues
C++11 and 64-bit Issues
 
20 issues of porting C++ code on the 64-bit platform
20 issues of porting C++ code on the 64-bit platform20 issues of porting C++ code on the 64-bit platform
20 issues of porting C++ code on the 64-bit platform
 
64 bits, Wp64, Visual Studio 2008, Viva64 and all the rest...
64 bits, Wp64, Visual Studio 2008, Viva64 and all the rest...64 bits, Wp64, Visual Studio 2008, Viva64 and all the rest...
64 bits, Wp64, Visual Studio 2008, Viva64 and all the rest...
 
Lesson 22. Pattern 14. Overloaded functions
Lesson 22. Pattern 14. Overloaded functionsLesson 22. Pattern 14. Overloaded functions
Lesson 22. Pattern 14. Overloaded functions
 
Comparing capabilities of PVS-Studio and Visual Studio 2010 in detecting defe...
Comparing capabilities of PVS-Studio and Visual Studio 2010 in detecting defe...Comparing capabilities of PVS-Studio and Visual Studio 2010 in detecting defe...
Comparing capabilities of PVS-Studio and Visual Studio 2010 in detecting defe...
 
Driver Development for Windows 64-bit
Driver Development for Windows 64-bitDriver Development for Windows 64-bit
Driver Development for Windows 64-bit
 
PVS-Studio advertisement - static analysis of C/C++ code
PVS-Studio advertisement - static analysis of C/C++ codePVS-Studio advertisement - static analysis of C/C++ code
PVS-Studio advertisement - static analysis of C/C++ code
 
Diving into VS 2015 Day2
Diving into VS 2015 Day2Diving into VS 2015 Day2
Diving into VS 2015 Day2
 
Undefined behavior is closer than you think
Undefined behavior is closer than you thinkUndefined behavior is closer than you think
Undefined behavior is closer than you think
 
Optimization of 64-bit programs
Optimization of 64-bit programsOptimization of 64-bit programs
Optimization of 64-bit programs
 

KĂŒrzlich hochgeladen

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
Christopher Logan Kennedy
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

KĂŒrzlich hochgeladen (20)

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 

Static code analysis for verification of the 64-bit applications

  • 1. Static code analysis for verification of the 64-bit applications Authors: Andrey Karpov, Evgeniy Ryzhkov Date: 22.04.2007 Abstract The coming of 64-bit processors to the PC market causes a problem which the developers have to solve: the old 32-bit applications should be ported to the new platform. After such code migration an application may behave incorrectly. The article is elucidating question of development and appliance of static code analyzer for checking out of the correctness of such application. Some problems emerging in applications after recompiling in 64-bit systems are considered in this article as well as the rules according to which the code check up is performed. This article contains various examples of 64-bit errors. However, we have learnt much more examples and types of errors since we started writing the article and they were not included into it. Please see the article "A Collection of Examples of 64-bit Errors in Real Programs" that covers defects in 64-bit programs we know of most thoroughly. We also recommend you to study the course "Lessons on development of 64-bit C/C++ applications" where we describe the methodology of creating correct 64- bit code and searching for all types of defects using the Viva64 code analyzer. 1. Introduction Mass production of the 64-bit processors and the fact that they are widely spread led the developers to the necessity to develop 64-bit versions of their programs. The applications must be recompiled to support 64-bit architectures exactly for users to get real advantages of the new processors. Theoretically, this process must not contain any problems. But in practice after the recompiling an application often does not function in the way it is supposed to do. This may occur in different situations: from data file failure up to help system break down. The cause of such behavior is the alteration of the base type data size in 64-bit processors, to be more exact, in the alteration of type size ratio. That's why the main problems of code migration appear in applications which were developed using programming languages like C or C++. In languages with strictly structuralized type system (for example .NET Framework languages) as a rule there are no such problems. So, what's the problem with exactly these languages? The matter is that even all the high-level constructions and C++ libraries are finally realized with the use of the low-level data types, such as a pointer, a machine word, etc. When the architecture is changed and these data types are changed, too, the behavior of the program may also change. In order to be sure that the program is correct with the new platform it is necessary to check the whole code manually and to make sure that it is correct. However, it is impossible to perform the whole real commercial application check-up because of its huge size.
  • 2. 2. The example of problems arising when code is ported to 64-bit platforms Here are some examples illustrating the appearance of some new errors in an application after the code migration to a 64-bit platform. Other examples may be found in different articles [1, 2]. When the amount of memory necessary for the array was defined constant size of type was used. With the 64-bit system this size was changed, but the code remained the same: size_t ArraySize = N * 4; intptr_t *Array = (intptr_t *)malloc(ArraySize); Some function returned the value of -1 size_t type if there was an error. The checking of the result was written in the following way: size_t result = func(); if (result == 0xffffffffu) { // error } For the 64-bit system the value of -1 for this type is different from 0xffffffff and the check up does not work. The pointer arithmetic is a permanent source of problems. But in the case of 64-bit applications some new problems are added to the already existing ones. Let's consider the example: unsigned a16, b16, c16; char *pointer; ... pointer += a16 * b16 * c16; As we can see, the pointer is never able to get an increment more than 4 gigabytes and this, though, is not diagnosed by modern compilers as a warning, and in the future would lead to disability of programs to work. There exist many more examples of potentially dangerous code. All these and many other errors were discovered in real applications while migration to the 64-bit platform. 3. The review of the existing solutions There exist different approaches to the securing of the code applications correctness. Let's enumerate the most widely spread ones: unit test checking, dynamic code analysis (performed when an application is working), static code analysis (analysis of source code). No one can claim that one of the variants of testing is better than the others, but all these approaches support different aspects of application quality.
  • 3. Unit tests are meant for quick checking of small sections of a code, for instance, of single functions and classes [3]. Their peculiarity is that these tests are performed quickly and allow being started often. And this causes two nuances of using this technology. The first one is that these tests must be written. Secondly, testing of large amounts of memory (for example, more than two gigabytes) takes much time, so it is not expedient because the unit tests must work fast. Dynamic code analyzers (the best representative of which is Compuware Bounds Checker) are meant to find errors in an application while the latter is running a program. This principle of work determines the main disadvantage of the dynamic analyzer. To make sure the program is correct it is necessary to accomplish all the possible code branches. For a real program this might be difficult. But this does not mean that the dynamic code analyzer is useless. This analysis allows to discover the errors which depends upon the actions of the user and cannot be defined through the application code. Static code analyzers (for instance Gimpel Software PC-lint and Parasoft C++test) are meant for complex securing of the code quality and contain several hundreds of analyzed rules [4]. They also contain some rules which analyze the correctness of 64-bit applications. However, they are code analyzers of general purpose, so their use of securing the 64-bit application quality is not always appropriate. This can be explained by the fact that they are not meant for this purpose. Another serious disadvantage is their directivity to the data model which is used in Unix-systems (LP64),while the data model used in Windows-systems (LLP64) is quite different. That's why the use of static analyzers for checking of 64-bit Windows applications can be possible only after unobvious additional setting. The presence of special diagnostic system for potentially incorrect code (for instance key /Wp64 in Microsoft Visual C++ compiler) can be considered as some additional level of code check. However this key allows to track only the most incorrect constructions, while it leaves out many other dangerous operations. There arises a question "Is it really necessary to check the code while migrating to 64-bit systems if there are only few such errors in the application?" We believe that this checking is necessary at least because large companies (such as IBM and Hewlett-Packard) have laid out some articles [2] devoted to errors which appear when the code is being ported in their sites. 4. The Rules of the Code Correctness Analysis We have formulated 10 rules of search of dangerous from the point of view of code migrating to 64-bit system C++ language constructions. In the rules we use a specially introduced memsize type. Here we mean any simple integer type capable of storing a pointer inside and able to change its size when the digit capacity of a platform changes from 32 to 64 bit. The examples of memsize types are size_t, ptrdiff_t, all pointers, intptr_t, INT_PTR, DWORD_PTR. Now let's list the rules themselves and give some examples of their application. RULE 1. Constructions of implicit and explicit integer type of 32 bits converted to memsize types should be considered dangerous: unsigned a;
  • 4. size_t b = a; array[a] = 1; The exceptions are: 1) The converted 32-bit integer type is a result of an expression in which less than 32 bits are required to represent the value of an expression: unsigned short a; unsigned char b; size_t c = a * b; At the same time the expression must not consist of only numerical literals: size_t a = 100 * 100 * 100; 2) The converted 32-bit type is represented by a numeric literal: size_t a = 1; size_t b = 'G'; RULE 2. Constructions of implicit and explicit conversion of memsize types to integer types of 32-bit size should be considered dangerous: size_t a; unsigned b = a; An exception: the converted size_t is the result of sizeof() operator accomplishment: int a = sizeof(float); RULE 3. We should also consider to be dangerous a virtual function which meets the following conditions: a) The function is declared in the base class and in the derived class. b) Function argument types does not coincide but they are equivalent to each other with a 32-bit system (for example: unsigned, size_t) and are not equivalent with 64-bit one. class Base { virtual void foo(size_t); }; class Derive : public Base { virtual void foo(unsigned); };
  • 5. RULE 4. The call of overloaded functions with the argument of memsize type. And besides, the functions must be overloaded for the whole 32-bit and 64-bit data types: void WriteValue(__int32); void WriteValue(__int64); ... ptrdiff_t value; WriteValue(value); RULE 5. The explicit conversion of one type of pointer to another should be consider dangerous if one of them refers to 32/64 bit type and the other refers to the memsize type: int *array; size_t *sizetPtr = (size_t *)(array); RULE 6. Explicit and implicit conversion of memsize type to double and vice versa should be considered dangerous: size_t a; double b = a; RULE 7. The transition of memsize type to a function with variable number of arguments should be considered dangerous: size_t a; printf("%u", a); RULE 8. The use of series of magic constants (4, 32, 0x7fffffff, 0x80000000, 0xffffffff) should be considered as dangerous: size_t values[ARRAY_SIZE]; memset(values, ARRAY_SIZE * 4, 0); RULE 9. The presence of memsize types members in unions should be considered as dangerous: union PtrNumUnion { char *m_p; unsigned m_n;
  • 6. } u; ... u.m_p = str; u.m_n += delta; RULE 10. Generation and processing of exceptions with use of memsize type should be considered dangerous: char *p1, *p2; try { throw (p1 - p2); } catch (int) { ... } It is necessary to note the fact that rule 1 covers not only type conversion while it is being assigned, but also when a function is called, an array is indexated and with pointer arithmetic. These rules (the first as well as the others) describe a large amount of errors, larger than the given examples. In other words, the given examples only illustrate some particular situations when these rules are applied. The represented rules are embodied in static code analyzer Viva64. The principle of its functioning is covered in the following part. 5. Analyzer architecture The work of analyzer consists of several stages, some of which are typical for common C++ compilers (picture 1). Picture 1. Analyzer architecture.
  • 7. At the input of the analyzer we have a file with the source code, and as a result of its work a report about potential code errors (with line numbers attached) is generated. The stages of the analyzer's work are the following: preprocessing, parsing and analysis itself. At the preprocessing stage the files introduced by means of #include directive are inserted, and also the parameters of conditional compiling (#ifdef/#endif) are processed. After the parsing of a file we get an abstract syntax tree with the information necessary for the future analysis is constructed. Let's take up a simple example: int A, B; ptrdiff_t C; C = B * A; There is a potential problem concerned with different data types in this code. Variable C can never possess the value less or more than 2 gigabytes and such situation may be incorrect. The analyzer must report that there is a potentially incorrect construction in the line "C = B * A". There are several variants of correction for this code. If variables B and a cannot possess the value less or more than 2 gigabytes in terms of the value, but the variable C can do it, so the expression should be written in the following way: C = (ptrdiff_t)(B) * (ptrdiff_t)(A); But if the variables A and B with a 64-bit system can possess large values, so we should replace them with ptrdiff_t type: ptrdiff_t A; ptrdiff _t B; ptrdiff _t C; C = B * A; Let's see how all this can be performed at the parsing stage. First, an abstract syntax tree is constructed for the code (picture 2).
  • 8. Picture 2. Abstract syntax tree. Then, at the parsing stage it is necessary to determine the types of variables, which participate in the evaluation of the expression. For this purpose some auxiliary information is used. This information was received when the tree was being constructed (type storage module). We can see this on the picture 3. Picture 3. Type Information storage. After the determination of types of all the variables participating in the expression it is necessary to calculate the resulting types of subexpressions. In the given example it is necessary to define the type of result of the intermediate expression "B * A". This can be done by means of the type evaluation module, as it is shown on the picture 4.
  • 9. Picture 4. Expression type evaluation. Then the correction of the resulting type expression evaluation is performed (operation "=" in the given example) and in the case of type conflict the construction is marked as potentially dangerous. There is such a conflict in the given example, because the variable C possesses the size of 64 bits (with the 64-bt system) and the result of the expression "B * A" possesses the size of 32 bits. The analysis of other rules is performed in the similar way because almost all of them are related to the correction of the types of one or another parameter. 6. Results Most methods of code analysis described in this article are embodied in the commercial static code analyzer Viva64. The use of this analyzer with real projects has proved the expediency of the code checking while developing 64-bit applications - real code errors could be discovered much quicker by means of this analyzer, than if you just use common examination of the source codes. References 1. J. P. Mueller. "24 Considerations for Moving Your Application to a 64-bit Platform", DevX.com, June 30, 2006. 2. Hewlett-Packard, "Transitioning C and C++ programs to the 64-bit data model". 3. S. Sokolov, "Bulletproofing C++ Code", Dr. Dobb's Journal, January 09, 2007. 4. S. Meyers, M. Klaus, "A First Look at C++ Program Analyzer", Dr. Dobb's Journal, Feb. Issue, 1997.