SlideShare ist ein Scribd-Unternehmen logo
1 von 6
Downloaden Sie, um offline zu lesen
Optimization of 64-bit programs
Author: Andrey Karpov

Date: 12.10.2008


Abstract
Some means of 64-bit Windows applications performance increase are considered in the article.


Introduction
People often have questions concerning 64-bit solutions performance and means of its increasing. Some
questionable points are considered in this article and then some recommendations concerning program
code optimization are given.


1. The result of porting to 64-bit systems
In a 64-bit environment old 32-bit application run owing to Wow64 subsystem. This subsystem emulates
32-bit environment by means of an additional layer between a 32-bit application and 64-bit Windows
API. In some localities this layer is thin, in others it's thicker. For an average program the productivity
loss caused by this layer is about 2%. For some programs this value may be larger. 2% are certainly not
much but still we have to take into account the fact that 32-bit applications function a bit slower under a
64-bit operation system than under a 32-bit one.

Compiling of a 64-bit code not only eliminates Wow64 but also increases performance. It's related to
architectural alterations in microprocessors, such as the increase in number of general-purpose
registers. For an average program the expected performance growth caused by an ordinary compilation
is 5-15%. But in this case everything depends upon the application and data types. For instance, Adobe
Company claims that new 64-bit "Photoshop CS4" is 12% faster than its 32-bit version.

Some programs dealing with large data arrays may greatly increase their performance when expanding
address space. The ability to store all the necessary data in the random access memory eliminates slow
operations of data swapping. In this case performance increase can be measured in times, not in
percent rate.

Here we can consider the following example: Alfa Bank has integrated Itanium 2-based platform into its
IT infrastructure. The bank's investment growth resulted in the fact that the existing system became
unable to cope with the increasing workload: users' service delays attained its deadline. Case analysis
showed up that the system's bottleneck is not the processors' performance but the limitation of 32-bit
architecture in a memory subsystem part that does not allow using efficiently more than 4 GB of the
server's addressing space. The data base itself was larger than 9 GB. Its intensive usage resulted in the
critical workload of input-output subsystem. Alfa Bank decided to purchase a cluster consisting of two
four-processor Itanium2-based servers with 12GB of random access memory. This decision allowed to
ensure the necessary level of system's performance and fault-tolerance. As explained by company
representatives implementation of Itanium2-based servers allowed to terminate problems to cut costs.
2. Program code optimization
We can consider optimization at three levels: microprocessor instructions optimization, code
optimization on the level of high-level languages and algorithmic optimization (which takes into account
peculiarities of 64-bit systems). The first one is available when we use such development tools as
assembler and is too specific to be of any interest for a wide audience. For those who are interested in
this theme we can recommend "Software Optimization Guide for AMD64 Processors" [2] -an AMD guide
of application optimization for a 64-bit architecture. Algorithmic optimization is unique for every task
and its consideration is beyond this article.

From the point of view of high-level languages, such as C++, 64-bit architecture optimization depends on
the choice of optimal data types. Using homogeneous 64-bit data types allows the optimizing compiler
to construct a simpler and more efficient code, as there's no need to convert 32-bit and 64-bit data inter
se often. Primarily, this can be referred to variables which are used as loop counters, array indexes and
for variables storing different sizes. Traditionally we use such types as int, unsigned and long to
represent the above-listed types. With 64-bit Windows systems which use LLP64 [3] data model these
types remain 32-bit ones. In a number of cases this results in less efficient code construction for there
are some additional conversions. For instance, if you need to figure out the address of an element in an
array with a 64-bit code, first you must turn the 32-bit index into a 64-bit one.

The use of such types as ptrdiff_t and size_t is more effective, as they possess optimal size for
representing indexes and counters. For 32-bit systems they are scaled as 32-bit, for 64-bit systems as 64-
bit (see table 1).




        Table 1. Data type dimension of 32-bit and 64-bit versions of Windows operation system.
Using ptrdiff_t, size_t and derivative types allows to optimize program code up to 30%. You can study an
example of such optimization in the article "Development of resource-intensive applications in Visual
C++ environment" [4]. Additional advantage here is a more reliable code. Using 64-bit variables as
indexes permits to avoid overflows when we deal with large arrays having several billions of elements.

Data type alteration is not an easy task far less if the alteration is really necessary. We bring forward
Viva64 static code analyzer as a tool which is meant to simplify this process. Though it specializes in 64-
bit code error search, one can considerably increase code performance if he follows its
recommendations on data type alteration.


3. Memory usage decrease
After a program was compiled in a 64-bit regime it starts consuming more memory than its 32-bit
variant used to do. Often this increase is almost imperceptible but sometimes memory consumption
increases two times. This coheres with the following reasons:

    •   Increasing memory allocation size for certain objects storage, for instance, pointers;
    •   Alteration of regulations of data alignment in structures;
    •   Stack memory consumption increase.

One can often put up with ram memory consumption increase. The advantage of 64-bit systems is
exactly that the amount of this memory is rather large. There's nothing bad in the fact that with a 32-bit
system having 2 GB of memory a program took 300 MB, but with a 64-bit system having 8 GB of
memory this program takes 400 MB. In relative units, we see that with a 64-bit system this program
takes three times less available physical memory. There is no sense trying to fight this memory
consumption growth. It's easier to add some memory.

But the increase of consumed memory has one disadvantage. This increase causes loss of performance.
Though a 64-bit program code functions faster, extracting of large amounts of data out of memory
frustrate all the advantages and even decrease performance. Data transfer between memory and
microprocessor (cache) is not a cheap operation.

Let us assume that we have a program which processes a large amount of text data (up to 400 MB). It
creates an array of pointers, each indicating a succeeding word in the processed text. Let the average
word length be 5 symbols. Then the program will require about 80 million pointers. So, a 32-bit variant
of the program will require 400 MB + (80 MB * 4) = 720 MB memory. As for a 64-bit version of the
program, it will require 400 MB+ (80 MB * 8) = 1040 MB memory. This is a considerable increase which
may adversely affect the program performance. And if there's no need to process gigabyte-sized texts,
the chosen data structure will be useless. The use of unsigned- type indexes instead of pointers may be
viewed as a simple and effective solution of the problem. In this case the size of consumed memory
again is 720 MB.

One can waste considerable amount of memory altering regulations of data alignment. Let us consider
an example:

struct MyStruct1

{

    char m_c;
void *m_p;

    int m_i;

};

Structure size in a 32-bit program is 12 bytes, and in a 64-bit one it is 24 bytes, which is not thrifty. But
we can improve this situation by altering the sequence of elements in the following way:

struct MyStruct2

{

    void *m_p;

    int m_i;

    char m_c;

};

MyStruct2 structure size still equals to 12 bytes in a 32-bit program, and in a 64-bit program it is only 16
bytes. Therewith, from the point of view of data access efficiency MyStruct1 and MyStruct2 structures
are equivalent. Picture 1 is a visual representation of structure elements distribution in memory.
Picture 1.

It's not easy to give clear instructions concerning order of elements in structures. But the common
recommendation is the following: the objects should be distributed in the order of their size decrease.

The last point is stack memory consumption growth. Storing of larger return addresses and data
alignment increases the size. Optimizing them makes no sense. A sensible developer would never create
megabyte-sized objects in stack. Remember that if you are porting a 32-bit program to a 64-bit system
don't forget to alter the size of stack in project settings. For instance, you can double it. On default a 32-
bit application as well as a 64-bit one is assigned a 2MB stack as usual. It may turn out to be insufficient
and securing makes sense.


Conclusion
The author hopes that this article will help in efficient 64-bit solutions development and invites you to
visit www.viva64.com to learn more about 64-bit technologies. You can find lots of items devoted to
development, testing and optimization of 64-bit applications. We wish you the best of luck in developing
your 64-bit projects.
References
  1. Valentin Sedykh. Russian 64 bit: let's dot all the "i"s.
  2. http://www.viva64.com/go.php?url=151
  3. Software Optimization Guide for AMD64 Processors.
     http://www.viva64.com/go.php?url=59
  4. Blog "The Old New Thing": "Why did the Win64 team choose the LLP64 model?"
     http://www.viva64.com/go.php?url=25
  5. Andrey Karpov, Evgeniy Ryzhkov. Development of Resource-intensive Applications in Visual C++.
     http://www.viva64.com/art-1-2-2014169752.html

Weitere ähnliche Inhalte

Ähnlich wie Optimization of 64-bit programs

Lesson 26. Optimization of 64-bit programs
Lesson 26. Optimization of 64-bit programsLesson 26. Optimization of 64-bit programs
Lesson 26. Optimization of 64-bit programsPVS-Studio
 
Seven Steps of Migrating a Program to a 64-bit System
Seven Steps of Migrating a Program to a 64-bit SystemSeven Steps of Migrating a Program to a 64-bit System
Seven Steps of Migrating a Program to a 64-bit SystemAndrey Karpov
 
Seven Steps of Migrating a Program to a 64-bit System
Seven Steps of Migrating a Program to a 64-bit SystemSeven Steps of Migrating a Program to a 64-bit System
Seven Steps of Migrating a Program to a 64-bit SystemPVS-Studio
 
Lesson 19. Pattern 11. Serialization and data interchange
Lesson 19. Pattern 11. Serialization and data interchangeLesson 19. Pattern 11. Serialization and data interchange
Lesson 19. Pattern 11. Serialization and data interchangePVS-Studio
 
Development of resource-intensive applications in Visual C++
Development of resource-intensive applications in Visual C++Development of resource-intensive applications in Visual C++
Development of resource-intensive applications in Visual C++PVS-Studio
 
Development of resource-intensive applications in Visual C++
Development of resource-intensive applications in Visual C++Development of resource-intensive applications in Visual C++
Development of resource-intensive applications in Visual C++Andrey Karpov
 
64 bit computing
64 bit computing64 bit computing
64 bit computingAnkita Nema
 
Driver Development for Windows 64-bit
Driver Development for Windows 64-bitDriver Development for Windows 64-bit
Driver Development for Windows 64-bitPVS-Studio
 
Static code analysis for verification of the 64-bit applications
Static code analysis for verification of the 64-bit applicationsStatic code analysis for verification of the 64-bit applications
Static code analysis for verification of the 64-bit applicationsPVS-Studio
 
The reasons why 64-bit programs require more stack memory
The reasons why 64-bit programs require more stack memoryThe reasons why 64-bit programs require more stack memory
The reasons why 64-bit programs require more stack memoryPVS-Studio
 
Design and Implementation of DMC for Memory Reliability Enhancement
Design and Implementation of DMC for Memory Reliability EnhancementDesign and Implementation of DMC for Memory Reliability Enhancement
Design and Implementation of DMC for Memory Reliability EnhancementIRJET Journal
 
Virtual Memory In Contemporary Microprocessors And 64-Bit Microprocessors Arc...
Virtual Memory In Contemporary Microprocessors And 64-Bit Microprocessors Arc...Virtual Memory In Contemporary Microprocessors And 64-Bit Microprocessors Arc...
Virtual Memory In Contemporary Microprocessors And 64-Bit Microprocessors Arc...Anurag Deb
 
Comparison of analyzers' diagnostic possibilities at checking 64-bit code
Comparison of analyzers' diagnostic possibilities at checking 64-bit codeComparison of analyzers' diagnostic possibilities at checking 64-bit code
Comparison of analyzers' diagnostic possibilities at checking 64-bit codePVS-Studio
 
USING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICS
USING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICSUSING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICS
USING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICSHCL Technologies
 
Cpu 64x architecture
Cpu 64x architectureCpu 64x architecture
Cpu 64x architectureAmmAr mobark
 
Interview with Anatoliy Kuznetsov, the author of BitMagic C++ library
Interview with Anatoliy Kuznetsov, the author of BitMagic C++ libraryInterview with Anatoliy Kuznetsov, the author of BitMagic C++ library
Interview with Anatoliy Kuznetsov, the author of BitMagic C++ libraryPVS-Studio
 
L'architettura di classe enterprise di nuova generazione - Massimo Brignoli
L'architettura di classe enterprise di nuova generazione - Massimo BrignoliL'architettura di classe enterprise di nuova generazione - Massimo Brignoli
L'architettura di classe enterprise di nuova generazione - Massimo BrignoliData Driven Innovation
 

Ähnlich wie Optimization of 64-bit programs (20)

Lesson 26. Optimization of 64-bit programs
Lesson 26. Optimization of 64-bit programsLesson 26. Optimization of 64-bit programs
Lesson 26. Optimization of 64-bit programs
 
Seven Steps of Migrating a Program to a 64-bit System
Seven Steps of Migrating a Program to a 64-bit SystemSeven Steps of Migrating a Program to a 64-bit System
Seven Steps of Migrating a Program to a 64-bit System
 
Seven Steps of Migrating a Program to a 64-bit System
Seven Steps of Migrating a Program to a 64-bit SystemSeven Steps of Migrating a Program to a 64-bit System
Seven Steps of Migrating a Program to a 64-bit System
 
Lesson 19. Pattern 11. Serialization and data interchange
Lesson 19. Pattern 11. Serialization and data interchangeLesson 19. Pattern 11. Serialization and data interchange
Lesson 19. Pattern 11. Serialization and data interchange
 
Development of resource-intensive applications in Visual C++
Development of resource-intensive applications in Visual C++Development of resource-intensive applications in Visual C++
Development of resource-intensive applications in Visual C++
 
Development of resource-intensive applications in Visual C++
Development of resource-intensive applications in Visual C++Development of resource-intensive applications in Visual C++
Development of resource-intensive applications in Visual C++
 
Operating system
Operating systemOperating system
Operating system
 
64 bit computing
64 bit computing64 bit computing
64 bit computing
 
Driver Development for Windows 64-bit
Driver Development for Windows 64-bitDriver Development for Windows 64-bit
Driver Development for Windows 64-bit
 
Static code analysis for verification of the 64-bit applications
Static code analysis for verification of the 64-bit applicationsStatic code analysis for verification of the 64-bit applications
Static code analysis for verification of the 64-bit applications
 
The reasons why 64-bit programs require more stack memory
The reasons why 64-bit programs require more stack memoryThe reasons why 64-bit programs require more stack memory
The reasons why 64-bit programs require more stack memory
 
Design and Implementation of DMC for Memory Reliability Enhancement
Design and Implementation of DMC for Memory Reliability EnhancementDesign and Implementation of DMC for Memory Reliability Enhancement
Design and Implementation of DMC for Memory Reliability Enhancement
 
Virtual Memory In Contemporary Microprocessors And 64-Bit Microprocessors Arc...
Virtual Memory In Contemporary Microprocessors And 64-Bit Microprocessors Arc...Virtual Memory In Contemporary Microprocessors And 64-Bit Microprocessors Arc...
Virtual Memory In Contemporary Microprocessors And 64-Bit Microprocessors Arc...
 
Comparison of analyzers' diagnostic possibilities at checking 64-bit code
Comparison of analyzers' diagnostic possibilities at checking 64-bit codeComparison of analyzers' diagnostic possibilities at checking 64-bit code
Comparison of analyzers' diagnostic possibilities at checking 64-bit code
 
USING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICS
USING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICSUSING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICS
USING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICS
 
64 bit arch
64 bit arch64 bit arch
64 bit arch
 
Cpu 64x architecture
Cpu 64x architectureCpu 64x architecture
Cpu 64x architecture
 
Higher Homework
Higher HomeworkHigher Homework
Higher Homework
 
Interview with Anatoliy Kuznetsov, the author of BitMagic C++ library
Interview with Anatoliy Kuznetsov, the author of BitMagic C++ libraryInterview with Anatoliy Kuznetsov, the author of BitMagic C++ library
Interview with Anatoliy Kuznetsov, the author of BitMagic C++ library
 
L'architettura di classe enterprise di nuova generazione - Massimo Brignoli
L'architettura di classe enterprise di nuova generazione - Massimo BrignoliL'architettura di classe enterprise di nuova generazione - Massimo Brignoli
L'architettura di classe enterprise di nuova generazione - Massimo Brignoli
 

Kürzlich hochgeladen

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 

Kürzlich hochgeladen (20)

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 

Optimization of 64-bit programs

  • 1. Optimization of 64-bit programs Author: Andrey Karpov Date: 12.10.2008 Abstract Some means of 64-bit Windows applications performance increase are considered in the article. Introduction People often have questions concerning 64-bit solutions performance and means of its increasing. Some questionable points are considered in this article and then some recommendations concerning program code optimization are given. 1. The result of porting to 64-bit systems In a 64-bit environment old 32-bit application run owing to Wow64 subsystem. This subsystem emulates 32-bit environment by means of an additional layer between a 32-bit application and 64-bit Windows API. In some localities this layer is thin, in others it's thicker. For an average program the productivity loss caused by this layer is about 2%. For some programs this value may be larger. 2% are certainly not much but still we have to take into account the fact that 32-bit applications function a bit slower under a 64-bit operation system than under a 32-bit one. Compiling of a 64-bit code not only eliminates Wow64 but also increases performance. It's related to architectural alterations in microprocessors, such as the increase in number of general-purpose registers. For an average program the expected performance growth caused by an ordinary compilation is 5-15%. But in this case everything depends upon the application and data types. For instance, Adobe Company claims that new 64-bit "Photoshop CS4" is 12% faster than its 32-bit version. Some programs dealing with large data arrays may greatly increase their performance when expanding address space. The ability to store all the necessary data in the random access memory eliminates slow operations of data swapping. In this case performance increase can be measured in times, not in percent rate. Here we can consider the following example: Alfa Bank has integrated Itanium 2-based platform into its IT infrastructure. The bank's investment growth resulted in the fact that the existing system became unable to cope with the increasing workload: users' service delays attained its deadline. Case analysis showed up that the system's bottleneck is not the processors' performance but the limitation of 32-bit architecture in a memory subsystem part that does not allow using efficiently more than 4 GB of the server's addressing space. The data base itself was larger than 9 GB. Its intensive usage resulted in the critical workload of input-output subsystem. Alfa Bank decided to purchase a cluster consisting of two four-processor Itanium2-based servers with 12GB of random access memory. This decision allowed to ensure the necessary level of system's performance and fault-tolerance. As explained by company representatives implementation of Itanium2-based servers allowed to terminate problems to cut costs.
  • 2. 2. Program code optimization We can consider optimization at three levels: microprocessor instructions optimization, code optimization on the level of high-level languages and algorithmic optimization (which takes into account peculiarities of 64-bit systems). The first one is available when we use such development tools as assembler and is too specific to be of any interest for a wide audience. For those who are interested in this theme we can recommend "Software Optimization Guide for AMD64 Processors" [2] -an AMD guide of application optimization for a 64-bit architecture. Algorithmic optimization is unique for every task and its consideration is beyond this article. From the point of view of high-level languages, such as C++, 64-bit architecture optimization depends on the choice of optimal data types. Using homogeneous 64-bit data types allows the optimizing compiler to construct a simpler and more efficient code, as there's no need to convert 32-bit and 64-bit data inter se often. Primarily, this can be referred to variables which are used as loop counters, array indexes and for variables storing different sizes. Traditionally we use such types as int, unsigned and long to represent the above-listed types. With 64-bit Windows systems which use LLP64 [3] data model these types remain 32-bit ones. In a number of cases this results in less efficient code construction for there are some additional conversions. For instance, if you need to figure out the address of an element in an array with a 64-bit code, first you must turn the 32-bit index into a 64-bit one. The use of such types as ptrdiff_t and size_t is more effective, as they possess optimal size for representing indexes and counters. For 32-bit systems they are scaled as 32-bit, for 64-bit systems as 64- bit (see table 1). Table 1. Data type dimension of 32-bit and 64-bit versions of Windows operation system.
  • 3. Using ptrdiff_t, size_t and derivative types allows to optimize program code up to 30%. You can study an example of such optimization in the article "Development of resource-intensive applications in Visual C++ environment" [4]. Additional advantage here is a more reliable code. Using 64-bit variables as indexes permits to avoid overflows when we deal with large arrays having several billions of elements. Data type alteration is not an easy task far less if the alteration is really necessary. We bring forward Viva64 static code analyzer as a tool which is meant to simplify this process. Though it specializes in 64- bit code error search, one can considerably increase code performance if he follows its recommendations on data type alteration. 3. Memory usage decrease After a program was compiled in a 64-bit regime it starts consuming more memory than its 32-bit variant used to do. Often this increase is almost imperceptible but sometimes memory consumption increases two times. This coheres with the following reasons: • Increasing memory allocation size for certain objects storage, for instance, pointers; • Alteration of regulations of data alignment in structures; • Stack memory consumption increase. One can often put up with ram memory consumption increase. The advantage of 64-bit systems is exactly that the amount of this memory is rather large. There's nothing bad in the fact that with a 32-bit system having 2 GB of memory a program took 300 MB, but with a 64-bit system having 8 GB of memory this program takes 400 MB. In relative units, we see that with a 64-bit system this program takes three times less available physical memory. There is no sense trying to fight this memory consumption growth. It's easier to add some memory. But the increase of consumed memory has one disadvantage. This increase causes loss of performance. Though a 64-bit program code functions faster, extracting of large amounts of data out of memory frustrate all the advantages and even decrease performance. Data transfer between memory and microprocessor (cache) is not a cheap operation. Let us assume that we have a program which processes a large amount of text data (up to 400 MB). It creates an array of pointers, each indicating a succeeding word in the processed text. Let the average word length be 5 symbols. Then the program will require about 80 million pointers. So, a 32-bit variant of the program will require 400 MB + (80 MB * 4) = 720 MB memory. As for a 64-bit version of the program, it will require 400 MB+ (80 MB * 8) = 1040 MB memory. This is a considerable increase which may adversely affect the program performance. And if there's no need to process gigabyte-sized texts, the chosen data structure will be useless. The use of unsigned- type indexes instead of pointers may be viewed as a simple and effective solution of the problem. In this case the size of consumed memory again is 720 MB. One can waste considerable amount of memory altering regulations of data alignment. Let us consider an example: struct MyStruct1 { char m_c;
  • 4. void *m_p; int m_i; }; Structure size in a 32-bit program is 12 bytes, and in a 64-bit one it is 24 bytes, which is not thrifty. But we can improve this situation by altering the sequence of elements in the following way: struct MyStruct2 { void *m_p; int m_i; char m_c; }; MyStruct2 structure size still equals to 12 bytes in a 32-bit program, and in a 64-bit program it is only 16 bytes. Therewith, from the point of view of data access efficiency MyStruct1 and MyStruct2 structures are equivalent. Picture 1 is a visual representation of structure elements distribution in memory.
  • 5. Picture 1. It's not easy to give clear instructions concerning order of elements in structures. But the common recommendation is the following: the objects should be distributed in the order of their size decrease. The last point is stack memory consumption growth. Storing of larger return addresses and data alignment increases the size. Optimizing them makes no sense. A sensible developer would never create megabyte-sized objects in stack. Remember that if you are porting a 32-bit program to a 64-bit system don't forget to alter the size of stack in project settings. For instance, you can double it. On default a 32- bit application as well as a 64-bit one is assigned a 2MB stack as usual. It may turn out to be insufficient and securing makes sense. Conclusion The author hopes that this article will help in efficient 64-bit solutions development and invites you to visit www.viva64.com to learn more about 64-bit technologies. You can find lots of items devoted to development, testing and optimization of 64-bit applications. We wish you the best of luck in developing your 64-bit projects.
  • 6. References 1. Valentin Sedykh. Russian 64 bit: let's dot all the "i"s. 2. http://www.viva64.com/go.php?url=151 3. Software Optimization Guide for AMD64 Processors. http://www.viva64.com/go.php?url=59 4. Blog "The Old New Thing": "Why did the Win64 team choose the LLP64 model?" http://www.viva64.com/go.php?url=25 5. Andrey Karpov, Evgeniy Ryzhkov. Development of Resource-intensive Applications in Visual C++. http://www.viva64.com/art-1-2-2014169752.html