The author downloaded and analyzed the source code of the TortoiseSVN project using the PVS-Studio static code analyzer. The analysis found several bugs, including identical comparisons, unsafe uses of formatting functions like printf(), and obsolete null checks after memory allocation. While many of the issues would not cause failures, some could lead to undefined behavior, especially in 64-bit systems where pointer sizes are larger than integer types. The author concludes by recommending regular use of static analysis to find bugs early.
Generic or specific? Making sensible software design decisions
Rechecking TortoiseSVN with the PVS-Studio Code Analyzer
1. Rechecking TortoiseSVN with the PVS-Studio
Code Analyzer
Author: Andrey Karpov
Date: 25.06.2013
We gave the TortoiseSVN developers a free registration key for some time so that they could check their
project. While they haven't utilized it yet, I've decided to download the TortoiseSVN source codes and check
it myself. My interest is obvious: I want to make another article to advertise PVS-Studio.
We already checked the TortoiseSVN project long ago. It was done at the same time as PVS-Studio 4.00 was
released, which for the first time included diagnostic rules for general analysis.
We recheck some projects from time to time to demonstrate how useful it is to use the tool regularly. It's
pointless to check a project just a couple of times: new bugs are constantly being added into a live code, and
you then spend a lot of time and nerves to fix them. Consequently, you'll gain the largest benefit from static
analysis when using PVS-Studio daily, or, what's even better, when using incremental analysis.
So, let's take a look at some interesting issues PVS-Studio 5.05 has managed to find in the project. The
TortoiseSVN source codes were downloaded June 19, 2013, from
http://tortoisesvn.googlecode.com/svn/trunk. The TortoiseSVN project is by the way a very high-quality one
and has a huge base of users, programmers. That's why finding at least a few defects is a great achievement.
Strange Conditions
static void ColouriseA68kDoc (....)
{
if (((sc.state == SCE_A68K_NUMBER_DEC) && isdigit(sc.ch))
....
|| ((sc.state == SCE_A68K_MACRO_ARG) && isdigit(sc.ch))
2. || ((sc.state == SCE_A68K_MACRO_ARG) && isdigit(sc.ch))
....
}
PVS-Studio's diagnostic message: V501 There are identical sub-expressions '((sc.state == 11) &&
isdigit(sc.ch))' to the left and to the right of the '||' operator. lexa68k.cxx 160
Two identical comparisons are present, which is perhaps caused by a misprint.
The following code also seems to have a misprint: the 'rv' variable's value is being checked twice.
struct hentry * AffixMgr::compound_check(
....
if (rv && forceucase && (rv) && ....)
....
}
PVS-Studio's diagnostic message: V501 There are identical sub-expressions to the left and to the right of the
'&&' operator: rv && forceucase && (rv):
• affixmgr.cxx 1784
• affixmgr.cxx 1879
One more code fragment with an incorrect comparison:
int main(int argc, char **argv)
{
....
DWORD ticks;
....
if (run_timers(now, &next)) {
ticks = next - GETTICKCOUNT();
if (ticks < 0) ticks = 0;
} else {
ticks = INFINITE;
}
3. ....
}
PVS-Studio's diagnostic message: V547 Expression 'ticks < 0' is always false. Unsigned type value is never <
0. winplink.c 635
The variable 'ticks' is unsigned, which means that the check "if (ticks < 0)" is pointless: the issue with an
overflow won't be handled.
Consider a bug that causes the function 'strncmp' to compare strings only partially.
int AffixMgr::parse_convtable(...., const char * keyword)
{
char * piece;
....
if (strncmp(piece, keyword, sizeof(keyword)) != 0) {
....
}
PVS-Studio's diagnostic message: V579 The strncmp function receives the pointer and its size as arguments.
It is possibly a mistake. Inspect the third argument. affixmgr.cxx 3654
The 'sizeof' operator calculates the pointer size. This value is not related in any way to the string length.
Strange String Forming
Variable argument functions can be found everywhere and are always dangerous.
class CTSVNPath
{
....
private:
mutable CString m_sBackslashPath;
mutable CString m_sLongBackslashPath;
mutable CString m_sFwdslashPath;
....
4. };
const FileStatusCacheEntry * SVNFolderStatus::BuildCache(
const CTSVNPath& filepath, ....)
{
....
CTraceToOutputDebugString::Instance() (_T(__FUNCTION__)
_T(": building cache for %sn"), filepath);
....
}
PVS-Studio's diagnostic message: V510 The 'operator()' function is not expected to receive class-type
variable as second actual argument:
• svnfolderstatus.cpp 150
• svnfolderstatus.cpp 355
• svnfolderstatus.cpp 360
The "%s" specifier specifies that the function is waiting for a string to be passed to it as an actual argument.
But the 'filepath' variable is not a string at all, but a complex object consisting of a number of strings. I
cannot say for sure what will be printed and whether the code won't crash at all.
It's not safe to use such functions as 'printf()' in the following way: "printf(myStr);". If there are control
specifiers inside 'myStr', the program may print what it shouldn't or crash.
Look at the following code fragment from TortoiseSVN:
BOOL CPOFile::ParseFile(....)
{
....
printf(File.getloc().name().c_str());
....
}
PVS-Studio's diagnostic message: V618 It's dangerous to call the 'printf' function in such a manner, as the
line being passed could contain format specification. The example of the safe code: printf("%s", str);
pofile.cpp 158
5. If the file name is "myfile%s%i%s.txt", that will be a fail.
Note. We have an interesting article telling how dangerous it is to use the printf() function.
Incorrect Array Clearing
I don't know for sure if it is dangerous to leave the buffers' content without clearing them in case of
ToroiseSVN. Perhaps it's absolutely safe. But the code does contain a fragment intended to clear the
buffers. Since it doesn't work, I suppose I should mention it. Here are the bugs:
static void sha_mpint(SHA_State * s, Bignum b)
{
unsigned char lenbuf[4];
....
memset(lenbuf, 0, sizeof(lenbuf));
}
PVS-Studio's diagnostic message: V597 The compiler could delete the 'memset' function call, which is used
to flush 'lenbuf' buffer. The RtlSecureZeroMemory() function should be used to erase the private data.
sshdss.c 23
The array 'lenbuf' should be cleared before leaving the function. Since the array is not used after that any
more, the compiler optimization will remove the call of the 'memset' function. To avoid this you need to use
special functions.
Here are other fragments where the compiler will remove 'memset()' function calls:
• sshdss.c 37
• sshdss.c 587
• sshdes.c 861
• sshdes.c 874
• sshdes.c 890
• sshdes.c 906
• sshmd5.c 252
• sshrsa.c 113
• sshpubk.c 153
• sshpubk.c 361
• sshpubk.c 1121
• sshsha.c 256
Something Strange
BOOL InitInstance(HINSTANCE hResource, int nCmdShow)
6. {
....
app.hwndTT; // handle to the ToolTip control
....
}
PVS-Studio's diagnostic message: V607 Ownerless expression 'app.hwndTT'. tortoiseblame.cpp 1782
The 'hwndTT' member of the function 'InitInstance()' should probably be initialized with some value, but the
code appeared to be incomplete due to a misprint.
64-Bit Bugs
My search for bugs is quite superficial. My attention is focused just to the extent necessary to notice enough
examples of bugs to serve as the basis for an article. No, I'm not a baddie. It's just that an analysis of the
product performed by its authors will undoubtedly be more thorough than mine.
Even more superficial is my examination of 64-bit bugs. It's very difficult to judge whether or not a certain
error will occur without knowing the project structure.
So, I'll show you only a couple of unsafe fragments:
void LoginDialog::CreateModule(void)
{
....
DialogBoxParam(g_hmodThisDll, MAKEINTRESOURCE(IDD_LOGIN),
g_hwndMain, (DLGPROC)(LoginDialogProc),
(long)this);
....
}
PVS-Studio's diagnostic message: V220 Suspicious sequence of types castings: memsize -> 32-bit integer ->
memsize. The value being casted: 'this'. logindialog.cpp 105
The pointer 'this' is explicitly cast to the 'long' type. It is then inexplicitly extended to the LPARAM
(LONG_PTR) type. The important thing is that the pointer for some time turns into 'long', which is very bad
when your program is 64-bit. The pointer size is 64 bits, while the 'long' type in Win64 is still a 32-bit type.
This causes the high-order bits of a 64-bit variable to get lost.
7. If an object is created outside the low-order 4 Gbytes of memory, the program will start undefined
behavior. Though the probability of this event is not high, the bug is very difficult to reproduce.
This is the fixed code: DialogBoxParam(...., (LPARAM)this);
Consider one more dangerous type conversion:
static int cmpforsearch(void *av, void *bv)
{
Actual_Socket b = (Actual_Socket) bv;
unsigned long as = (unsigned long) av,
bs = (unsigned long) b->s;
if (as < bs)
return -1;
if (as > bs)
return +1;
return 0;
}
PVS-Studio's diagnostic message: V205 Explicit conversion of pointer type to 32-bit integer type: (unsigned
long) av:
• winnet.c 139
• winhandl.c 359
• winhandl.c 348
The pointers are explicitly cast to 'unsigned long' and put into the variables 'as' and 'bs'. Since the high-order
bits of the address can get lost during this operation, the comparison may become incorrect. After all, I
don't see any reason at all why to cast pointers to integer types in this code; you could just compare them
as they are.
Obsolete Null Pointer Checks
Long gone are the times when the 'new' operator returned NULL when it failed to allocate memory. Now it
throws a std::bad_alloc exception. We could of course have the 'new' operator return 0, but this is not the
case.
Despite what is said above, programs are still inhabited by code fragments like this:
8. int _tmain(....)
{
....
pBuf = new char[maxlength];
if (pBuf == NULL)
{
_tprintf(_T("Could not allocate enough memory!n"));
delete [] wc;
delete [] dst;
delete [] src;
return ERR_ALLOC;
}
....
}
PVS-Studio's diagnostic message: V668 There is no sense in testing the 'pBuf' pointer against null, as the
memory was allocated using the 'new' operator. The exception will be generated in the case of memory
allocation error.
• subwcrev.cpp 912
• repositorybrowser.cpp 2565
• repositorybrowser.cpp 4225
• svnstatuslistctrl.cpp 5254
• svnprogressdlg.cpp 2357
• bugtraqassociations.cpp 116
• xmessagebox.cpp 792
• xmessagebox.cpp 797
• hyperlink_base.cpp 166
• affixmgr.cxx 272
• hashmgr.cxx 363
• hashmgr.cxx 611
That'll Do
In my articles I don't cite many of the bugs I find in code, for they don't prevent a program from working
correctly. This time, however, I've decided to tell you about a couple of those. You see, it's just fun to watch
a program run well from sheer luck, but not because it was written well.
9. void CBaseView::OnContextMenu(CPoint point, DiffStates state)
{
....
popup.AppendMenu(MF_STRING | oWhites.HasTrailWhiteChars ?
MF_ENABLED : (MF_DISABLED|MF_GRAYED),
POPUPCOMMAND_REMOVETRAILWHITES, temp);
....
}
PVS-Studio's diagnostic message: V502 Perhaps the '?:' operator works in a different way than it was
expected. The '?:' operator has a lower priority than the '|' operator. baseview.cpp 2246
Depending on the 'oWhites.HasTrailWhiteChars' variable's value, you need to get one of the following
combinations of constants:
• MF_STRING | MF_ENABLED
• MF_STRING | MF_DISABLED | MF_GRAYED
But the code works in quite a different way. The priority of the '|' operation is higher than that of the '?:'
operation. Let's add parentheses to make it clearer:
(MF_STRING | oWhites.HasTrailWhiteChars) ? MF_ENABLED : MF_DISABLED | MF_GRAYED
The code runs correctly only because the constant 'MF_STRING' equals 0. It doesn't affect the result in any
way, which causes the incorrect expression to work well.
Here's one more example of programmer's luck. The HWND type is often used as the 'unsigned' type in
TortoiseSVN. To be able to do this, the programmer had to use explicit type conversions, as in the following
functions:
HWND m_hWnd;
UINT_PTR uId;
INT_PTR CBaseView::OnToolHitTest(....) const
{
....
pTI->uId = (UINT)m_hWnd;
....
}
10. UINT_PTR idFrom;
HWND m_hWnd;
BOOL CBaseView::OnToolTipNotify(
UINT, NMHDR *pNMHDR, LRESULT *pResult)
{
if (pNMHDR->idFrom != (UINT)m_hWnd)
return FALSE;
....
}
Or, for instance, the value of a HWND variable is printed as if it were the 'long' type.
bool CCommonAppUtils::RunTortoiseProc(....)
{
....
CString sCmdLine;
sCmdLine.Format(L"%s /hwnd:%ld",
(LPCTSTR)sCommandLine, AfxGetMainWnd()->GetSafeHwnd());
....
}
From a formal viewpoint the code is incorrect: the 'HWND' type is a pointer, which means it cannot be cast
to 32-bit integer types. The PVS-Studio analyzer is anxious about violations of that rule and generates the
warnings.
But the funny thing is that this code will work as it should!
The HWND type is used to store descriptors which are used in Windows to handle various system objects.
Such are also the types HANDLE, HMENU, HPALETTE, HBITMAP, etc.
Though descriptors are in fact 64-bit pointers, only the low-order 32 bits of these are used - to provide
better compatibility (for instance, to enable interaction between 32-bit and 64-bit processes). See
11. "Microsoft Interface Definition Language (MIDL): 64-Bit Porting Guide" (USER and GDI handles are sign
extended 32b values) for details.
I doubt that the TortoiseSVN developers took these assumptions into account when putting the HWND type
into 32-bit types. It must be just a carelessly written code which works well due to luck and Windows API
developers' efforts.
Conclusion
Use static analysis regularly in your development process, and you'll find a whole lot of bugs at the very
early stages of development. I naturally recommend you to consider trying the PVS-Studio code analyzer in
the first place; but there also exist many other good analyzers: static code analysis tools.
References
Here are additional references to clarify some subtleties described in the article.
1. Knowledge base. Overwriting memory - why?
2. Documentation. V668. There is no sense in testing the pointer against null, as the memory was
allocated using the 'new' operator.
3. Knowledge base. How to correctly cast a pointer to int in a 64-bit application?
4. Andrey Karpov, Evgeniy Ryzhkov. Lessons on development of 64-bit C/C++ applications.