Our C# expert Eric Lippert provides his take on the psychology of C# analysis, including the business case for C#, developer characteristics and analysis tools.
4. Intro
• Psychological factors in language design…
• … and compiler error messages…
• … and static analysis tools…
• … and funny pictures of cats.
5. Who is this guy?
• Compiler developer / language designer at Microsoft from
1996 through 2012
• Visual Basic, VBScript, JScript, VS Tools for Office, C# / Roslyn
• Static analysis architect for C# at Coverity since January
• I will use “we” totally inconsistently
• I have no formal background in static analysis
• I take an engineering rather than academic approach
10. The business case for C#
• Productive, successful professional developers who target
Microsoft platforms make those platforms more attractive
to Microsoft’s customers
• Original design goal was “a simple, modern, general-
purpose language”
• Any language with an 800 page specification is no longer
simple, but modern and general-purpose still apply
• Understanding developer psychology is key to achieving
wide adoption of any developer tool
11. Target C# Developer Characteristics
• Professionals, not amateurs
• Engineers, not hackers
• Programming experts, not line-of-business experts
• Pragmatists, not academics
• Skeptics, not true believers
• Conservatives, not radicals
13. Conservatism
• C# developers hate breaking changes imposed by tools
• Even trivial breaking changes are agonized over
• In 11 years and 6 releases C# has never added a new
reserved keyword
• New keywords are contextual so as to not be breaking
• This imposes considerable restrictions on new syntaxes
• For example, consider iterator blocks:
double yield = 123.4;
yield return yield;
14. Conservatism
• C# app developers also hate breaking their users
• Facilitating versionable components was a pri 1 design goal
• Numerous seemingly-counterintuitive features actually mitigate
brittle-base-class failures:
class Base
{
public void M(int x) { }
}
class Derived : Base
{
public void M(double x) { }
}
...
derived.M(123); // Base.M or Derived.M?
16. Conservatism
C# 4.0 added dynamic dispatch to facilitate interoperability
with dynamic languages and “legacy” object models
• Enormous MVP community pushback
• I will use this feature correctly but my coworkers are
going to abuse it and then I’m going to have to fix their
god-awful hacked-up code
• Anything that makes the compiler less capable of finding
bugs is met with skepticism and resistance
• Completely redesigned based on early feedback
18. Error reporting psychology
• Dealing with correct code is literally the smallest problem
• “Roslyn” does syntactic analysis of broken code in the time
between keystrokes; semantic analysis takes a little longer
• Error messages need to be understandable, accurate, polite
and diagnostic rather than prescriptive
• Let’s take a look at some examples
20. Error reporting psychology
A params parameter must be the last
parameter in a formal parameter list
Is this saying:
• If there is a params parameter, it must be the last one? or
• The last parameter and only the last parameter must
always be a params parameter? Or
• The last parameter must be a params parameter; if others
are as well, that’s fine too?
The error is only clear if the feature is already understood
21. Error reporting psychology
Error messages must read the mind of a developer who
wrote broken code and figure out what they meant.
class C
{
public virtual static void M(){}
}
23. Error reporting psychology
Complex operator + (Complex x, Complex y) { ...
User-defined operator must be declared static and public
• This is an example of a prescriptive error done right
• The user absolutely positively has to do this to overload an operator
• Odds that they were not trying to overload an operator are low
25. Warnings are harder than errors
• Must infer developers erroneous thoughts
• Compiler must be fast
• This makes an opportunity for third-party tools
• Must be plausibly wrong
• A warning for code that no one would reasonably type is unhelpful
• Must be able to eliminate warning
• And ideally the warning should tell you how
• Must have low false positive rate
• Encouraging developers to change correct code is harmful
• We will return to this point later
26. What do C# developers want?
Rigidly defined areas of doubt and uncertainty
• Static type checking, type safety, memory safety…
• … that can be disabled if necessary.
• A compiler that infers developer intent…
• … with predictable behavior and understandable rules
• Actionable errors when inference fails…
• …rather than muddling on through and getting it wrong
28. C# was originally called SafeC
C# throws developers into the “Pit of Success”:
• Eliminate unimportant dangerous features entirely
• switch fall through
• Restrict dangerous features to clearly-marked unsafe code regions
• Eliminate implementation-defined behaviours
• x = ++x + x++; is well-defined in C# …
• …but still a bad idea.
• Define common undefined behaviours
• Accessing an array out of bounds causes an exception
• Mandate compiler warnings
There are numerous defects that the Coverity C/C++ analysis checkers
detect which are impossible, unlikely, or already warnings in C#.
Let’s look at a few dozen. Quickly. These are all defects found by Coverity
in C/C++ that are not worth checking in C#…
29. C/C++ defects inapplicable to C#:
• Local read before assignment
• C# rejects programs that use uninitialized locals
• Uninitialized fields / arrays
• Fields and arrays are automatically zeroed out
• Treating a pointer to a variable as a pointer to an array
• Rare, must be marked as unsafe
• Buffer length arithmetic errors
• Strings and arrays know their lengths; checked at runtime
• Pointer/integer/char/bool/enum type errors
• Not inter-assignable in C# without explicit cast operators
30. C/C++ defects inapplicable to C#:
• Failure to consistently check error return codes
• C# uses exceptions
• Accidental sign extension
• Either error or warning
• Implementation-defined side effect order
• Side effect order is well-defined
• Statement with no effect
• is actually a parse time error in C#
• Accidental use of ambiguous names
• C# requires that a simple name have a unique meaning in a block
31. C/C++ defects inapplicable to C#:
• sizeof mistakes
• C#’s sizeof operator only takes types
• Unintentional switch fall-through
• Is an error
• Unreachable code
• Is a warning
• Accidental assignment or comparison of variable to itself
• Yep, that’s a warning too
• Field never written or never read
• Man that’s a lot of warnings
• Missing return statement
• Is illegal
• malloc without free / free without malloc / allocator – deallocator mismatch / use after free
• Not needed in a garbage-collected language
• Dereferencing an address that lived longer than the storage it refers to
• References to variables may not be stored in long-term storage
• Accidental use of function pointer
• Method group expressions can only be used in strictly limited locations
• Overriding errors
• The language was designed to mitigate brittle base class failures by default
33. Defects common to C/C++ and C#
• Copy paste mistakes
• Expression contains variables but always
has the same result
• You checked for null here, you dereferenced
without checking there.
• Some infinite loops
• Dangling else and other indentation issues
• Array index out of bounds
• Integer overflow
• checked arithmetic is off by default
• Non-memory resource leaks
• Such as forgetting to close a file
• Stray semicolons
• Swapped arguments
• Unused return value
• Uncaught exception
• Missing or misordered critical sections
• Including non-atomic operations
inconsistently inside critical sections
• And many more!
And these are just a few that are
common to C and C#; there are
a whole host of defects specific
to C# programs that we could
find statically.
Let’s consider the psychological
aspects of static analysis tools
beyond the compiler.
35. Developer Adoption is Key
• Soundness is explicitly a non-goal
• We don’t want to find all defects or even most defects
• We want every defect reported to be a customer-affecting bug
• Developers won’t adopt a product that they perceive as making
their jobs harder for no customer benefit
• Our business model requires adoption to drive renewals
• How do developers – who, remember, are using C# because they
like a statically-typed language – react to static analysis tools?
37. Developer psychology WRT analysis tools
• Egotistical
• I don’t need this tool for my code
• But my coworkers on the other hand…
• Clever management uses this trait to advantage
39. Developer psychology WRT analysis tools
• Skeptical, conservative, dismissive
• Resistant to change
• Quick to criticize “stupid” false positives
• The first five defects they see had better be true positives
41. Developer psychology WRT analysis tools
• “Busy” with, you know, “real work”
• Code annotations are unacceptable
• Analysis tool must adapt to customer’s build process
• Overnight analysis runs are acceptable – barely
43. Developer psychology WRT analysis tools
• Any change in what defects are reported on the same code
over time – a.k.a. “churn” – is the enemy
• Randomized analysis is right out, unfortunately
• Any improvement to our analysis heuristics can cause
unwanted churn
• We try to keep churn below 5% on every release
45. Developer psychology WRT analysis tools
• Responds well to perverse incentives
• Hard-to-understand defect reports are easy to ignore
• No downside to incorrectly triaging true positives as false positives
• Finding defects is hard; presenting evidence that prevents
incorrect classification as a false positive is harder
• Deep analysis with theorem provers can be worse than shallow
analysis with cheap heuristics.
• Presenting the result is insufficient; the developer must understand
the proof to fix the defect.
47. Displaying good defect messages
public void GetThing(Type type, bool includeFrobs)
{
bool isFrob = (type != null) &&
typeof(IFrob).IsAssignableFrom(type);
object instance = this.objects[this.name]
if (instance is IFrob && includeFrobs)
{ [...] }
else if (type.IsAssignableFrom(instance.GetType())
{ [...] }
48. Displaying good defect messages
public void GetThing(Type type, bool includeFrobs)
{
Assuming type is null.
type != null evaluated to false.
bool isFrob = (type != null) &&
typeof(IFrob).IsAssignableFrom(type);
object instance = this.objects[this.name]
instance is IFrob evaluated to true.
includeFrobs evaluated to false.
if (instance is IFrob && includeFrobs)
{ [...] }
Dereference after null check:
dereferencing type while it is null.
else if (type.IsAssignableFrom(instance.GetType())
{ [...] }
50. Management psychology
• The first time static analysis runs there may be thousands
of errors; typical rate is one defect per thousand LOC
• Academic answer: rank heuristics
• Pragmatic answer: ignore them all
• Simply ignore all defects in existing code
• Triage and fix defects in new code
• “Someday” get around to fixing defects in old code
• Why is this so popular?
• Old code is in the field. It works well enough. Risk is low.
• New code is unproven. It might work, or it might not. Risk is high.
52. Management psychology
• Management actually pays for the developer tools
• And typically has no idea how to use them effectively
• Middle management has perverse incentives too
• Time, cost and complexity are easily measured; quality is not
• “Never upgrade the static analysis tool before release”
• Worse tools are better; better tools are worse
53. Worse is better; better is worse
KnownDefects
Time
No tool improvements ==
Management gets bonus
54. Worse is better; better is worse
KnownDefects
Time
No tool improvements ==
Management gets bonus
Tool upgrades find more defects ==
Management gets no bonus
The fix rate is the same in these two
graphs but if the tool improves faster
than the fix rate, no bonus.
55. Good news
If you have a well-engineered product that:
• makes good use of theoretical and pragmatic approaches,
• finds real-world, user-affecting defects, and
• takes developer and management psychology into account
Then you can make a positive difference
59. Conclusion
• Theoretical static analysis techniques are awesome; we can
and do use them in industry…
• … but doing all that math is actually only one small part of shipping
a static analysis product
• Understanding developer and management psychology is
necessary to ensure adoption of any developer tools
• C# was carefully designed to match a target developer mindset
• Coverity thinks about developer and manager psychology at every
stage in the analysis and overall product design
• Research into better ways to present defects would be awesome
60. More information
• Learn about Coverity at www.Coverity.com
• Read “A Few Billion Lines Of Code Later”
• Find me on Twitter at @ericlippert
• Or read my C# blog at www.EricLippert.com
• Or ask me about C# at www.StackOverflow.com