2. Language Background
•What is the best programming language?
oWrong question
oTradeoffs ... choosing the right language for a situation
o("tradeoff" is a good job interview topic!)
•How many computer languages are there?
•The answer is: 3
oC/C++, Java, Python
oYou should know these 3
oThen using any other language is pretty easy
oCompiler / JIT / Interpreter
•And I have too much teeny text on my slides, since really
they're just my notes to myself
3. Language Comparison Dimensions
•CPU efficiency
•Memory efficiency
•Programmer efficiency (Moore's law vs. the above two)
oi.e. why do programmers today get more done?
oTerseness (probably too much focus here)
oType System
oMemory manual vs. automatic (i.e GC)
oLibraries of code to use -- re-use
4. 1. C/C++ -- The Foundation
•CPU/mem efficiency -- great, the best
•Memory managment -- great efficiency
oBut programmer does most of the work
•Static (compile time) type system
•Dynamic type support = void* and that's it
oProgrammer is on their own
•Libraries -- medium
•You must understand the C/C++ Space
oGives understanding of pointers, call stack, hardware
oPython interpreter -- written in C
oJava JIT system -- C
oMentioned as Stanford grad advantage
5. 2. Java - The Middle Language
•CPU efficiency -- 1-2x worse than C
•Mem efficiency -- 2x worse than C
opointer chasing -- bleh
•Programmer efficiency -- great, type sys, GC, libs
•Java has a full compile time type system
oLots of type info making the code verbose
oCompile time error detection, code completion (yay)
•Java ALSO tracks types at run time
oObject x = new Object() vs. new String()
oString y = ((String)x).toLowerCase()
oChecks type of what x points to at run time (unlike C)
•Java mostly looks like a statically typed language
•Its run time typing provides extra flexibility + error checking
6. Aside 1 - Java Implementation
•Java compiles to portable "bytecode"
oLike PDF, not tied to OS, CPU etc.
•Old: interpreter runs bytecode
•Modern: Just In Time compiler (JIT) -> native code
•HotSpot, GPL open source from Sun/Oracle
•HotSpot does aggressive re-writes of "hot" code
•Done at run time -- HotSpot can see all the cards
•IMHO: this is the future of interesting code optimizations --
run time re-writes of running code
•Mozilla/Google JITs for JavaScript
oJIT is more difficult with dynamic languages
7. Aside 2 - HotSpot Optimization Simple
// HotSpot re-writes code to remove
// bounds check of a[i]
int sum = 0;
for (int i=0; i<a.length; i++) {
sum += a[i]; // remove [i] bounds check
}
// The re-write must not change the program's
// output. The optimizer must in effect prove
// that a computation is not needed.
8. Aside 3a - Complex Optimization
// Suppose have int[] p.a, int p.min
for (int i=0; i<p.a.length; i++)
if (p.min > p.a[i])
p.min = p.a[i]
// Running code as written:
// dereferencing p.xxx in the loop
// :( very bad performance
9. Aside 3b - Complex Optimization
// Hotspot: dynamically re-write the code
// so p.a temporarily lives in r1,
// p.min lives in r2, so the computation
// uses registers directly.
for (int i=0; i<r1.length; i++)
if (r2 > r1[i])
r2 = r1[i]
// After the code, write the results
// from registers back to p.xxx in the heap.
// Key point: note that the rest of the system
// cannot tell that we did this, and we get C
// performance. HotSpot can be aggressive,
// since it works at runtime; it can see
// everything.
10. What is Compile Time Typing?
•C/C++, Java
•Compile time analysis
oJust have the static code text
oHierarchy of declarations
•System of declarations about will happen at run time
•An approximate forecast, e.g.
oThis will be an int
oThis pointer will be one of A, B, C, but don't know which
•x.foo() -- how do different language compilers treat this?
o"no method foo() exists" -- when is this error possible?
11. Compile Time Typing Pro/Con
•Advantages
oDetect some errors at compile time
oBetter performance, decision not deferred to run time
e.g. a + b (int vs. string .. ct vs. rt decision)
oBetter tool refactoring and auto-complete (!)
o+/- Readable/Verbose
oEnables easier performance optimization (JIT compiler)
Dynamic language may get there someday
•Disadvantages
oExtra stuff to keyboard in, more verbose code
oMay be hard to express some ideas in type system,
though they will actually work at run time (!)
oMaintaining type info is work and can get in the way
•Demo Eclipse: java code, verbose, error check, refactoring
12. 3. Python
•Dynamic language: no types in source code, figure out types
at run time.
•Other examples: javascript, php, lisp
•Interpreters are surprisingly easy to implement
oLoop + switch statement
•Results:
oTerse -- not much to keyboard in (can be hard to read)
oVery flexible, easy to write libraries
oSlow
13. Dynamic Typing
•Dynamic typing
oEverything is a void*
oEverything in the heap, GC
oAt runtime, interpreter follows the pointer, see what it is,
make decisions
•aka Run Time Typing, "scripting languages"
•Possible to JIT to native code, but the defer-to-rt semantics
are still present, even in the JIT native code
oThe PyPy project JITs Python code
14. Dynamic Typing Pro/Con
•Advantages
oLess to keyboard in, less to get in the way (!!)
oLanguage not limited to what can be expressed within
static typing structure, most flexible
Python has lots of features for a reason
Code is short, defer-to-rt is simple to implement
Interpreters are surprisingly easy to build
•Disadvantages
oIn some ways hard to read -- type info not there
Find yourself adding it in the variable names
A clue that type info can be useful to the reader
oWorse performance (defer decisions)
oWorse compile time error detection
Compensate with unit tests
oWorse compile time tool refactoring, auto-complete (!!)
15. Language Choice Realities 1
•Reality: source code is high inertia, high cost (unintuitive?)
oNot a criticism, just a reality you should plan for
•1. Therefore: "Legacy" huge driver
oWe use language X because that's how this project
started 10 years ago, and now it weighs 10 million tons
oNot an error, just rational
•2. Therefore: Avoid building your system on top of locked-in,
proprietary infrastructure
oOnce your system develops high inertia,
you are screwed
oLike building your extremely expensive building on
someone else's land
oStandards: C, C++, Java, Python, HTML, JavaScript
oWhy standard/open-source infrastructure dominates
16. Language Choice Precepts 2
•Warning: engineers can get excited and emotional about
language choice
oModern languages+tools are all pretty good
•Aside: Bikeshed Painting (wikipedia)
oPeople feel they should say something to not look stupid.
If something is simple, they are most comfortable arguing
about it.
oTherefore most discussion is on the trivial.
oStrategy: if it's kind of trivial, consider just letting the
implementer do it however they like, even if you prefer the
other way.
•New skill to develop: "shutting up skills"
17. Conclusion1 Three Language Choices
•3 Languages, all can work very well (Nick's opinion here)
•Here "features" = language + its libraries
•1. C/C++
oHigh performance, Small mem use, Low dependency
oFeatures are pretty weak (C++ 0x, limits of what can do)
•Good fit for:
olow level projects, e.g. disk driver
osmall or simple projects, e.g. arduino, or a library that just
does one narrow thing
operformance sensitive projects, e.g. google storage
olegacy
18. Conclusion2 Three Language Choices
•2. Java
oStatic typing
oPerformance good (varies)
oFeatures very productive -- language + libraries
(hence popular)
oGood fit for:
Large or complex project, multiple contributors
Type system ties things together
Must tolerate the verbosity and the memory use
19. Conclusion3 Three Language Choices
•3. Python
oDynamic typing
oPerformance the worst
(however many things are just waiting for i/o anyway)
oFeatures large (+ flexibility that comes with dynamic, no
compile step!)
oGood fit for:
Small projects, where brevity/simplicity shines (IMHO)
e.g. a 3 page script .. get done so fast
oSome people love python for everything
Depends on if they like having CT type system
If they care about performance much