MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
SE 20016 - programming languages landscape.
1. PROGRAMMING LANGUAGES LANDSCAPE
OLD & NEW IDEAS
Ruslan Shevchenko
VertaMedia/ Researcher
ruslan@shevchenko.kiev.ua
https://github.com/rssh
@rssh1
2. PROGRAMMING LANGUAGES LANDSCAPE: OLD & NEW IDEAS
What tomorrow programming will be like.
Languages
Complexity
Hardware
Worlds
Learning Curve
Expressibility
Layers
5. Hardware
1 Processor Unit
1 Memory Unit
1 Machine
N Different Processors (CPU, GPU, NTU, QTU)
N Different Storage Systems (Cache, Mem, SSD, ..)
N Different Machines
PL: Main Language Constructs:
still execution flows
6. Memory Access Evolution:
Fortran 57 : static memory allocation
Algol 60 : Stack
Lisp 58: Garbage Collection
BCPL, C [70] - manual
ObjectiveC [88] — reference counting
Java [95] — Garbage collection become mainstream.
Rust [2010-15] — compile-time analysis become mainstream
C++ [..88] — manual + destructors
Algol 68: - Stack + manual + GC
Smalltalk [72-80] — GC (RC as GC optimization)
Objective C++
ML [70] - compile time analysis
Simula 67
// not all, not main
7. Memory Access Evolution:
Manual allocation: Risky, low-level
Garbage Collection: Generally Ok, but pauses:
not for Real-time systems
not for System-level programming
Type analysis [RUST]
subculture of sun.misc.Unsafe
in java
8. RUST: ownership & lifecycle
T - object of type T (owned by code in scope)
&T - borrowed reference to type T (owned not by us)
&’L T - reference to type T with Lifetime L
mut T - mutable object of type T
* T - row unsafe pointer
let y: & str
{
let email = retrieve_email(….. )
let domain = first_entry(email,”@“)
y = domain
}
// not compiled, lifetime of y is in outer scope.
fn first_entry(value: &’a str, pattern: &’b str) -> &’a str
9. RUST: general
Next step in low-level system languages.
Zero-cost abstraction + safety
more difficult to write in comparison with GC lang.
fast and easy in Comparison with C [may-be C++]
Alternatives:
advanced GC [go, D, Nim ]
10. Concurrency Models Evolution:
Fortran 57 : one execution flow
PL/1 64 : Multitasking API
1972: Actor Model
1988: Erlang [ Actor Model implementation]
1978: CSP Model
1983: Occam [1-st CSP Model Implementation]
1980: Implicit parallelism in functional languages (80-1)
1977. Future [MultiLisp]
2007: Go (CSP become mainstream)
2010: Akka in Scala (Actor Model become mainstream)
2015: Pony [actors + ownership]
// not all, not main
11. Concurrency Models:
Callbacks: [manual], Futures [Semi-manual]
hard to maintain
Actor-Model (Active Object)
CSP Channels; Generators
Async methods.
lightweight threads [coroutines, fibers .. ]
execution flow ‘breaks’ thread boundaries.
Implicit parallelism
hard to implement, not yet in mainstream
14. Async/Transform (by compiler/interpreter):
def method():Future[Int] = async {
val x = retrieveX()
val y = retrieveY()
x+y
}
def method():Future[Int] = async {
val x = await(retrieveX())
val y = await(retrieveY())
x+y
}
class methodAsync {
var state: Int
val promise: Promise[Int]
var x, y
def m():Unit =
{
state match {
case 0 => x = retrieveX onSuccess{ state=1; m() }
case 1 => y = retrieveY on Success { state = 2; m() }
case 2 => promise.set(x+y)
}
}
15. Concurrency Models / current state
Problems:
Data Races. Possible solutions:
immutability (functional programming)
copy/move semantics [Go, Rust]
static alias analysis [Rust, Pony]
Async IO interfaces.
Future:
Heterogenous/Distributed case
Implicit parallelism
16. RUST: race control
T <: std::marker::Send
— it is safe to send object to other thread
— otherThread(t) is safe
T <: std::marker::Sync
— it is safe to share object between threads
— share = send reference
—- otherThread(&t) is safe
{
let x = 1
thread::spawn {||
do_something(x)
}
}
// error - lifetime of x
{
let x = 1
thread::spawn {move||
do_something(x)
}
}
copy of original
17. Pony:
Actors
Type Analysis for data sharing.
Pony Type - type + capability
— T iso - isolated
— T val - value
—- T ref - reference
—- T box - rdonly
—- T trn - transition (write part of the box)
—- T tag — identity only
Destructive read/write
fut test(T iso a) {
var ref x = a
}
// error -
fun test(T iso a){
var iso x = consume a // ok
var iso y = a
// error - a is consumed
}
19. val lines = load(uri)
val count = lines.flatMap(_.split(“ “))
.map(word => (word, 1))
.reduceByKey(_ + _)
Scala, count words:
// Same code, different execution
20. val lines = load(uri)
val count = lines.flatMap(_.split(“ “))
.map(word => (word, 1))
.reduceByKey(_ + _)
Java, count words:
// Same code, different execution
@Override
public void map(Object key, Text value, Context context
) throws IOException, InterruptedException {
String line = (caseSensitive) ?
value.toString() : value.toString().toLowerCase();
for (String pattern : patternsToSkip) {
line = line.replaceAll(pattern, "");
}
StringTokenizer itr = new StringTokenizer(line);
while (itr.hasMoreTokens()) {
word.set(itr.nextToken());
context.write(word, one);
Counter counter = context.getCounter(CountersEnum.class.getName(), CountersEnum.INPUT_WORDS.toString());
counter.increment(1);
}
}
}
public static class IntSumReducer
extends Reducer<Text,IntWritable,Text,IntWritable> {
private IntWritable result = new IntWritable();
public void reduce(Text key, Iterable<IntWritable> values,
Context context
) throws IOException, InterruptedException {
int sum = 0;
for (IntWritable val : values) {
sum += val.get();
}
result.set(sum);
context.write(key, result);
}
}
21. @Override
public void map(Object key, Text value, Context context
) throws IOException, InterruptedException {
String line = (caseSensitive) ?
value.toString() : value.toString().toLowerCase();
for (String pattern : patternsToSkip) {
line = line.replaceAll(pattern, "");
}
StringTokenizer itr = new StringTokenizer(line);
while (itr.hasMoreTokens()) {
word.set(itr.nextToken());
context.write(word, one);
Counter counter = context.getCounter(CountersEnum.class.getName(), CountersEnum.INPUT_WORDS.toString());
counter.increment(1);
}
}
}
public static class IntSumReducer
extends Reducer<Text,IntWritable,Text,IntWritable> {
private IntWritable result = new IntWritable();
public void reduce(Text key, Iterable<IntWritable> values,
Java, count words:
// Same code, different execution
Can we do better (?) - Yes [but not for free]
- retargeting stream API (impl.effort)
- via annotation processor
- use byte-code rewriting (low-level)
22. Java, count words(2):
// Near same code, different execution
List{???}<String> lines = load(uri)
int count = lines.toStream.map(x ->x.split(“ “))
.collect(Collectors.group{Concurrent,Distributed}By(w->w,
Collectors.mapping(w->1
Collectors.reducing(Integer::Sum)))
[distributed version is theoretically possible]
23. Ideas
Language Extensibility: F: A=>B F: Expr[A] => Expr[B]
• Functional interpreters: Expr[A] build on top of L
• well-known functional programming pattern
• Macros: Expr[A] == {program in A}
• Lisp macroses [1960 … ]
• Compiler plugins [X10],
• Non-standard interpretation of arguments [R]
Reach enough type system, to express Expr[A] (inside language)
24. Language Extensibility: F: A=>B F: Expr[A] => Expr[B]
Small example (functional compiler)
trait GE[T]
Code(
val fundefs: Map[String, String]
val expr: String,
)
trait GERunner
{
def loadValues(Map[String,Array[Double]])
def loadCode(GE[_])
def run()
def retrieveValues(name:String):Array[Double]
}
// GPU contains OpenCL or CUDA compiler
// available via system API
25. case class GEArray(name:String) extends GE[Array[Double]]
{
def apply(i:GE[Int]): GE[Double] = GEArrayIndex(this,i)
def update(i:GE[Int],x:GE[Double]): GE[Unit] = GEUpdate(this,i,x)
def index = new {
def map(f: GE[Int] => GE[Double]):GE[Array[Double]] = GEMap(this,f)
def foreach[T](f:GE[Int] => GE[T]):GE[Unit] = GEForeach(this,f)
}
}
case class GEPlus(x: GE[Double], y: GE[Double])
extends GE[Double]
implicit class CEPlusSyntax(x:CE[Double]) extends AnyVal
{
def + (y:CE[Double]) = CEPlus(x,y)
}
case class GEMap(a:GE[Array[Double]],f:GE[Int]=>GE[Double])
case class GEArrayIndex(a: GE[Array[Double]],i:GE[Int])
extends GE[Double]
case class GEConstant(x:T):GE[T]
case class GEVar[T](name:String):GE[T]
26. val a = GEArray[Double](“a”)
val b = GEArray[Double](“b”)
val c = GEArray[Double](“c”)
for( i<- a.index) {
c(i) = a(i) + b(i)
}
a.index.foreach(i => c(i) = a(i)+b(i) )
a.index(i => GEArrayIndex(c,i).update(i,
GEArrayIndex(a,i)+GEArrayIndex(b,i)))
GEForeach(i =>
(GEUpdate(c,i),
GEPlus(GEArrayIndex(a,i),GEArrayIndex(b,i)))
27. trait GE[T]
case class GPUCode(
val defs: Map[String,String]
val expr: String
)
class GEArrayIndex(x:GE[Array[Double]], i:GE[Int])
{
def generate():GPUCode =
{
val (cx, ci) = (x.generate(),i.generate())
GPUCode(defs = merge(cx.defs,cy.defs),
expo = s”(${cx.expr}[${cy.expr}]”)
}
}
GEArrayIndex(GEArrayVar(a),GEVar(i)) => “a[i]”
class GEIntVar(name:String) ..
{
def generate():GPUCode =
GPUCode(
defs = Map(name -> “int ${name};”)
expr = name)
}
28. trait GE[T]
case class GPUCode(
val defs: Map[String,String]
val expr: String
)
class GEArrayIndex(x:GE[Array[Double]], i:GE[Int])
{
def generate():GPUCode =
{
val (cx, ci) = (x.generate(),i.generate())
GPUCode(defs = merge(cx.defs,cy.defs),
expo = s”(${cx.expr}[${cy.expr}]”)
}
}
GEPlus(GEArrayIndex(GEArrayVar(a),GEVar(i)),
GEArrayIndex(GEArrayVar(b),GEVar(i)) =>
“a[i] + b[i]”
class GEPlus(x:GE[Double], y:GE[Double])
{
def generate():GPUCode =
{
val (cx, cy) = (x.generate(),y.generate())
GPUCode(defs = merge(cx.defs,cy.defs),
expo = s”(${cx.expr} + ${cy.expr})”)
}
}
29. trait GE[T]
case class GPUCode(
val defs: Map[String,String]
val expr: String
)
class GEArrayIndex(x:GE[Array[Double]], i:GE[Int])
{
def generate():GPUCode =
{
val (cx, ci) = (x.generate(),i.generate())
GPUCode(defs = merge(cx.defs,cy.defs),
expo = s”(${cx.expr}[${cy.expr}]”)
}
}
c.update(i,a(i)+b(i)) => “c[i] = a[i] + b[i]”
class GEPlus(x:GE[Double], y:GE[Double])
{
def generate():GPUCode =
{
val (cx, cy) = (x.generate(),y.generate())
GPUCode(defs = merge(cx.defs,cy.defs),
expo = s”(${cx.expr} + ${cy.expr})”)
}
}
class GEUpdate(x:GE[Double],i:GE[Int], y:GE[Double])
{
def generate():GPUCode =
{
val (cx, ci, cy) = (x,i,u) map (_.generate)
GPUCode(defs = merge(cx.defs,cy.defs,ci.defs),
expo = s”(${cx.expr} + ${cy.expr})”)
}
}
30. trait GE[T]
case class GPUCode(
val defs: Map[String,String]
val expr: String
)
class GEArrayIndex(x:GE[Array[Double]], i:GE[Int])
{
def generate():GPUCode =
{
val (cx, ci) = (x.generate(),i.generate())
GPUCode(defs = merge(cx.defs,cy.defs),
expo = s”(${cx.expr}[${cy.expr}]”)
}
}
GEPlus(GEArrayIndex(GEArrayVar(a),GEVar(i)),
GEArrayIndex(GEArrayVar(b),GEVar(i)) =>
“a[i] + b[i]”
class GEPlus(x:GE[Double], y:GE[Double])
{
def generate():GPUCode =
{
val (cx, cy) = (x.generate(),y.generate())
GPUCode(defs = merge(cx.defs,cy.defs),
expo = s”(${cx.expr} + ${cy.expr})”)
}
}
class GEUpdate(x:GE[Double],i:GE[Int], y:GE[Double])
{
def generate():GPUCode =
{
val (cx, ci, cy) = (x,i,u) map (_.generate)
GPUCode(defs = merge(cx.defs,cy.defs,ci.defs),
expo = s”(${cx.expr} + ${cy.expr})”)
}
}
class GEForeach[T](x:GE[Array[Double]],
f:GE[Int] => GE[T] )
{
def generate():GPUCode =
{
val i = new GEIntVar(System.newName)
val (cx, ci, cfi) = (x,i,f(i)) map (_.generate)
val fName = System.newName
val fBody = s”””
__kernel void ${funName}(${genParamDefs(x)}) {
int ${i.name} = get_global_id(0)
${cfi.expr}
}
“””
GPUCode(
defs = merge(cx.defs,cy.defs,cci.defs,Map(fName,fBody)),
expr = s”${fname}(${genParams(x)})”)
}
}
31. trait GE[T]
case class GPUCode(
val defs: Map[String,String]
val expr: String
)
class GEArrayIndex(x:GE[Array[Double]], i:GE[Int])
{
def generate():GPUCode =
{
val (cx, ci) = (x.generate(),i.generate())
GPUCode(defs = merge(cx.defs,cy.defs),
expo = s”(${cx.expr}[${cy.expr}]”)
}
}
class GEPlus(x:GE[Double], y:GE[Double])
{
def generate():GPUCode =
{
val (cx, cy) = (x.generate(),y.generate())
GPUCode(defs = merge(cx.defs,cy.defs),
expo = s”(${cx.expr} + ${cy.expr})”)
}
}
class GEUpdate(x:GE[Double],i:GE[Int], y:GE[Double])
{
def generate():GPUCode =
{
val (cx, ci, cy) = (x,i,u) map (_.generate)
GPUCode(defs = merge(cx.defs,cy.defs,ci.defs),
expo = s”(${cx.expr} + ${cy.expr})”)
}
}
class GEForeach[T](x:GE[Array[Double]],
f:GE[Int] => GE[T] )
{
def generate():GPUCode =
{
val i = new GEIntVar(System.newName)
val (cx, ci, cfi) = (x,i,f(i)) map (_.generate)
val fName = System.newName
val fBody = s”””
__kernel void ${funName}(${genParamDef(x)}) {
int ${i.name} = get_global_id(0)
${cfi.expr}
}
“””
GPUCode(
defs = merge(cx.defs,cy.defs,cci.defs,Map(fName,fBody)),
expr = s”${fname}($genParams(x))”)
}
}
for(i <- a.index) yield
c(i)=a(i)+b(i)
=>
defs: “””
__kernel void f1(__global double * a,
__global double * b,
__global double* c, int n) {
int i2 = get_global_id(0)
c[i] = a[i]+b[i]
}
32. Finally:
val a = GEArray[Double](“a”)
val b = GEArray[Double](“b”)
val c = GEArray[Double](“c”)
for( i<- a.index) {
c(i) = a(i) + b(i)
}
__kernel void f1(__global double*a,
__global double* b,
__global double* c,
int n) {
int i2 = get_global_id(0)
c[i] = a[i]+b[i]
}
GPUExpr(
)
// with macroses can be done in compile time
33. Complexity
Louse coupling (can be build independently)
Amount of shared infrastructure (duplication)
Amount of location informations.
34. Typeclasses:
typeclasses in Haskell
implicit type transformations in scala
concepts in C++14x (WS, not ISO)
traits in RUST
A B
B don’t care about AA don’t care about B & C
Crepresentation of A