SlideShare ist ein Scribd-Unternehmen logo
1 von 106
Downloaden Sie, um offline zu lesen
C2
  The Server Compiler
第5回JVMソースコードリーディングの会




       @ytoshima



                        1
Compilation trigger
invoke* と goto bytecode (to negative
offset) の呼び出し数を interpreter でカウ
ントし,しきい値を越えると CompilerBroker
にコンパイルのリクエストを出す。キューにリク
エストが入り、キューを見ている
CompilerThread がコンパイルを開始する。GC
動作とも並列して動作できる様になっている。




                                       2
Compilation trigger
// Method invocation
SimpleCompPolicy::method_invocation_e
vent
  CompileBroker::compile_method(...)

// OSR
SimpleCompPolicy::method_back_branch_
event
  CompileBroker::compile_method(...)




                                        3
Compilation trigger
CompileBroker::compile_method
  CompileBroker::compile_method_base
    :
    CompileQueue* queue =
        compile_queue(comp_level);
    task = create_compile_task(queue,
        compile_id, method,
        osr_bci, comp_level,
        hot_method, hot_count,
        comment,
        blocking);



                                        4
Compile
Compile::Compile
  :
    if ((jvms = cg->generate(jvms))
== NULL) // Parse
  :
  Optimize();
  :
  Code_Gen();




                                      5
Compiler data structures
Node
MachNode
Type
ciObject
Phase
JVMState
nmethod




                           6
Compiler data structures




Compile::build_start_state + aload_0
(this)

                                       7
Compiler data structures




Graph Text Representation
                            8
Compiler data structures
$ ~/jdk1.7.0-b147/fastdebug/bin/java -XX:+PrintCompilation -XX:+PrintIdeal
-XX:CICompilerCount=1 sum

    214    1             sum::doit (22 bytes)
VM option '+PrintIdeal' ...
  21" ConI" === 0 [[ 180 ]] #int:0
 180" Phi" === 184 21 70 [[ 179 ]] #int !orig=[159],[139],[66] !
jvms: sum::doit @ bci:10
 179" AddI" === _ 180 181 [[ 178 ]] !orig=[154],[137],70,[145] !jvms:
sum::doit @ bci:12
 178" AddI" === _ 179 181 [[ 177 ]] !orig=[153],[146],[135],86,[71] !
jvms: sum::doit @ bci:14
 177" AddI" === _ 178 181 [[ 176 ]] !orig=[165],[152],86,[71] !jvms:
sum::doit @ bci:14
 148" ConI" === 0 [[ 87 ]] #int:97
 176" AddI" === _ 177 181 [[ 190 ]] !orig=[168],[152],86,[71] !jvms:
sum::doit @ bci:14

// <idx> <node type> === <in[]> [[out[]]] <additional desc>
// jvms = JVMState, root()->dump(9999) would dump IR as above;

Real example: https://gist.github.com/1369656




                                                                             9
Ideal Graph Visualizer




                   グラフ表示オプション

level 4 ではパースの各段階と最適
   化の各段階の IR が表示可能

                                10
Ideal Graph Visualizer
public int getValue() { return value; }




                             enum { Control, I_O, Memory,
                             FramePtr, ReturnAdr, Parms };




                                                             11
Ideal Graph Visualizer




Final Code: Mach* node や Epilog,
Prolog, Ret などマシン依存のノードに
なっている

                                   12
Compiler data structures
Ideal Graph Visualizer
http://ssw.jku.at/General/Staff/TW/igv.html


etc/idealgraphvisualizer.conf:

default_options="-J-Xmx400m --branding
idealgraphvisualizer"

You need a fastdebug or a debug build
In OpenJDK build dir
$ make fastdebug_build
  OR
$ make debug_build




                                              13
Compiler data structures
Options to generate data for Ideal
Graph Visualizer
-XX:PrintIdealGraphLevel=0 [0:None, 4: most verbose]
-XX:PrintIdealGraphPort=4444
-XX:PrintIdealGraphAddress=”127.0.0.1”
-XX:PrintIdealGraphFile=<path to IR xml file>

IdealGraphVisualizer listens to port 4444 by default.




                                                        14
Compiler data structures
// -XX:PrintIdealGraphFile=<path> , IdealGraphViewer can display this
<graphDocument>
  <group>
    <properties> <p name="name"> virtual jint Call.doit()</p> </properties>
    <graph name="Bytecode 0: aload_0">
      <nodes>
        <node id="159337448">
          <properties>
            <p name="name"> Root</p>
            <p name="type"> bottom</p>
            <p name="idx"> 0</p>
            ...
          </properties>
        </node>
        <node id="159414516"> ... </node>
        <node id="159337448"> ... </node>
        ...
      </nodes>
      <edges>
        <edge index="0" to="159337448" from="159337448"></edge>
        <edge index="0" to="159414516" from="159414516"></edge>

Example: https://gist.github.com/1369620




                                                                              15
Compiler data structures
Node
    #   _in : Node** // use-def
    #   _out: Node** // def-use
    #   _cnt: node_idx_t // # of required inputs
    #   _max: node_idx_t // actual input array length
    #   _outcnt: node_idx_t
    #   _outmax: node_idx_t
    -   _class_id: jushort
    -   _flags: jushort

// _in は ordered, 位置も重要
// サブクラスの合計 340 個 <-> C1 62 個
// _class_id は 16 bit 値、ideal, mach で node の型を判断
// _flags は enum NodeFlags: Flag_is_Copy, Flag_is_Call,
// Flag_is_macro, Flag_is_con...




                                                          16
Compiler data structures
// Insert a new required input at the end
void Node::ins_req( uint idx, Node *n ) {
  assert( is_not_dead(n), "can not use dead node");
  add_req(NULL);                // Make space
  ...
  _in[idx] = n;    // Stuff over old required edge
  if (n != NULL) n->add_out((Node *)this); // Add
reciprocal def-use edge
}

  void add_out( Node *n ) {
    if (is_top()) return;
    if( _outcnt == _outmax ) out_grow(_outcnt);
    _out[_outcnt++] = n;
  }




                                                      17
Compiler data structures
RegionNode
 Control の merge
PhiNode
 Control の merge に伴うデータのマージ.
対応する RegionNode を指す。




                               18
Compiler data structures
Node // Optimize functions
// more ideal node, canonicalize
virtual Node *Ideal(PhaseGVN *phase,
bool can_reshape);

// set of values this node can take
virtual const Type *Value
( PhaseTransform *phase ) const;

// existing node which computes same
virtual Node *Identity
( PhaseTransform *phase );


                                       19
Compiler data structures
Node Ideal
 defined in Add*Node, MinINode, StartNode, ReturnNode,
RethrowNode, SafePointNode, AllocateArrayNode, LockNode,
UnlockNode, RegionNode, PhiNode, PCTableNode,
NeverBranchNode, CMove*Node, ConstraintCastNode,
CheckCastPPNode, Conv?2?Node, Div?Node, Mod?Node,
IfNode, LoopNode, CountedLoopNode, LoopLimitNode,
Load*Node, Store*Node, ClearArrayNode, StrIntrinsicNode,
MemBarNode, MergeMemNode, Mul*Node, And*Node,
LShift*Node, URShift*Node, RootNode, HaltNode, Sub*Node,
Cmp*Node, BoolNode

Ideal, Value, Identity は多くのサブクラスが目的に応じた物を定義
している。




                                                           20
Compiler data structures
AddNode     Ideal
Convert     "(x+1)+2" into "x+(1+2)"
Convert     "(x+1)+y" into "(x+y)+1"
Convert     "x+(y+1)" into "(x+y)+1"

x    Con1       Con2   x   Con1     Con2

    Add                         Add

          Add                 Add



                                           21
Compiler data structures
AddINode Ideal
 Node* in1 = in(1);
 Node* in2 = in(2);
 int op1 = in1->Opcode();
 int op2 = in2->Opcode();
 // Fold (con1-x)+con2 into (con1+con2)-x
 if ( op1 == Op_AddI && op2 == Op_SubI ) {
   // Swap edges to try optimizations below
   in1 = in2;
   in2 = in(1);
   op1 = op2;
   op2 = in2->Opcode();
 }
 if( op1 == Op_SubI ) {
   "(a-b)+(c-d)" into "(a+c)-(b+d)"
   "(a-b)+(b+c)" into "(a+c)"
   "(a-b)+(c+b)" into "(a+c)"




                                              22
Compiler data structures
const Type *AddNode::Value(...)
  // Either input is TOP ==> the result is TOP
  // Either input is BOTTOM ==> the result is the local
BOTTOM
  // Check for an addition involving the additive
identity




                                                          23
Compiler data structures
Node *AddNode::Identity(...)
  // If either input is a constant 0, return the other
input.

  const Type *zero = add_id(); // The additive identity
  if( phase->type( in(1) )->higher_equal( zero ) )
return in(2);
  if( phase->type( in(2) )->higher_equal( zero ) )
return in(1);
  return this;




                                                          24
Compiler data structures
Node *AddINode::Identity(...)
  // Fold (x-y)+y   OR   y+(x-y)   into   x

  if( in(1)->Opcode() == Op_SubI && phase->eqv(in(1)->in
(2),in(2)) ) {
    return in(1)->in(1);
  }
  else if( in(2)->Opcode() == Op_SubI && phase->eqv(in
(2)->in(2),in(1)) ) {
    return in(2)->in(1);
  }
  return AddNode::Identity(phase);




                                                           25
Compiler data structures
Node *PhaseGVN::transform_no_reclaim
// Return a node which computes the same function
// as this node, but in a faster or cheaper fashion.

  while( 1 ) {
    Node *i = k->Ideal(this, /*can_reshape=*/false);
    if( !i ) break;...
  }
  const Type *t = k->Value(this); // Get runtime Value
set
    k->raise_bottom_type(t);
  Node *i = k->Identity(this); if (i != k) return i;

  i = hash_find_insert(k); if( i && (i != k)) return i;

Parse で Node を作ると transform する。Parse しながら GVN




                                                          26
Compiler data structures
Node // raise_bottom_type related
  TypeNode // Type* _type
    ConNode // ConINode, ConPNode ..
    PhiNode // TypePtr* _adr_type
          // int _inst_id,
          // inst_index, _inst_offset
    ConvI2LNode
  MemNode // TypePtr* _adr_type
    LoadNode // Type* _type
      LoadPNode // load obj or arr
      LoadINode
https://gist.github.com/1369608


                                        27
Compiler data structures
Node
^
RegionNode
basic blocks にマップできる。入力は Control sources.
PhiNode は RegionNode を指す
入力を持つ。PhiNode へのマージされるデータの入力は RegionNode の
入力と一対一の
対応を持つ。 PhiNode の 0 の入力は RegionNode で RegionNode
の入力 0 は自身。
PhiNode* has_phi() const
^
LoopNode // Simple Loop Header
  short _loop_flags
^
RootNode




                                                  28
Compiler data structures
MultiNode

SafePointNode




                           29
Compiler data structures
TypeNode
^
ConNode




                           30
Compiler data structures
Node
^
ProjNode // project a single elem    out of a tuple or
signature type
^
ParmNode // incoming Parameters
  const uint _con;              //   The field in the
tuple we are projecting
  const bool _is_io_use;        //   Used to distinguish
between the projections
                                //   used on the control
and io paths from a macro node




                                                           31
Compiler data structures
Node
^
MergeMem // (See comment in memnode.cpp near
MergeMemNode::MergeMemNode for semantics.)
  in(AliasIdxTop) = in(1) is always the top node
  in(0) is NULL
  in(AliasIdxBot) is a "wide" memory state.
   For in(AliasIdxRaw) = in(3) and above, mem state for
alias type <N> or top
   base_memory() // wide state
   memory_at(N) // for alias type <N>
   Identity: base が empty なら base を返す,さも無ければ this
 Ideal: Simplify stacked MergeMem




                                                          32
Compiler data structures
TypeNode
^
PhiNode
異なるコントロールパスからの値をマージする。Slot 0 は control す
る RegionNode




                                           33
Compiler data structures
class ConINode : public ConNode {
public:
  ConINode( const TypeInt *t ) : ConNode(t) {}
  virtual int Opcode() const;

  // Factory method:
  static ConINode* make( Compile* C, int con ) {
    return new (C, 1) ConINode( TypeInt::make(con) );
  }
class ConNode : public TypeNode {
public:
  ConNode( const Type *t ) : TypeNode(t,1) {
    init_req(0, (Node*)Compile::current()->root());
    init_flags(Flag_is_Con);
  }
class TypeNode : public Node {
  const Type* const _type;
  TypeNode( const Type *t, uint required ) : Node



                                                        34
Compiler data structures
// Add pointer plus integer to get pointer. NOT commutative, really.
// So not really an AddNode. Lives here, because people associate it with
// an add.
class AddPNode : public Node {
public:
  enum { Control,               // When is it safe to do this add?
         Base,                  // Base oop, for GC purposes
         Address,               // Actually address, derived from base
         Offset } ;             // Offset added to address
  AddPNode( Node *base, Node *ptr, Node *off ) : Node(0,base,ptr,off) {
    init_class_id(Class_AddP);
  }
  Identity: if one input is 0, return in(Address), otherwise this
  Ideal: 左が定数の加算であれば, expression tree を平坦化
          raw pointer で NULL なら CastX2PNode(offset)
          右が constant の加算なら (ptr + (offset+cn)) を (ptr + offset) +
con に変更




                                                                            35
Compiler data structures
// Return from subroutine node
class ReturnNode : public Node {
public:
  ReturnNode( uint edges, Node *cntrl, Node *i_o, Node *memory, Node
*retadr, Node *frameptr );
  virtual int Opcode() const;
  virtual bool is_CFG() const { return true; }




                                                                       36
Compiler data structures
JVMState
 JVMState*        _caller    // for scope chains
 uint             _depth, _locoff, _stkoff, _monoff,
 uint             _scloff    // offset of scalar objs
 uint             _endoff
 uint             _sp
 int              _bci
 ReexecuteState   _reexecute
 ciMethod*        _method
 SafePointNode*   _map




                                                        37
Compiler data structures
class Type {
public:
  enum TYPES { Bad = 0, Control,
    Top,
    Int, Long, Half, NarrowOop,
    Tuple, Array,
    AnyPtr, RawPtr, OopPtr, InstPtr, AryPtr, KlassPtr,
    Function, Abio, Return_Address, Memory,
    FloatTop, FloatCon, FloatBot,
    DoubleTop, DoubleCon, DoubleBot,
    Bottom, lasttype };
private:
  const Type __dual;
protected:
  const TYPES _base;




                                                         38
Compiler data structures
class Type {
  :
public:
  TYPES base();
  static const Type *make(enum TYPES);
  static int cmp(Type*, Type*);
  int higher_equal( Type *t)
  const Type *meet(Type *t);
  virtual const Type *widen(Type *old, Type* limit)
  virtual const Type *narrow(Type *old)




                                                      39
Compiler data structures
class   Dict;
class   Type;
class     TypeD;
class     TypeF;
class     TypeInt;
class     TypeLong;
class     TypeNarrowOop;
class     TypeAry;
class     TypeTuple;
class     TypePtr;
class       TypeRawPtr;
class       TypeOopPtr;
class         TypeInstPtr;
class         TypeAryPtr;
class         TypeKlassPtr;




                              40
Compiler data structures
Phase
                                PhaseTransform
  Compile
                                  PhaseIdealLoop
  GraphKit
                                  Matcher
  PhaseCFG
                                  PhaseValues
  PhaseBlockLayout
                                    PhaseGVN
  PhaseCoalesce
                                      PhaseIterGVN
    PhaseAggressiveCoalesce
                                        PhaseCCP
    PhaseConservativeCoalesce
                                  PhasePeephole
  PhaseIFG
                                PhaseStringOpts
  PhaseLive
  PhaseMacroExpand
  PhaseRegAlloc
    PhaseChaitin
  PhaseRemoveUseless




                                                     41
Compiler data structures
class Phase : public StackObj {
public:
  enum PhaseNumber { Compiler, Parser,Remove_Useless, ...}
protected:
  enum PhaseNumber _pnum;
public:
  Compile * C;
}




                                                             42
Compiler data structures
class Compile : public Phase {
  const int        _compile_id;
  ciMethod*        _method;
  int              _entry_bci;
  const TypeFunc*  _tf;
  InlineTree*      _ilt;
  Arena            _comp_arena;
  ConnectionGraph* _congraph;
  uint             _unique;
  Arena            _node_arena;
  RootNode*        _root;
  Node*            _top;
  :
}




                                  43
Compiler data structures
class Compile : public Phase {
  :
  PhaseGVN*         _initial_gvn;
  Unique_Node_List  _for_igvn;
  WarmCallInfo*     _warm_calls;
  PhaseCFG*         _cfg;
  Matcher*          _matcher;
  PhaseRegAlloc*    _regalloc;
  OopMapSet*        _oop_map_set;
  :
}




                                    44
Compiler data structures
class PhaseTransform : public Phase {
protected:
  Arena* _arena;
  Node_Array _nodes;
  Type_Array _types;
  ConINode*  _icons[...];
  ConLNode*  _lcons[...];
  ConNode*   _zcons[...];
  :
}




                                        45
Compiler data structures
class PhaseTransform : public Phase {
public:
  const Type* type(const Node* n) const;
  const Type* type_or_null(const Node* n) const;
  void set_type(const Node* n, const Type* t);
  void set_type_bottom(const Node* n);
  void ensure_type_or_null(const Node* n);
  ConNode* makecon(const Type* t);
  ConINode* intcon(jint i);
  ConLNode* longcon(jlong l);
  virtual Node *transform(Node *) = 0;
  :
}




                                                   46
Compiler data structures
値をテーブルで管理する機能
class PhaseValues : public PhaseTransform {
protected:
  NodeHash       _table; // for value-numbering
public:
  bool   hash_delete(Node *n);
  bool   hash_insert(Node *n);
  Node  *hash_find_insert(Node* n);
  Node  *hash_find(Node* n);
  :
}




                                                  47
Compiler data structures
ローカルの悲観的な GVN-style の最適化
class PhaseGVN : public PhaseValues {
public:
  Node  *transform(Node *n);
  Node  *transform_no_reclaim(Node *n);
  :
}




                                          48
Compiler data structures
繰り返しのローカル、悲観的 GVN-style 最適化と ideal の変形
class PhaseIterGVN : public PhaseGVN {
private:
  bool _delay_transform;
  virtual Node *transform_old(Node *a_node);
  void subsume_node(Node *old, Node *nn);
protected:
  virtual Node *transform(Node *a_node);
  void init_worklist(Node *a_root);
  virtual const Type* saturate(Type*, Type*, Type*)
public:
  Unique_Node_List _worklist;
  void optimize();
  :
}




                                                      49
Parse
最初のパスでブロックを認識、2番目のパスで各
ブロックを訪れ、そのなかのバイトコードを処理
して、Node のサブクラスのオブジェクトを作っ
たり、JVMState を作ったり、更新したり、最適
化したり。値の伝播がうまくいく様にブロックに
入ってくるブロックが極力 Parse されている様
にする。




                             50
Parse
#0 Parse::do_one_bytecode()
#1 Parse::do_one_block()
#2 Parse::do_all_blocks()
#3 Parse::Parse(JVMState*, ciMethod*,
float) ()
#4 ParseGenerator::generate
(JVMState*)
#5 Compile::Compile(ciEnv*,
C2Compiler*, ciMethod*, int, bool,
bool)




                                        51
Parse do_one_bytecode
  switch (bc()) {
  case Bytecodes::_nop:
    // do nothing
    break;
  case Bytecodes::_lconst_0:
    push_pair(longcon(0)); break;
  :
  case Bytecodes::_iconst_5: push(intcon( 5)); break;
  case Bytecodes::_bipush:   push(intcon(iter
().get_constant_u1())); break;
  case Bytecodes::_sipush:   push(intcon(iter
().get_constant_u2())); break;

makecon, ingcon など定数を表すノードを返す static 関数もある。




                                                        52
Parse do_one_bytecode
  case    Bytecodes::_ldc:
  case    Bytecodes::_ldc_w:
  case    Bytecodes::_ldc2_w:
    //    If the constant is unresolved, run this BC once
in the    interpreter.
    {
         ciConstant constant = iter().get_constant();
         if (constant.basic_type() == T_OBJECT &&
             !constant.as_object()->is_loaded()) {
           int index = iter().get_constant_pool_index();




                                                            53
Parse do_one_bytecode
 case Bytecodes::_aload_0:
   push( local(0) );
   break;
 :

 case Bytecodes::_aload:
   push( local(iter().get_index()) );
   break;

push, local は結果的に JVMState, SafePointNode の状態を変
更。

iter() を使って bytecode の引き数を取って来る事ができる。




                                                  54
Parse do_one_bytecode
 case Bytecodes::_fstore_0:
 case Bytecodes::_istore_0:
 case Bytecodes::_astore_0:
   set_local( 0, pop() );
   break;

 :
 case Bytecodes::_fstore:
 case Bytecodes::_istore:
 case Bytecodes::_astore:
   set_local( iter().get_index(), pop() );
   break;




                                             55
Parse do_one_bytecode
 case Bytecodes::_pop: _sp -= 1;    break;
 case Bytecodes::_pop2: _sp -= 2;   break;
 case Bytecodes::_swap:
   a = pop();
   b = pop();
   push(a);
   push(b);
   break;
 case Bytecodes::_dup:
   a = pop();
   push(a);
   push(a);
   break;




                                             56
Parse do_one_bytecode
  case Bytecodes::_baload: array_load(T_BYTE);   break;
  case Bytecodes::_caload: array_load(T_CHAR);   break;
  case Bytecodes::_iaload: array_load(T_INT);    break;
  case Bytecodes::_saload: array_load(T_SHORT); break;
  case Bytecodes::_faload: array_load(T_FLOAT); break;
  case Bytecodes::_aaload: array_load(T_OBJECT); break;
  case Bytecodes::_laload: {
    a = array_addressing(T_LONG, 0);
    if (stopped()) return;      // guaranteed null or
range check
    _sp -= 2;                   // Pop array and index
    push_pair( make_load(control(), a, TypeLong::LONG,
T_LONG, TypeAryPtr::LONGS));
    break;
  }




                                                          57
Parse do_one_bytecode
  case Bytecodes::_bastore: array_store(T_BYTE); break;
  case Bytecodes::_castore: array_store(T_CHAR); break;
  case Bytecodes::_iastore: array_store(T_INT);    break;
  case Bytecodes::_sastore: array_store(T_SHORT); break;
  case Bytecodes::_fastore: array_store(T_FLOAT); break;
  case Bytecodes::_aastore: {
    d = array_addressing(T_OBJECT, 1);
    if (stopped()) return;      // guaranteed null or
range check
    array_store_check();
    c = pop();                  // Oop to store
    b = pop();                  // index (already used)
    a = pop();                  // the array itself
    const TypeOopPtr* elemtype = _gvn.type(a)-
>is_aryptr()->elem()->make_oopptr();
    const TypeAryPtr* adr_type = TypeAryPtr::OOPS;
    Node* store = store_oop_to_array(control(), a, d,
adr_type, c, elemtype, T_OBJECT);



                                                            58
Parse do_one_bytecode
 case Bytecodes::_getfield:
   do_getfield();
   break;

 case Bytecodes::_getstatic:
   do_getstatic();
   break;

 case Bytecodes::_putfield:
   do_putfield();
   break;

 case Bytecodes::_putstatic:
   do_putstatic();
   break;




                               59
Parse do_one_bytecode
 // implementation of _get* and _put* bytecodes
 void do_getstatic() { do_field_access(true, false); }
 void do_getfield () { do_field_access(true, true); }
 void do_putstatic() { do_field_access(false, false); }
 void do_putfield () { do_field_access(false, true); }




                                                          60
Parse do_one_bytecode
Parse::do_field_access
  Parse::do_get_xxx(Node* obj, ciField* field, bool
is_field)
    Node *adr = basic_plus_adr(obj, obj, offset);
    :
    Node* ld = make_load(NULL, adr, type, bt, adr_type,
is_vol);


Node* GraphKit::basic_plus_adr(Node* base, Node* ptr,
Node* offset) {
  // short-circuit a common case
  if (offset == intcon(0)) return ptr;
  return _gvn.transform( new (C, 4) AddPNode(base, ptr,
offset) );
}




                                                          61
Parse do_one_bytecode
// factory methods in "int adr_idx"
Node* GraphKit::make_load(Node* ctl, Node* adr, const
Type* t, BasicType bt,int adr_idx,
                          bool require_atomic_access) {

  Node* mem = memory(adr_idx);
  Node* ld;
  if (require_atomic_access && bt == T_LONG) {
    ld = LoadLNode::make_atomic(C, ctl, mem, adr,
adr_type, t);
  } else {
    ld = LoadNode::make(_gvn, ctl, mem, adr, adr_type,
t, bt);
  }
  return _gvn.transform(ld);
}




                                                          62
Parse do_one_bytecode
Node* GraphKit::memory(uint alias_idx) {
  MergeMemNode* mem = merged_memory();
  Node* p = mem->memory_at(alias_idx);
  _gvn.set_type(p, Type::MEMORY); // must be mapped
  return p;
}




                                                      63
Parse do_one_bytecode
_iadd
b = pop(), a = pop()
push(_gvn.transform(
    new (C, 3) AddINode(a,b)))

// GraphKit::pop()
Node* pop() { ..; return
  _map->stack(_map->_jvms,--_sp); }
// SefePointNode::stack
Node *stack(JVMState* jvms, uint idx) const {
  return in(jvms->stkoff() + idx);
}




                                                64
Parse do_one_bytecode
case Bytecodes::_iinc:         // Increment local
    i = iter().get_index();     // Get local index
    set_local( i, _gvn.transform(
        new (C, 3) AddINode(
            _gvn.intcon(iter().get_iinc_con()),
            local(i) ) ) );
    break;




                                                     65
Parse do_one_bytecode
_goto, _goto_w
    int target_bci = (bc() == Bytecodes::_goto) ?
        iter().get_dest() : iter().get_far_dest();
    // If this is a backwards branch in the bytecodes,
add Safepoint
    maybe_add_safepoint(target_bci);
    // Update method data
    profile_taken_branch(target_bci);
    // Add loop predicate if it goes to a loop
    if (should_add_predicate(target_bci)){
      add_predicate();
    }
    // Merge the current control into the target basic
block
    merge(target_bci);
    ...



                                                         66
Parse do_one_bytecode
_goto, _goto_w
    ...// See if we can get some profile data and hand
it off to the next block
    Block *target_block = block()->successor_for_bci
(target_bci);
    if (target_block->pred_count() != 1)  break;
    ciMethodData* methodData = method()->method_data();
    if (!methodData->is_mature())  break;
    ciProfileData* data = methodData->bci_to_data(bci
());
    assert( data->is_JumpData(), "" );
    int taken = ((ciJumpData*)data)->taken();
    taken = method()->scale_count(taken);
    target_block->set_count(taken);
    break;




                                                          67
Parse do_one_bytecode
case _ifnull:    btest = BoolTest::eq;
    goto handle_if_null;
case _ifnonnull: btest = BoolTest::ne;
    goto handle_if_null;
handle_if_null:
    // If this is a backwards branch in the bytecodes,
add Safepoint
    maybe_add_safepoint(iter().get_dest());
    a = null();
    b = pop();
    c = _gvn.transform( new (C, 3) CmpPNode(b, a) );
    do_ifnull(btest, c);
    break;




                                                         68
Parse do_one_bytecode
case _if_acmpeq: btest = BoolTest::eq;
    goto handle_if_acmp;
case _if_acmpne: btest = BoolTest::ne;
    goto handle_if_acmp;
handle_if_acmp:
    // If this is a backwards branch in the bytecodes,
add Safepoint
    maybe_add_safepoint(iter().get_dest());
    a = pop();
    b = pop();
    c = _gvn.transform( new (C, 3) CmpPNode(b, a) );
    do_if(btest, c);
    break;




                                                         69
Parse do_one_bytecode
case Bytecodes::_tableswitch:
    do_tableswitch();
    break;

case Bytecodes::_lookupswitch:
    do_lookupswitch();
    break;




                                 70
Parse do_one_bytecode
case Bytecodes::_invokestatic:
case Bytecodes::_invokedynamic:
case Bytecodes::_invokespecial:
case Bytecodes::_invokevirtual:
case Bytecodes::_invokeinterface:
    do_call();
    break;
case Bytecodes::_checkcast:
    do_checkcast();
    break;
case Bytecodes::_instanceof:
    do_instanceof();
    break;




                                    71
Parse do_one_bytecode
getClass はインライン展開され、
LoadKlass -> メモリアクセスに。
hashCode は static に



public class Call {
  public static void main(String[] args) {
    Call c = new Call();
    for (int i = 0; i < 100000; i++) {
      c.doit();
    }
  }
  int doit() {
    return getClass().hashCode();
  }
}




                                             72
Parse do_one_bytecode
case Bytecodes::_anewarray:
    do_anewarray();
    break;
case Bytecodes::_newarray:
    do_newarray((BasicType)iter().get_index());
    break;
case Bytecodes::_multianewarray:
    do_multianewarray();
    break;
case Bytecodes::_new:
    do_new();
    break;




                                                  73
Parse do_one_bytecode
case Bytecodes::_jsr:
case Bytecodes::_jsr_w:
    do_jsr();
    break;

case Bytecodes::_ret:
    do_ret();
    break;




                          74
Parse do_one_bytecode
case Bytecodes::_monitorenter:
    do_monitor_enter();
    break;

case Bytecodes::_monitorexit:
    do_monitor_exit();
    break;




                                 75
Optimize
PhaseIterGVN igvn(initial_gvn)
igvn.optimize()
ConnectionGraph::do_analysis(this,
&igvn) // EscapeAnalysis
igvn.optimize()
PhaseIdealLoop ideal_loop(igvn, true)
PhaseIdealLoop ideal_loop(igvn, ...)
PhaseIdealLoop ideal_loop(igvn, ...)
PhaseCCP ccp( &igvn )
PhaseMacroExpand mex(igvn)




                                        76
Optimize
PhaseIterGVN igvn(initial_gvn)
igvn.optimize()
ConnectionGraph::do_analysis(this,
&igvn) // EscapeAnalysis
igvn.optimize()
PhaseIdealLoop ideal_loop(igvn, true)
PhaseIdealLoop ideal_loop(igvn, ...)
PhaseIdealLoop ideal_loop(igvn, ...)
PhaseCCP ccp( &igvn )
PhaseMacroExpand mex(igvn)




                                        77
Optimize
PhaseIterGVN igvn(initial_gvn)
igvn.optimize()
 // worklist から取り出し, node を transform.
 // node が変わったら,edge 情報を
 // 更新して, users を worklist に置く
 while( _worklist.size() ) {
   Node *n = _worklist.pop();
   if (++loop_count >= K * C->unique()) { // 範囲の確認
     ...}
     if (n->outcnt() != 0) {
       Node *nn = transform_old(n);
     } else if (!n->is_top()) {
       remove_dead_node(n);
     }
 }




                                                     78
Optimize
Node *PhaseIterGVN::transform_old
( Node *n )

Ideal に渡す can_reshape が true である事
Constant に計算される物は subsume_node で user を新しいノード
をさす様に変更する事




                                                79
Optimize
PhaseIterGVN igvn(initial_gvn)
igvn.optimize()
ConnectionGraph::do_analysis(this,
&igvn) // EscapeAnalysis
igvn.optimize()
PhaseIdealLoop ideal_loop(igvn, true)
PhaseIdealLoop ideal_loop(igvn, ...)
PhaseIdealLoop ideal_loop(igvn, ...)
PhaseCCP ccp( &igvn )
PhaseMacroExpand mex(igvn)




                                        80
Optimize
ConnectionGraph::do_analysis(this,
&igvn) // EscapeAnalysis
 if (congraph->compute_escape()) {
   // There are non escaping objects.
   C->set_congraph(congraph);
 }

congraph は LockNode, UnlockNode で確認し、これらが non-
escape なら処理がなくなる。local なオブジェクトにロック、アンロッ
クは無意味。




                                                 81
Optimize
ConnectionGraph::compute_escape()
java object の allocation がなければ false を返す
AddP, MergeMem 等を work list にのせる、それらの out ものせる

worklist のノードを細かく調べる

GrowableArray<PointsToNode>   _nodes に登録して、
GlobalEscape, ArgEscape, NoEscape に分類, 到達可能なノードに
伝播する。

// comment in escape.hpp
// flags: PrintEscapeAnalysis PrintEliminateAllocations




                                                          82
Optimize
class ConnectionGraph: public
ResourceObj
  // escape state of a node
  PointsToNode::EscapeState escape_state(Node *n);

  // other information we have collected
  bool is_scalar_replaceable(Node *n) {
    if (_collecting || (n->_idx >= nodes_size()))
      return false;
    PointsToNode* ptn = ptnode_adr(n->_idx);
    return ptn->escape_state() == PointsToNode::NoEscape
&& ptn->_scalar_replaceable;
  }




                                                           83
Optimize
PhaseIterGVN igvn(initial_gvn)
igvn.optimize()
ConnectionGraph::do_analysis(this,
&igvn) // EscapeAnalysis
igvn.optimize()
PhaseIdealLoop ideal_loop(igvn, true)
PhaseIdealLoop ideal_loop(igvn, ...)
PhaseIdealLoop ideal_loop(igvn, ...)
PhaseCCP ccp( &igvn )
PhaseMacroExpand mex(igvn)




                                        84
Optimize
PhaseIdealLoop::PhaseIdealLoop
(PhaseIterGVN &igvn, bool
do_split_ifs)

  build_and_optimize(do_split_ifs);




                                      85
Optimize
// Convert to counted loops where possible
PhaseIdealLoop::is_counted_loop( Node *x, IdealLoopTree
*loop )
  PhaseIdealLoop::is_counted_loop で CountedLoop への変換
を試みる。再帰的に子のループに関しても counted_loop を呼ぶ
void PhaseIdealLoop::do_peeling( IdealLoopTree *loop,
Node_List &old_new )
// 1回目の実行を切り出す。loopTransform.cpp に図解
void PhaseIdealLoop::do_unroll( IdealLoopTree *loop,
Node_List &old_new, bool adjust_min_trip )
void PhaseIdealLoop::do_maximally_unroll( IdealLoopTree
*loop, Node_List &old_new )

// Eliminate range-checks and other trip-counter vs
loop-invariant tests.
void PhaseIdealLoop::do_range_check( IdealLoopTree
*loop, Node_List &old_new )



                                                          86
Optimize -- PhaseIdealLoop
  After Parsing




static int doit() {
  int sm = 0;
  for (int i = 0; i < 100; i++)
    sm += i;
  return sm;
}




                                  87
Optimize -- PhaseIdealLoop
After CountedLoop




 static int doit() {
   int sm = 0;
   for (int i = 0; i < 100; i++)
     sm += i;
   return sm;
 }




                                   88
Optimize -- PhaseIdealLoop
Optimization Finished




                            Unrolling?


                                         89
Optimize
PhaseIterGVN igvn(initial_gvn)
igvn.optimize()
ConnectionGraph::do_analysis(this,
&igvn) // EscapeAnalysis
igvn.optimize()
PhaseIdealLoop ideal_loop(igvn, true)
PhaseIdealLoop ideal_loop(igvn, ...)
PhaseIdealLoop ideal_loop(igvn, ...)
PhaseCCP ccp( &igvn )
PhaseMacroExpand mex(igvn)




                                        90
Optimize
PhaseCCP ccp( &igvn )
ccp.do_transform
  C->set_root( transform(C->root())-
>as_Root() );

 定数置き換え可能な物を置き換える




                                       91
Optimize
PhaseIterGVN igvn(initial_gvn)
igvn.optimize()
ConnectionGraph::do_analysis(this,
&igvn) // EscapeAnalysis
igvn.optimize()
PhaseIdealLoop ideal_loop(igvn, true)
PhaseIdealLoop ideal_loop(igvn, ...)
PhaseIdealLoop ideal_loop(igvn, ...)
PhaseCCP ccp( &igvn )
PhaseMacroExpand mex(igvn)




                                        92
Optimize
PhaseMacroExpand mex(igvn)
mex.expand_macro_nodes()
  ...
  eliminate_allocate_node
    scalar_replacement

Escape Analysis の結果の処理、allocation
をスタック操作に変換?




                                    93
Code_Gen
Matcher m(proj_list)
m.match()
PhaseCFG cfg(node_arena(), root())
cfg.Dominators()
cfg.Estimate_Block_Frequency()
cfg.GlobalCodeMotion(m,unique(),proj)
PhaseChaitin regalloc(unique, cfg, m)
regalloc->Register_Allocate()
PaseBlockLayout
PhasePeephole
Output



                                        94
Matcher
#0 in addI_eRegNode::Expand(State*,
Node_List&, Node*) ()
#1 in Matcher::ReduceInst(State*,
int, Node*&) ()
#2 in Matcher::match_tree(Node
const*) ()
#3 in Matcher::xform(Node*, int) ()
#4 in Matcher::match() ()
#5 in Compile::Code_Gen() ()
#6 in Compile::Compile(ciEnv*,
C2Compiler*, ciMethod*, int, bool,
bool) ()


                                      95
Matcher
// x86_32.ad, an ADLC file
instruct addI_eReg(eRegI dst, eRegI
src, eFlagsReg cr) %{
  match(Set dst (AddI dst src));
  effect(KILL cr);

     size(2);
     format %{ "ADD   $dst,$src" %}
     opcode(0x03);
     ...
%}



                                      96
PhaseCFG
PhaseCFG::build_cfg()
RegionNode, StartNode を元に CFG
(Control Flow Graph) を構築。以降のマシ
ンよりの操作が行える様にする。




                                 97
PhaseCFG
class PhaseCFG : public Phase
+ _num_blocks: uint
+ _blocks: RootNode*
+ _bbs: Block_Array
+ _broot: Block*
+ _rpo_ctr: uint
+ _root_loop:
+ _node_latency: GrowableAray<uint>*




                                       98
PhaseCFG
class Block :   public CFGElement
+ _nodes    :   Node_List
+ _succs    :   Block_Array
+ _num_succs:   uint
+ _pre_order:   uint // Pre-order DFS #

+ _dom_depth: uint
+ _idom     : Block*

+ _loop     : CFGLoop*
+ _rpo      : uint
: // reg pressure, etc


                                          99
PhaseCFG
PhaseCFG::Dominators()
// Lengauer & Tarjan algorithm
// Block の _dom_depth, _idom を設定
// Code Motion の元になるデータ

PhaseCFG::Estimate_Block_Frequency()
// IfNode の probabilities から block
// の frequency を算出, Block の親の
// field _freq に設定



                                       100
PhaseCFG
PhaseCFG::GlobalCodeMotion
  schedule_early
  schedule_late




                             101
Register allocation BriggsChaitin
レジスタ彩色

変数の生存区間の干渉グラフを既定のレジスタ数
の色に塗り分け、解けないならスピルを加えて再
試行...の改良版

読めてません




                                    102
Output
StartNode を MachPrologNode で置き換え
Unverified entry point の設定
MachEpilogNode を各 return の前に配置
ScheduleAndBundle()
BuildOopMap()
Fill_buffer()
  CodeBuffer を用意
  for (i=0; i < _cfg->numLblocks; i++)
    for Uj = 0; j < last_inst; j++)
      …
      n->emit(*cb, _regalloc)


-XX:+PrintOptoAssembly to dump instructions

https://gist.github.com/1376858




                                              103
おまけ                                          Sheet2


                                     bytecode size vs arena use
        14000000




        12000000




        10000000




        8000000

                                                                                       comp
bytes




                                                                                       node
                                                                                       res
        6000000




        4000000




        2000000




               0
                   0   1000   2000        3000             4000   5000   6000   7000

                                           bytecode size




                                                     ページ 1


                                                                                              104
Sheet2


                                        compiler memory use
        14000000




        12000000




        10000000




        8000000

                                                                                            comp_arena
bytes




                                                                                            node_arena
                                                                                            res_area
        6000000




        4000000




        2000000




               0
                   0   5000   10000                15000            20000   25000   30000

                                      unique (number of nodes)




                                                       ページ 1


                                                                                                         105
参考文献
http://www.usenix.org/events/jvm01/
full_papers/paleczny/paleczny.pdf

http://ssw.jku.at/Research/Papers/
Wuerthinger07Master/
Wuerthinger07Master.pdf




                                      106

Weitere ähnliche Inhalte

Was ist angesagt?

Intrinsic Methods in HotSpot VM
Intrinsic Methods in HotSpot VMIntrinsic Methods in HotSpot VM
Intrinsic Methods in HotSpot VMKris Mok
 
PHP と SAPI と ZendEngine3 と
PHP と SAPI と ZendEngine3 とPHP と SAPI と ZendEngine3 と
PHP と SAPI と ZendEngine3 とdo_aki
 
C++ マルチスレッドプログラミング
C++ マルチスレッドプログラミングC++ マルチスレッドプログラミング
C++ マルチスレッドプログラミングKohsuke Yuasa
 
OPcache の最適化器の今
OPcache の最適化器の今OPcache の最適化器の今
OPcache の最適化器の今y-uti
 
Nginxを使ったオレオレCDNの構築
Nginxを使ったオレオレCDNの構築Nginxを使ったオレオレCDNの構築
Nginxを使ったオレオレCDNの構築ichikaway
 
非同期処理の基礎
非同期処理の基礎非同期処理の基礎
非同期処理の基礎信之 岩永
 
カスタムROM開発者の視点から見たAndroid
カスタムROM開発者の視点から見たAndroidカスタムROM開発者の視点から見たAndroid
カスタムROM開発者の視点から見たAndroidandroid sola
 
Zynq + Vivado HLS入門
Zynq + Vivado HLS入門Zynq + Vivado HLS入門
Zynq + Vivado HLS入門narusugimoto
 
ContainerとName Space Isolation
ContainerとName Space IsolationContainerとName Space Isolation
ContainerとName Space Isolationmaruyama097
 
オレ流のOpenJDKの開発環境(JJUG CCC 2019 Fall講演資料)
オレ流のOpenJDKの開発環境(JJUG CCC 2019 Fall講演資料)オレ流のOpenJDKの開発環境(JJUG CCC 2019 Fall講演資料)
オレ流のOpenJDKの開発環境(JJUG CCC 2019 Fall講演資料)NTT DATA Technology & Innovation
 
Exploring Twitter's Finagle technology stack for microservices
Exploring Twitter's Finagle technology stack for microservicesExploring Twitter's Finagle technology stack for microservices
Exploring Twitter's Finagle technology stack for microservices💡 Tomasz Kogut
 
Php and threads ZTS
Php and threads ZTSPhp and threads ZTS
Php and threads ZTSjulien pauli
 
AVX-512(フォーマット)詳解
AVX-512(フォーマット)詳解AVX-512(フォーマット)詳解
AVX-512(フォーマット)詳解MITSUNARI Shigeo
 
[131]chromium binging 기술을 node.js에 적용해보자
[131]chromium binging 기술을 node.js에 적용해보자[131]chromium binging 기술을 node.js에 적용해보자
[131]chromium binging 기술을 node.js에 적용해보자NAVER D2
 
[CNCF TAG-Runtime 2022-10-06] Lima
[CNCF TAG-Runtime 2022-10-06] Lima[CNCF TAG-Runtime 2022-10-06] Lima
[CNCF TAG-Runtime 2022-10-06] LimaAkihiro Suda
 
Part II: LLVM Intermediate Representation
Part II: LLVM Intermediate RepresentationPart II: LLVM Intermediate Representation
Part II: LLVM Intermediate RepresentationWei-Ren Chen
 

Was ist angesagt? (20)

Intrinsic Methods in HotSpot VM
Intrinsic Methods in HotSpot VMIntrinsic Methods in HotSpot VM
Intrinsic Methods in HotSpot VM
 
PHP と SAPI と ZendEngine3 と
PHP と SAPI と ZendEngine3 とPHP と SAPI と ZendEngine3 と
PHP と SAPI と ZendEngine3 と
 
C++ マルチスレッドプログラミング
C++ マルチスレッドプログラミングC++ マルチスレッドプログラミング
C++ マルチスレッドプログラミング
 
OPcache の最適化器の今
OPcache の最適化器の今OPcache の最適化器の今
OPcache の最適化器の今
 
Nginxを使ったオレオレCDNの構築
Nginxを使ったオレオレCDNの構築Nginxを使ったオレオレCDNの構築
Nginxを使ったオレオレCDNの構築
 
golang profiling の基礎
golang profiling の基礎golang profiling の基礎
golang profiling の基礎
 
非同期処理の基礎
非同期処理の基礎非同期処理の基礎
非同期処理の基礎
 
カスタムROM開発者の視点から見たAndroid
カスタムROM開発者の視点から見たAndroidカスタムROM開発者の視点から見たAndroid
カスタムROM開発者の視点から見たAndroid
 
Zynq + Vivado HLS入門
Zynq + Vivado HLS入門Zynq + Vivado HLS入門
Zynq + Vivado HLS入門
 
ContainerとName Space Isolation
ContainerとName Space IsolationContainerとName Space Isolation
ContainerとName Space Isolation
 
LLVM最適化のこつ
LLVM最適化のこつLLVM最適化のこつ
LLVM最適化のこつ
 
オレ流のOpenJDKの開発環境(JJUG CCC 2019 Fall講演資料)
オレ流のOpenJDKの開発環境(JJUG CCC 2019 Fall講演資料)オレ流のOpenJDKの開発環境(JJUG CCC 2019 Fall講演資料)
オレ流のOpenJDKの開発環境(JJUG CCC 2019 Fall講演資料)
 
Exploring Twitter's Finagle technology stack for microservices
Exploring Twitter's Finagle technology stack for microservicesExploring Twitter's Finagle technology stack for microservices
Exploring Twitter's Finagle technology stack for microservices
 
Plan 9のお話
Plan 9のお話Plan 9のお話
Plan 9のお話
 
Php and threads ZTS
Php and threads ZTSPhp and threads ZTS
Php and threads ZTS
 
AVX-512(フォーマット)詳解
AVX-512(フォーマット)詳解AVX-512(フォーマット)詳解
AVX-512(フォーマット)詳解
 
[131]chromium binging 기술을 node.js에 적용해보자
[131]chromium binging 기술을 node.js에 적용해보자[131]chromium binging 기술을 node.js에 적용해보자
[131]chromium binging 기술을 node.js에 적용해보자
 
[CNCF TAG-Runtime 2022-10-06] Lima
[CNCF TAG-Runtime 2022-10-06] Lima[CNCF TAG-Runtime 2022-10-06] Lima
[CNCF TAG-Runtime 2022-10-06] Lima
 
Linux Namespaces
Linux NamespacesLinux Namespaces
Linux Namespaces
 
Part II: LLVM Intermediate Representation
Part II: LLVM Intermediate RepresentationPart II: LLVM Intermediate Representation
Part II: LLVM Intermediate Representation
 

Ähnlich wie J5回JVMソースコードリーディングの会 サーバコンパイラ

Chainer-Compiler 動かしてみた
Chainer-Compiler 動かしてみたChainer-Compiler 動かしてみた
Chainer-Compiler 動かしてみたAkira Maruoka
 
20145-5SumII_CSC407_assign1.htmlCSC 407 Computer Systems II.docx
20145-5SumII_CSC407_assign1.htmlCSC 407 Computer Systems II.docx20145-5SumII_CSC407_assign1.htmlCSC 407 Computer Systems II.docx
20145-5SumII_CSC407_assign1.htmlCSC 407 Computer Systems II.docxeugeniadean34240
 
Score (smart contract for icon)
Score (smart contract for icon) Score (smart contract for icon)
Score (smart contract for icon) Doyun Hwang
 
C++ amp on linux
C++ amp on linuxC++ amp on linux
C++ amp on linuxMiller Lee
 
Flashback, el primer malware masivo de sistemas Mac
Flashback, el primer malware masivo de sistemas MacFlashback, el primer malware masivo de sistemas Mac
Flashback, el primer malware masivo de sistemas MacESET Latinoamérica
 
Code is not text! How graph technologies can help us to understand our code b...
Code is not text! How graph technologies can help us to understand our code b...Code is not text! How graph technologies can help us to understand our code b...
Code is not text! How graph technologies can help us to understand our code b...Andreas Dewes
 
All I know about rsc.io/c2go
All I know about rsc.io/c2goAll I know about rsc.io/c2go
All I know about rsc.io/c2goMoriyoshi Koizumi
 
ISCA Final Presentaiton - Compilations
ISCA Final Presentaiton -  CompilationsISCA Final Presentaiton -  Compilations
ISCA Final Presentaiton - CompilationsHSA Foundation
 
Checking Oracle VM VirtualBox. Part 2
Checking Oracle VM VirtualBox. Part 2Checking Oracle VM VirtualBox. Part 2
Checking Oracle VM VirtualBox. Part 2Andrey Karpov
 
PVS-Studio 5.00, a solution for developers of modern resource-intensive appl...
PVS-Studio 5.00, a solution for developers of modern resource-intensive appl...PVS-Studio 5.00, a solution for developers of modern resource-intensive appl...
PVS-Studio 5.00, a solution for developers of modern resource-intensive appl...Andrey Karpov
 
Modify this code to use multiple threads with the same data1.Modif.pdf
Modify this code to use multiple threads with the same data1.Modif.pdfModify this code to use multiple threads with the same data1.Modif.pdf
Modify this code to use multiple threads with the same data1.Modif.pdfmallik3000
 
Top 10 bugs in C++ open source projects, checked in 2016
Top 10 bugs in C++ open source projects, checked in 2016Top 10 bugs in C++ open source projects, checked in 2016
Top 10 bugs in C++ open source projects, checked in 2016PVS-Studio
 
COSCUP2023 RSA256 Verilator.pdf
COSCUP2023 RSA256 Verilator.pdfCOSCUP2023 RSA256 Verilator.pdf
COSCUP2023 RSA256 Verilator.pdfYodalee
 
Linux kernel tracing superpowers in the cloud
Linux kernel tracing superpowers in the cloudLinux kernel tracing superpowers in the cloud
Linux kernel tracing superpowers in the cloudAndrea Righi
 
Hot Code is Faster Code - Addressing JVM Warm-up
Hot Code is Faster Code - Addressing JVM Warm-upHot Code is Faster Code - Addressing JVM Warm-up
Hot Code is Faster Code - Addressing JVM Warm-upMark Price
 
Analysis of Haiku Operating System (BeOS Family) by PVS-Studio. Part 2
Analysis of Haiku Operating System (BeOS Family) by PVS-Studio. Part 2Analysis of Haiku Operating System (BeOS Family) by PVS-Studio. Part 2
Analysis of Haiku Operating System (BeOS Family) by PVS-Studio. Part 2PVS-Studio
 
Java Jit. Compilation and optimization by Andrey Kovalenko
Java Jit. Compilation and optimization by Andrey KovalenkoJava Jit. Compilation and optimization by Andrey Kovalenko
Java Jit. Compilation and optimization by Andrey KovalenkoValeriia Maliarenko
 

Ähnlich wie J5回JVMソースコードリーディングの会 サーバコンパイラ (20)

Chainer-Compiler 動かしてみた
Chainer-Compiler 動かしてみたChainer-Compiler 動かしてみた
Chainer-Compiler 動かしてみた
 
20145-5SumII_CSC407_assign1.htmlCSC 407 Computer Systems II.docx
20145-5SumII_CSC407_assign1.htmlCSC 407 Computer Systems II.docx20145-5SumII_CSC407_assign1.htmlCSC 407 Computer Systems II.docx
20145-5SumII_CSC407_assign1.htmlCSC 407 Computer Systems II.docx
 
Score (smart contract for icon)
Score (smart contract for icon) Score (smart contract for icon)
Score (smart contract for icon)
 
Day 1
Day 1Day 1
Day 1
 
C++ amp on linux
C++ amp on linuxC++ amp on linux
C++ amp on linux
 
Flashback, el primer malware masivo de sistemas Mac
Flashback, el primer malware masivo de sistemas MacFlashback, el primer malware masivo de sistemas Mac
Flashback, el primer malware masivo de sistemas Mac
 
Code is not text! How graph technologies can help us to understand our code b...
Code is not text! How graph technologies can help us to understand our code b...Code is not text! How graph technologies can help us to understand our code b...
Code is not text! How graph technologies can help us to understand our code b...
 
All I know about rsc.io/c2go
All I know about rsc.io/c2goAll I know about rsc.io/c2go
All I know about rsc.io/c2go
 
ISCA Final Presentaiton - Compilations
ISCA Final Presentaiton -  CompilationsISCA Final Presentaiton -  Compilations
ISCA Final Presentaiton - Compilations
 
Checking Oracle VM VirtualBox. Part 2
Checking Oracle VM VirtualBox. Part 2Checking Oracle VM VirtualBox. Part 2
Checking Oracle VM VirtualBox. Part 2
 
PVS-Studio 5.00, a solution for developers of modern resource-intensive appl...
PVS-Studio 5.00, a solution for developers of modern resource-intensive appl...PVS-Studio 5.00, a solution for developers of modern resource-intensive appl...
PVS-Studio 5.00, a solution for developers of modern resource-intensive appl...
 
Modify this code to use multiple threads with the same data1.Modif.pdf
Modify this code to use multiple threads with the same data1.Modif.pdfModify this code to use multiple threads with the same data1.Modif.pdf
Modify this code to use multiple threads with the same data1.Modif.pdf
 
Top 10 bugs in C++ open source projects, checked in 2016
Top 10 bugs in C++ open source projects, checked in 2016Top 10 bugs in C++ open source projects, checked in 2016
Top 10 bugs in C++ open source projects, checked in 2016
 
COSCUP2023 RSA256 Verilator.pdf
COSCUP2023 RSA256 Verilator.pdfCOSCUP2023 RSA256 Verilator.pdf
COSCUP2023 RSA256 Verilator.pdf
 
C++InputOutput.pptx
C++InputOutput.pptxC++InputOutput.pptx
C++InputOutput.pptx
 
C++InputOutput.PPT
C++InputOutput.PPTC++InputOutput.PPT
C++InputOutput.PPT
 
Linux kernel tracing superpowers in the cloud
Linux kernel tracing superpowers in the cloudLinux kernel tracing superpowers in the cloud
Linux kernel tracing superpowers in the cloud
 
Hot Code is Faster Code - Addressing JVM Warm-up
Hot Code is Faster Code - Addressing JVM Warm-upHot Code is Faster Code - Addressing JVM Warm-up
Hot Code is Faster Code - Addressing JVM Warm-up
 
Analysis of Haiku Operating System (BeOS Family) by PVS-Studio. Part 2
Analysis of Haiku Operating System (BeOS Family) by PVS-Studio. Part 2Analysis of Haiku Operating System (BeOS Family) by PVS-Studio. Part 2
Analysis of Haiku Operating System (BeOS Family) by PVS-Studio. Part 2
 
Java Jit. Compilation and optimization by Andrey Kovalenko
Java Jit. Compilation and optimization by Andrey KovalenkoJava Jit. Compilation and optimization by Andrey Kovalenko
Java Jit. Compilation and optimization by Andrey Kovalenko
 

Kürzlich hochgeladen

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 

Kürzlich hochgeladen (20)

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 

J5回JVMソースコードリーディングの会 サーバコンパイラ

  • 1. C2 The Server Compiler 第5回JVMソースコードリーディングの会 @ytoshima 1
  • 2. Compilation trigger invoke* と goto bytecode (to negative offset) の呼び出し数を interpreter でカウ ントし,しきい値を越えると CompilerBroker にコンパイルのリクエストを出す。キューにリク エストが入り、キューを見ている CompilerThread がコンパイルを開始する。GC 動作とも並列して動作できる様になっている。 2
  • 3. Compilation trigger // Method invocation SimpleCompPolicy::method_invocation_e vent CompileBroker::compile_method(...) // OSR SimpleCompPolicy::method_back_branch_ event CompileBroker::compile_method(...) 3
  • 4. Compilation trigger CompileBroker::compile_method CompileBroker::compile_method_base : CompileQueue* queue = compile_queue(comp_level); task = create_compile_task(queue, compile_id, method, osr_bci, comp_level, hot_method, hot_count, comment, blocking); 4
  • 5. Compile Compile::Compile : if ((jvms = cg->generate(jvms)) == NULL) // Parse : Optimize(); : Code_Gen(); 5
  • 8. Compiler data structures Graph Text Representation 8
  • 9. Compiler data structures $ ~/jdk1.7.0-b147/fastdebug/bin/java -XX:+PrintCompilation -XX:+PrintIdeal -XX:CICompilerCount=1 sum     214 1 sum::doit (22 bytes) VM option '+PrintIdeal' ...   21" ConI" === 0 [[ 180 ]] #int:0  180" Phi" === 184 21 70 [[ 179 ]] #int !orig=[159],[139],[66] ! jvms: sum::doit @ bci:10  179" AddI" === _ 180 181 [[ 178 ]] !orig=[154],[137],70,[145] !jvms: sum::doit @ bci:12  178" AddI" === _ 179 181 [[ 177 ]] !orig=[153],[146],[135],86,[71] ! jvms: sum::doit @ bci:14  177" AddI" === _ 178 181 [[ 176 ]] !orig=[165],[152],86,[71] !jvms: sum::doit @ bci:14  148" ConI" === 0 [[ 87 ]] #int:97  176" AddI" === _ 177 181 [[ 190 ]] !orig=[168],[152],86,[71] !jvms: sum::doit @ bci:14 // <idx> <node type> === <in[]> [[out[]]] <additional desc> // jvms = JVMState, root()->dump(9999) would dump IR as above; Real example: https://gist.github.com/1369656 9
  • 10. Ideal Graph Visualizer グラフ表示オプション level 4 ではパースの各段階と最適 化の各段階の IR が表示可能 10
  • 11. Ideal Graph Visualizer public int getValue() { return value; } enum { Control, I_O, Memory, FramePtr, ReturnAdr, Parms }; 11
  • 12. Ideal Graph Visualizer Final Code: Mach* node や Epilog, Prolog, Ret などマシン依存のノードに なっている 12
  • 13. Compiler data structures Ideal Graph Visualizer http://ssw.jku.at/General/Staff/TW/igv.html etc/idealgraphvisualizer.conf: default_options="-J-Xmx400m --branding idealgraphvisualizer" You need a fastdebug or a debug build In OpenJDK build dir $ make fastdebug_build OR $ make debug_build 13
  • 14. Compiler data structures Options to generate data for Ideal Graph Visualizer -XX:PrintIdealGraphLevel=0 [0:None, 4: most verbose] -XX:PrintIdealGraphPort=4444 -XX:PrintIdealGraphAddress=”127.0.0.1” -XX:PrintIdealGraphFile=<path to IR xml file> IdealGraphVisualizer listens to port 4444 by default. 14
  • 15. Compiler data structures // -XX:PrintIdealGraphFile=<path> , IdealGraphViewer can display this <graphDocument> <group> <properties> <p name="name"> virtual jint Call.doit()</p> </properties> <graph name="Bytecode 0: aload_0"> <nodes> <node id="159337448"> <properties> <p name="name"> Root</p> <p name="type"> bottom</p> <p name="idx"> 0</p> ... </properties> </node> <node id="159414516"> ... </node> <node id="159337448"> ... </node> ... </nodes> <edges> <edge index="0" to="159337448" from="159337448"></edge> <edge index="0" to="159414516" from="159414516"></edge> Example: https://gist.github.com/1369620 15
  • 16. Compiler data structures Node   # _in : Node** // use-def   # _out: Node** // def-use   # _cnt: node_idx_t // # of required inputs   # _max: node_idx_t // actual input array length   # _outcnt: node_idx_t   # _outmax: node_idx_t - _class_id: jushort - _flags: jushort // _in は ordered, 位置も重要 // サブクラスの合計 340 個 <-> C1 62 個 // _class_id は 16 bit 値、ideal, mach で node の型を判断 // _flags は enum NodeFlags: Flag_is_Copy, Flag_is_Call, // Flag_is_macro, Flag_is_con... 16
  • 17. Compiler data structures // Insert a new required input at the end void Node::ins_req( uint idx, Node *n ) { assert( is_not_dead(n), "can not use dead node"); add_req(NULL); // Make space ... _in[idx] = n; // Stuff over old required edge if (n != NULL) n->add_out((Node *)this); // Add reciprocal def-use edge } void add_out( Node *n ) { if (is_top()) return; if( _outcnt == _outmax ) out_grow(_outcnt); _out[_outcnt++] = n; } 17
  • 18. Compiler data structures RegionNode Control の merge PhiNode Control の merge に伴うデータのマージ. 対応する RegionNode を指す。 18
  • 19. Compiler data structures Node // Optimize functions // more ideal node, canonicalize virtual Node *Ideal(PhaseGVN *phase, bool can_reshape); // set of values this node can take virtual const Type *Value ( PhaseTransform *phase ) const; // existing node which computes same virtual Node *Identity ( PhaseTransform *phase ); 19
  • 20. Compiler data structures Node Ideal defined in Add*Node, MinINode, StartNode, ReturnNode, RethrowNode, SafePointNode, AllocateArrayNode, LockNode, UnlockNode, RegionNode, PhiNode, PCTableNode, NeverBranchNode, CMove*Node, ConstraintCastNode, CheckCastPPNode, Conv?2?Node, Div?Node, Mod?Node, IfNode, LoopNode, CountedLoopNode, LoopLimitNode, Load*Node, Store*Node, ClearArrayNode, StrIntrinsicNode, MemBarNode, MergeMemNode, Mul*Node, And*Node, LShift*Node, URShift*Node, RootNode, HaltNode, Sub*Node, Cmp*Node, BoolNode Ideal, Value, Identity は多くのサブクラスが目的に応じた物を定義 している。 20
  • 21. Compiler data structures AddNode Ideal Convert "(x+1)+2" into "x+(1+2)" Convert "(x+1)+y" into "(x+y)+1" Convert "x+(y+1)" into "(x+y)+1" x Con1 Con2 x Con1 Con2 Add Add Add Add 21
  • 22. Compiler data structures AddINode Ideal Node* in1 = in(1); Node* in2 = in(2); int op1 = in1->Opcode(); int op2 = in2->Opcode(); // Fold (con1-x)+con2 into (con1+con2)-x if ( op1 == Op_AddI && op2 == Op_SubI ) { // Swap edges to try optimizations below in1 = in2; in2 = in(1); op1 = op2; op2 = in2->Opcode(); } if( op1 == Op_SubI ) { "(a-b)+(c-d)" into "(a+c)-(b+d)" "(a-b)+(b+c)" into "(a+c)" "(a-b)+(c+b)" into "(a+c)" 22
  • 23. Compiler data structures const Type *AddNode::Value(...) // Either input is TOP ==> the result is TOP // Either input is BOTTOM ==> the result is the local BOTTOM // Check for an addition involving the additive identity 23
  • 24. Compiler data structures Node *AddNode::Identity(...) // If either input is a constant 0, return the other input. const Type *zero = add_id(); // The additive identity if( phase->type( in(1) )->higher_equal( zero ) ) return in(2); if( phase->type( in(2) )->higher_equal( zero ) ) return in(1); return this; 24
  • 25. Compiler data structures Node *AddINode::Identity(...) // Fold (x-y)+y OR y+(x-y) into x if( in(1)->Opcode() == Op_SubI && phase->eqv(in(1)->in (2),in(2)) ) { return in(1)->in(1); } else if( in(2)->Opcode() == Op_SubI && phase->eqv(in (2)->in(2),in(1)) ) { return in(2)->in(1); } return AddNode::Identity(phase); 25
  • 26. Compiler data structures Node *PhaseGVN::transform_no_reclaim // Return a node which computes the same function // as this node, but in a faster or cheaper fashion. while( 1 ) { Node *i = k->Ideal(this, /*can_reshape=*/false); if( !i ) break;... } const Type *t = k->Value(this); // Get runtime Value set k->raise_bottom_type(t); Node *i = k->Identity(this); if (i != k) return i; i = hash_find_insert(k); if( i && (i != k)) return i; Parse で Node を作ると transform する。Parse しながら GVN 26
  • 27. Compiler data structures Node // raise_bottom_type related TypeNode // Type* _type ConNode // ConINode, ConPNode .. PhiNode // TypePtr* _adr_type // int _inst_id, // inst_index, _inst_offset ConvI2LNode MemNode // TypePtr* _adr_type LoadNode // Type* _type LoadPNode // load obj or arr LoadINode https://gist.github.com/1369608 27
  • 28. Compiler data structures Node ^ RegionNode basic blocks にマップできる。入力は Control sources. PhiNode は RegionNode を指す 入力を持つ。PhiNode へのマージされるデータの入力は RegionNode の 入力と一対一の 対応を持つ。 PhiNode の 0 の入力は RegionNode で RegionNode の入力 0 は自身。 PhiNode* has_phi() const ^ LoopNode // Simple Loop Header short _loop_flags ^ RootNode 28
  • 31. Compiler data structures Node ^ ProjNode // project a single elem out of a tuple or signature type ^ ParmNode // incoming Parameters const uint _con; // The field in the tuple we are projecting const bool _is_io_use; // Used to distinguish between the projections // used on the control and io paths from a macro node 31
  • 32. Compiler data structures Node ^ MergeMem // (See comment in memnode.cpp near MergeMemNode::MergeMemNode for semantics.) in(AliasIdxTop) = in(1) is always the top node in(0) is NULL in(AliasIdxBot) is a "wide" memory state. For in(AliasIdxRaw) = in(3) and above, mem state for alias type <N> or top base_memory() // wide state memory_at(N) // for alias type <N> Identity: base が empty なら base を返す,さも無ければ this Ideal: Simplify stacked MergeMem 32
  • 34. Compiler data structures class ConINode : public ConNode { public: ConINode( const TypeInt *t ) : ConNode(t) {} virtual int Opcode() const; // Factory method: static ConINode* make( Compile* C, int con ) { return new (C, 1) ConINode( TypeInt::make(con) ); } class ConNode : public TypeNode { public: ConNode( const Type *t ) : TypeNode(t,1) { init_req(0, (Node*)Compile::current()->root()); init_flags(Flag_is_Con); } class TypeNode : public Node { const Type* const _type; TypeNode( const Type *t, uint required ) : Node 34
  • 35. Compiler data structures // Add pointer plus integer to get pointer. NOT commutative, really. // So not really an AddNode. Lives here, because people associate it with // an add. class AddPNode : public Node { public: enum { Control, // When is it safe to do this add? Base, // Base oop, for GC purposes Address, // Actually address, derived from base Offset } ; // Offset added to address AddPNode( Node *base, Node *ptr, Node *off ) : Node(0,base,ptr,off) { init_class_id(Class_AddP); } Identity: if one input is 0, return in(Address), otherwise this Ideal: 左が定数の加算であれば, expression tree を平坦化 raw pointer で NULL なら CastX2PNode(offset) 右が constant の加算なら (ptr + (offset+cn)) を (ptr + offset) + con に変更 35
  • 36. Compiler data structures // Return from subroutine node class ReturnNode : public Node { public: ReturnNode( uint edges, Node *cntrl, Node *i_o, Node *memory, Node *retadr, Node *frameptr ); virtual int Opcode() const; virtual bool is_CFG() const { return true; } 36
  • 37. Compiler data structures JVMState JVMState* _caller // for scope chains uint _depth, _locoff, _stkoff, _monoff, uint _scloff // offset of scalar objs uint _endoff uint _sp int _bci ReexecuteState _reexecute ciMethod* _method SafePointNode* _map 37
  • 38. Compiler data structures class Type { public:   enum TYPES { Bad = 0, Control,     Top,     Int, Long, Half, NarrowOop,     Tuple, Array,     AnyPtr, RawPtr, OopPtr, InstPtr, AryPtr, KlassPtr,     Function, Abio, Return_Address, Memory,     FloatTop, FloatCon, FloatBot,     DoubleTop, DoubleCon, DoubleBot,     Bottom, lasttype }; private:   const Type __dual; protected:   const TYPES _base; 38
  • 39. Compiler data structures class Type {   : public:   TYPES base();   static const Type *make(enum TYPES);   static int cmp(Type*, Type*);   int higher_equal( Type *t)   const Type *meet(Type *t);   virtual const Type *widen(Type *old, Type* limit)   virtual const Type *narrow(Type *old) 39
  • 40. Compiler data structures class Dict; class Type; class   TypeD; class   TypeF; class   TypeInt; class   TypeLong; class   TypeNarrowOop; class   TypeAry; class   TypeTuple; class   TypePtr; class     TypeRawPtr; class     TypeOopPtr; class       TypeInstPtr; class       TypeAryPtr; class       TypeKlassPtr; 40
  • 41. Compiler data structures Phase PhaseTransform Compile PhaseIdealLoop GraphKit Matcher PhaseCFG PhaseValues PhaseBlockLayout PhaseGVN PhaseCoalesce PhaseIterGVN PhaseAggressiveCoalesce PhaseCCP PhaseConservativeCoalesce PhasePeephole PhaseIFG PhaseStringOpts PhaseLive PhaseMacroExpand PhaseRegAlloc PhaseChaitin PhaseRemoveUseless 41
  • 42. Compiler data structures class Phase : public StackObj { public:   enum PhaseNumber { Compiler, Parser,Remove_Useless, ...} protected:   enum PhaseNumber _pnum; public:   Compile * C; } 42
  • 43. Compiler data structures class Compile : public Phase {   const int        _compile_id;   ciMethod*        _method;   int              _entry_bci;   const TypeFunc*  _tf;   InlineTree*      _ilt;   Arena            _comp_arena;   ConnectionGraph* _congraph;   uint             _unique;   Arena            _node_arena;   RootNode*        _root;   Node*            _top;   : } 43
  • 44. Compiler data structures class Compile : public Phase {   :   PhaseGVN*         _initial_gvn;   Unique_Node_List  _for_igvn;   WarmCallInfo*     _warm_calls;   PhaseCFG*         _cfg;   Matcher*          _matcher;   PhaseRegAlloc*    _regalloc;   OopMapSet*        _oop_map_set;   : } 44
  • 45. Compiler data structures class PhaseTransform : public Phase { protected:   Arena* _arena;   Node_Array _nodes;   Type_Array _types;   ConINode*  _icons[...];   ConLNode*  _lcons[...];   ConNode*   _zcons[...];   : } 45
  • 46. Compiler data structures class PhaseTransform : public Phase { public:   const Type* type(const Node* n) const;   const Type* type_or_null(const Node* n) const;   void set_type(const Node* n, const Type* t);   void set_type_bottom(const Node* n);   void ensure_type_or_null(const Node* n);   ConNode* makecon(const Type* t);   ConINode* intcon(jint i);   ConLNode* longcon(jlong l);   virtual Node *transform(Node *) = 0;   : } 46
  • 47. Compiler data structures 値をテーブルで管理する機能 class PhaseValues : public PhaseTransform { protected:   NodeHash       _table; // for value-numbering public:   bool   hash_delete(Node *n);   bool   hash_insert(Node *n);   Node  *hash_find_insert(Node* n);   Node  *hash_find(Node* n);   : } 47
  • 48. Compiler data structures ローカルの悲観的な GVN-style の最適化 class PhaseGVN : public PhaseValues { public:   Node  *transform(Node *n);   Node  *transform_no_reclaim(Node *n);   : } 48
  • 49. Compiler data structures 繰り返しのローカル、悲観的 GVN-style 最適化と ideal の変形 class PhaseIterGVN : public PhaseGVN { private:   bool _delay_transform;   virtual Node *transform_old(Node *a_node);   void subsume_node(Node *old, Node *nn); protected:   virtual Node *transform(Node *a_node);   void init_worklist(Node *a_root);   virtual const Type* saturate(Type*, Type*, Type*) public:   Unique_Node_List _worklist;   void optimize();   : } 49
  • 51. Parse #0 Parse::do_one_bytecode() #1 Parse::do_one_block() #2 Parse::do_all_blocks() #3 Parse::Parse(JVMState*, ciMethod*, float) () #4 ParseGenerator::generate (JVMState*) #5 Compile::Compile(ciEnv*, C2Compiler*, ciMethod*, int, bool, bool) 51
  • 52. Parse do_one_bytecode switch (bc()) { case Bytecodes::_nop: // do nothing break; case Bytecodes::_lconst_0: push_pair(longcon(0)); break; : case Bytecodes::_iconst_5: push(intcon( 5)); break; case Bytecodes::_bipush: push(intcon(iter ().get_constant_u1())); break; case Bytecodes::_sipush: push(intcon(iter ().get_constant_u2())); break; makecon, ingcon など定数を表すノードを返す static 関数もある。 52
  • 53. Parse do_one_bytecode case Bytecodes::_ldc: case Bytecodes::_ldc_w: case Bytecodes::_ldc2_w: // If the constant is unresolved, run this BC once in the interpreter. { ciConstant constant = iter().get_constant(); if (constant.basic_type() == T_OBJECT && !constant.as_object()->is_loaded()) { int index = iter().get_constant_pool_index(); 53
  • 54. Parse do_one_bytecode case Bytecodes::_aload_0: push( local(0) ); break; : case Bytecodes::_aload: push( local(iter().get_index()) ); break; push, local は結果的に JVMState, SafePointNode の状態を変 更。 iter() を使って bytecode の引き数を取って来る事ができる。 54
  • 55. Parse do_one_bytecode case Bytecodes::_fstore_0: case Bytecodes::_istore_0: case Bytecodes::_astore_0: set_local( 0, pop() ); break; : case Bytecodes::_fstore: case Bytecodes::_istore: case Bytecodes::_astore: set_local( iter().get_index(), pop() ); break; 55
  • 56. Parse do_one_bytecode case Bytecodes::_pop: _sp -= 1; break; case Bytecodes::_pop2: _sp -= 2; break; case Bytecodes::_swap: a = pop(); b = pop(); push(a); push(b); break; case Bytecodes::_dup: a = pop(); push(a); push(a); break; 56
  • 57. Parse do_one_bytecode case Bytecodes::_baload: array_load(T_BYTE); break; case Bytecodes::_caload: array_load(T_CHAR); break; case Bytecodes::_iaload: array_load(T_INT); break; case Bytecodes::_saload: array_load(T_SHORT); break; case Bytecodes::_faload: array_load(T_FLOAT); break; case Bytecodes::_aaload: array_load(T_OBJECT); break; case Bytecodes::_laload: { a = array_addressing(T_LONG, 0); if (stopped()) return; // guaranteed null or range check _sp -= 2; // Pop array and index push_pair( make_load(control(), a, TypeLong::LONG, T_LONG, TypeAryPtr::LONGS)); break; } 57
  • 58. Parse do_one_bytecode case Bytecodes::_bastore: array_store(T_BYTE); break; case Bytecodes::_castore: array_store(T_CHAR); break; case Bytecodes::_iastore: array_store(T_INT); break; case Bytecodes::_sastore: array_store(T_SHORT); break; case Bytecodes::_fastore: array_store(T_FLOAT); break; case Bytecodes::_aastore: { d = array_addressing(T_OBJECT, 1); if (stopped()) return; // guaranteed null or range check array_store_check(); c = pop(); // Oop to store b = pop(); // index (already used) a = pop(); // the array itself const TypeOopPtr* elemtype = _gvn.type(a)- >is_aryptr()->elem()->make_oopptr(); const TypeAryPtr* adr_type = TypeAryPtr::OOPS; Node* store = store_oop_to_array(control(), a, d, adr_type, c, elemtype, T_OBJECT); 58
  • 59. Parse do_one_bytecode case Bytecodes::_getfield: do_getfield(); break; case Bytecodes::_getstatic: do_getstatic(); break; case Bytecodes::_putfield: do_putfield(); break; case Bytecodes::_putstatic: do_putstatic(); break; 59
  • 60. Parse do_one_bytecode // implementation of _get* and _put* bytecodes void do_getstatic() { do_field_access(true, false); } void do_getfield () { do_field_access(true, true); } void do_putstatic() { do_field_access(false, false); } void do_putfield () { do_field_access(false, true); } 60
  • 61. Parse do_one_bytecode Parse::do_field_access Parse::do_get_xxx(Node* obj, ciField* field, bool is_field) Node *adr = basic_plus_adr(obj, obj, offset); : Node* ld = make_load(NULL, adr, type, bt, adr_type, is_vol); Node* GraphKit::basic_plus_adr(Node* base, Node* ptr, Node* offset) { // short-circuit a common case if (offset == intcon(0)) return ptr; return _gvn.transform( new (C, 4) AddPNode(base, ptr, offset) ); } 61
  • 62. Parse do_one_bytecode // factory methods in "int adr_idx" Node* GraphKit::make_load(Node* ctl, Node* adr, const Type* t, BasicType bt,int adr_idx, bool require_atomic_access) { Node* mem = memory(adr_idx); Node* ld; if (require_atomic_access && bt == T_LONG) { ld = LoadLNode::make_atomic(C, ctl, mem, adr, adr_type, t); } else { ld = LoadNode::make(_gvn, ctl, mem, adr, adr_type, t, bt); } return _gvn.transform(ld); } 62
  • 63. Parse do_one_bytecode Node* GraphKit::memory(uint alias_idx) { MergeMemNode* mem = merged_memory(); Node* p = mem->memory_at(alias_idx); _gvn.set_type(p, Type::MEMORY); // must be mapped return p; } 63
  • 64. Parse do_one_bytecode _iadd b = pop(), a = pop() push(_gvn.transform( new (C, 3) AddINode(a,b))) // GraphKit::pop() Node* pop() { ..; return _map->stack(_map->_jvms,--_sp); } // SefePointNode::stack Node *stack(JVMState* jvms, uint idx) const {   return in(jvms->stkoff() + idx); } 64
  • 65. Parse do_one_bytecode case Bytecodes::_iinc:         // Increment local     i = iter().get_index();     // Get local index     set_local( i, _gvn.transform(         new (C, 3) AddINode(             _gvn.intcon(iter().get_iinc_con()), local(i) ) ) );     break; 65
  • 66. Parse do_one_bytecode _goto, _goto_w     int target_bci = (bc() == Bytecodes::_goto) ?         iter().get_dest() : iter().get_far_dest();     // If this is a backwards branch in the bytecodes, add Safepoint     maybe_add_safepoint(target_bci);     // Update method data     profile_taken_branch(target_bci);     // Add loop predicate if it goes to a loop     if (should_add_predicate(target_bci)){       add_predicate();     }     // Merge the current control into the target basic block     merge(target_bci);     ... 66
  • 67. Parse do_one_bytecode _goto, _goto_w     ...// See if we can get some profile data and hand it off to the next block     Block *target_block = block()->successor_for_bci (target_bci);     if (target_block->pred_count() != 1)  break;     ciMethodData* methodData = method()->method_data();     if (!methodData->is_mature())  break;     ciProfileData* data = methodData->bci_to_data(bci ());     assert( data->is_JumpData(), "" );     int taken = ((ciJumpData*)data)->taken();     taken = method()->scale_count(taken);     target_block->set_count(taken);     break; 67
  • 68. Parse do_one_bytecode case _ifnull:    btest = BoolTest::eq; goto handle_if_null; case _ifnonnull: btest = BoolTest::ne; goto handle_if_null; handle_if_null:     // If this is a backwards branch in the bytecodes, add Safepoint     maybe_add_safepoint(iter().get_dest());     a = null();     b = pop();     c = _gvn.transform( new (C, 3) CmpPNode(b, a) );     do_ifnull(btest, c);     break; 68
  • 69. Parse do_one_bytecode case _if_acmpeq: btest = BoolTest::eq; goto handle_if_acmp; case _if_acmpne: btest = BoolTest::ne; goto handle_if_acmp; handle_if_acmp:     // If this is a backwards branch in the bytecodes, add Safepoint     maybe_add_safepoint(iter().get_dest());     a = pop();     b = pop();     c = _gvn.transform( new (C, 3) CmpPNode(b, a) );     do_if(btest, c);     break; 69
  • 70. Parse do_one_bytecode case Bytecodes::_tableswitch:     do_tableswitch();     break; case Bytecodes::_lookupswitch:     do_lookupswitch();     break; 70
  • 71. Parse do_one_bytecode case Bytecodes::_invokestatic: case Bytecodes::_invokedynamic: case Bytecodes::_invokespecial: case Bytecodes::_invokevirtual: case Bytecodes::_invokeinterface:     do_call();     break; case Bytecodes::_checkcast:     do_checkcast();     break; case Bytecodes::_instanceof:     do_instanceof();     break; 71
  • 72. Parse do_one_bytecode getClass はインライン展開され、 LoadKlass -> メモリアクセスに。 hashCode は static に public class Call { public static void main(String[] args) { Call c = new Call(); for (int i = 0; i < 100000; i++) { c.doit(); } } int doit() { return getClass().hashCode(); } } 72
  • 73. Parse do_one_bytecode case Bytecodes::_anewarray:     do_anewarray();     break; case Bytecodes::_newarray:     do_newarray((BasicType)iter().get_index());     break; case Bytecodes::_multianewarray:     do_multianewarray();     break; case Bytecodes::_new:     do_new();     break; 73
  • 74. Parse do_one_bytecode case Bytecodes::_jsr: case Bytecodes::_jsr_w:     do_jsr();     break; case Bytecodes::_ret:     do_ret();     break; 74
  • 75. Parse do_one_bytecode case Bytecodes::_monitorenter:     do_monitor_enter();     break; case Bytecodes::_monitorexit:     do_monitor_exit();     break; 75
  • 76. Optimize PhaseIterGVN igvn(initial_gvn) igvn.optimize() ConnectionGraph::do_analysis(this, &igvn) // EscapeAnalysis igvn.optimize() PhaseIdealLoop ideal_loop(igvn, true) PhaseIdealLoop ideal_loop(igvn, ...) PhaseIdealLoop ideal_loop(igvn, ...) PhaseCCP ccp( &igvn ) PhaseMacroExpand mex(igvn) 76
  • 77. Optimize PhaseIterGVN igvn(initial_gvn) igvn.optimize() ConnectionGraph::do_analysis(this, &igvn) // EscapeAnalysis igvn.optimize() PhaseIdealLoop ideal_loop(igvn, true) PhaseIdealLoop ideal_loop(igvn, ...) PhaseIdealLoop ideal_loop(igvn, ...) PhaseCCP ccp( &igvn ) PhaseMacroExpand mex(igvn) 77
  • 78. Optimize PhaseIterGVN igvn(initial_gvn) igvn.optimize() // worklist から取り出し, node を transform. // node が変わったら,edge 情報を // 更新して, users を worklist に置く while( _worklist.size() ) { Node *n = _worklist.pop(); if (++loop_count >= K * C->unique()) { // 範囲の確認 ...} if (n->outcnt() != 0) { Node *nn = transform_old(n); } else if (!n->is_top()) { remove_dead_node(n); } } 78
  • 79. Optimize Node *PhaseIterGVN::transform_old ( Node *n ) Ideal に渡す can_reshape が true である事 Constant に計算される物は subsume_node で user を新しいノード をさす様に変更する事 79
  • 80. Optimize PhaseIterGVN igvn(initial_gvn) igvn.optimize() ConnectionGraph::do_analysis(this, &igvn) // EscapeAnalysis igvn.optimize() PhaseIdealLoop ideal_loop(igvn, true) PhaseIdealLoop ideal_loop(igvn, ...) PhaseIdealLoop ideal_loop(igvn, ...) PhaseCCP ccp( &igvn ) PhaseMacroExpand mex(igvn) 80
  • 81. Optimize ConnectionGraph::do_analysis(this, &igvn) // EscapeAnalysis if (congraph->compute_escape()) { // There are non escaping objects. C->set_congraph(congraph); } congraph は LockNode, UnlockNode で確認し、これらが non- escape なら処理がなくなる。local なオブジェクトにロック、アンロッ クは無意味。 81
  • 82. Optimize ConnectionGraph::compute_escape() java object の allocation がなければ false を返す AddP, MergeMem 等を work list にのせる、それらの out ものせる worklist のノードを細かく調べる GrowableArray<PointsToNode> _nodes に登録して、 GlobalEscape, ArgEscape, NoEscape に分類, 到達可能なノードに 伝播する。 // comment in escape.hpp // flags: PrintEscapeAnalysis PrintEliminateAllocations 82
  • 83. Optimize class ConnectionGraph: public ResourceObj // escape state of a node PointsToNode::EscapeState escape_state(Node *n); // other information we have collected bool is_scalar_replaceable(Node *n) { if (_collecting || (n->_idx >= nodes_size())) return false; PointsToNode* ptn = ptnode_adr(n->_idx); return ptn->escape_state() == PointsToNode::NoEscape && ptn->_scalar_replaceable; } 83
  • 84. Optimize PhaseIterGVN igvn(initial_gvn) igvn.optimize() ConnectionGraph::do_analysis(this, &igvn) // EscapeAnalysis igvn.optimize() PhaseIdealLoop ideal_loop(igvn, true) PhaseIdealLoop ideal_loop(igvn, ...) PhaseIdealLoop ideal_loop(igvn, ...) PhaseCCP ccp( &igvn ) PhaseMacroExpand mex(igvn) 84
  • 86. Optimize // Convert to counted loops where possible PhaseIdealLoop::is_counted_loop( Node *x, IdealLoopTree *loop ) PhaseIdealLoop::is_counted_loop で CountedLoop への変換 を試みる。再帰的に子のループに関しても counted_loop を呼ぶ void PhaseIdealLoop::do_peeling( IdealLoopTree *loop, Node_List &old_new ) // 1回目の実行を切り出す。loopTransform.cpp に図解 void PhaseIdealLoop::do_unroll( IdealLoopTree *loop, Node_List &old_new, bool adjust_min_trip ) void PhaseIdealLoop::do_maximally_unroll( IdealLoopTree *loop, Node_List &old_new ) // Eliminate range-checks and other trip-counter vs loop-invariant tests. void PhaseIdealLoop::do_range_check( IdealLoopTree *loop, Node_List &old_new ) 86
  • 87. Optimize -- PhaseIdealLoop After Parsing static int doit() { int sm = 0; for (int i = 0; i < 100; i++) sm += i; return sm; } 87
  • 88. Optimize -- PhaseIdealLoop After CountedLoop static int doit() { int sm = 0; for (int i = 0; i < 100; i++) sm += i; return sm; } 88
  • 89. Optimize -- PhaseIdealLoop Optimization Finished Unrolling? 89
  • 90. Optimize PhaseIterGVN igvn(initial_gvn) igvn.optimize() ConnectionGraph::do_analysis(this, &igvn) // EscapeAnalysis igvn.optimize() PhaseIdealLoop ideal_loop(igvn, true) PhaseIdealLoop ideal_loop(igvn, ...) PhaseIdealLoop ideal_loop(igvn, ...) PhaseCCP ccp( &igvn ) PhaseMacroExpand mex(igvn) 90
  • 91. Optimize PhaseCCP ccp( &igvn ) ccp.do_transform C->set_root( transform(C->root())- >as_Root() ); 定数置き換え可能な物を置き換える 91
  • 92. Optimize PhaseIterGVN igvn(initial_gvn) igvn.optimize() ConnectionGraph::do_analysis(this, &igvn) // EscapeAnalysis igvn.optimize() PhaseIdealLoop ideal_loop(igvn, true) PhaseIdealLoop ideal_loop(igvn, ...) PhaseIdealLoop ideal_loop(igvn, ...) PhaseCCP ccp( &igvn ) PhaseMacroExpand mex(igvn) 92
  • 93. Optimize PhaseMacroExpand mex(igvn) mex.expand_macro_nodes() ... eliminate_allocate_node scalar_replacement Escape Analysis の結果の処理、allocation をスタック操作に変換? 93
  • 94. Code_Gen Matcher m(proj_list) m.match() PhaseCFG cfg(node_arena(), root()) cfg.Dominators() cfg.Estimate_Block_Frequency() cfg.GlobalCodeMotion(m,unique(),proj) PhaseChaitin regalloc(unique, cfg, m) regalloc->Register_Allocate() PaseBlockLayout PhasePeephole Output 94
  • 95. Matcher #0 in addI_eRegNode::Expand(State*, Node_List&, Node*) () #1 in Matcher::ReduceInst(State*, int, Node*&) () #2 in Matcher::match_tree(Node const*) () #3 in Matcher::xform(Node*, int) () #4 in Matcher::match() () #5 in Compile::Code_Gen() () #6 in Compile::Compile(ciEnv*, C2Compiler*, ciMethod*, int, bool, bool) () 95
  • 96. Matcher // x86_32.ad, an ADLC file instruct addI_eReg(eRegI dst, eRegI src, eFlagsReg cr) %{ match(Set dst (AddI dst src)); effect(KILL cr); size(2); format %{ "ADD $dst,$src" %} opcode(0x03); ... %} 96
  • 97. PhaseCFG PhaseCFG::build_cfg() RegionNode, StartNode を元に CFG (Control Flow Graph) を構築。以降のマシ ンよりの操作が行える様にする。 97
  • 98. PhaseCFG class PhaseCFG : public Phase + _num_blocks: uint + _blocks: RootNode* + _bbs: Block_Array + _broot: Block* + _rpo_ctr: uint + _root_loop: + _node_latency: GrowableAray<uint>* 98
  • 99. PhaseCFG class Block : public CFGElement + _nodes : Node_List + _succs : Block_Array + _num_succs: uint + _pre_order: uint // Pre-order DFS # + _dom_depth: uint + _idom : Block* + _loop : CFGLoop* + _rpo : uint : // reg pressure, etc 99
  • 100. PhaseCFG PhaseCFG::Dominators() // Lengauer & Tarjan algorithm // Block の _dom_depth, _idom を設定 // Code Motion の元になるデータ PhaseCFG::Estimate_Block_Frequency() // IfNode の probabilities から block // の frequency を算出, Block の親の // field _freq に設定 100
  • 103. Output StartNode を MachPrologNode で置き換え Unverified entry point の設定 MachEpilogNode を各 return の前に配置 ScheduleAndBundle() BuildOopMap() Fill_buffer() CodeBuffer を用意 for (i=0; i < _cfg->numLblocks; i++) for Uj = 0; j < last_inst; j++) … n->emit(*cb, _regalloc) -XX:+PrintOptoAssembly to dump instructions https://gist.github.com/1376858 103
  • 104. おまけ Sheet2 bytecode size vs arena use 14000000 12000000 10000000 8000000 comp bytes node res 6000000 4000000 2000000 0 0 1000 2000 3000 4000 5000 6000 7000 bytecode size ページ 1 104
  • 105. Sheet2 compiler memory use 14000000 12000000 10000000 8000000 comp_arena bytes node_arena res_area 6000000 4000000 2000000 0 0 5000 10000 15000 20000 25000 30000 unique (number of nodes) ページ 1 105