11. A pitfall of Symbol
✔ All symbols are not garbage collected.
✔ Many beginners don't know this fact.
✔ Make a mistake even good rubyists.
✔ Prone to vulnerability
✔ User input → symbol
✔ Compress the memory
12. Simple cases
✖ if user.respond_to(params[:method].to_sym)
Is this method callable?
NG: params[:method] is user input
✖ params[params[:attr].to_sym]
Get a value of a hash via a symbol key.
NG: params[:attr] is user input.
25. ID
✔ ID: Used by C Level.
✔ Store ID to a method table or a variable table.
✔ An unique number that corresponds to a symbol.
✔ Created by rb_intern(“foo”) of C API.
✔ :sym == :sym → 1001 == 1001
28. For example, it stores ID to the static
area of the C extension
C Ruby
global_symbols
sym_id(hash)
“foo”
1001
・
・・
last_id(long)
1001
SYMBOL
(VALUE)
:foo
Ruby's C extension
static public ID id;
SYM2ID(:foo) 1001
29. If :foo is collected,
ID in sym_id will be deleted.
C Ruby
global_symbols
sym_id(hash)
“foo”
1001
・
・・
last_id(long)
1001
SYMBOL
(VALUE)
:foo
Ruby's C extension
static public ID id;
1001 GC START
30. Then “foo”.to_sym is called.
:foo == :foo but different ID
C Ruby
global_symbols
sym_id(hash)
“foo”
1002
・
・・
last_id(long)
1001
SYMBOL
(VALUE)
:foo
Ruby's C extension
static public ID id;
1001
1002
Different
SYM2ID(:foo) != id
31. Why can't collect
garbage symbols
✔ Problem: ID remaining in the C side.
✔ We can't detect and manage all IDs in C extension.
✔ Same symbol but different ID
✔ It will create an inconsistent ID.
32. In Ruby world
RRIIPP.. AA ssyymmbbooll iiss ddeeaadd......
Photo by MIKI Yoshihito, https://www.flickr.com/pphhoottooss//mmuujjiittrraa//77557711002222449900
33. In C world
WWRRRRRRYYYYYYYYYY!!!!!!
II''mm ssttiillll aalliivvee........!!
IIDD
Photo by Zufallsfaktor, https://www.flickr.com/photos/zzuuffaallllssffaakkttoorr//55991111333388995599
36. Separates into two types of symbols
Immortal
Symbol
Mortal
Symbol
CC WWoorrlldd RRuubbyy WWoorrlldd
37. Immortal Symbol
✔ These symbols have the ID corresponding
✔ e.g. method name, variable name, constant name, etc...
✔ use in C-level mainly
✔ Uncollectable
✔ Symbol stay alive after numbering the ID
once
✔ There is no transition to Mortal Symbol.
38. def foo; end
C Ruby
global_symbols
sym_id(hash)
“foo”
1001
・
・・
last_id(long)
1001
Frozen String
“foo”
39. Store an ID to the method table
C Ruby
global_symbols
sym_id(hash)
“foo”
1001
・
・・
last_id(long)
1001
Frozen String
“foo”
Method table
1001 def foo; end
40. ID2SYM(ID) → VALUE
C Ruby
global_symbols
sym_id(hash)
“foo”
1001
・
・・
last_id(long)
1001
“foo”
ID: 1001
ID2SYM(ID)
Immortal
Symbol
(VALUE)
:foo
Frozen String
Method table
1001 def foo; end
41. Mortal Symbol
✔ These symbols don't have ID
✔ “sym”.to_sym → Mortal Symbol
✔ use in Ruby-level mainly
✔ Collectable
✔ Unreachable symbols are collected.
✔ There is transition to Immortal Symbol.
46. def foo; end
C Ruby
global_symbols
sym_id(hash)
“foo”
1001
・
・・
last_id(long)
1001
“foo”
ID: 1001
ID2SYM(ID)
Immortal
Symbol
(VALUE)
:foo
Frozen String
47. “foo”.to_sym
C Ruby
global_symbols
sym_id(hash)
“foo”
1001
・
・・
last_id(long)
1001
“foo”
ID:1001
ID2SYM(ID)
Immortal
Symbol
(VALUE)
:foo
Frozen String
Mortal
Symbol
(VALUE)
:foo
Check
Use this one
53. Immortal Symbol
✔ All symbols are garbage collected.
✔ Immortal symbols are not garbage
collected.
✔ Mortal → Immortal symbol when
numbering an ID.
✔ This still lead to vulnerability!
54. A new pitfall
✔ Immortal Symbol is increase
unintentionally.
✔ For instance: Get a name from a symbol
✔ rb_id2str(SYM2ID(sym))
✔ Mortal → Immortal
✔ Please use rb_sym2str()
✔ Please attention to unconsidered SYM2ID().
55. Please keep to monitor
✔ Check Symbol.all_symbols.size
✔ Please report a bug to ruby-core or library author if
increase number of symbols.
✔ It's a transition period now.
✔ It will get better gradually.
57. Static Symbol,
Dynamic Symbol
✔ Static Symbol = Immediate value
✔ Immortal
✔ Dynamic Symbol = RVALUE
✔ Mortal or Immortal
✔ Change to immortal symbol when needs ID.
✔ Similar to Float and FLONUM
58. Details of RSymbol struct
struct RSymbol {
struct RBasic basic;
VALUE fstr;
ID type;
};
Frozen String
“foo”
ID_LOCAL 0b00000
ID_INSTANCE 0b00010
ID_GLOBAL 0b00110
ID_ATTRSET 0b01000
・・
・
59. ID Structure
0bxxx.....xxx 000
High-order 61 bits = Counter Low-order 3 bits = ID type
0bxxx.....xx 000
1
Low-order 1 bit = Static Symbol Flag
60. Fast recognize ID
✔ Low-order 1bit = 1 → Static Symbol
✔ Dynamic Symbol ID = RVALUE address
✔ Low order 1 bit = 0
✔ It's only check of the lower 1 bit.
62. Conclusion
✔ Most symbols will be garbage collected.
✔ But some symbols won't be garbage
collected.
✔ “sym”.to_sym → OK
✔ define_method(“sym”.to_sym){} → NG
63. Acknowledgments
✔ Sasada-san
✔ Teaches me an idea of Symbol GC.
✔ Refines code of Symbol GC.
✔ Nakada-san, Tsujimoto-san, U.Nakamura-san,
etc...
✔ Fixes many bugs.
✔ NaCl members