Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

Weird Stream API

597 Aufrufe

Veröffentlicht am

JETConf 2016 - Weird Stream API

Veröffentlicht in: Ingenieurwesen
  • Als Erste(r) kommentieren

  • Gehören Sie zu den Ersten, denen das gefällt!

Weird Stream API

  1. 1. Weird Stream API Тагир Валеев JetBrains
  2. 2. Почему мне можно верить? https://github.com/amaembo/streamex 2
  3. 3. Почему мне можно верить? • JDK-8072727 Add variation of Stream.iterate() that's finite • JDK-8136686 Collectors.counting can use Collectors.summingLong to reduce boxing • JDK-8141630 Specification of Collections.synchronized* need to state traversal constraints • JDK-8145007 Pattern splitAsStream is not late binding as required by the specification • JDK-8146218 Add LocalDate.datesUntil method producing Stream<LocalDate> • JDK-8147505 BaseStream.onClose() should not allow registering new handlers after stream is consumed • JDK-8148115 Stream.findFirst for unordered source optimization • JDK-8148250 Stream.limit() parallel tasks with ordered non-SUBSIZED source should short-circuit • JDK-8148838 Stream.flatMap(...).spliterator() cannot properly split after tryAdvance() • JDK-8148748 ArrayList.subList().spliterator() is not late-binding • JDK-8151123 Collectors.summingDouble/averagingDouble unnecessarily call mapper twice • JDK-8153293 Preserve SORTED and DISTINCT characteristics for boxed() and asLongStream() operations • JDK-8154387 Parallel unordered Stream.limit() tries to collect 128 elements even if limit is less • JDK-8155600 Performance optimization of Arrays.asList().iterator() • JDK-8164189 Collectors.toSet() parallel performance improvement 3
  4. 4. 4
  5. 5. 5
  6. 6. source.stream() 6
  7. 7. .map(x -> x.squash()) 7
  8. 8. .filter(x -> x.getColor() != YELLOW) 8
  9. 9. .forEach(System.out::println) 9
  10. 10. source.stream() .map(x -> x.squash()) .filter(x -> x.getColor() != YELLOW) .forEach(System.out::println) 10
  11. 11. AtomicLong cnt = new AtomicLong(); .peek(x -> cnt.incrementAndGet()) 11
  12. 12. AtomicLong cnt = new AtomicLong(); source.stream() .map(x -> x.squash()) .peek(x -> cnt.incrementAndGet()) .filter(x -> x.getColor() != YELLOW) .forEach(System.out::println) 12
  13. 13. LongStream.range(1, 100) .count(); 13
  14. 14. LongStream.range(1, 100) .count(); >> 99 14
  15. 15. LongStream.range(0, 1_000_000_000_000_000_000L) .count(); 15
  16. 16. LongStream.range(0, 1_000_000_000_000_000_000L) .count(); 16
  17. 17. LongStream.range(0, 1_000_000_000_000_000_000L) .count(); >> 1000000000000000000 Java 9: JDK-8067969 Optimize Stream.count for SIZED Streams 17
  18. 18. public interface Spliterator<T> { boolean tryAdvance(Consumer<? super T> action); default void forEachRemaining(Consumer<? super T> action) { … } Spliterator<T> trySplit(); long estimateSize(); default long getExactSizeIfKnown() { … } int characteristics(); default boolean hasCharacteristics(int characteristics) { … } default Comparator<? super T> getComparator() { … } … } 18
  19. 19. Характеристики SIZED SUBSIZED SORTED ORDERED DISTINCT NONNULL IMMUTABLE CONCURRENT 19
  20. 20. list.stream() .peek(data -> process(data)) .peek(data -> processInAnotherWay(data)) .count(); // какая-нибудь терминальная операция 20
  21. 21. Характеристики SIZED SUBSIZED SORTED ORDERED DISTINCT NONNULL IMMUTABLE CONCURRENT 21
  22. 22. Характеристики SIZED SUBSIZED SORTED ORDERED DISTINCT NONNULL IMMUTABLE CONCURRENT 22
  23. 23. Stream<T> IntStream LongStream DoubleStream 23
  24. 24. Stream<T> IntStream Stream<Integer> LongStream Stream<Long> DoubleStream Stream<Double> 24
  25. 25. 25 Primitive is faster!
  26. 26. int[] ints; Integer[] integers; @Setup public void setup() { ints = new Random(1).ints(1000000, 0, 1000) .toArray(); integers = new Random(1).ints(1000000, 0, 1000) .boxed().toArray(Integer[]::new); } 26
  27. 27. 27 @Benchmark public long stream() { return Stream.of(integers).distinct().count(); } @Benchmark public long intStream() { return IntStream.of(ints).distinct().count(); }
  28. 28. 28 @Benchmark public long stream() { return Stream.of(integers).distinct().count(); } @Benchmark public long intStream() { return IntStream.of(ints).distinct().count(); } # VM version: JDK 1.8.0_92, VM 25.92-b14 Benchmark Mode Cnt Score Error Units intStream avgt 30 17.121 ± 0.296 ms/op stream avgt 30 15.764 ± 0.116 ms/op
  29. 29. 29 Stream<T> ⇒ ReferencePipeline IntStream ⇒ IntPipeline LongStream ⇒ LongPipeline DoubleStream ⇒ DoublePipeline AbstractPipeline package java.util.stream;
  30. 30. 30 // java.util.stream.IntPipeline @Override public final IntStream distinct() { // While functional and quick to implement, // this approach is not very efficient. // An efficient version requires // an int-specific map/set implementation. return boxed().distinct().mapToInt(i -> i); }
  31. 31. 31 @Benchmark public long stream() { return Stream.of(integers).distinct().count(); } @Benchmark public long intStream() { // IntStream.of(ints).distinct().count(); return IntStream.of(ints).boxed().distinct() .mapToInt(i -> i).count(); } -prof gc stream 48870 B/op intStream 13642536 B/op
  32. 32. 32 Задача: вычислить сумму 5 различных псевдослучайных чисел от 1 до 20. Image courtesy: http://magenta-stock.deviantart.com/
  33. 33. 33 Задача: вычислить сумму 5 различных псевдослучайных чисел от 1 до 20. // IntStream ints(long streamSize, int origin, int bound) new Random().ints(5, 1, 20+1).sum();
  34. 34. 34 Задача: вычислить сумму 5 различных псевдослучайных чисел от 1 до 20. // IntStream ints(long streamSize, int origin, int bound) new Random().ints(5, 1, 20+1).sum();
  35. 35. 35 Задача: вычислить сумму 5 различных псевдослучайных чисел от 1 до 20. // IntStream ints(long streamSize, int origin, int bound) new Random().ints(5, 1, 20+1).sum(); new Random().ints(5, 1, 20+1).distinct().sum();
  36. 36. 36 Задача: вычислить сумму 5 различных псевдослучайных чисел от 1 до 20. // IntStream ints(long streamSize, int origin, int bound) new Random().ints(5, 1, 20+1).sum(); new Random().ints(5, 1, 20+1).distinct().sum(); // IntStream ints(int origin, int bound) new Random().ints(1, 20+1).distinct().limit(5).sum(); …
  37. 37. 37 Задача: вычислить сумму 5 различных псевдослучайных чисел от 1 до 20. new Random().ints(1, 20+1).distinct().limit(5).sum(); new Random().ints(1, 20+1).parallel().distinct().limit(5).sum(); new Random().ints(1, 20+1).distinct().parallel().limit(5).sum(); new Random().ints(1, 20+1).distinct().limit(5).parallel().sum();
  38. 38. 38 Задача: вычислить сумму 5 различных псевдослучайных чисел от 1 до 20. new Random().ints(1, 20+1).distinct().limit(5).sum(); new Random().ints(1, 20+1).parallel().distinct().limit(5).sum(); new Random().ints(1, 20+1).distinct().parallel().limit(5).sum(); new Random().ints(1, 20+1).distinct().limit(5).parallel().sum(); new Random().ints(1, 20+1).parallel().distinct().limit(5) .sequential().sum(); JDK-8132800 clarify stream package documentation regarding sequential vs parallel modes
  39. 39. 39 Задача: вычислить сумму 5 различных псевдослучайных чисел от 1 до 20. new Random().ints(1, 20+1).distinct().limit(5).sum(); # VM version: JDK 1.8.0_92, VM 25.92-b14 Benchmark Mode Cnt Score Error Units rndSum avgt 30 0.286 ± 0.001 us/op
  40. 40. 40 Задача: вычислить сумму 5 различных псевдослучайных чисел от 1 до 20. new Random().ints(1, 20+1).distinct().limit(5).sum(); # VM version: JDK 1.8.0_92, VM 25.92-b14 Benchmark Mode Cnt Score Error Units rndSum avgt 30 0.286 ± 0.001 us/op rndSumPar ~6080 years/op
  41. 41. 41 Задача: вычислить сумму 5 различных псевдослучайных чисел от 1 до 20. new Random().ints(1, 20+1).distinct().limit(5).sum(); # VM version: JDK 1.8.0_92, VM 25.92-b14 Benchmark Mode Cnt Score Error Units rndSum avgt 30 0.286 ± 0.001 us/op rndSumPar ~6080 years/op java.util.stream.StreamSpliterators.UnorderedSliceSpliterator<T, T_SPLITR>
  42. 42. 42 Задача: вычислить сумму 5 различных псевдослучайных чисел от 1 до 20. public IntStream ints(int randomNumberOrigin, int randomNumberBound) Returns an effectively unlimited stream of pseudorandom int values, each conforming to the given origin (inclusive) and bound (exclusive). Implementation Note: This method is implemented to be equivalent to ints(Long.MAX_VALUE, randomNumberOrigin, randomNumberBound).
  43. 43. 43 Задача: вычислить сумму 5 различных псевдослучайных чисел от 1 до 20. new Random().ints(1, 20+1).distinct().limit(5).sum(); new Random().ints(1, 20+1).parallel().distinct().limit(5).sum(); # VM version: JDK 9-ea, VM 9-ea+130 Benchmark Mode Cnt Score Error Units rndSum avgt 30 0.318 ± 0.003 us/op rndSumPar avgt 30 7.793 ± 0.026 us/op JDK-8154387 Parallel unordered Stream.limit() tries to collect 128 elements even if limit is less
  44. 44. parallel().skip() IntStream.range(0, 100_000_000) 104.5±4.4 ms .skip(99_000_000) .sum(); IntStream.range(0, 100_000_000) ? .parallel() .skip(99_000_000) .sum(); 44
  45. 45. 45
  46. 46. parallel().skip() API Note: While skip() is generally a cheap operation on sequential stream pipelines, it can be quite expensive on ordered parallel pipelines, especially for large values of n, since skip(n) is constrained to skip not just any n elements, but the first n elements in the encounter order. 46
  47. 47. parallel().skip() IntStream.range(0, 100_000_000) 104.5±4.4 ms .skip(99_000_000) .sum(); IntStream.range(0, 100_000_000) 1.4±0.2 ms .parallel() (74.6×) .skip(99_000_000) .sum(); 47
  48. 48. parallel(): trySplit() (SIZED+SUBSIZED) 0..99_999_999 0..49_999_999 50_000_000..99_999_999 75_000_000.. 99_999_999 50_000_000.. 74_999_999 ………… 25_000_000.. 49_999_999 …… 0..24_999_999 …… 48
  49. 49. parallel().skip() (SIZED+SUBSIZED) 0..99_999_999 0..49_999_999 50_000_000..99_999_999 75_000_000..99_999_99950_000_000..74_999_999 …… 49
  50. 50. class Node { private String name; private List<Node> children; public Node(String name, Node... children) { this.name = name; this.children = Arrays.asList(children); } @Override public String toString() { return name; } public Stream<Node> allNodes() { return …? } } 50
  51. 51. class Node { private String name; private List<Node> children; ... @Override public String toString() { return name; } public Stream<Node> allNodes() { return children.isEmpty() ? Stream.of(this) : Stream.concat( Stream.of(this), children.stream().flatMap(Node::allNodes)); } } 51
  52. 52. flatMap() Stream<Integer> a = ...; Stream<Integer> b = ...; Stream<Integer> c = ...; Stream<Integer> d = ...; Stream<Integer> res = Stream.concat( Stream.concat(Stream.concat(a, b), c), d); Stream<Integer> res = Stream.of(a, b, c, d) .flatMap(Function.identity()); 52
  53. 53. flatMap() – декартово произведение 53 List<List<String>> input = asList(asList("a", "b", "c"), asList("x", "y"), asList("1", "2", "3")); Supplier<Stream<String>> s = input.stream() .<Supplier<Stream<String>>>map(list -> list::stream) .reduce((sup1, sup2) -> () -> sup1.get() .flatMap(e1 -> sup2.get().map(e2 -> e1+e2))) .orElse(() -> Stream.of("")); s.get().forEach(System.out::println); >> ax1 >> ax2 >> … >> cy3
  54. 54. flatMap() parallel? 54 Stream.of(list1, list2).parallel() .flatMap(List::stream).forEach(item -> {...}); list1 Item1 Item2 Item3 … Item1000 list2 Item1 Item2 Item3 … Item1000
  55. 55. flatMap() parallel? 55 Stream.of(list1, list2).parallel() .flatMap(l -> l.stream().parallel()).forEach(item -> {...}); list1 Item1 Item2 Item3 … Item1000 list2 Item1 Item2 Item3 … Item1000
  56. 56. concat()-то лучше! 56 Stream.concat(list1.stream().parallel(), list2.stream().parallel()).forEach(item -> {}); list1 Item1 Item2 … Item500 Item501 Item502 … Item1000 list2 Item1 Item2 … Item500 Item501 Item502 … Item1000
  57. 57. flatMap() и short-circuiting 57 private IntStream stream() { if (!flatMap) { return IntStream.rangeClosed(0, 1_000_000_000); } else { return IntStream.of(1_000_000_000) .flatMap(x -> IntStream.rangeClosed(0, x)); } }
  58. 58. flatMap() и short-circuiting 58 @Benchmark public boolean hasGreaterThan2() { return stream().anyMatch(x -> x > 2); }
  59. 59. flatMap() и short-circuiting 59 @Benchmark public boolean hasGreaterThan2() { return stream().anyMatch(x -> x > 2); } # VM version: JDK 1.8.0_101, VM 25.101-b13 Benchmark (flatMap) Mode Score Error Units hasGreaterThan2 false avgt 0.040 ± 0.001 μs/op hasGreaterThan2 true avgt 980489.972 ± 12335.916 μs/op
  60. 60. flatMap() и tryAdvance() flatMap = false; Spliterator<Integer> spliterator = stream().spliterator(); spliterator.tryAdvance(System.out::println); >> 0 60
  61. 61. flatMap() и tryAdvance() 61 flatMap = true; Spliterator<Integer> spliterator = stream().spliterator(); spliterator.tryAdvance(System.out::println);
  62. 62. java.lang.OutOfMemoryError: Java heap space at java.util.stream.SpinedBuffer.ensureCapacity at java.util.stream.SpinedBuffer.increaseCapacity at java.util.stream.SpinedBuffer.accept at java.util.stream.IntPipeline$4$1.accept at java.util.stream.IntPipeline$7$1.lambda$accept$198 at java.util.stream.IntPipeline$7$1$$Lambda$7/1831932724.accept at java.util.stream.Streams$RangeIntSpliterator.forEachRemaining at java.util.stream.IntPipeline$Head.forEach at java.util.stream.IntPipeline$7$1.accept at java.util.stream.Streams$IntStreamBuilderImpl.tryAdvance at java.util.Spliterator$OfInt.tryAdvance ... at java.util.stream.StreamSpliterators$WrappingSpliterator.tryAdvance at by.jetconf.streamsamples.FlatMapTryAdvance.main 62
  63. 63. flatMap() и tryAdvance() 63 flatMap = true; Spliterator<Integer> spliterator = stream().spliterator(); spliterator.tryAdvance(System.out::println);
  64. 64. flatMap() и concat() flatMap = true; stream().sum(); stream().findFirst(); // non-short-circuit IntStream.concat(stream(), IntStream.of(1)).sum(); IntStream.concat(stream(), IntStream.of(1)).findFirst(); 64
  65. 65. Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at java.util.stream.SpinedBuffer$OfInt.newArray at java.util.stream.SpinedBuffer$OfInt.newArray at java.util.stream.SpinedBuffer$OfPrimitive.ensureCapacity at java.util.stream.SpinedBuffer$OfPrimitive.increaseCapacity at java.util.stream.SpinedBuffer$OfPrimitive.preAccept at java.util.stream.SpinedBuffer$OfInt.accept ... at java.util.stream.Streams$ConcatSpliterator$OfInt.tryAdvance at java.util.stream.IntPipeline.forEachWithCancel at java.util.stream.AbstractPipeline.copyIntoWithCancel at java.util.stream.AbstractPipeline.copyInto at java.util.stream.AbstractPipeline.wrapAndCopyInto at java.util.stream.FindOps$FindOp.evaluateSequential at java.util.stream.AbstractPipeline.evaluate at java.util.stream.IntPipeline.findFirst at by.jetconf.streamsamples.ConcatFlat.main 65
  66. 66. class Node { private String name; private List<Node> children; ... @Override public String toString() { return name; } public Stream<Node> allNodes() { return children.isEmpty() ? Stream.of(this) : Stream.concat( Stream.of(this), children.stream().flatMap(Node::allNodes)); } } 66
  67. 67. Root Child n0 n1 n2 n3 n999999… 67
  68. 68. root.allNodes().count() root.allNodes().anyMatch(n -> false) 68
  69. 69. count 29.4 ± 2.0 ms 80 001 KB/op anyMatch 70.0 ± 6.0 ms 84 196 KB/op 69
  70. 70. // with concat public Stream<Node> allNodes() { return children.isEmpty() ? Stream.of(this) : Stream.concat( Stream.of(this), children.stream().flatMap(Node::allNodes)); } // without concat public Stream<Node> allNodes() { return children.isEmpty() ? Stream.of(this) : Stream.of(Stream.of(this), children.stream().flatMap(Node::allNodes)) .flatMap(Function.identity()); } 70
  71. 71. count (concat) 29.4 ± 2.0 ms 80 001 KB/op (flatMap) 32.6 ± 2.0 ms 80 001 KB/op anyMatch (concat) 70.0 ± 6.0 ms 84 196 KB/op (flatMap) 31.0 ± 2.3 ms 80 001 KB/op 71
  72. 72. // with concat public Stream<Node> allNodes() { return children.isEmpty() ? Stream.of(this) : Stream.concat( Stream.of(this), children.stream().flatMap(Node::allNodes)); } // without concat public Stream<Node> allNodes() { return children.isEmpty() ? Stream.of(this) : Stream.of(Stream.of(this), children.stream().flatMap(Node::allNodes)) .flatMap(Function.identity()); } 72
  73. 73. import one.util.streamex.StreamEx; public Stream<Node> allNodes() { return StreamEx.ofTree(this, n -> n.children.isEmpty() ? null : StreamEx.of(n.children)); } 73
  74. 74. count (concat) 29.4 ± 2.0 ms 80 001 KB/op (flatMap) 32.6 ± 2.0 ms 80 001 KB/op (StreamEx) 12.4 ± 0.2 ms 0.5 KB/op anyMatch (concat) 70.0 ± 6.0 ms 84 196 KB/op (flatMap) 31.0 ± 2.3 ms 80 001 KB/op (StreamEx) 19.3 ± 0.3 ms 0.5 KB/op 74
  75. 75. Stream и Iterator
  76. 76. iterator() Верните цикл for()! 76
  77. 77. for vs forEach() for forEach() Появился Java 1.5 Java 1.8 Нормальная отладка Короткие стектрейсы Checked exceptions Изменяемые переменные Досрочный выход по break Параллелизм (всё равно не нужен) 77
  78. 78. iterator() 78
  79. 79. iterator() Stream<String> s = Stream.of("a", "b", "c"); for(String str : (Iterable<String>)s::iterator) { System.out.println(str); } 79
  80. 80. iterator() Stream<String> s = Stream.of("a", "b", "c"); for(String str : (Iterable<String>)s::iterator) { System.out.println(str); } // Joshua Bloch approves for(String str : StreamEx.of("a", "b", "c")) { System.out.println(str); } 80
  81. 81. iterator() IntStream s = IntStream.of(1_000_000_000) .flatMap(x -> IntStream.range(0, x)); for(int i : (Iterable<Integer>)s::iterator) { System.out.println(i); } 81
  82. 82. Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at j.u.s.SpinedBuffer$OfInt.newArray at j.u.s.SpinedBuffer$OfInt.newArray at j.u.s.SpinedBuffer$OfPrimitive.ensureCapacity at j.u.s.SpinedBuffer$OfPrimitive.increaseCapacity at j.u.s.SpinedBuffer$OfPrimitive.preAccept at j.u.s.SpinedBuffer$OfInt.accept at j.u.s.IntPipeline$7$1.lambda$accept$198 ... at j.u.s.StreamSpliterators$AbstractWrappingSpliterator.fillBuffer at j.u.s.StreamSpliterators$AbstractWrappingSpliterator.doAdvance at j.u.s.StreamSpliterators$IntWrappingSpliterator.tryAdvance at j.u.Spliterators$2Adapter.hasNext at by.jetconf.streamsamples.IterableWorkaround.main 82
  83. 83. 83 Map<String, String> userPasswords = Files.lines(Paths.get("/etc/passwd")) .map(str -> str.split(":")) .collect(toMap(arr -> arr[0], arr -> arr[1]));
  84. 84. 84 Map<String, String> userPasswords = Files.lines(Paths.get("/etc/passwd")) .map(str -> str.split(":")) .collect(toMap(arr -> arr[0], arr -> arr[1]));
  85. 85. 85 Map<String, String> userPasswords; try (Stream<String> stream = Files.lines(Paths.get("/etc/passwd"))) { userPasswords = stream.map(str -> str.split(":")) .collect(toMap(arr -> arr[0], arr -> arr[1])); }
  86. 86. 86 Map<String, String> userPasswords; try (Stream<String> stream = Files.lines(Paths.get("/etc/passwd"))) { userPasswords = stream.map(str -> str.split(":")) .collect(toMap(arr -> arr[0], arr -> arr[1])); }
  87. 87. 87 http://stackoverflow.com/questions/34753078/
  88. 88. 88 Терминальные операции Нормальные (с внутренним обходом): forEach() collect() reduce() findFirst() count() toArray() …
  89. 89. 89 Терминальные операции Нормальные (с внутренним обходом): forEach() collect() reduce() findFirst() count() toArray() … Причудливые (с внешним обходом): iterator() spliterator()
  90. 90. 90 Use flatMap to save the Stream!
  91. 91. 91 flatMap <R> Stream<R> flatMap( Function<? super T,? extends Stream<? extends R>> mapper) Returns a stream consisting of the results of replacing each element of this stream with the contents of a mapped stream produced by applying the provided mapping function to each element. Each mapped stream is closed after its contents have been placed into this stream. (If a mapped stream is null an empty stream is used, instead.)
  92. 92. 92 Map<String, String> userPasswords = Stream.of(Files.lines(Paths.get("/etc/passwd"))) .flatMap(Function.identity()) .map(str -> str.split(":")) .collect(toMap(arr -> arr[0], arr -> arr[1]));
  93. 93. 93 Files.lines(Paths.get("/etc/passwd")) .spliterator().tryAdvance(...); // вычитываем одну строку, файл не закрываем Stream.of(Files.lines(Paths.get("/etc/passwd"))) .flatMap(s -> s) .spliterator().tryAdvance(...); // вычитываем весь файл в память, файл закрываем
  94. 94. 94 Files.lines(Paths.get("/etc/passwd")) .spliterator().tryAdvance(...); // вычитываем одну строку, файл не закрываем Stream.of(Files.lines(Paths.get("/etc/passwd"))) .flatMap(s -> s) .spliterator().tryAdvance(...); // вычитываем весь файл в память, файл закрываем http://stackoverflow.com/a/32767282/4856258 (небуферизующий flatMap)
  95. 95. 95 There’s no way to save the Stream 
  96. 96. Спасибо за внимание https://twitter.com/tagir_valeev https://github.com/amaembo https://habrahabr.ru/users/lany 96

×