Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

大解密!用 PostgreSQL 提升 350 倍的 Funliday 推薦景點計算速度

Funliday 是一個極度依賴景點資料的網路服務,大家從十月開始在使用 Funliday 在「景點瀏覽」輸入城市,或者是移動地圖出現的推薦景點,應該都是極速出現,跟之前兩年的速度實在是相差甚遠。這中間的差異主要是調整了演算法、儲存資料以及索引,更重要的是不要求絕對即時,這樣子可以讓計算的時間加快了 350 倍以上,這場分享就是解釋 Funliday 是如何利用上面這些技術加快計算時間。

文字筆記:https://www.facebook.com/kewang.information/posts/2728954360714254

預計提到的內容包括:PostgreSQL、Redis、CDN

  • Als Erste(r) kommentieren

大解密!用 PostgreSQL 提升 350 倍的 Funliday 推薦景點計算速度

  1. 1. 大解密!用 PostgreSQL 提升 350 倍的 Funliday 推薦景點計算速度 Kewang
  2. 2. Kewang ● 王慕羣 Kewang ● Java / JavaScript ● HBase / PostgreSQL / MongoDB / ElasticSearch ● Git / DevOps ● 熱愛開源 LinkedinLinkedin kewangtwkewangtw SlideShareSlideShare kewangkewang GmailGmail cpckewangcpckewang FacebookFacebook Kewang 的資訊進化論Kewang 的資訊進化論 devopsday taipeidevopsday taipei '17'17 hadoopconhadoopcon '14 '15'14 '15 jcconfjcconf '16 '17 '18'16 '17 '18 modernwebmodernweb '18 '19 '20'18 '19 '20 GitHubGitHub kewangkewang FunlidayFunliday kewangkewang coscupcoscup '20'20 mopconmopcon '14 '20'14 '20
  3. 3. 4 推薦景點
  4. 4. 5 推薦景點
  5. 5. 6 推薦景點
  6. 6. 7 推薦景點
  7. 7. 8 技術演進
  8. 8. 9 V1 2019-02
  9. 9. 10 Summary V1 2019-02
  10. 10. 11 Summary ● GiST index V1 2019-02
  11. 11. 12 Summary ● GiST index ● Cluster V1 2019-02
  12. 12. 13 Summary ● GiST index ● Cluster ● Nearest-Neighbor Search V1 2019-02
  13. 13. 14 Summary ● GiST index ● Cluster ● Nearest-Neighbor Search ● Advisory lock V1 2019-02
  14. 14. 15 Summary ● GiST index ● Cluster ● Nearest-Neighbor Search ● Advisory lock ● Cache V1 2019-02
  15. 15. 16 Sequence diagram
  16. 16. 17 Sequence diagram V1 2019-02 client AP Redis DB
  17. 17. 18 Sequence diagram V1 2019-02 client AP Redis DB get POIs
  18. 18. 19 Sequence diagram V1 2019-02 client AP Redis DB get POIs get cache from Redis
  19. 19. 20 Sequence diagram V1 2019-02 client AP Redis DB get POIs get cache from Redis return cache
  20. 20. 21 Sequence diagram V1 2019-02 client AP Redis DB get POIs get cache from Redis return cacheif hits, return POIs
  21. 21. 22 Sequence diagram V1 2019-02 client AP Redis DB get POIs get cache from Redis return cacheif hits, return POIs if misses, calculate POIs from DB
  22. 22. 23 Sequence diagram V1 2019-02 client AP Redis DB get POIs get cache from Redis return cacheif hits, return POIs if misses, calculate POIs from DB return calculated results
  23. 23. 24 Sequence diagram V1 2019-02 client AP Redis DB get POIs get cache from Redis return cacheif hits, return POIs if misses, calculate POIs from DB return calculated results store cache to Redis
  24. 24. 25 Sequence diagram V1 2019-02 client AP Redis DB get POIs get cache from Redis return cacheif hits, return POIs if misses, calculate POIs from DB return calculated results store cache to Redis store OK
  25. 25. 26 Sequence diagram V1 2019-02 client AP Redis DB get POIs get cache from Redis return cacheif hits, return POIs if misses, calculate POIs from DB return calculated results store cache to Redis store OK return POIs
  26. 26. 27 Advisory lock
  27. 27. 28 Advisory lock client V1 2019-02 server
  28. 28. 29 Advisory lock client req A (search Taipei city) V1 2019-02 server T
  29. 29. 30 Advisory lock client req A (search Taipei city) req B (search Taipei city) V1 2019-02 server calculate T T+1
  30. 30. 31 Advisory lock client req A (search Taipei city) res B (data processing...) req B (search Taipei city) V1 2019-02 server calculate T T+1 T+3
  31. 31. 32 Advisory lock client req A (search Taipei city) res B (data processing...) req B (search Taipei city) res A (calculated) V1 2019-02 server calculate T T+1 T+3 T+10
  32. 32. 33 Table DDL
  33. 33. 34 Table DDL V1 2019-02
  34. 34. 35 Cluster
  35. 35. 36 Before cluster V1 2019-02
  36. 36. 37 Before cluster - query plan V1 2019-02
  37. 37. 38 Statistical correlation V1 2019-02
  38. 38. 39 Statistical correlation V1 2019-02 correlation 愈接近 1 ,用 index 的成本愈低
  39. 39. 40 Statistical correlation V1 2019-02 correlation 愈接近 1 ,用 index 的成本愈低 如果沒有基本運算子就算不出 correlation
  40. 40. 41 Running cluster V1 2019-02
  41. 41. 42 Running cluster - lock V1 2019-02
  42. 42. 43 Running cluster - lock ● Rebuild table V1 2019-02
  43. 43. 44 Running cluster - lock ● Rebuild table – Access Exclusive Lock V1 2019-02
  44. 44. 45 Running cluster - lock ● Rebuild table – Access Exclusive Lock ● Rebuild index V1 2019-02
  45. 45. 46 Running cluster - lock ● Rebuild table – Access Exclusive Lock ● Rebuild index – Access Share Lock V1 2019-02
  46. 46. 47 Running cluster - lock ● Rebuild table – Access Exclusive Lock ● Rebuild index – Access Share Lock – Access Exclusive Lock V1 2019-02
  47. 47. 48 After cluster 1 V1 2019-02
  48. 48. 49 After cluster 1 - query plan V1 2019-02
  49. 49. 50 After cluster 1 - query plan V1 2019-02
  50. 50. 51 After cluster 2 V1 2019-02
  51. 51. 52 After cluster 2 - query plan V1 2019-02
  52. 52. 53 After cluster 2 - query plan V1 2019-02
  53. 53. 54 Nearest-Neighbor Search
  54. 54. 55 Nearest-Neighbor Search 1 V1 2019-02
  55. 55. 56 Nearest-Neighbor Search 1 V1 2019-02
  56. 56. 57 Nearest-Neighbor Search 1 V1 2019-02 full table scan because of ST_Distance
  57. 57. 58 Nearest-Neighbor Search 2 V1 2019-02
  58. 58. 59 Nearest-Neighbor Search 2 V1 2019-02
  59. 59. 60 Nearest-Neighbor Search 2 V1 2019-02 speed up because of ST_Expand
  60. 60. 61 Nearest-Neighbor Search 3 V1 2019-02
  61. 61. 62 Nearest-Neighbor Search 3 V1 2019-02
  62. 62. 63 Nearest-Neighbor Search 3 V1 2019-02 <-> KNN
  63. 63. 64 V2 2019-09 late
  64. 64. 65 Summary V2 2019-09 late
  65. 65. 66 Summary ● Use POI history to more precise V2 2019-09 late
  66. 66. 67 Summary ● Use POI history to more precise ● Remove duplicate POI from KNN and POI history via uniq function V2 2019-09 late
  67. 67. 68 V2.1 2020-06 late
  68. 68. 69 Summary V2.1 2020-06 late
  69. 69. 70 Summary ● Extract city_data (2000M) from poi_data (25000M) to speed up V2.1 2020-06 late
  70. 70. 71 V2.2 2020-07 late
  71. 71. 72 Summary V2.2 2020-07 late
  72. 72. 73 Summary ● Remove unnecessary OSM POI V2.2 2020-07 late
  73. 73. 74 Summary ● Remove unnecessary OSM POI – drinking_water V2.2 2020-07 late
  74. 74. 75 Summary ● Remove unnecessary OSM POI – drinking_water – place_of_worship V2.2 2020-07 late
  75. 75. 76 Summary ● Remove unnecessary OSM POI – drinking_water – place_of_worship – basketball, football, volleyball V2.2 2020-07 late
  76. 76. 77 Summary ● Remove unnecessary OSM POI – drinking_water – place_of_worship – basketball, football, volleyball – parking V2.2 2020-07 late
  77. 77. 78 Summary ● Remove unnecessary OSM POI – drinking_water – place_of_worship – basketball, football, volleyball – parking ● Expired time V2.2 2020-07 late
  78. 78. 79 Summary ● Remove unnecessary OSM POI – drinking_water – place_of_worship – basketball, football, volleyball – parking ● Expired time – KNN cache has expired after 14d V2.2 2020-07 late
  79. 79. 80 Summary ● Remove unnecessary OSM POI – drinking_water – place_of_worship – basketball, football, volleyball – parking ● Expired time – KNN cache has expired after 14d – POI history cache has expired after 1d V2.2 2020-07 late
  80. 80. 81 V3 2020-09-22 late
  81. 81. 82 Summary V3 2020-09-22 late
  82. 82. 83 Summary ● Read Redis at first, if not exists, set refresh true V3 2020-09-22 late
  83. 83. 84 Summary ● Read Redis at first, if not exists, set refresh true ● Read DB second, if not exists, calculate and store DB & Redis V3 2020-09-22 late
  84. 84. 85 Summary ● Read Redis at first, if not exists, set refresh true ● Read DB second, if not exists, calculate and store DB & Redis ● Set instead of uniq function V3 2020-09-22 late
  85. 85. 86 Summary ● Read Redis at first, if not exists, set refresh true ● Read DB second, if not exists, calculate and store DB & Redis ● Set instead of uniq function ● L2 & L3 cache V3 2020-09-22 late
  86. 86. 87 Summary ● Read Redis at first, if not exists, set refresh true ● Read DB second, if not exists, calculate and store DB & Redis ● Set instead of uniq function ● L2 & L3 cache ● Refresher to scan refresh true and calculate V3 2020-09-22 late
  87. 87. 88 Sequence diagram - API
  88. 88. 89 Sequence diagram - API client AP RedisDB V3 2020-09-22 late
  89. 89. 90 Sequence diagram - API client AP RedisDB get POIs V3 2020-09-22 late
  90. 90. 91 Sequence diagram - API client AP RedisDB get POIs get cache V3 2020-09-22 late
  91. 91. 92 Sequence diagram - API client AP RedisDB get POIs get cache return cache V3 2020-09-22 late
  92. 92. 93 Sequence diagram - API client AP RedisDB get POIs get cache return cache if hits, return POIs V3 2020-09-22 late
  93. 93. 94 Sequence diagram - API client AP RedisDB get POIs get cache return cache if hits, return POIs if misses, set refresh=true V3 2020-09-22 late
  94. 94. 95 Sequence diagram - API client AP RedisDB get POIs get cache return cache if hits, return POIs if misses, set refresh=true set OK V3 2020-09-22 late
  95. 95. 96 Sequence diagram - API client AP RedisDB get POIs get cache return cache if hits, return POIs if misses, set refresh=true set OK get cache V3 2020-09-22 late
  96. 96. 97 Sequence diagram - API client AP RedisDB get POIs get cache return cache if hits, return POIs if misses, set refresh=true set OK get cache V3 2020-09-22 late return cache
  97. 97. 98 Sequence diagram - API client AP RedisDB get POIs get cache return cache if hits, return POIs if misses, set refresh=true set OK get cache if hits, return POIs V3 2020-09-22 late return cache
  98. 98. 99 Sequence diagram - API client AP RedisDB get POIs get cache return cache if hits, return POIs if misses, set refresh=true set OK get cache if hits, return POIs V3 2020-09-22 late if misses, calculate POIs return cache
  99. 99. 100 Sequence diagram - API client AP RedisDB get POIs get cache return cache if hits, return POIs if misses, set refresh=true set OK get cache if hits, return POIs V3 2020-09-22 late if misses, calculate POIs return cache return POI IDs
  100. 100. 101 Sequence diagram - API client AP RedisDB get POIs get cache return cache if hits, return POIs if misses, set refresh=true set OK get cache if hits, return POIs V3 2020-09-22 late if misses, calculate POIs return cache return POI IDs store cache
  101. 101. 102 Sequence diagram - API client AP RedisDB get POIs get cache return cache if hits, return POIs if misses, set refresh=true set OK get cache if hits, return POIs V3 2020-09-22 late if misses, calculate POIs return cache return POI IDs store cache store OK
  102. 102. 103 Sequence diagram - API client AP RedisDB get POIs get cache return cache if hits, return POIs if misses, set refresh=true set OK get cache if hits, return POIs V3 2020-09-22 late if misses, calculate POIs return cache return POI IDs store cache store cache store OK
  103. 103. 104 Sequence diagram - API client AP RedisDB get POIs get cache return cache if hits, return POIs if misses, set refresh=true set OK get cache if hits, return POIs V3 2020-09-22 late if misses, calculate POIs return cache return POI IDs store cache store cache store OK store OK
  104. 104. 105 Sequence diagram - API client AP RedisDB get POIs get cache return cache if hits, return POIs if misses, set refresh=true set OK get cache if hits, return POIs V3 2020-09-22 late if misses, calculate POIs return cache return POI IDs store cache store cache return POIs store OK store OK
  105. 105. 106 Sequence diagram - refresher
  106. 106. 107 Sequence diagram - refresher city IDs refresher RedisDB V3 2020-09-22 late
  107. 107. 108 Sequence diagram - refresher city IDs refresher RedisDB run V3 2020-09-22 late
  108. 108. 109 Sequence diagram - refresher city IDs refresher RedisDB run calculate POIs V3 2020-09-22 late
  109. 109. 110 Sequence diagram - refresher city IDs refresher RedisDB run calculate POIs return POI IDs V3 2020-09-22 late
  110. 110. 111 Sequence diagram - refresher city IDs refresher RedisDB run calculate POIs return POI IDs store cache V3 2020-09-22 late
  111. 111. 112 Sequence diagram - refresher city IDs refresher RedisDB run calculate POIs return POI IDs store cache store OK V3 2020-09-22 late
  112. 112. 113 Sequence diagram - refresher city IDs refresher RedisDB run calculate POIs return POI IDs store cache store OK V3 2020-09-22 late store cache
  113. 113. 114 Sequence diagram - refresher city IDs refresher RedisDB run calculate POIs return POI IDs store cache store OK V3 2020-09-22 late store cache store OK
  114. 114. 115 Sequence diagram - refresher city IDs refresher RedisDB run calculate POIs return POI IDs store cache store OK V3 2020-09-22 late store cache store OK set refresh=false
  115. 115. 116 Sequence diagram - refresher city IDs refresher RedisDB run calculate POIs return POI IDs store cache store OK V3 2020-09-22 late store cache store OK set refresh=false set OK
  116. 116. 117 Sequence diagram - refresher city IDs refresher RedisDB run calculate POIs return POI IDs store cache store OK done V3 2020-09-22 late store cache store OK set refresh=false set OK
  117. 117. 118 V3.1 2020-09-23 early
  118. 118. 119 Summary V3.1 2020-09-23 early
  119. 119. 120 Summary ● Store back existing cache from Redis to DB V3.1 2020-09-23 early
  120. 120. 121 Sequence diagram
  121. 121. 122 Sequence diagram client AP RedisDB V3.1 2020-09-23 early
  122. 122. 123 Sequence diagram client AP RedisDB run V3.1 2020-09-23 early
  123. 123. 124 Sequence diagram client AP RedisDB run get cache V3.1 2020-09-23 early
  124. 124. 125 Sequence diagram client AP RedisDB run get cache return cache V3.1 2020-09-23 early
  125. 125. 126 Sequence diagram client AP RedisDB run get cache return cache if hits, get cache V3.1 2020-09-23 early
  126. 126. 127 Sequence diagram client AP RedisDB run get cache return cache if hits, get cache return cache V3.1 2020-09-23 early
  127. 127. 128 Sequence diagram client AP RedisDB run get cache return cache if hits, get cache return cache if misses, store cache V3.1 2020-09-23 early
  128. 128. 129 Sequence diagram client AP RedisDB run get cache return cache if hits, get cache return cache if misses, store cache store OK V3.1 2020-09-23 early
  129. 129. 130 Sequence diagram client AP RedisDB run get cache return cache if hits, get cache return cache done if misses, store cache store OK V3.1 2020-09-23 early
  130. 130. 131 V3.2 2020-09-23 mid
  131. 131. 132 Summary V3.2 2020-09-23 mid
  132. 132. 133 Summary ● POI history cache TTL from 1d to 14d V3.2 2020-09-23 mid
  133. 133. 134 Summary ● POI history cache TTL from 1d to 14d – Balance diversity V3.2 2020-09-23 mid
  134. 134. 135 Summary ● POI history cache TTL from 1d to 14d – Balance diversity – Flatten burst activities V3.2 2020-09-23 mid
  135. 135. 136 Heatmap
  136. 136. 137 Heatmap V3.2 2020-09-23 mid
  137. 137. 138 V3.3 2020-09-24 early
  138. 138. 139 Summary V3.3 2020-09-24 early
  139. 139. 140 Summary ● Measure execution time V3.3 2020-09-24 early
  140. 140. 141 Summary ● Measure execution time – Add New Relic custom attribute V3.3 2020-09-24 early
  141. 141. 142 New Relic custom attributes
  142. 142. 143 New Relic custom attributes V3.3 2020-09-24 early
  143. 143. 144 New Relic custom attributes V3.3 2020-09-24 early
  144. 144. 145 New Relic custom attributes V3.3 2020-09-24 early
  145. 145. 146 V3.4 2020-09-24 mid
  146. 146. 147 Summary V3.4 2020-09-24 mid
  147. 147. 148 Summary ● Add result cache for language code and city id at AP V3.4 2020-09-24 mid
  148. 148. 149 Summary ● Add result cache for language code and city id at AP ● Measure execution time V3.4 2020-09-24 mid
  149. 149. 150 Summary ● Add result cache for language code and city id at AP ● Measure execution time – Add New Relic custom segment V3.4 2020-09-24 mid
  150. 150. 151 Sequence diagram
  151. 151. 152 Sequence diagram client AP DBLRU cache (with POI IDs, city ID,language) V3.4 2020-09-24 mid (at AP)
  152. 152. 153 Sequence diagram client AP DBLRU cache get results (with POI IDs, city ID,language) V3.4 2020-09-24 mid (at AP)
  153. 153. 154 Sequence diagram client AP DBLRU cache get results (with POI IDs, city ID,language) return results V3.4 2020-09-24 mid (at AP)
  154. 154. 155 Sequence diagram client AP DBLRU cache get results if hits, return POIs (with POI IDs, city ID,language) return results V3.4 2020-09-24 mid (at AP)
  155. 155. 156 Sequence diagram client AP DBLRU cache get results if hits, return POIs if misses, build results (with POI IDs, city ID,language) return results V3.4 2020-09-24 mid (at AP)
  156. 156. 157 Sequence diagram client AP DBLRU cache get results if hits, return POIs if misses, build results return results (with POI IDs, city ID,language) return results V3.4 2020-09-24 mid (at AP)
  157. 157. 158 Sequence diagram client AP DBLRU cache get results if hits, return POIs if misses, build results return results store results (with POI IDs, city ID,language) return results V3.4 2020-09-24 mid (at AP)
  158. 158. 159 Sequence diagram client AP DBLRU cache get results if hits, return POIs if misses, build results return results store results store OK (with POI IDs, city ID,language) return results V3.4 2020-09-24 mid (at AP)
  159. 159. 160 Sequence diagram client AP DBLRU cache get results if hits, return POIs if misses, build results return results store results return POIs store OK (with POI IDs, city ID,language) return results V3.4 2020-09-24 mid (at AP)
  160. 160. 161 New Relic custom segments
  161. 161. 162 Before custom segments V3.4 2020-09-24 mid
  162. 162. 163 After custom segments V3.4 2020-09-24 mid
  163. 163. 164 After custom segments V3.4 2020-09-24 mid
  164. 164. 165 V3.5 2020-09-24 late
  165. 165. 166 Summary V3.5 2020-09-24 late
  166. 166. 167 Summary ● Merge join: 70s V3.5 2020-09-24 late
  167. 167. 168 Summary ● Merge join: 70s – where and group together V3.5 2020-09-24 late
  168. 168. 169 Summary ● Merge join: 70s – where and group together ● Hash join: 10s V3.5 2020-09-24 late
  169. 169. 170 Summary ● Merge join: 70s – where and group together ● Hash join: 10s – where first then group V3.5 2020-09-24 late
  170. 170. 171 Merge join
  171. 171. 172 Before optimization: 70s V3.5 2020-09-24 late
  172. 172. 173 Before optimization: 70s V3.5 2020-09-24 late
  173. 173. 174 Before optimization: 70s V3.5 2020-09-24 late https://explain.depesz.com/s/Om1c
  174. 174. 175 Before optimization: 70s V3.5 2020-09-24 late
  175. 175. 176 Before optimization: 70s V3.5 2020-09-24 late
  176. 176. 177 Hash join
  177. 177. 178 After optimization: 10s V3.5 2020-09-24 late
  178. 178. 179 After optimization: 10s V3.5 2020-09-24 late
  179. 179. 180 After optimization: 10s V3.5 2020-09-24 late https://explain.depesz.com/s/C9yZ
  180. 180. 181 After optimization: 10s V3.5 2020-09-24 late
  181. 181. 182 After optimization: 10s V3.5 2020-09-24 late
  182. 182. 183 V3.6 2020-09-25 early
  183. 183. 184 Summary V3.6 2020-09-25 early
  184. 184. 185 Summary ● Remove duplicate middleware: 1ms V3.6 2020-09-25 early
  185. 185. 186 V3.7 2020-09-27 early
  186. 186. 187 Summary V3.7 2020-09-27 early
  187. 187. 188 Summary ● Find city from location via GiST index V3.7 2020-09-27 early
  188. 188. 189 Summary ● Find city from location via GiST index ● Query POI history via B-tree index V3.7 2020-09-27 early
  189. 189. 190 GiST index
  190. 190. 191 Find city from location V3.7 2020-09-27 early
  191. 191. 192 Before optimization: 400ms V3.7 2020-09-27 early
  192. 192. 193 Before optimization: 400ms V3.7 2020-09-27 early
  193. 193. 194 Before optimization: 400ms V3.7 2020-09-27 early https://explain.depesz.com/s/y5Vv
  194. 194. 195 Create GiST index V3.7 2020-09-27 early
  195. 195. 196 After optimization: 0.4ms V3.7 2020-09-27 early
  196. 196. 197 After optimization: 0.4ms V3.7 2020-09-27 early
  197. 197. 198 After optimization: 0.4ms V3.7 2020-09-27 early https://explain.depesz.com/s/QqmT
  198. 198. 199 B-tree index
  199. 199. 200 Query POI history V3.7 2020-09-27 early
  200. 200. 201 Before optimization: 10s V3.7 2020-09-27 early
  201. 201. 202 Before optimization: 10s V3.7 2020-09-27 early
  202. 202. 203 Before optimization: 10s V3.7 2020-09-27 early https://explain.depesz.com/s/7LHy
  203. 203. 204 Create B-tree index V3.7 2020-09-27 early
  204. 204. 205 After optimization: 1s V3.7 2020-09-27 early
  205. 205. 206 After optimization: 1s V3.7 2020-09-27 early
  206. 206. 207 After optimization: 1s V3.7 2020-09-27 early https://explain.depesz.com/s/o59U
  207. 207. 208 References ● digoal/blog ● PostgreSQL cluster table using index ● 27. Nearest-Neighbour Searching - Introduction to PostGIS
  208. 208. 209

×