Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

大腸内視鏡検査における大腸癌認識システム

827 Aufrufe

Veröffentlicht am

大腸内視鏡検査における大腸癌認識システム @ 阪大セミナー

Veröffentlicht in: Technologie
  • Loggen Sie sich ein, um Kommentare anzuzeigen.

  • Gehören Sie zu den Ersten, denen das gefällt!

大腸内視鏡検査における大腸癌認識システム

  1. 1. , Bisser Raytchev, , , and many others
  2. 2. http://www.sciencekids.co.nz/pictures/humanbody/braintomography.html
  3. 3. http://www.sciencekids.co.nz/pictures/humanbody/heartsurfaceanatomy.html
  4. 4. http://sozai.rash.jp/medical/p/000154.html http://sozai.rash.jp/medical/p/000152.html http://medical.toykikaku.com/ / / http://www.sciencekids.co.nz/pictures/humanbody/humanorgans.html
  5. 5. • CT • • • NBI http://sozai.rash.jp/medical/p/000154.html http://sozai.rash.jp/medical/p/000152.html http://medical.toykikaku.com/ / /
  6. 6. • : 235,000 ( 21 )† – • : 42,434 ( )† – 20 1.7 – 3 ( 1 : 2 : ) – 7 1 • 5 : 20% 11 † http://www.mhlw.go.jp/toukei/saikin/ ‡http://www.gunma-cc.jp/sarukihan/seizonritu/index.html 0 20 40 60 80 100 stage 1 stage 2 stage 3 stage 4 † 5 ‡ survivalrate[%] stage 1: stage 2: stage 3: stage 4: stage 1 ( ) 100% 0 10,000 20,000 30,000 40,000 50,000 '90 '91 '92 '93 '94 '95 '96 '97 '98 '99 '00 '01 '02 '03 '04 '05 '06 '07 '08 '09 fatalities of colorectal cancer year
  7. 7. http://www.mhlw.go.jp/toukei/saikin/hw/jinkou/geppo/nengai11/kekka03.html#k3_2 8 10
  8. 8. http://ameblo.jp/gomora16610/entry-10839715830.html http://daichou.com/ben.htm
  9. 9. • CCD • 100 I think this is a cancer…
  10. 10. http://www.ajinomoto-seiyaku.co.jp/newsrelease/2004/1217.html
  11. 11. http://yotsuba-clinic.jp/WordPress/?p=63 http://www.oiya-clinic.jp/inform3.htmlhttps://www.youtube.com/watch?v=40L-y9rNOzw
  12. 12. Capture ~Setup~ 17 NBI内視鏡 処理用PC スコープ
  13. 13. 光源(NBI) ビデオプロセッサ レコーダー スコープ 接続口
  14. 14. 70 100
  15. 15. NBI
  16. 16. 75 100 NBI
  17. 17. NBI (Narrow Band Imaging , –NBI AFI IRI – , 1 1 , , 2006. R B G 415nm 540nm Color Transform NBI filter Xenon lamp RGB rotary filter mucosal CCD Monitor Light source unit Video processor ON OFF Normal light Normal light NBI NBI http://www.olympus.co.jp/jp/technology/technology/luceraelite/
  18. 18. http://cancernavi.nikkeibp.co.jp/daicho/worry/post_2.html ü ü ü ü I think this is a cancer… or or or or
  19. 19. or or or or http://cancernavi.nikkeibp.co.jp/daicho/worry/post_2.html ü ü ü ü I think this is a cancer… Ø Ø
  20. 20. or or or or http://cancernavi.nikkeibp.co.jp/daicho/worry/post_2.html ü ü ü ü I think this is a cancer… Oh, MIA, ‘07 Sundaram et al., MIA, ‘08 Diaz & Rao , PRL, ‘07 Al-Kadi, PR, ‘10 Gunduz-Demir et al., MIA, ‘10 Tosun, PR, ‘09 Pit-Pattern Häfner et al., PAA, ‘09 Häfner, ICPR, ‘10 Häfner, PR, ‘09 Kwitt & Uhl, ICCV, ‘07 Tischendrof et al., Endoscopy, ‘10 NBI Stehle, MI, ‘09 Gross, MI, ‘08 , PRMU, ‘10 Tamaki et al., ACCV, ‘10
  21. 21. pit-pattern • pit – pit – 29 m sm pit pit pit pit pit S L I N pit pit pit S L pit-pattern [S.Tanaka et al., ‘06]
  22. 22. NBI (NBI: Narrow-band Imaging) • pit – – Type A Type B Type C 1 2 3 pit pit / pit / pit / (AVA) NBI [H.Kanao et al., ‘09] sm
  23. 23. texture analysis approach Yoshito Takemura, Shigeto Yoshida, Shinji Tanaka, Keiichi Onji, Shiro Oka, Toru Tamaki, Kazufumi Kaneda, Masaharu Yoshihara, Kazuaki Chayama: "Quantitative analysis and development of a computer-aided system for identification of regular pit patterns of colorectal lesions," Gastrointestinal Endoscopy, Vol. 72, No. 5, pp. 1047-1051 (2010 11).
  24. 24. Bag-of-Visual Words Approach Type A Type B Type C3 12, 55, 63, … 87, 49, 21, … 32, 20, 73, … 67, 6, 0, … 79, 5, 40, … 11, 36, 87, … 27, 64, 25, …, 87 93, 41, 75, …, 8 … 12, 55, 63, … 87, 49, 21, … 32, 20, 73, … 67, 6, 0, … 79, 5, 40, … 11, 36, 87, … 65, 33, 19, …, 101 52, 51, 32, …, 89 … 12, 55, 63, … 87, 49, 21, … 32, 20, 73, … 67, 6, 0, … 79, 5, 40, … 11, 36, 87, … 66, 95, 47, …, 85 11, 82, 3,…, 124 … Type A Type B Type C3 84, 99, 40, …, 121 5, 26, 91, …, 150 … Vector quantization Vector quantization Feature space Classifier Histogram Test image Learning Classification result Description of Local features + Bag-of-features
  25. 25. Object Bag of words Slide by Li Fei-Feiat CVPR2007 Tutorialhttp://people.csail.mit.edu/torralba/shortCourseRLOC/
  26. 26. Analogy to documents Of all the sensory impressions proceeding to the brain, the visual experiences are the dominant ones. Our perception of the world around us is based essentially on the messages that reach the brain from our eyes. For a long time it was thought that the retinal image was transmitted point by point to visual centers in the brain; the cerebral cortex was a movie screen, so to speak, upon which the image in the eye was projected. Through the discoveries of Hubel and Wiesel we now know that behind the origin of the visual perception in the brain there is a considerably more complicated course of events. By following the visual impulses along their path to the various cell layers of the optical cortex, Hubel and Wiesel have been able to demonstrate that the message about the image falling on the retina undergoes a step- wise analysis in a system of nerve cells stored in columns. In this system each cell has its specific function and is responsible for a specific detail in the pattern of the retinal image. sensory, brain, visual, perception, retinal, cerebral cortex, eye, cell, optical nerve, image Hubel, Wiesel China is forecasting a trade surplus of $90bn (£51bn) to $100bn this year, a threefold increase on 2004's $32bn. The Commerce Ministry said the surplus would be created by a predicted 30% jump in exports to $750bn, compared with a 18% rise in imports to $660bn. The figures are likely to further annoy the US, which has long argued that China's exports are unfairly helped by a deliberately undervalued yuan. Beijing agrees the surplus is too high, but says the yuan is only one factor. Bank of China governor Zhou Xiaochuan said the country also needed to do more to boost domestic demand so more goods stayed within the country. China increased the value of the yuan against the dollar by 2.1% in July and permitted it to trade within a narrow band, but the US wants the yuan to be allowed to trade freely. However, Beijing has made it clear that it will take its time and tread carefully before allowing the yuan to rise further in value. China, trade, surplus, commerce, exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Slide by Li Fei-Feiat CVPR2007 Tutorialhttp://people.csail.mit.edu/torralba/shortCourseRLOC/
  27. 27. Slide by Li Fei-Feiat CVPR2007 Tutorialhttp://people.csail.mit.edu/torralba/shortCourseRLOC/
  28. 28. ヒストグラム Type C3Type BType A Type A Type C3 学習画像 Type B 特徴量: Bag-of-Visual Words Approach 病変部の画像パッチを分類[Tamaki et al., 2013] ・908枚のNBI画像(Type A: 359,Type B:462,Type C3:87)で学習 ・Type C1,Type C2は不明瞭な部分が多いため省かれている ・最大認識率96% 認識の流れ 特徴量抽出 特徴量をクラスタリング, 代表値をVisual Wordsとする Visual Wordsヒストグラムを作成 認識画像 Type ?Type B Visual Wordsヒストグラムを作成 SVM学習 認識 Visual Words Bag-of-Visual Wordsの枠組み
  29. 29. : gridSIFT • Scale Invariant Feature Transform (SIFT) [Lowe, ‘99] – 128 – DoG 90[%] DoG • grid sampling SIFT (gridSIFT) – – SIFT grid sampling grid space scale size
  30. 30. : Support Vector Machine (SVM) • – Radial basis function (RBF) – linear – χ2 • : One-Versus-One vuvu =),(lineark )exp(),( 2 vuvu =RBFk ( ) + = vu vu vu 2 2 2 exp),(k 2 2 1 max ww subject to yiw (xi ) 1 2 1 w 2 1 w
  31. 31. • • Type • : 100 300 900 800[pix.] • 2 • 907 (Type A: 359, Type B: 462, Type C3: 87) Type A: Type B: Type C3: < >
  32. 32. Results <10-fold Cross Validation> 60 65 70 75 80 85 90 95 100 10 100 1000 10000 100000 Correct Rate [%] # of visual-words [-] Correct Rate 96.00% 0 10 20 30 40 50 60 70 80 90 100 10 100 1000 10000 100000 Recall Rate [%] # of visual-words [-] Recall Rate Type A Type B Type C3 0 10 20 30 40 50 60 70 80 90 100 10 100 1000 10000 100000 Precision Rate [%] # of visual-words [-] Precision Rate Type A Type B Type C3
  33. 33. Results <Holdout Testing> 60 65 70 75 80 85 90 95 100 10 100 1000 10000 100000 Correct Rate [%] # of visual-words [-] Correct Rate 0 10 20 30 40 50 60 70 80 90 100 10 100 1000 10000 100000 Recall Rate [%] # of visual-words [-] Recall Rate Type A Type B Type C3 0 10 20 30 40 50 60 70 80 90 100 10 100 1000 10000 100000 Precision Rate [%] # of visual-words [-] Precision Rate Type A Type B Type C3 92.86%
  34. 34. MOTIVATION • ž NBI ž NBI ž × × × A B C3
  35. 35. ABSTRACT • Self-training n n [Yoshimuta et al., ‘10] Key Idea :
  36. 36. Self-training • • Accept Reject POINT 1. 2.
  37. 37. labeled samples • • 100 300 900 800 [pix.] • Type A Type B Type C3 Total 359 462 87 908 A B C3
  38. 38. Unlabeled samples • 10 • 30 30 250 250 [pix.] • – – • * 10 Type A Type B Type C3 Total 3590 4610 870 9070
  39. 39. Result 0.9 0.91 0.92 0.93 0.94 0.95 0.96 Algorithm 1 Algorithm 2 Algorithm 3 Recognition Rate * p=0.013314
  40. 40. ヒストグラム Type C3Type BType A Type A Type C3 学習画像 Type B 特徴量: 病変部の画像パッチを分類[Tamaki et al., 2013] ・908枚のNBI画像(Type A: 359,Type B:462,Type C3:87)で学習 ・Type C1,Type C2は不明瞭な部分が多いため省かれている ・最大認識率96% 認識の流れ 特徴量抽出 特徴量をクラスタリング, 代表値をVisual Wordsとする Visual Wordsヒストグラムを作成 認識画像 Type ?Type B Visual Wordsヒストグラムを作成 SVM学習 認識 Visual Words Bag-of-Visual Wordsの枠組み
  41. 41. 格子間隔 15[pixel] 10[pixel] 5[pixel] 最高認識率 92.11[%] 93.89[%] 96.00[%] 学習時間 約13分 約30分 約3時間 2/3 1/2 +1.78% +2.11% 特徴量数の増加による学習時間の増加が問題 格子間隔 Ø 抽出する特徴量数を増やすと認識率は向上する[Jurie et al., 2005] Ø 特徴量抽出の間隔(格子間隔)を狭くして認識率向上を確認[吉牟田ら,2011] 特徴量数: 2.25倍 特徴量数: 4倍 格子間隔: 2/3 格子間隔: 1/2 学習画像:NBI画像908枚(Type A: 359,Type B:462,Type C3:87)
  42. 42. 特徴空間 ヒストグラム 1. 全ての学習画像から(格子状に)特徴量を抽出する 2. 抽出した特徴量をクラスタリングする 3. 学習画像1枚から(格子状に)特徴量を抽出する 4. 特徴量をベクトル量子化して Visual Wordsヒストグラムを求める 格子間隔 全学習画像 I = {In | 1, . . . , N} 学習画像 In 2 I Visual Words
  43. 43. ( & ) 特徴空間 ヒストグラム 1. 全ての学習画像から少量の特徴量を抽出する 3. 学習画像1枚から(格子状に)多くの特徴量を抽出する 4. 特徴量をベクトル量子化して Visual Wordsヒストグラムを求める 2. 少量の特徴量をクラスタリングする 格子間隔 全学習画像 I = {In | 1, . . . , N} 学習画像 In 2 I Visual Words
  44. 44. 学習時間の削減と認識率の向上を確認する Visual Words作成 ヒストグラム作成 特徴量数:削減 実行環境 OS:Linux Fedora 18 CPU:Intel Xeon CPU E-5 2620 Memory:128GB 識別器 Ø Linear SVM 学習画像 Ø ラベルありNBI画像908枚(Type A: 359,Type B:462,Type C3:87) Visual Wordsを作成する特徴量数を減らす 特徴量数:増加 Visual Words作成に使用する特徴量数 格子間隔 5[pixel],2[pixel],1[pixel]19,742個8,678,198個 ヒストグラムを作成する特徴量数を増やす Ø 特徴量数 vs 学習時間合計 Ø 格子間隔 vs 認識率 学習時間の削減を確認する 認識率の向上を確認する
  45. 45. 10233.45 680.72 4167.89 16471.8 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 従来手法(格子間隔:5) 提案手法(格子間隔:5) 提案手法(格子間隔:2) 提案手法(格子間隔:1) CPU時間[sec] 6.6% 40.7% 160.9% 格子間隔:5[pixel],2[pixel]の時,学習時間が削減できている 格子間隔:1[pixel]の時,学習時間が増えている Visual Words数:32
  46. 46. 0.80 0.82 0.84 0.86 0.88 0.90 0.92 0.94 0.96 0.98 32 1024 4096 16384 CorrectRate Visual Words数 従来手法(格子間隔:5) 提案手法(格子間隔:5) 提案手法(格子間隔:2) 提案手法(格子間隔:1) 格子間隔:5[pixel]と格子間隔:2[pixel],1[pixel]には差がある 格子間隔:2[pixel]と格子間隔:1[pixel]には大きな差がない
  47. 47. Problem 58 光学系が異なる 撮影画像が異なる 特徴量分布が異なる 旧内視鏡と新内視鏡が混在している Old endoscopy (EVIS LUCERA) New endoscopy (EVIS LUCERA ELITE) Viewing angle 140 (WIDE),80 (TELE) 170 (WIDE),90 (TELE) Resolution 1440*1080 1980*1080 Old endoscopy New endoscopy Ø 新内視鏡が広角・高解像度で明るい Old endoscopy New endoscopy 新内視鏡での認識性能の低下 学習画像を新旧同時に使えない 新内視鏡の学習画像を収集するのは困難 Ø 認識と学習は分布が同じことが前提 Ø がん患者は多くない Ø 検査時しか撮影できない Ø ラベル付けは医師しかできない Ø 最新のデバイスが登場し,過渡期にある
  48. 48. http://www.olympus.co.jp/jp/technology/technology/luceraelite/
  49. 49. Objective 60 Solution: 新内視鏡の特徴量を旧内視鏡の特徴量に変換し, 学習する Framework of Transfer Learning 2つの画像は関連がある 5 10 New endoscopyOld endoscopy 5 10 特徴量を変換する 学習:旧内視鏡 認識:旧内視鏡 学習:旧内視鏡 認識:新内視鏡 認識率 低下
  50. 50. Related Work 61 Adapting Visual Category Models to New Domains [Saenko et al., ECCV2010] SourceとTargetの同時認識をする問題 Source:x Target:y Targetのみを認識する問題 Ø 本手法はハイパーパラメータが存在しない Ø この手法はハイパーパラメータが存在し,調整が必要 TargetをSourceに変換する行列 W を求める Our Approach Source Target W Source-Target間の条件を満たす行列 を求めるA Source Target +Target For each class (xi yj)T A(xi yj)  upper bound (xi yj)T A(xi yj) lower bound Same class: Different class: A1/2 A1/2 ! arg min W kx W yk2 F W
  51. 51. y1 Convert Histogram 62 yn yN x1 xn xN Source Target 1. Visual Wordsヒストグラムをベクトルとして扱い,行列とする 2. ヒストグラム同士の誤差を最小に する変換行列WをADMM*で求める *ADMMによる解法 (For each row n=1, …, N) arg min W PN n=1 ||xn W nyn||2 2 +1 2 ||W n zn + un||2 2 + PN n=1(zk n uk n)) 手順 以下の双対問題を手順を繰り返すことで解く Y = y1, · · · , yNX = x1, · · · , xN Subject to. W ij 0 arg min W kX W Y k2 F zk+1 n = ⇡c(W k+1 n + uk n) uk+1 n = uk n + W k+1 n zk+1 n W k+1 n = ( PN n=1 ynyT n + E) 1 ( PN n=1 ynyT n
  52. 52. How to Make Pseudo Dataset 63 l 新内視鏡はくっきり,鮮やかに見えると思われるため を適用する ①コントラスト強調 ②先鋭化フィルタ Source Target Output Input 0 25542 213 0 255 1 9 1 9 1 9 1 9 1 9 1 9 1 9 1 9 25 9 コントラスト強調 先鋭化フィルタ l この手法は学習画像同士の対応がないと使えない Ø 現実には対応のある画像を得るのは難しい
  53. 53. Result 64 転移することで旧内視鏡と同等に認識率を得た Almost same Training Test n Source Source n Source Target n Source+Target Target n Source + Target Target ① ④ ② ③
  54. 54. Related Works 65 Cross-Domain Transform[Saenko et al., ECCV2010] Max-Margin Domain Transfer(MMDT)[Hoffman et al., ICLR2013] min tr(W ) log det W s.t. W ⌫ 0 kxs i xt jkw  upper bound, (xs i , xt j) 2 the same class kxs i xt jkw  lowe rbound, (xs i , xt j) 2 di↵erent class Ø Estimate transformation matrix which minimize Mahalanobis distance. Ø Consider in only transformed feature distributions. Ø Not ensure classification result. min W ,✓,b 1 2 kW k2 F + 1 2 KX k=1 k✓kk2 2 + Cs nX i=1 KX k=1 ⇠s i,k + Ct mX j=1 KX k=1 ⇠t j,k s.t. ys i,k✓T k xs i bk 1 ⇠s i.k yt j,k✓T k W xt j 1 ⇠t j,k ⇠s i,k 0, ⇠t j,k 0 Ø Optimize transformation matrix and SVM parameters at same time. Ø Ensure classification result. Ø Not guarantee transformed feature distributions. W : Transform matrix ✓k : SVM parameter ⇠s , ⇠t : Slack variable yi,k : Indicator function
  55. 55. Propose Method 66 min W ,✓,b 1 2 kW k2 F + 1 2 KX k=1 k✓kk2 2 + Cs nX i=1 KX k=1 ⇠s i,k s.t. ys i,k✓T k xs i bk 1 ⇠s i.k yt j,k✓T k W xt j 1 ⇠t j,k ⇠s i,k 0, ⇠t j,k 0 Constraint of close transformed target to source. +Ct mX j=1 KX k=1 ⇠t j,k + 1 2 D MX i=1 NX j=1 yi,jk(W xt i xs j)k2 2 Ø Add L2 distance constraints to MMDT. Ø Our method ensures classification result and transformed feature distributions. Max-Margin Domain Transfer with L2 Distance Constraints (MMDTL2)
  56. 56. Decompose to Sub-problem 67 Hoffman et al. decompose objective function to 2 sub-problem in MMDT. Our method as well decomposes objective functions in below. Objective function optimize by iterate (1) and (2). min ✓,⇠s,⇠t 1 2 KX k=1 k✓kk2 2 + Cs NX i=1 KX k=1 ⇠s i,k + Ct MX j=1 KX k=1 ⇠t j,k(1) Constraintof close transformed target to source. (2) min W ,⇠t 1 2 kW k2 F + Ct MX j=1 KX k=1 ⇠t j,k + 1 2 D MX i=1 MX j=1 yi,jkW xt i xs jk2 2 Objective function for optimize SVM parameter. Objective function for optimize transform matrix. s.t. ys i,k✓T k xs i bk 1 ⇠s i.k yt j,k✓T k W xt j 1 ⇠t j,k ⇠s i,k 0, ⇠t j,k 0
  57. 57. Primal Problem 68 U(x) = 2 6 6 6 4 xxT xxT ... xxT 3 7 7 7 5 vi,j = vec(xs j(xt i)T ) w = vec(W ) (x) = vec(✓xT ) min w,⇠t 1 2 kwk2 2 + Ct MX j=1 KX k=1 ⇠t j,k + 1 2 D MX i=1 MX j=1 wT U(xt i)w 2vT ijw + (xt i)T xs j(2) s.t. ⇠t i 0 yt i,k T k (xt i)w 1 ⇠t i,k Derivate from objective function for optimize transform matrix. This is standard quadratic programming but… p High computational costs. p Need to huge memory. p Depend on dimensions of data. Derivate dual problem.
  58. 58. Dual Problem 69 s.t. 0  ai  CT MX i=1 aiyt i,k = 0 max a 1 2 KX k1=1 KX k2=1 MX i=1 MX j=1 aiajyt i,k1 yt j,k2 T k1(xt i)V 1 k2 (xt j) + KX k=1 MX i=1 ai 1 D T k (xt i)V 1 MX m=1 NX n=1 ym,nvi,j !! (2) p Low computational cost. p Defined by sparse problem. p Depend on number of target data. ai: Lagrange multiplier Dual problem has many advantages. V = 0 @I + D MX i=1 NX j=1 yi,jU(xt i) 1 A
  59. 59. Comparison Primal with Dual of Computation Time 70 SetupTime: computation time for coefficients(e.g. and ). OptimizationTime: optimization time for solving quadratic programming CalculationTime: computation time in from (dual only). U(x) vi,j w a 3riPDO DuDO 0 1000 2000 3000 4000 5000 6000 7000 CoPSutDtion7iPe 6etuS7iPe 2StiPizDtion7iPe CDOcuODtion7iPe Visual Words:128 About 14 times faster
  60. 60. Result 71 MMDTL2 achieve good performance as equivalent with baseline. But Not transfer is the best performance. 8 16 32 64 128 256 512 1024 # Rf 9LVuDl WRrdV 0.4 0.5 0.6 0.7 0.8 0.9 1.05ecRgnLWLRnrDWe BDVelLne 6Rurce Rnly 1RW WrDnVfer 00D7 00D7L2
  61. 61. :14.7[fps] A B C3 • Visual Word Histogram • (A or B or C3) • A, B, C3 • : SVM 22 6 … 91 87 … • • SIFT 120[pix.] 120[pix.] A B C3
  62. 62. 73time probability 0 1 A B C3
  63. 63. Objective 74 処理用PCをNBI内視鏡と接続し, オンラインでの認識を可能とする システム構成 *2 http://www.genkosha.com/vs/news/entry/sdi.html*1 http://www.olympus.co.jp/jp/news/2012b/nr121002luceraj.jsp 開発環境 Visual Studio 2012(製品版),OpenCV 3.0-devel, VLFeat 0.9.18, Boost 1.55.0,DeckLink SDK 10.0 OS:Windows 7 Home Premium SP1 64bit CPU:Intel Core i7-4470 3.40GHz Memory:16GB OLYMPUS製 EVIS LUCERAELITE Blackmagic製 DeckLink SDI NBI内視鏡*1 キャプチャボード*2 処理用PC SDI PCI Express
  64. 64. Capture ~Setup~ 75 NBI内視鏡 処理用PC NBIスコープ
  65. 65. Capture ~demo & performance~ 76 NBI内視鏡 の画面 処理用PC の画面 色変換処理 特徴量抽出
  66. 66. 77time probability 0 1 A B C3
  67. 67. [ ] 0 0.5 1 0 50 100 150 200 Probability フレーム番号 Type A Type B • • Type A Type B Type C3
  68. 68. MRF/HMM f x y( )∝exp A xi, yi( ) i ∑ # $ % & ' (⋅exp I xi, xj( ) j∈Ni ∑ # $ %% & ' (( x: y: SVM x1 x50………… x100 x150 x200………… ………… ………… B B BC3 C3 i0 50 100 200150 …… …… …… …… y1 y50 y100 y150 y200
  69. 69. Type B (original) frame number 0 20 40 60 80 100 120 140 160 180 200 Type B (DP_0.8) frame number 0 20 40 60 80 100 120 140 160 180 200 Type B (DP_0.9) frame number 0 20 40 60 80 100 120 140 160 180 200 Type B (DP_0.99) Type B (original) frame number 0 20 40 60 80 100 120 140 160 180 200 Type B (DP_0.8) frame number 0 20 40 60 80 100 120 140 160 180 200 Type B (DP_0.9) frame number 0 20 40 60 80 100 120 140 160 180 200 Type B (DP_0.99) frame number 0 20 40 60 80 100 120 140 160 180 200 Type B (DP_0.999) frame number 0 20 40 60 80 100 120 140 160 180 200 Type B (original) frame number 0 20 40 60 80 100 120 140 160 180 200 Type B (Gibbs_p4=0.6) frame number 0 20 40 60 80 100 120 140 160 180 200 Type B (Gibbs_p4=0.7) frame number 0 20 40 60 80 100 120 140 160 180 200 Type B (Gibbs_p4=0.8) frame number 0 20 40 60 80 100 120 140 160 180 200 Type B (Gibbs_p4=0.9) frame number 0 20 40 60 80 100 120 140 160 180 200 0 0.5 1 B A C 20 40 60 80 100 120 140 160 180 200 Type B 0 0.5 1 A B C 20 40 60 80 100 120 140 160 180 200 Type A_1 (original) frame number 0 20 40 60 80 100 120 140 160 180 200 Type A_1 (DP_0.99) frame number 0 20 40 60 80 100 120 140 160 180 200 Type A_1 (Gibbs_p4=0.9) frame number 0 20 40 60 80 100 120 140 160 180 200 Type A Type A Type B Type C3 MAP (C3 ) ( )
  70. 70. A B C3 Type A Type B Type C3 MRF 0 0.5 1 0 50 100 150 200 Probability Type A Type B Type C3
  71. 71. Type A Type B Type A Type B Type A Type B Type C3
  72. 72. Colorectal Tumor Classification System in Magnifying Endoscopic NBI Images [Tamaki et al., MedIA2013] Recognizing colorectal image p Feature: Bag-of-Visual-Words of densely sampled SIFT p Classifier: Linear SVM 83 Extended to video frames Display posterior probabilities at each frame. 0 0.5 1 251 271 291 311 331 351 371 391 411 431 Probability Frame number A B C 0 20 40 60 80 120100 140 160 180 200 Highly unstable classification results
  73. 73. Possible Cause of Instability 84 p Classification results would be affected by out of focus. number of visual words RecognitionRate[%] ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●no defoucs SD = 0.5 SD = 1 SD = 2 SD = 3 SD = 5 SD = 7 SD = 9 SD = 11 10 100 1000 10000 0.00.20.40.60.81.0 p Test image: 1191 Ø Test images are added Gaussian blur with different SD. p Train image: 480 Ø 160 images for each class Smaller SD Larger Recognition results for out of focus images
  74. 74. Particle Filter (Online Bayesian Filtering) 85 State vector: Observation vector: t : time p (xt | y1:t 1) = Z p (xt | xt 1, ✓1) p (xt 1 | y1:t 1) dxt 1 Prediction State transition We use Dirichlet distribution for state transition and likelihood. Update Likelihood p (xt | y1:t) / p (yt | xt, ✓2) p (xt | y1:t 1) yt = ⇣ y (A) t , y (B) t , y (C3) t ⌘ , y (A) t + y (B) t + y (C3) t = 1 xt = ⇣ x (A) t , x (B) t , x (C3) t ⌘ , x (A) t + x (B) t + x (C3) t
  75. 75. Dirichlet distribution 86 (0.50, 0.50, 0.50) (0.85, 1.50, 2.00) (1.00, 1.00, 1.00) (1.00, 1.76, 2.35) (4.00, 4.00 ,4.00) (3.40, 6.00, 8.00) low high Dirx[↵] = ( PN i=1 ↵i) QN i=1 (↵i) NY i=1 x↵i 1 i parameter of distribution: ↵ (x) = ax + b
  76. 76. Problem & Our Approach 87 xt 1 xt xt+1 yt 1 yt yt+1zt+1zt 1 zt t t+1t 1 xt 1 xt+1xt ytyt 1 yt+1 ✓2 Dirichlet Particle Filter (DPF) Defocus-aware Dirichlet Particle Filter (D-DPF) Prediction p (xy | y1:t 1, 1:t 1, z1:t 1) = Z p (xt | xt 1, ✓1)p (xt 1 | y1:t 1, 1:t 1, z1:t 1)dxt 1 State transition p (xt | y1:t, 1:t, z1:t) / p (yt, t, zt | xt) p (xt | y1:t 1, 1:t 1, z1:t 1) Update Likelihood p (yt, t, zt | xt) = p (yt, xt, t) p (zt | t)
  77. 77. Isolated Pixel Ratio (IPR) [Oh et al., MedIA2007] 88 Endoscopic image Edges pixels by Canny edge detector Clear edge Defocus edge Edge pixel Non-edge pixel Edge and isolated pixel IPR: the percentage of isolated pixel in every edge pixels
  78. 78. Isolated pixel value (IPR) frequency 0.000.020.040.060.080.10 0 0.005 0.01 0.015 γt Density 0 2 4 6 8 10 0.00.20.40.60.81.01.2 sigma = 0.5 sigma = 1 sigma = 2 sigma = 3 sigma = 4 Dirichlet distribution Modeling with Rayleigh dist. and IPR 89 Rayx [ ] = x 2 exp ✓ x2 2 2 ◆ Defocus Clear γt 0.000 0.005 0.010 0.015 0.51.01.52.02.53.03.54.0 zt σ(zt) ● ● (zt) = 4 exp(100 log(0.25)zt) p (zt | t) = Ray t [ (zt)]
  79. 79. Sequential filtering 90 Prediction p (xy | y1:t 1, 1:t 1, z1:t 1) = Z p (xt | xt 1, ✓1)p (xt 1 | y1:t 1, 1:t 1, z1:t 1)dxt 1 p (xt | y1:t, 1:t, z1:t) / p (yt, t, zt | xt) p (xt | y1:t 1, 1:t 1, z1:t 1) Update xt 1 xt xt+1 yt 1 yt yt+1zt+1zt 1 zt t t+1t 1 p (yt, t, zt | xt) = p (yt, xt, t) p (zt | t) p (yt, xt, t) = Dirxt [↵2 (yt, t)] p (zt | t) = Ray t [ (zt)] p (xt | xt 1, ✓1) = Dirxt [↵1(xt 1, ✓1)]
  80. 80. The performance for defocus frames 91 0 100 200 300 400 500 6000.00.51.0 0 100 200 300 400 500 600 0.00.51.0 0 100 200 300 400 500 600 0.0000.0050.010 0 100 200 300 400 500 600 0.00.51.0 0 100 200 300 400 500 600 0.00.51.0 Frame number Ground truth Observation IPR Result by DPF Result by D-DPF
  81. 81. Smoothing result for an actual NBI video 92 No smoothing result Smoothing result Type A Type B Type C3
  82. 82. Summary • NBI • Baseline: SIFT + Bag-of-Visual Words • • – self-training – sampling – domain adaptation / transfer learning • / – MRF/HMM –

×