4. RQ2. 学術論文や技術文書はMLシステム
の設計をどのように扱っているのか?
• 方法:体系的文献調査
– 学術論文メタサーチエンジンEngineering Village
– 一般サーチエンジンGoogle
• 結果: 学術論文19編 s1-s10 & a1-a9、技術文書19編
g1-g19
4
((((system) OR (software)) AND (machine learning) AND (implementation pattern) OR
(pattern) OR (architecture pattern) OR (design pattern) OR (anti-pattern) OR (recipe) OR
(workflow) OR (practice) OR (issue) OR (template))) WN ALL) + ((cpx OR ins OR kna) WN
DB) AND (({ca} OR {ja} OR {ip} OR {ch}) WN DT)
(system OR software) "Machine learning" (pattern OR "implementation pattern" OR
"architecture pattern" OR "design pattern“ OR anti-pattern OR recipe OR workflow OR
practice OR issue OR template)
"machine implementation pattern" OR "architecture pattern" OR "design pattern“ OR anti-
pattern OR recipe OR workflow OR practice OR issue OR template
Engineering Village
Google
6. 6
s1 Developing machine learning products better
and faster at startups
s2 Hidden technical debt in machine learning
systems
s3 Machine learning at facebook: Understanding
inference at the edge
s4 Continuous integration of machine learning
models with ease.ml/ci: Towards a rigorous yet
practical treatment
s5 Uncertainty in machine learning applications: A
practice-driven classification of uncertainty
s6 Software architecture of a learning apprentice
system in medical billing
s7 Cleartk 2.0: Design patterns for machine
learning in UIMA
s8 Solution patterns for machine learning
s9 A survey on security threats and defensive
techniques of machine learning: A data driven
view
s10 Machine learning system architectural pattern
for improving operational stability
a1 Trials and tribulations of developers of
intelligent systems: A field study
a2 A methodology to involve domain experts
and machine learning techniques in the
design of human-centered algorithms
a3 Scaling distributed machine learning with the
parameter server
a4 Machine learning software engineering in
practice: An industrial case study
a5 Software engineering for machine learning: a
case study
a6 Integrated machine learning in the kepler
scientific workflow system
a7 Deep convolutional neural network design
patterns
a8 Practical Machine Learning
a9 Stream Analytics with Microsoft Azure: Real-
time data processing for quick insights using
Azure Stream Analytics
7. 7
g1 Scaling machine learning at uber with
michelangelo
g2 Federated learning: Collaborative machine
learning without centralized training data
g3 Design patterns for deep learning
g4 Design patterns for machine learning in
production
g5 Patterns (and anti-patterns) for developing
machine learning systems
g6 The mvc for machine learning: Datamodel-
learner (dml)
g7 Rules of machine learning: Best practices for
ml engineering
g8 A design pattern for machine learning with
scala, spray and spark
g9 Closed-loop intelligence: A design pattern for
machine learning
g10 A design pattern for explainability and
reproducibility in production ml
g11 Top trends: Machine learning,
microservices, containers, kubernetes,
cloud to edge. what are they and how do
they fit together?
g12 Daisy architecture
g13 Event-driven architecture
g14 Demystifying data lake architecture
g15 Exploring development patterns in data
science
g16 Architecture of data lake
g17 From insights to value - building a modern
logical data lake to drive user adoption and
business value
g18 Lambda architecture pattern
g19 Busting event-driven myths
8. RQ3. MLアーキテクチャ・デザインパターン
はどのように分類できるか?
• 方法: 様々な代表的プロセスの検討
• 結果: MLパイプラインプロセスおよびISO/IEC 12207:2008ソフ
トウェアライフサイクルプロセスによる分類
8
MLパイプラインプロセス (Microsoft)
S. Amershi, et al., “Software engineering for machine learning: a case study,” 41st International
Conference on Software Engineering: Software Engineering in Practice, ICSE (SEIP) 2019
Requirements
Analysis
Architectural
Design
Detailed
Design
Construction
Integration
Qualification
Testing
Software Implementation Process (ISO/IEC 12207:2008)
10. 10
a04
Separation of Concerns and
Modularization of ML Components
g02aFederated Learning
g05 Handshake or Hand Buzzer
g07a
Test the infrastructure independently
from the machine learning
g07b
Reuse Code between Training
Pipeline and Serving Pipeline
g08 Data-Algorithm-Serving-Evaluator
g09 Closed-Loop Intelligence
g10 Canary Model
g12 Daisy Architecture
g13 Event-driven ML Microservices
g15bMicroservice Architecture
g16 Data Lake
g17 Kappa Architecture
g18 Lambda Architecture
s02d
Design Holistically about Data
Collection and Feature Extraction
s02f
Reexamine Experimental Branches
Periodically
s02h Parameter-Server Abstraction
s02j Descriptive Data Type for Rich Information
s03a
Decouple Training Pipeline from
Production Pipeline
s03b ML Versioning pattern
s10a Distinguish Business Logic from ML Models
s10b Gateway Routing Architecture
MLアーキテクチャパターン MLアーキテクチャパターン
s05 Isolate and Validate Output of Model
s02b
Wrap Black-Box Packages into
Common APIs
g02bSecure Aggregation
MLデザインパターン
s02l Undeclared Consumers
s02k Multiple-Language Smell
s02i Plain-Old-Data Type Smell
s02g Abstraction Debt
s02e Dead Experimental Codepaths
s02c Pipeline Jungles
s02a Glue Code
g15a Big Ass Script Architecture
MLアンチパターン
11. a04. Separation of Concerns and
Modularization of ML Components
ML構成要素群における関心事の
分離およびモジュール化
• 問題: 要求やデータの変化に応じてML構成
要素群は、他のシステム構成要素よりも変わ
りやすい。
• 解決: ML構成要素群においてモジュール化を
進めて、関心事の分離および再利用しやすく
する。
11