TensorFlow Lite Delegateとは？

TensorFlow Lite
Delegate とは？
作成：2019.11.17 
＠Vengineer

左のツイートにあるよう
に、「TensorFlow Lite
Delegate」が一番興味が
あるようでしたので、資料
を公開します。
なお、この資料は
TensorFlow r2.0のソー
スコードをベースに作成し
ました。

TensorFlow Lite とは？
TensorFlow Lite は、TensorFlowのモデルを
デバイス上(mobile / embedded / IoT)での推論を可能にする、
オープンソースのディープラーニングフレームワーク
デバイス上で推論すると、こんなにいいことが！
● レイテンシー：サーバーにデータを送らないので！
● プライバシー：データをデバイスに残す必要がない！
● コネクティビティ：インターネットに接続しなくていい！
● 電力消費：ネットワーク接続は電気食い
TensorFlow Lite for Microcontrollers のお話は、今回しません

TensorFlow Lite を利用するステップ
● モデルの選択：学習する・学習済み
○ 学習はクラウド : Google Colaboratory (GPU, TPUも使える)
○ 学習済みモデルもたんまり
● 変換：TensorFlow => TensorFlow Lite
○ TensorFlow Lite Converter
● 最適化
○ 量子化 (8bit int, 16bit ﬂoat), delegate
● デプロイ：専用の推論エンジン
○ TensorFlow Lite Interpreter
TensorFlow Lite guide を読めばいい！(一部は日本語に訳されているよ)

TensorFlow Lite Delegates
　 Delegate API は、まだ experimental なので、今後変更有
Delegate は、TensorFlow Liteのグラフの一部/全体をCPU以外の Executor
に委譲 (delegate) するもの
● GPU ：OpenGL ES v3.1 / OpenCL / Metal (iOS)
● AIアクセラレータ：Google Edge TPU / Google Pixel Neural Core
● Android NN APIs ：v1.0 (8.1) / v1.1 (9.0) / v1.2 (10.0)
● Flex ：TensorFlow Select
● その他：Arm NN (PR#33436)

TensorFlow Lite 2019 Roadmap
https://www.tensorﬂow.org/lite/guide/roadmap
● Updated NN API support
　　Full support for new Android Q NN API features, ops and types
● GPU backend optimizations
　　OpenCL and Vulkan support on Android
　　Metal and Objective-C CocoaPods for Metal acceleration
● Hexagon DSP backend
　　Initial release of DSP acceleration for pre-Android P devices

TensorFlow LiteDelegate documentation
https://www.tensorflow.org/lite/performance/delegates
Run inference with TensorFlow Lite in Python
https://coral.withgoogle.com/docs/edgetpu/tflite-python/
Python quickstart (tflite_runtime というのがある)
https://www.tensorflow.org/lite/guide/python
Edge TPU inferencing overview
https://coral.withgoogle.com/docs/edgetpu/inference/
Run inference with TensorFlow Lite in C++ (Edge TPU)
https://coral.withgoogle.com/docs/edgetpu/tflite-cpp/

CPU : Delegateなし、基本パターン
// TensorFlow Liteモデルをファイルから読み込み
　auto model = FlatBuﬀerModel::BuildFromFile("interesting_model.tﬂite");
// 推論エンジンにモデルをロードする
ops::builtin::BuiltinOpResolver op_resolver;
std::unique_ptr<Interpreter> interpreter;
InterpreterBuilder(*model, op_resolver)(&interpreter);
　// メモリの割り当てをする
interpreter->AllocateTensors();
// .... (Prepare input tensors) 入力データの準備
　// 推論の実行
interpreter->Invoke();
// .... (Retrieve the result from output tensors) 出力データを取り込む

GPU Delegate : v1 (OpenGLES)
　// モデルの読み込み、推論エンジンへのモデルのロードまでは同じ
const TfLiteGpuDelegateOptions options = {
.metadata = NULL,
.compile_options = {
.precision_loss_allowed = 1, // FP16
.preferred_gl_object_type = TFLITE_GL_OBJECT_TYPE_FASTEST,
.dynamic_batch_enabled = 0, // Not fully functional yet
},
};
auto* delegate = TfLiteGpuDelegateCreate(&options);
interpreter->ModifyGraphWithDelegate(delegate); Delegateの設定
　// 推論エンジンの実行も同じ

GPU Delegate : v2 (OpenCL => OpenGLES)
　// v1 と違って、option はデフォルトでは必要なくなった
　// v2 では、最初に OpenCL をチェックして、なければ、 OpenGL ES v3.1 を実行
auto* delegate = TfLiteGpuDelegateV2Create(/*default options=*/nullptr);
interpreter->ModifyGraphWithDelegate(delegate);

Google Edge TPU Delegate
size_t num_devices;
std::unique_ptr<edgetpu_device, decltype(&edgetpu_free_devices)> devices(
edgetpu_list_devices(&num_devices), &edgetpu_free_devices);
const auto& device = devices.get()[0];
　// 専用APIにて、Edge TPUのDelegateを獲得
auto* delegate = edgetpu_create_delegate(device.type, device.path, nullptr, 0);
interpreter->ModifyGraphWithDelegate({delegate, edgetpu_free_delegate});
//edgetpu_free_delegate があるのは次ページ

もうひとつのModifyGraphWithDelegate
// Owning handle to a TfLiteDelegate instance.
using TfLiteDelegatePtr = std::unique_ptr<TfLiteDelegate, void (*)(TfLiteDelegate*)>;
/// Same as ModifyGraphWithDelegate except this interpreter takes
/// ownership of the provided delegate. Be sure to construct the unique_ptr
/// with a suitable destruction function.
/// WARNING: This is an experimental API and subject to change.
TfLiteStatus ModifyGraphWithDelegate(TfLiteDelegatePtr delegate);

Android NN APIs Delegate
auto* delegate = NnApiDelegate();
　// DEPRECATED: Please use StatefulNnApiDelegate class instead.
TfLiteDelegate* NnApiDelegate() {
static StatefulNnApiDelegate* delegate = new StatefulNnApiDelegate();
return delegate;
}

Flex Delegate：TensorFlow Select時
auto* delegate = FlexDelegate::Create();

TensorFlow Select
TensorFlow Liteに、TensorFlowのOpを使えるようにするためのものです。
「Select TensorFlow operators to use in TensorFlow Lite」
このビデオ「Inside TensorFlow: TensorFlow Lite」のModel Conversionの説明部分。
converter をビルドするときに、 --define=tflite_convert_with_select_tf_ops=true を指定する
bazel run --define=tflite_convert_with_select_tf_ops=true tflite_convert --
--output_file=/tmp/foo.tflite
--graph_def_file=/tmp/foo.pb
--input_arrays=input
--output_arrays=MobilenetV1/Predictions/Reshape_1
--target_ops=TFLITE_BUILTINS,SELECT_TF_OPS

ArmNN Delegate
Add initial minimal ArmNN delegate plugin. #33436
https://github.com/GeorgeARM/tensorflow/tree/armnn_delegate/tensorflow/lite/delegates/armnn
The following kernels are currently off-loaded:
- Pool2d
- AvgPool
- MaxPool
- L2Pool
- Conv2d
- Depthwise Conv2d
- Softmax
- Squeeze

ArmNN Delegate
　Interpreter::TfLiteDelegatePtr CreateArmNNDelegate(ArmNNDelegate::Options options) {
　 return Interpreter::TfLiteDelegatePtr(
new ArmNNDelegate(options), [](TfLiteDelegate* delegate) {
delete reinterpret_cast<ArmNNDelegate*>(delegate);
});
　}
　ArmNNDelegate::Options opts;
　opts.backend_name = "CpuRef";
　auto delegate = CreateArmNNDelegate(opts));
　interpreter->ModifyGraphWithDelegate(delegate);

Qualcomm Hexagon Delegate
Tﬂite Qualcomm DSP acceleration #29028 ?
NPE hardware acceleration support in Qualcomm chips #17526 ?

InterpreterInvoke
　モデル、Delegate、入力データ、出力データがあれば、この関数でOK!
　Status InterpreterInvoke(const ::tﬂite::Model* model,
　 TfLiteDelegate* delegate,
　 const std::vector<TensorFloat32>& inputs,
　 std::vector<TensorFloat32>* outputs);

InterpreterInvoke
auto model = FlatBuﬀerModel::BuildFromFile("interesting_model.tﬂite);
auto* delegate = Delegateを生成する
std::vector<TensorFloat32>& inputs;
std::vector<TensorFloat32>* outputs;
InterpreterInvoke(model, delegate, inputs, outputs);
　// 推論エンジンの実行は同じ

Edge TPU の例
interpreter = Interpreter(model_path=args.model_file)
を
interpreter = Interpreter(model_path=args.model_file,
　　　experimental_delegates=[load_delegate('libedgetpu.so.1.0')])
に変更するだけ。
from tensorflow.lite.python.interpreter import load_delegate
で、load_delegate を取り込めばいい。

lite.experimental.load_delegate
@_tf_export('lite.experimental.load_delegate')
def load_delegate(library, options=None):
"""Returns loaded Delegate object.
Args:
library: Name of shared library containing the
[TfLiteDelegate](https://www.tensorﬂow.org/lite/performance/delegates).
options: Dictionary of options that are required to load the delegate. All
keys and values in the dictionary should be convertible to str. Consult
the documentation of the speciﬁc delegate for required and legal options.
(default None)
Returns:
Delegate object.
Raises:
ValueError: Delegate failed to load.
RuntimeError: If delegate loading is used on unsupported platform.
"""

lite.experimental.load_delegate
# TODO(b/137299813): Fix darwin support for delegates.
if sys.platform == 'darwin':
raise RuntimeError('Dynamic loading of delegates on Darwin not supported.')
try:
delegate = Delegate(library, options)
except ValueError as e:
raise ValueError('Failed to load delegate from {}n{}'.format(
library, str(e)))
return delegate

Delegate クラス
class Delegate(object):
"""Python wrapper class to manage TfLiteDelegate objects.
The shared library is expected to have two functions:
TfLiteDelegate* tflite_plugin_create_delegate(
char**, char**, size_t, void (*report_error)(const char *))
void tflite_plugin_destroy_delegate(TfLiteDelegate*)
The first one creates a delegate object. It may return NULL to indicate an
error (with a suitable error message reported by calling report_error()).
The second one destroys delegate object and must be called for every
created delegate object. Passing NULL as argument value is allowed, i.e.
tflite_plugin_destroy_delegate(tflite_plugin_create_delegate(...))
always works.
"""

lite.Interpreter
@_tf_export('lite.Interpreter')
class Interpreter(object):
def __init__(self,
model_path=None,
model_content=None,
experimental_delegates=None):
"""Constructor.
Args:
model_path: Path to TF-Lite Flatbuffer file.
model_content: Content of model.
experimental_delegates: Experimental. Subject to change. List of
[TfLiteDelegate](https://www.tensorflow.org/lite/performance/delegates)
objects returned by lite.load_delegate().
Raises:
ValueError: If the interpreter was unable to create.
"""

lite.Interpreter
# Each delegate is a wrapper that owns the delegates that have been loaded
# as plugins. The interpreter wrapper will be using them, but we need to
# hold them in a list so that the lifetime is preserved at least as long as
# the interpreter wrapper.
self._delegates = []
if experimental_delegates:
self._delegates = experimental_delegates
for delegate in self._delegates:
self._interpreter.ModifyGraphWithDelegate(
delegate._get_native_delegate_pointer()) # pylint: disable=protected-access
Edge TPU CompilerがUpdate
https://vengineer.hatenablog.com/entry/71953852

tflite_plugin_create_delegate
extern "C" {
TfLiteDelegate* tflite_plugin_create_delegate(char** options_keys,
char** options_values,
size_t num_options,
ErrorHandler error_handler) {
return new PosenetDelegateForCustomOp();
}
void tflite_plugin_destroy_delegate(TfLiteDelegate* delegate) {
delete static_cast<PosenetDelegateForCustomOp*>(delegate);
}

Flex Delagete：libflexdelegate.so
#include "tensorflow/lite/delegates/flex/delegate.h"
namespace {
typedef void (*ErrorHandler)(const char*);
} // namespace
extern "C" {
TfLiteDelegate* tflite_plugin_create_delegate(char** options_key, char** options_values,
size_t num_options, ErrorHandler error_handler) {
std::unique_ptr<tflite::FlexDelegate> delegate = tflite::FlexDelegate::Create();
return delegate.get();
}
void tflite_plugin_destroy_delegate(TfLiteDelegate* delegate) {
delete static_cast<tflite::FlexDelegate*>(delegate);
}
}
デフォルトでは、ビルドされない

TensorFlow Lite Python Runtime
https://www.tensorﬂow.org/lite/guide/python
TensorFlow Lite (v1.14) の Python Runtime (Python 3.5/3.6/3.7)のパッケージが用意されている
　Raspberry Piシリーズにも簡単にインストールできる
　$ pip3 install tflite_runtime-1.14.0-cp37-cp37m-linux_armv7l.whl
　import tensorflow as tf
　interpreter = tf.lite.Interpreter(model_path=args.model_file)
　を
　import tflite_runtime.interpreter as tflite
　interpreter = tflite.Interpreter(model_path=args.model_file)
　に

あたしは、
ディープラーニング職人ではありません
コンピュータエンジニアです
 
 
ありがとうございました
＠Vengineer
ソースコード解析職人

TensorFlow Lite Delegateとは？

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie TensorFlow Lite Delegateとは？

Ähnlich wie TensorFlow Lite Delegateとは？ (20)

Mehr von Mr. Vengineer

Mehr von Mr. Vengineer (20)

TensorFlow Lite Delegateとは？