性能

此文档罗列了在 Android 和 iOS 设备上运行一些经典的模型时 TensorFlow Lite 性能基准。

这些性能基准值由 Android TFLite benchmark binary 和 iOS benchmark app 生成。

Android 性能基准

在测试 Android 基准时，为了减少差异 CPU 关联都设定到使用设备上最多的核心（详细请查看）

它假定模型都是下载并解压到 /data/local/tmp/tflite_models 目录下。基准程序按照这些说明构建并假定位于 /data/local/tmp 目录下。

使用以下命令运行基准程序：

adb shell taskset ${CPU_MASK} /data/local/tmp/benchmark_model \
  --num_threads=1 \
  --graph=/data/local/tmp/tflite_models/${GRAPH} \
  --warmup_runs=1 \
  --num_runs=50 \
  --use_nnapi=false

这里，${GRAPH} 是模型的名字，${CPU_MASK} 是按照下表选择的 CPU 关联：

设备	CPU_MASK
Pixel 2	f0
Pixel xl	0c

模型名	设备	推理所用平均时间（std dev）
Mobilenet_1.0_224(float)	Pixel 2	166.5 ms (2.6 ms)
Mobilenet_1.0_224(float)	Pixel xl	122.9 ms (1.8 ms)
Mobilenet_1.0_224 (quant)	Pixel 2	69.5 ms (0.9 ms)
Mobilenet_1.0_224 (quant)	Pixel xl	78.9 ms (2.2 ms)
NASNet mobile	Pixel 2	273.8 ms (3.5 ms)
NASNet mobile	Pixel xl	210.8 ms (4.2 ms)
SqueezeNet	Pixel 2	234.0 ms (2.1 ms)
SqueezeNet	Pixel xl	158.0 ms (2.1 ms)
Inception_ResNet_V2	Pixel 2	2846.0 ms (15.0 ms)
Inception_ResNet_V2	Pixel xl	1973.0 ms (15.0 ms)
Inception_V4	Pixel 2	3180.0 ms (11.7 ms)
Inception_V4	Pixel xl	2262.0 ms (21.0 ms)

iOS 基准

为了测试 iOS 基准，修改了 benchmark app 以包含合适的模型并且benchmark_params.json 中 num_threads 设定为 1。

模型名	设备	推理所用平均时间（std dev）
Mobilenet_1.0_224(float)	iPhone 8	32.2 ms (0.8 ms)
Mobilenet_1.0_224 (quant)	iPhone 8	24.4 ms (0.8 ms)
NASNet mobile	iPhone 8	60.3 ms (0.6 ms)
SqueezeNet	iPhone 8	44.3 (0.7 ms)
Inception_ResNet_V2	iPhone 8	562.4 ms (18.2 ms)
Inception_V4	iPhone 8	661.0 ms (29.2 ms)

性能 ​

Android 性能基准 ​

iOS 基准 ​

性能

Android 性能基准

iOS 基准