Project

General

Profile

Actions

Feature #4585

closed

ARM NEON optimizied TS 05.03 encoding/decoding

Added by laforge almost 4 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
libosmocore
Target version:
-
Start date:
06/05/2020
Due date:
% Done:

100%

Spec Reference:

Description

We'd like to implement a NEON optimized version of our TS 05.03 convolutional encoding/decoding routines.

libosmocore already has generic C and SSE/AVX optimized versions today (see source:src/conv_acc_sse.c, source:src/conv/acc_sse_avx.c, source:src/conv_acc_sse_impl.h). Ideally the same infrastructure is re-used to introduce the ARM NEON optimized version.

(assigning to myself until we have figured out who and how).

Side note: There may also be other parts beyond the convolutional code that could benefit from NEON, such as the interleaving/deinterleaving, burst mapping. It may be worth looking at that afterwards.


Files

neonlib_vs_neontest.log neonlib_vs_neontest.log 12.7 KB Hoernchen, 07/23/2020 02:25 PM
standardlib_vs_neontest.log standardlib_vs_neontest.log 12.7 KB Hoernchen, 07/23/2020 02:25 PM
Actions #1

Updated by laforge almost 4 years ago

  • Priority changed from Normal to High
Actions #3

Updated by laforge almost 4 years ago

  • Assignee changed from laforge to Hoernchen
Actions #4

Updated by Hoernchen over 3 years ago

Patch is in https://gerrit.osmocom.org/c/libosmocore/+/19372

Requires building libosmocore with --enable-neon configure flag

Benchmark code lives at https://github.com/Hoernchen/osmo-conv-test

Attached are two benchmark runs, original generic libosmosdr vs benchmark with neon, as well as modified libosmocore that used the neon code vs benchmark with the same neon code, to ensure performance actually matches the benchmark.

Speedup is 1.3-1.5 which is quite reasonable considering SSE only manages ~2-3, n2k5 needs a special case because generic is faster than NEON for very short RACH with len ~14 (other runs are len > 100).

Actions #5

Updated by fixeria over 3 years ago

  • Status changed from New to Resolved

The patch has been merged today, and caused massive [master] build failures on rpi4-deb9build-ansible:

/usr/bin/ld: ../src/.libs/libosmocore.so: undefined reference to `osmo_conv_neon_metrics_k5_n2'
/usr/bin/ld: ../src/.libs/libosmocore.so: undefined reference to `osmo_conv_neon_vdec_free'
/usr/bin/ld: ../src/.libs/libosmocore.so: undefined reference to `osmo_conv_neon_metrics_k7_n3'
/usr/bin/ld: ../src/.libs/libosmocore.so: undefined reference to `osmo_conv_neon_metrics_k5_n3'
/usr/bin/ld: ../src/.libs/libosmocore.so: undefined reference to `osmo_conv_neon_metrics_k7_n2'
/usr/bin/ld: ../src/.libs/libosmocore.so: undefined reference to `osmo_conv_neon_metrics_k5_n4'
/usr/bin/ld: ../src/.libs/libosmocore.so: undefined reference to `osmo_conv_neon_metrics_k7_n4'
/usr/bin/ld: ../src/.libs/libosmocore.so: undefined reference to `osmo_conv_neon_vdec_malloc'
collect2: error: ld returned 1 exit status

I found and fixed the problem:

https://gerrit.osmocom.org/c/libosmocore/+/19545 configure.ac: fix: do not define HAVE_NEON unconditionally

and submitted a few more cosmetic / misc changes:

https://gerrit.osmocom.org/c/libosmocore/+/19543 src/Makefile.am: add conv_acc_neon_impl.h to EXTRA_DIST
https://gerrit.osmocom.org/c/libosmocore/+/19544 configure.ac: clarify description of --enable-neon
https://gerrit.osmocom.org/c/libosmocore/+/19546 configure.ac: print ARM NEON instructions support status

Tested on my ARMv6 based RPi. Closing this ticket.

Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)