Kaijie Wei - An FPGA off-loading of HARK sound source localization

Journal article

Zhongyang Hou, Kaijie Wei, H. Amano, K. Nakadai
2022 Tenth International Symposium on Computing and Networking Workshops (CANDARW), 2022

Semantic Scholar DBLP DOI

Cite

APA Click to copy
Hou, Z., Wei, K., Amano, H., & Nakadai, K. (2022). An FPGA off-loading of HARK sound source localization. 2022 Tenth International Symposium on Computing and Networking Workshops (CANDARW).

Chicago/Turabian Click to copy
Hou, Zhongyang, Kaijie Wei, H. Amano, and K. Nakadai. “An FPGA off-Loading of HARK Sound Source Localization.” 2022 Tenth International Symposium on Computing and Networking Workshops (CANDARW) (2022).

MLA Click to copy
Hou, Zhongyang, et al. “An FPGA off-Loading of HARK Sound Source Localization.” 2022 Tenth International Symposium on Computing and Networking Workshops (CANDARW), 2022.

BibTeX Click to copy

@article{zhongyang2022a,
  title = {An FPGA off-loading of HARK sound source localization},
  year = {2022},
  journal = {2022 Tenth International Symposium on Computing and Networking Workshops (CANDARW)},
  author = {Hou, Zhongyang and Wei, Kaijie and Amano, H. and Nakadai, K.}
}

Abstract

HARK is open-source software for robot auditions, including auditory functions using a microphone array. The primary tasks of robot audition are sound source localization (SSL), sound source separation, and automatic speech recognition of the separated speech. In particular, SSL is the fundamental function among them, which detects the number and position of sound sources. For high-speed yet low-power processing of the SSL, we implement it on an FPGA board with an SoC-type Xilinx Zynq Ultrascale+. Since the whole processing of the SSL is complicated, we off-loaded two core functions, AddCorrelation and Added NormalizeCorrelation, on the hardwired logic. Approximately 295 times and 365 times, performance improvements were achieved compared with software execution.