[P] Offline LLMs at edge – Automating Family Memories

[P] Offline LLMs at edge - Automating Family Memories

Over winter break I built a prototype which is effectively a device (currently Raspberry Pi) which listens and detects “meaningful moments” for a given household or family. I have two young kids so it’s somewhat tailored for that environment.

What I have so far works, and catches 80% of the 1k “moments” I manually labeled and deemed as worth preserving. And I’m confident I could make it better, however there is a wall of optimization problems ahead of me. Here’s a brief summary of the system:

1) Microphone ->

2) Rolling audio buffer in memory ->

3) Transcribe (using Whisper – good, but expensive) ->

4) Quantized local LLM (think Mistral, etc.) judges the output of Whisper. Includes transcript but also semantic details about conversations, including tone, turn taking, energy, pauses, etc. ->

5) Output structured JSON binned to days/weeks, viewable in a web app, includes a player for listening to the recorded moments

I’m currently doing a lot of heavy lifting with external compute off-board from the Raspberry Pi. I want everything to be onboard, no external connections/compute required. This quickly becomes a very heavy optimization problem, to be able to achieve all of this with completely offline edge compute, while retaining quality.

Naturally you can use more distilled models, but there’s an obvious tradeoff in quality the more you do that. Also, I’m not aware of many edge accelerators which are purpose built for LLMs, I saw Raspberry Pi just announced a hat/accelerator.. I’m curious to experiment with that possibly.

I’m also curious to explore options such as TinyML. TinyML opens the door to truly edge compute, but LLMs at edge? I’m trying to learn up on what the latest and greatest successes in this space have been.

I would be interested to hear from anyone else who is experienced in doing anything with generative tech, offline, at edge. Thanks!

submitted by /u/GoochCommander
[link] [comments]

Liked Liked