Lillup ยท 5 hours ago
Embedded Generative AI Engineer Intern (On-Device LLM + iOS-First) (Remote, Unpaid)
Lillup is building an embedded AI product ecosystem focused on low-latency, privacy-preserving, on-device generative AI. They are seeking an Embedded Generative AI Engineer Intern to help implement and optimize on-device LLM inference and an orchestration layer, working within a fast-moving team across AI engineering, product, and UX.
Responsibilities
Embedded LLM Integration (iOS-First)
Performance Optimization for Edge Devices
Orchestration (Product + Research-Backed)
Local Memory and Prompt Assembly
Testing, Documentation, and Iteration
Qualification
Required
Strong fundamentals in software engineering and systems thinking
Hands-on exposure to Generative AI / LLM concepts (inference, decoding, prompt design)
Comfort working in a remote, fast-paced, research-to-production environment
Ability to read technical docs, prototype quickly, and iterate with discipline
Preferred
iOS development experience (Swift, SwiftUI, Swift Concurrency)
Familiarity with mobile ML deployment constraints (bundle limits, asset delivery, thermal)
Experience with on-device inference frameworks or runtimes (any of: Core ML, LiteRT, MediaPipe, ONNX Runtime, llama.cpp, MLC)
Work on open-source or personal projects related to model inference, quantization, or RAG
Prior contributions or experiments involving DeepSeek / LLaMA or inference tooling
GitHub / demos / technical write-ups