Pulse · 5 months ago
Software Engineer, Inference
Pulse is a fast-growing company focused on document intelligence, leveraging advanced technology to extract structured information from complex documents. The Software Engineer, Inference role involves building and optimizing inference services for OCR and multimodal models, ensuring high performance and effective resource management.
Artificial Intelligence (AI)Data ManagementSoftware
Responsibilities
Build inference services with smart batching and caching
Optimize kernels, tokenization, and model graphs
Evaluate vLLM, TensorRT LLM, and Triton tradeoffs
Implement autoscaling and admission control with clear SLOs
Own performance dashboards and capacity planning
Qualification
Required
3+ years in performance engineering or ML systems
Strong Python, plus C++ or CUDA exposure
Experience with GPU profiling and model serving
5 days in-office at our San Francisco office
Eager to learn and adapt quickly
Preferred
Prior startup or founding experience is a plus
Experience reducing p95 and cost in production ML systems
Benefits
Competitive base salary plus equity
Performance-based bonuses
Relocation assistance for Bay Area moves
Daily meal stipends
Comprehensive medical, vision, and dental coverage
Company
Pulse
Production-Grade Unstructured Document Extraction
H1B Sponsorship
Pulse has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (1)
2024 (1)
2022 (2)
Funding
Current Stage
Early StageTotal Funding
$4.4MKey Investors
Y Combinator
2025-02-19Seed· $3.9M
2024-09-25Pre Seed· $0.5M
Company data provided by crunchbase