Back

Machine Learning Engineer, LLM Fine-Tuning

First Soft Solutions LLC
04/02/2026

Full time Information Technology Telecommunications Java Python Software Engineer Testing

Job Description

Machine Learning Engineer, LLM Fine Tuning

We are actively hiring for a Machine Learning Engineer focused on LLM fine tuning for Verilog/RTL applications.

Location: San Jose, CA (Onsite)

Skills: LLM fine tuning, Verilog/RTL, AWS, Bedrock, SageMaker

Responsibilities

Own the technical roadmap for Verilog/RTL focused LLM capabilities-from model selection and adaptation to evaluation, deployment, and continuous improvement.
Lead a hands on team of applied scientists/engineers: set direction, unblock technically, review designs/code, and raise the bar on experimentation velocity and reliability.
Fine tune and customize models using state of the art techniques (LoRA/QLoRA, PEFT, instruction tuning, preference optimization/RLAIF) with robust HDL specific evals:
- Compile /lint /simulate based pass rates, for code generation, constrained decoding to enforce syntax, and "does it synthesize" checks.
Design privacy first ML pipelines on AWS:
- Training/customization and hosting using Amazon Bedrock and SageMaker (or EKS + KServe/Triton/DJL) for bespoke training needs.
- Artifacts in S3 with KMS CMKs; isolated VPC subnets & PrivateLink (including Bedrock VPC endpoints), IAM least privilege, CloudTrail auditing, and Secrets Manager for credentials.
- Enforce encryption in transit/at rest, data minimization, no public egress for customer/RTL corpora.
Stand up dependable model serving: Bedrock model invocation where it fits, and/or low latency self hosted inference (vLLM/TensorRT LLM), autoscaling, and canary/blue green rollouts.
Build an evaluation culture: automatic regression suites that run HDL compilers/simulators, measure behavioral fidelity, and detect hallucinations/constraint violations; model cards and experiment tracking (MLflow/Weights & Biases).
Partner deeply with hardware design, CAD/EDA, Security, and Legal to source/prepare datasets (anonymization, redaction, licensing), define acceptance gates, and meet compliance requirements.
Drive productization: integrate LLMs with internal developer tools (IDEs/plug ins, code review bots, CI), retrieval (RAG) over internal HDL repos/specs, and safe tool use/function calling.
Mentor & uplevel: coach ICs on LLM best practices, reproducible training, critical paper reading, and building secure by default systems.

Qualifications

10+ years total engineering experience with 5+ years in ML/AI or large scale distributed systems; 3+ years working directly with transformers/LLMs.
Proven track record shipping LLM powered features in production and leading ambiguous, cross functional initiatives at Staff level.
Deep hands on skill with PyTorch, Hugging Face Transformers/PEFT/TRL, distributed training (DeepSpeed/FSDP), quantization aware fine tuning (LoRA/QLoRA), and constrained/grammar guided decoding.
AWS expertise to design and defend secure enterprise deployments: Bedrock, SageMaker, S3, EC2/EKS/ECR, VPC/Subnets/Security Groups, IAM, KMS, PrivateLink, CloudWatch/CloudTrail, Step Functions, Batch, Secrets Manager.
Strong software engineering fundamentals: testing, CI/CD, observability, performance tuning; Python a must (bonus for Go/Java/C++).
Demonstrated ability to set technical vision and influence across teams; excellent written and verbal communication for execs and engineers.

Seniority Level

Mid Senior level

Employment Type

Full time

Job Function

Engineering and Information Technology

Industries

IT Services and IT Consulting

Machine Learning Engineer, LLM Fine-Tuning

Job Description

Modal Window