Services

We build AI workflows, backends, and infrastructure — and we're honest about what's production-ready and what isn't. Below is what we actually do, grouped by the kind of problem you're solving.

This page is for the founder or team trying to get their first AI workflow into production, or who needs solid software built without the markup. Two things are true at once: we do focused AI consulting, and we deliver affordable traditional software with AI-augmented engineering. We keep them distinct so you know exactly what you're buying.

AI & Agentic Engineering

Most "AI strategy" stops at the slide deck. We help organizations ship their first AI workflows to production — and keep them running once the demo is over.

Agent design and multi-agent orchestration. We design agents and coordinate multiple agents to do real work, built on LangChain and LangGraph. When you can build a proper state graph, you should — so we do.
Production-safe AI delivery. Tool-use patterns, evaluation, and behavior-steering to mitigate hallucination, improve performance, and reduce cost. The goal is an agent that stays on track under real operational load, not one that demos well once.
RAG pipelines. Getting LLMs to safely use data you don't hand them directly — retrieval, grounding, and the plumbing that connects a model to your sources.
Output and truthfulness evaluation. Measuring whether the model is actually right, with evidence gathering rather than vibes.

We won't oversell AI as a cure-all. Sometimes the right answer isn't an LLM at all, and we'll tell you when that's the case.

Backend & Cloud

The unglamorous layer that decides whether anything ships. Our founder has built high-throughput software for Fortune 500 companies, and the same discipline applies whether you're under heavy production load or serving your first hundred users.

Python backend development. FastAPI, async, Pydantic, and typed clients. APIs that are fast to read, safe to change, and well tested.
AWS serverless architecture. Lambda, API Gateway, S3, and DynamoDB — pay-for-what-you-use systems that scale down to zero and back up as demand returns.
Infrastructure as Code. Terraform and Terragrunt with reusable modules and OIDC-based CI/CD. Your infrastructure should be inspectable, versioned, and repeatable — not a thing one person knows how to click together.
Enterprise and legacy systems. Java and Spring Boot when that's where your stack lives.

Affordable Traditional & Legacy Software

Not every problem needs AI. A lot of work is still ordinary, well-understood software — and that's where AI-augmented engineering makes the difference. By pairing experienced engineering with AI tooling, we build the boring, dependable stuff faster and cheaper, without cutting corners on quality.

Full-stack and static web delivery. Next.js and TypeScript front ends served on AWS, with GitHub Actions OIDC CI/CD doing the deploys.
Type-safe API client design. Validated, well-tested clients that fail loudly at build time instead of quietly in production. On one client project, that discipline meant 99% test coverage across 46 endpoints.
DevOps automation. GitHub Actions, Jenkins, and Docker — the pipelines that turn "it works on my machine" into "it ships on every push."
Maintenance and modernization. Keeping existing systems understandable and alive, because clean systems stay alive when someone insists they do.

Work You Can Inspect

Our whole pitch is real, deployed systems — not slideware. So here are a few public projects you can actually look at:

LangGraph behavior-steering middleware — an original pattern for keeping long-running agents on track at runtime.
An LLM model router — directs queries to simple or advanced models via conditional routing, so you're not paying for a large model on a small question.
A multi-agent 3D-printing system — AI applied to a hardware domain, generating 3D-printable objects.
A multi-model truthfulness evaluator — checks LLM output with evidence gathering rather than vibes.

We also publish engineering notes alongside these projects, and we're explicit about what's production-ready and what's a demo. See more of the work and the writing on the About page.

How We Work

Gem State Digital is a Boise, Idaho shop led by founder John Sosoka, who has gone from software tester to principal engineer and tech lead, and consulted across legal software, banking, e-commerce, grocery, and construction.

Engineer-to-engineer honesty. We lead with the real constraint, then the elegant solution. We'll name the right tool for the job, not the flashiest one.
Work you can inspect. We publish engineering notes and hands-on projects, and we're explicit about what's production-ready and what's a demo.
Human-in-the-loop by default. We build semi-autonomous systems with people in the loop where it matters. No "set-and-forget" promises.

How an engagement starts

Engagements are scoped per project, and we keep the first step low-pressure: a short discovery conversation about what you're building, where it's stuck, and whether AI is even the right tool. From there we can frame the work and what it would take — no obligation, no bots, just a genuine conversation.

If you're trying to get your first AI workflow into production — or you just need solid software built without the markup — get in touch. If you'd rather look before you talk, start with the work.