Essay · 2026-05-18 · 12 min read

From factory ERP
to production AI

Notes from a senior tech operator: 9 years of engineering across 5 countries and 5 industries

Vyacheslav Chukhaldin

Kaliningrad · published 2026-05-18

This is not an article about "how I became an AI engineer in 6 months." This is a record of how nine years working with large industrial systems, moving through five countries and three industries — forge the skill that today gets called "senior tech operator." I'm writing this once, and I'll only rewrite it if the industry drifts seriously.

Baltic Shipyard taught me what university never did

I joined Baltic Shipyard in Saint Petersburg in October 2017 as a first-category engineer. On paper, it was "shipbuilding project planning in Primavera P6." In practice — multi-level network schedules for ships that take 5–7 years to build, specifications running into tens of thousands of material line items, and an ETL between Excel, Primavera and Infor LN that I had to write myself because "there's no out-of-the-box solution."

Four years in, I was promoted to lead software engineer — by that point I was already writing my own sessions in Infor LN Studio in BAAN 4gl, optimizing multi-tier SQL queries for multi-shop reports, and teaching the workshops how to load progress into Progress Reporter. In parallel — IBM Cognos reporting for the finance department.

What does "production" mean when you've got ten thousand materials on the shop floor and a single SQL typo costs you a day of downtime? I didn't learn it from a book. Production isn't "code that runs at the client." Production is a mode in which every line you write directly affects people who are standing at a workstation right now, waiting for you to fix it.

The lessons I carried away — and which I still apply to AI systems in 2026:

· Every error has an address. At a factory you can't hide behind a "merge conflict" — a specific welder is standing there waiting. Today I treat production-AI bugs the same way: not an "abstract race condition," but a specific user.
· Logging beats elegance. If something breaks overnight — you need a record you can use to reconstruct the chain. Since then I haven't shipped a single AI agent without structured logs.
· "It works" isn't a quality — it's a question of time. Every system degrades. The factory taught me to build in self-cleanup and regular checks, instead of naively trusting that things will stay stable.

At conferences I often hear "I have production AI" from someone who's launched a chat bot in Streamlit. That's not production. Production is when you're ready to take a call at 03:47 AM telling you your agent has gotten stuck yet again.

Enterprise on 3 continents — why the same Infor LN breaks differently

In November 2022 I moved to Turkey to work at IPL Consulting on Western European and Middle Eastern clients. A year later — Bulgaria, Asertiva Solutions, the EU market. Over those two and a half years I saw how the same ERP platform — Infor LN — gets customized for automotive (including Ferrari and Aston Martin), chemicals, pharma, nuclear energy (Kozloduy NPP, Rosatom) and shipbuilding (OSK + dozens of yards across multiple countries).

That sounds like a marketing bullet point. In practice it's hundreds of tickets from companies in different countries and industries, each of which required understanding that "standard Infor LN functionality" is only a skeleton. The client's real production logic always lives in custom 4GL sessions (BAAN), in VB.NET overrides inside Mongoose, in three-tier Birst reports where half the DAX formulas exist "because that's what we agreed on back in 2007."

Same platform. Same tables. And — completely different failure modes. At Ferrari a single ticket would turn into half a ton of custom code because the client had very specific requirements for component traceability. At Kozloduy the same field was being used for regulatory reporting under analogues of Russia's 187-FZ critical-infrastructure law. One bug in one formula — and you either get an incomplete parts shipment to Maranello, or a query from the regulator.

That's the main enterprise lesson: production systems don't "work." They survive through constant rework against reality. They evolve through tiny corrections, because reality changes faster than your spec. If your ERP "just works" — it means you don't have a team maintaining it, and in 18–24 months you'll be doing a painful migration or replacing the system entirely.

When a CTO tells me today "we shipped an AI agent, it works" — I don't believe it. "Works" is a static state. Production AI is a continuous process in which somebody is looking at logs, evals and cost metrics every single day. Without that, in six months you've got something completely different in prod from what you launched in March.

Second enterprise lesson: regulation isn't paperwork — it's an engineering problem. At a nuclear plant you have very specific traceability requirements for every change. In Russia in 2026 — FZ-152 on personal data and 187-FZ on critical infrastructure — these aren't things "the compliance team will write up." That's your code, your architecture, your logs. If you didn't build it in from day one, you have a problem.

Moving into the Apple ecosystem, and why I built twenty iOS apps

In the summer of 2021, while I was still a lead engineer at the shipyard, I signed up for the App Brewery iOS Bootcamp and watched the WWDC21 keynotes in parallel. I wanted to learn how to build products, not just customize other people's. Your own product is a different mode of working: you're responsible for the UX, the back end, and making sure it doesn't hurt the user to pay you.

Over a year I wrote more than twenty iOS apps in Swift / SwiftUI / UIKit. Not all of them shipped to the App Store — most were learning projects — but each one had a concrete problem behind it: a CoreML integration, ARKit for AR scenes, StoreKit for in-app purchases, Catalyst for a desktop variant. CoreData, Firebase, push notifications through APNs.

What did this track give me beyond Swift itself? The understanding that real product-grade knowledge can't be bought. You can sit through 40 hours of a course and you'll know what @StateObject is. But what to do with StoreKit error handling when the user's AppleID has expired and the receipt validation has been stuck for 30 seconds — no lecture will teach you that. You only learn it when your app starts losing real money on real users.

That's the same logic as with AI agents today. Courses will tell you what RAG is. But what to do when your vector database returns a "neutral" chunk in response to a critical user query — nobody teaches you that. Only your own production teaches you that.

$1.34M in performance marketing — how media buying changed how I think about architecture

In November 2024 I left Asertiva and joined Syndicate Group in Moscow as a Middle Media-buyer. Then Senior Media-buyer at SweepStakes from September 2025. Over sixteen months, $1.34M in ad budget passed through my hands across the largest social-media platforms — about $379k of which was net profit. ROI ranged from 25% to 37% depending on vertical and period.

The numbers are nice, but they're not the point. Media buying rewired me architecturally. When you're running an A/B test with a $2–5k daily budget, every extra second of landing page load is minus 2% conversion rate and real money you're burning. Not "theoretically." Right now. In front of you. On the Ads Manager dashboard.

I started looking at performance like a media buyer, not like a backend engineer. Big difference:

· An engineer thinks: "I'm optimizing because it's the right thing to do."
· A media buyer thinks: "I'm optimizing because 200ms = $400 in lost margin per day."

I started writing tooling: bulk uploads through the FB Graph API, creative parsers, GPT classifiers for spy data, integrations with Keitaro via CAPI. None of this got written "because it was cool" — but because uploading 300 creatives by hand in season costs you two weeks of budget.

And that's where the switch flipped. I realized the best software engineers I've ever seen come out of places where cash burn is visible daily. Not from FAANG, where you have "sprint planning." From places where if you don't deliver today, there's nothing to pay with tomorrow. Trading rooms, media buying, e-com operations, small SaaS. These people are a different breed.

I apply this to AI now. When I build a system for a client, I'm not thinking "how do I make this elegant" — I'm thinking "how many dollars per month is this thing going to eat in OpenAI at current traffic" and "what's the ROI." If ROI < 3×, I don't ship — the client's margins are thinner than my nutra campaign's.

My own production fleet of AI systems

Since 2025 I've been running my own production fleet. These aren't "pet projects" in a repo — they're real systems running 24/7 that spend and earn money. If they go down, I find out via Telegram within 60 seconds, because I have regression monitors and an automatic halt mechanism.

What's in it right now:

· Trading infrastructure on Base CTF and Polygon CLOB — my own bot fleet on Limitless / Polymarket / Bybit trading crypto 24/7. With observability, halt mechanisms and a regular postmortem process.
· AI Marketing-Ops Pipeline across 327+ FB campaigns — my own @Analyticsfbbot, which long-polls CAPI metrics and sends Telegram alerts on deviations.
· Landing-page generator with Keitaro API and Cloudflare integration — a Tkinter GUI on top of AI generation that lets you spin up a landing page with geo-specific content swaps in 10 minutes.
· OSINT toolkit built around a 12-step methodology for vetting counterparties in performance marketing. Self-healing browser automation through an AdsPower dual-instance setup.
· Cross-project memory + decision log — a SQLite database with every Edit/Write/Bash from my Claude Code sessions, plus a markdown decision journal with reminders via a Discord webhook.

It runs on five VPS servers in Finland, Switzerland, Austria, Germany and the Netherlands, connected by an AmneziaWG mesh, with alerts to Telegram and Discord, and a regular VACUUM on the SQLite databases every quarter.

The main lesson I've taken from this: I don't "teach" AI. I run it. And between those two verbs is a gigantic chasm that most people haven't crossed. Running it means knowing what to do when OpenAI rate-limits you at midnight, what to do when your evals fail a new model version, what to do when a regulator in Moscow asks where you store PII.

That's what the "senior tech operator" stance actually is: I'm not a consultant with PowerPoint, I'm the person with hands on the wheel.

What I learned about AI after 18 months of my own production

This is the most useful chapter in the essay. If you're a CTO, a founder, or just someone about to spend a lot of money on AI in 2026 — read this one. I'm going to say uncomfortable things that rarely get said at conferences.

1. AI agents aren't autonomous. They're a supporting force inside a correctly built system. Every "autonomous agent" I've seen in production in 2025–2026 is 80% deterministic code (validation, retry, fallback, observability) and 20% LLM calls. If yours is the other way around — you don't have an agent, you have a demo.

2. 95% of "agentic AI" in 2026 is rebranded RPA or a chat bot. I'm not saying "all AI is a fraud" — that's silly. I'm saying that the marketing category "agentic AI" in 2026 has been blurred down to "anything that uses an LLM." Gartner, Apr 2026: agentic AI on the Peak of Inflated Expectations, 40%+ of projects to be cancelled by 2027. Don't trust the slides — look at eval metrics and cost structure.

3. What REALLY matters for production AI:

a. Evals. Not a "vibe check," but reproducible tests on a golden dataset. If you don't have evals, you're flying blind.
b. Observability. Every LLM call should be in your logs with input/output/cost/latency. Without that, you can't figure out why the agent did something stupid today.
c. Cost guardrails. Token-bomb attacks are real. Without a spend cap, your worst-case scenario is $50k overnight.
d. FZ-152 / critical-infrastructure compliance. If you're a Russian legal entity, look at the 2026 fines: up to 20M ₽ + 3% of revenue for a personal-data leak. From January 1, 2026, the 187-FZ critical-infrastructure law bans foreign software in banks, oil and gas, telecom, and the public sector. Cursor / Claude / OpenAI = a formal violation.

4. OWASP LLM Top 10 — required reading, 2025 edition. If you have AI in production and haven't read the latest OWASP LLM edition — drop everything and read it. Indirect prompt injection (LLM01), system prompt leakage (LLM07), excessive agency (LLM06), token bombing (LLM10) — this isn't academic angst, these are real 2025–2026 attacks. CVE-2025-32711 EchoLeak — CVSS 9.3. CVE-2025-53773 GitHub Copilot RCE — CVSS 9.6. This is living reality.

5. Prompt injection shows up in the most unexpected places. I've personally seen it in RSS feeds my RAG was indexing, in email signatures that ended up in context, in comments on pull requests that Copilot was looking at. If your agent reads any user-supplied data — you have an attack surface.

If you currently have an AI agent in production and you don't know the answers to those five points — that's normal. That's how most teams look in 2026. But it means you need an outside fifth eye to come in and show you where the holes are. That's what I do, and it's one of the most in-demand verticals in my practice.

What I now offer clients

After everything above — what do I actually sell as a service? I've split my practice into four categories:

· Customer communication: AI assistants in chat (WhatsApp / Telegram / web), voice automation through Vapi / ElevenLabs, AI helpers on top of the company knowledge base.
· Process automation: n8n / Make pipelines, document processing, marketing-ops automation for FB / Keitaro, "whatever you've got" integrations with CRM/ERP.
· AI infra: standing up and maintaining your own LLM stacks (vLLM / Ollama / on-prem GigaChat for critical-infrastructure requirements), evals, observability, cost guardrails. Defensive AI security per the OWASP LLM Top 10.
· Development: backend in Python / Node.js, mobile (iOS native + cross-platform), frontend in React / Astro, integrations with any API.

Who I'm the best fit for: B2B SaaS founders with 20–100 people, performance agencies spending $500K+/mo, mid-market e-com running AI customer support, lawyers and medical clinics under FZ-152. Who I'll pass on: SMBs with revenue below 1M ₽/mo — you don't have the budget for my practice, and I don't want to waste your time.

If you've read this far, you're interested. Start with the 5,000 ₽ AI audit: in one working day you'll get a clear picture of what to roll out first, what the payback looks like, and whether your business actually needs AI at all (sometimes — no, and I'll tell you that honestly).

Book the 5,000 ₽ AI audit

Thanks for reading to the end. If this essay was useful — share it with that one CTO or founder who currently has a "weird" AI agent in production and doesn't yet know what to do with it.

— Vyacheslav Chukhaldin, Kaliningrad, 2026-05-18