Lead Software Engineer on why cloud migration in regulated industries keeps failing and what it takes to move financial systems without breaking them.
Organizations around the world are accelerating the shift of critical workloads into the cloud. Gartner forecasts that spending on sovereign cloud infrastructure alone will reach $80 billion this year. Yet as these migrations expand, legacy infrastructure remains a major technical risk. Many enterprises still rely on outdated systems, and modernization projects continue to stall because of technical debt, integration complexity, and operational constraints.

To better understand why these migrations are so difficult, TechBullion spoke with Sergei Kuznetsov, a Lead Software Engineer with more than 15 years of experience in enterprise systems. Sergei Kuznetsov is part of the EPAM Systems team developing a distributed trading platform used by central banks, government pension funds, and institutional investors worldwide. In this role, he led the migration of a sensitive trading data service from a legacy standalone server to a modern cloud-based architecture, eliminating a critical single point of failure and improving the platform’s reliability, scalability, and security.
In our conversation, Kuznetsov explains why high-stakes cloud migrations often fail, what separates a safe transition from a risky one, and how engineers can move mission-critical systems to the cloud without introducing new vulnerabilities.
Sergei, the cloud migration market is booming, yet legacy infrastructure still poses a major technical risk. In your experience, what goes wrong when teams try to migrate systems that handle sensitive financial data?
Most teams design migration frameworks for standard enterprise workloads. You assess dependencies, you containerize, you move, but financial systems operate under a completely different set of constraints. On the trading platform I work on at EPAM, the data service we migrated supported institutional clients’ real-time decision-making. A standalone SharePoint server acted as a single point of failure. Downtime on a trading platform is not a bug report – it is a financial event. So the migration to CosmosDB on Azure had to improve reliability, scalability, and security all at once, without disrupting the users who depend on it. Teams fail when they treat this like a standard lift-and-shift. They underestimate the number of edge cases related to data consistency in high-frequency environments. Anticipating those requires understanding both the infrastructure and the business logic it supports.
You mentioned edge cases and business logic. Can you give a concrete example of how that complexity plays out during an actual migration?
Take the data access layer. In a standard app, you redesign the API, run some integration tests, and deploy. On our platform, the users who rely on this data are making important financial decisions based on what the service returns. If the new service behaves even slightly differently, it can affect downstream processes. We had to map every consumer of the old service and understand what each one expected before we could switch anything over. I proposed the architecture, led the implementation with my team of six, and we took our time to thoroughly validate the transition. That kind of patience is not glamorous, but it is the difference between a smooth migration and a very expensive incident.
Migration failures often come down to a lack of domain knowledge. You have migrated and rebuilt systems across medtech, edtech, and foodtech before moving into finance. How does that cross-industry experience change the way you approach a cloud migration?
More than people expect. At STO Solutions, I optimized a preloading algorithm for a library of 3.5 million medical records used in NLP-based diagnostic coding. Cutting load time by 65% there taught me how to think about performance in systems where speed directly affects human outcomes. Over at Imito AG, I joined a clinical wound-assessment app with years of accumulated technical debt, reduced the bug backlog by 70% in eight months, and improved client-side performance by 30%. Medical software and financial software share a core principle: bugs are not inconveniences; they are risks. A doctor relying on slow data at a patient’s bedside and a trader relying on stale data during market hours face structurally similar problems. Working across domains trains you to see those patterns. An engineer who has only ever worked in one vertical tends to optimize for that vertical’s conventions. Someone who has built products from scratch in four industries starts thinking in terms of failure modes and resilience, not just features.
Cloud migration and architecture migration often occur simultaneously. At Eurekly, you moved a growing platform from a monolith to microservices with a five-person team. When a company is doing both at once, what breaks first – the technology or the team?
Both, but the people’s side surprised me more. The technology part is methodical: identify modules with clear boundaries, start with the heaviest-loaded modules, run parallel systems, and route traffic gradually. We picked the modules with the most independent data flows first, because those carried the lowest risk of cascading failures. But what caught us off guard was how much the team’s daily communication had to change. In a monolith, one deployment covers everything. With microservices, you need to coordinate releases, agree on API contracts, and handle failure in one service without bringing down another. If you do not retrain the team’s habits alongside the architecture, the migration technically succeeds but operationally fails.
At EPAM Systems, one of the world’s leading digital engineering companies, you led the migration of a critical trading data service to a cloud-based architecture. What were the hardest engineering decisions you faced in making that transition without compromising reliability?
The hardest decisions centered on preserving system stability while introducing a new architecture. The trading platform supports institutional investors, central banks, and pension funds, so reliability and data consistency were critical. I proposed the architecture for migrating the service from a standalone SharePoint server to Azure Cosmos DB and led a team of six engineers through the implementation. A key part of the process involved carefully mapping every system that consumed the data service and validating how each one would behave after the transition. We introduced staged testing and additional validation layers to ensure the new cloud service maintained the same data accuracy and response patterns expected by downstream systems. This approach allowed us to remove a single point of failure while improving scalability and long-term platform resilience.
Sergei, you are rightly considered a technical leader in high-stakes system architecture. Given your experience, what would you say is the single most common mistake organizations make when planning a cloud migration for regulated systems?
Treating migration as a technology project instead of an engineering discipline. Tools and platforms are mature enough that is rarely the bottleneck. What trips teams up is the lack of engineers who understand the domain, the data flows, and the operational consequences of their changes. You can buy cloud infrastructure in minutes, but you cannot shortcut the expertise required to move a system that central banks rely on. That is exactly why I am developing a course focused on the transition from senior individual contributor to technical leadership, because that gap in expertise is where most migration failures begin.







