The Growth Story of a Cloud Migration Product
About the Author
Ray Sun is the CTO of OneProCloud (a joint venture of GDS Holdings, NASDAQ: GDS), an Alibaba Cloud Solutions MVP, co-founder of the Ceph China Community, and an AWS Certified DevOps Professional. He has previously worked at notable domestic and international companies including Yiyang Tongxin, Motorola, and Shunlian Software. He started his entrepreneurial journey in 2013, working in private cloud R&D. In 2016, he led his team in developing the cloud-native migration product HyperMotion, which has been widely adopted in projects for Jiangsu Rural Credit Cooperatives, State Grid Corporation, Haitong Securities, and many others. In 2018, he successfully organized the first-ever global Ceph Summit and helped multiple well-known Chinese enterprises join the Ceph Foundation under the Linux Foundation.
About OneProCloud
OneProCloud Information Technology (Shanghai) Co., Ltd., established in Shanghai, is a leading domestic provider of cloud technology and digital architecture services. OneProCloud is dedicated to providing enterprises with neutral and professional cloud consulting, cloud products, and cloud services, with the mission of becoming a trustworthy cloud service provider for enterprise IT operations and digital development. Guided by the philosophy of product-driven services and technology-enhanced business value, the company continuously delivers a rich portfolio of cloud products, solutions, and professional consulting services, and collaborates with its ecosystem to help enterprises accelerate their growth in the digital era.
OneProCloud’s core R&D team was formed in May 2013. From 2013 to 2016, the team focused on developing OpenStack-based private cloud products. After 2016, the team pivoted entirely to developing a niche cloud product: cloud migration. In 2017, they completed the private cloud platform construction and business system cloud migration for a rural commercial bank in Shuyang, a project that won a Category 4 Technology Achievement Award from the China Banking Regulatory Commission (CBRC) and a Second Prize at the 2nd Outstanding Cloud Computing Open Source Case Awards. In 2018, they completed the construction of a dedicated cloud platform for Jiangsu Rural Credit Cooperatives, and simultaneously used their cloud migration product to migrate more than 1,200 business systems to the cloud in bulk — a project that earned a Category 2 Technology Achievement Award from the CBRC and a Second Prize at the 3rd Outstanding Cloud Computing Open Source Case Awards. In the same year, they completed the batch migration of nearly 20,000 VMware virtual machines across 27 provinces for State Grid Corporation. In 2019, they integrated Haitong Securities’ cloud management platform with their cloud migration product — the first domestic project to integrate a self-service migration service into a cloud management platform. In 2020, they completed the batch migration of VMware virtual machines for Qianhai Equity Exchange to Alibaba Cloud.
How It All Started with Cloud Migration
Starting in 2011, I was engaged in R&D work on OpenStack applications for enterprise private clouds. From 2011 through 2018, the open-source community was at its most active, with companies pouring their primary energy into optimizing various OpenStack modules. At the time, the services offered when building a private cloud platform were comprehensive: from system integration and installation to ongoing operations, maintenance, and custom development — essentially a full-stack solution. Sometimes, even when business systems running on top of the cloud platform ran into problems, customers would still turn to us for help. This was an enormous challenge for any OpenStack company that was still in its early stages.
In 2016, we were contracted to build a private cloud for a rural commercial bank. After extensive preliminary validation, we secured the project at the end of 2016. At that time, in addition to the requirement to build the cloud platform itself, there was another requirement that served as an acceptance criterion: to smoothly migrate the customer’s existing business systems — running on various physical machines — to the new cloud platform, with no disruption to current operations. Finally, the old hardware would need to undergo necessary upgrades before being re-added to the new cloud platform.
Looking back at that cloud platform build, the architecture itself was not complex — a classic OpenStack setup using hardware storage plus VLAN. In the actual project implementation, from hardware delivery and rack installation to cloud platform deployment, the entire process took roughly three weeks. However, due to the customer’s requirements for live migration and resource reclamation, the project ultimately took a full six months to complete. Because the customer’s location was not on a high-speed rail line, our engineers traveling from Beijing either had to take an overnight slow train or take the high-speed rail to Xuzhou and then transfer to a long-distance bus. Either way, the journey took at least eight hours. From solution validation to final implementation, the entire team made more than 50 business trips in total, resulting in extremely high implementation costs. When we tried to conduct a post-mortem on the entire process, we found that the most time-consuming part was resolving the various issues that arose during the migration itself.
Moving Forward Through Setbacks
The customer’s business systems were a classic example of aging legacy systems: running on physical machines with hardware storage arrays, with a small number of virtualized environments. The operating systems were diverse — the most common being SUSE 11, along with Windows 2003, CentOS, and others. Databases included DB2, Oracle, and a small amount of MySQL.
Because this was a banking system, there was an extremely strong demand for business continuity, and the following requirements were placed on us for the migration:
First, risk control. In any industry, stability and reliability are the unquestioned top priorities, and this is even more so in the financial industry, which is tied to people’s livelihoods. In practice, moving existing business systems to the cloud during a cloud platform build typically faces the greatest resistance. The root cause is the absence of a complete, scientific methodology and toolset to put customers’ minds at ease about going to the cloud. Therefore, during the migration to the cloud, the system had to be verifiable and rollback-capable. Before the official cutover to the cloud platform, the business systems needed to be thoroughly validated on the cloud platform; and if something went wrong after the cutover, it had to be possible to immediately roll back to the original systems and continue providing services. The goal was to minimize risk throughout the cloud migration process.
Second, ensuring business continuity. Rural commercial banks differ from the four major state-owned banks or city commercial banks — they often have significant autonomy in their IT infrastructure, and except for their core transaction systems, all other business systems run locally, which places high demands on local operations and maintenance capabilities. During the migration, local business system continuity was critical: if it was interrupted, the bank simply could not open for business. Furthermore, according to relevant regulations issued by the CBRC: causing business operations to be unable to proceed normally for 30 minutes (inclusive) or more during business service hours constitutes a major operational interruption event. This meant that the cutover window for migration was essentially limited to nighttime hours — but at night, the bank would have data delivery and batch processing programs running, leaving very little time for migration. Therefore, an approach close to live migration in effect was required to meet the customer’s needs.
Third, minimizing human intervention to ensure migration reliability. Because many systems were developed by third-party vendors, some applications were old, and in some cases the vendors no longer existed. Minimizing reliance on application vendors during migration was critical — actions such as reinstalling or reconfiguring could render applications inoperable. Additionally, since the migration process involved many complex steps, excessive manual operations were very prone to error.
During this process, we went down many wrong paths. For example, we started with Clonezilla using a cold migration approach, which took 24 hours to migrate a single host. We also investigated various open-source P2V and V2V tools, none of which were suitable. To solve a UEFI boot issue, we modified the Nova code, only to find that a server sat at a black screen for half an hour during startup — and because of that one system, we made five trips back and forth between Beijing and the customer. All of these difficulties forced us to stop and think: why was what seemed like a simple migration ultimately the key factor affecting the project schedule and cost?
Born from Projects, Grown through Projects
In order to solve the problems encountered in the field, we tried every approach available and ultimately discovered that block-level differential replication technology from the disaster recovery domain, combined with a cloud-native approach, was the optimal combination. Using block-level incremental replication from disaster recovery fully ensured business continuity, while maximizing the use of cloud-native APIs and resources achieved the “shortest distance between two points” effect — ensuring migration reliability, greatly reducing the uncertainty introduced by human intervention, and ultimately meeting the overarching goal of controllable risk.
Through nearly two years of refinement during 2016 and 2017, a live migration product targeting OpenStack took shape in its initial form. Then came 2018, which brought another major test: we faced the large-scale migration of the Jiangsu Rural Credit Cooperatives’ dedicated cloud platform. We needed to migrate the business systems of all 62 second-tier legal entities across the province to the cloud. Very quickly, the excitement of winning the bid was swallowed up by new challenges. In our previous projects, all migration work was done within local data centers where at least all network connections were at gigabit speed. But in this project, the connections between the provincial hub and each second-tier legal entity were 10 Mbps dedicated lines — and that was best case, with some as low as 2 Mbps. The dedicated lines between the province and the second-tier entities were primarily used for data distribution from the provincial hub, so data transfer for migration could only happen during specific time windows, without consuming the full bandwidth, to avoid affecting operations. However, the data volume for each second-tier entity was enormous — approximately 30 TB to 50 TB — and relying purely on network transfer would theoretically have taken over a year. So purely network-based transfer was out of the question. We needed a hardware-plus-network combination approach: hardware would hold the full data set, which would be physically transported to the provincial hub, and after the full data was switched to the cloud, the network would be used to transfer incremental data. This approach still achieved the effect of live migration, but migration speed improved dramatically.
After resolving the large-scale data transfer challenge, we immediately faced the next problem: what to migrate first, and what to migrate later? Application systems have dependencies, so the topology of the application systems had to be mapped out before migration, and post-migration changes to network configurations and application settings had to be analyzed in advance to ensure everything went smoothly. This process is essentially what is referred to in many migration methodologies as the investigation and analysis phase. Through this process, we accumulated our own migration investigation methods and implementation plans, which proved highly valuable for our subsequent projects. We also came to realize that migration is absolutely not a problem that can be solved by a single tool alone — it is a heavily consulting-intensive process, and migration tools only address the final mile.
Starting in early 2018, we formed a business expert group with the Jiangsu Rural Credit Cooperatives’ team and went deep into each prefecture and city, rigorously following a scientific cloud migration process: investigation, review, implementation, and cutover. From basic system information collection and organization, to analysis of upstream and downstream dependencies for business systems, topology mapping, and comprehensive security assessment — then using the findings to prepare implementation plans and schedules, ensuring every post-migration change was documented in advance and that the migration would proceed without a hitch. Auxiliary physical devices were used for full-data copying, transported to the provincial hub for cloud switchover, and finally incremental data and business cutover were completed at the appropriate time. In the second half of 2018, an average of three rural commercial banks per week were fully migrated to the cloud.
This project was a tremendous proving ground for our product, which withstood the test of large-scale migration. Through the dedicated cloud platform construction and business system migration, the solution saved Jiangsu Rural Credit Cooperatives 560 million yuan in IT investment over three years. As of September 30, 2018, a total of 54 second-tier legal entities with over 1,200 systems had been migrated. Meanwhile, the cloud platform grew from its initial 15 nodes to more than 130 nodes, and storage grew from 0.2 PB to 3 PB.
From One Cloud to Many Clouds
By 2019, the cloud-native philosophy embedded in our product was gaining increasing recognition from customers, and this high degree of automation built on cloud-native foundations was precisely filling the gap in the cloud migration market. Some established disaster recovery vendors even began treating us as migration competitors, attacking us in advertorials — which only proved the enormous value our product represented.
But supporting migration to only a single cloud could no longer meet the growing demand in the market. So in the first half of 2019, we set out to comprehensively support more public and private cloud platforms. We started with China’s largest public cloud provider — Alibaba Cloud. Over the past decade, Alibaba Cloud had become the benchmark of the Chinese cloud computing industry, with an extremely high market share and the broadest API support, offering maximum empowerment to partners. Because Alibaba Cloud and OpenStack differ in some mechanisms, after nearly three months of research and development, we finally broke through to achieve live migration to Alibaba Cloud. From there, we continued expanding our cloud platform coverage, and in roughly four additional months we covered the vast majority of domestic public cloud, dedicated cloud, and private cloud platforms — truly becoming a multi-cloud migration solution.
Building an Outstanding User Experience
The first impression many enterprise products give is that they are professional and complex — you need two days of training before you can even figure out how to use them. The cloud migration space is no different: many cloud migration products are simply light modifications of traditional disaster recovery software by incumbent vendors, with complex interfaces and extremely cumbersome operations. Migrating a single host could easily require 15 to 25 steps as a baseline. So when we iterated on our product, we wanted to build a B2B product with a B2C mindset.
At the initial stage, users only need to configure source and destination information following a wizard, and then they can enter the migration workflow. We distilled the migration process into three simple steps: select hosts, sync data, and start migration. Through a highly automated workflow and clever use of cloud-native APIs and resources, even a junior Linux engineer can get fully up to speed in just a few minutes. And because of the high degree of automation, the advantages during bulk migration are especially pronounced.
Full new UI.png

Because we had always been working in private cloud product development, there was a kind of inertia in our R&D team’s approach to products. To meet private deployment requirements, we typically needed to package installation media into an ISO format with no network dependencies. The direct consequence was that when users tried out our product, they often had to spend a long time downloading the installation media, then installing it, and only then trying it out. That back-and-forth process could easily waste an entire day. This was especially frustrating in the context of public cloud migration, and so in the second half of 2019 we decided to turn our product into a SaaS offering, so users could experience the product more quickly rather than wasting time on installation. Due to constraints on human resources, both the R&D team and the operations team faced enormous challenges. The R&D team had to develop new modules to support operational requirements, multi-tenancy, and other SaaS needs, while also rearchitecting the original communication model to avoid bidirectional communications. The implementation team had to balance private projects with online operations, which demanded a stable, highly reliable, and easily maintainable platform — making cloud-native adoption especially critical. We leveraged Alibaba Cloud’s Kubernetes container service and various cloud-native components to complete the SaaS transformation, and without adding a single headcount, we achieved the full SaaS go-live in early 2020.
Growing Together on the Shoulders of Giants
In early 2019, AWS acquired the Israeli disaster recovery startup CloudEndure for $250 million. Although the company was acquired under the banner of a disaster recovery firm, its primary business was providing migration services to AWS. Our product’s design philosophy and user experience were very similar to CloudEndure’s, while our product could support a wide range of different cloud vendors in China.
AWS’s acquisition of CloudEndure gave us tremendous confidence and reinforced our commitment to the path of cloud-native migration and disaster recovery products. We found that this market was essentially a blank slate domestically. Although traditional disaster recovery vendors could solve on-site project problems by throwing people at them, only a truly self-service migration platform could let users independently allocate their cloud workloads, accelerate consumption of cloud resources, and ultimately benefit cloud vendors.
And so a bold idea took shape: could we integrate our migration software as a cloud-native service within a public cloud platform? After several rounds of deliberation, we began engaging with Alibaba Cloud. I am deeply grateful to Dr. Chen Xu of Alibaba Cloud, who opened the door to collaboration with the Alibaba Cloud team for us. After connecting with Alibaba Cloud in 2019, the first test we faced was from the Alibaba Cloud ECS team. After thoroughly testing the product, we met with Alibaba Cloud’s ecosystem partner team and investment department in Hangzhou — a meeting that fully opened the doors of collaboration between us and Alibaba Cloud.
At the end of 2019, I was awarded Alibaba Cloud Solutions MVP status, which further deepened our cooperation with Alibaba Cloud. In early 2020, the application tool marketplace in the Alibaba Cloud console caught my attention. This deeply integrated mode with Alibaba Cloud was an ideal home for cloud-native migration and disaster recovery. Through an introduction from the Alibaba Cloud MVP operations team, we successfully connected with the Alibaba Cloud application tool marketplace team, and at the end of February decided to list our product there.

The process of listing on the Alibaba Cloud application tool marketplace was far from smooth sailing. Alibaba Cloud has stringent security requirements, and the product must pass a rigorous review by Alibaba Cloud’s security department before going live. To this end, we made some architectural adjustments and security hardening measures. After nearly three months of effort, our platform officially went live on the evening of July 10, 2020. Once listed, the migration platform maintains a fully consistent user experience with Alibaba Cloud — users experience no friction whatsoever when using it.

Soon after, through the MVP operations team, we connected with the Alibaba Cloud Apsara Stack team and began integrating with Apsara Stack dedicated cloud. By early August, we had fully achieved comprehensive support for automated migration to Apsara Stack.
Closing Thoughts
In April 2020, the Chinese government introduced the “New Infrastructure” development initiative, with information infrastructure at the forefront — and cloud computing, as the foundation of new infrastructure, has never been more important. The epidemic at the start of 2020 made the entire society aware of the importance of a “cloud-based society.” It can be predicted that the era of full cloudification is approaching.
Through our comprehensive collaboration with Alibaba Cloud, our product has gained access to top-tier traffic channels, shortening the time it takes to earn customer trust. Looking ahead, we will also build our product into a cloud-native backup and disaster recovery offering, delivering a superior user experience to more cloud customers. We welcome all like-minded individuals to join our team, and we welcome customers with migration needs to join our migration discussion group (follow our WeChat official account and reply “support”).
