MindMap Gallery Alibaba Cloud Hybrid Cloud Disaster Recovery Service HDR
Hybrid cloud disaster recovery service Hybrid Disaster Recovery ("HDR" for short) is a disaster recovery service that provides low-minute RPO and RTO for enterprise-level applications. Covering Alibaba Cloud application cross-availability zone or cross-region disaster recovery scenarios as well as local application disaster recovery cloud scenarios, it can effectively ensure data security and business continuity. There is no need to build a disaster recovery center yourself. Cloud resources are fully automatically managed and controlled through a centralized console.
Edited at 2024-01-13 20:50:39One Hundred Years of Solitude is the masterpiece of Gabriel Garcia Marquez. Reading this book begins with making sense of the characters' relationships, which are centered on the Buendía family and tells the story of the family's prosperity and decline, internal relationships and political struggles, self-mixing and rebirth over the course of a hundred years.
One Hundred Years of Solitude is the masterpiece of Gabriel Garcia Marquez. Reading this book begins with making sense of the characters' relationships, which are centered on the Buendía family and tells the story of the family's prosperity and decline, internal relationships and political struggles, self-mixing and rebirth over the course of a hundred years.
Project management is the process of applying specialized knowledge, skills, tools, and methods to project activities so that the project can achieve or exceed the set needs and expectations within the constraints of limited resources. This diagram provides a comprehensive overview of the 8 components of the project management process and can be used as a generic template for direct application.
One Hundred Years of Solitude is the masterpiece of Gabriel Garcia Marquez. Reading this book begins with making sense of the characters' relationships, which are centered on the Buendía family and tells the story of the family's prosperity and decline, internal relationships and political struggles, self-mixing and rebirth over the course of a hundred years.
One Hundred Years of Solitude is the masterpiece of Gabriel Garcia Marquez. Reading this book begins with making sense of the characters' relationships, which are centered on the Buendía family and tells the story of the family's prosperity and decline, internal relationships and political struggles, self-mixing and rebirth over the course of a hundred years.
Project management is the process of applying specialized knowledge, skills, tools, and methods to project activities so that the project can achieve or exceed the set needs and expectations within the constraints of limited resources. This diagram provides a comprehensive overview of the 8 components of the project management process and can be used as a generic template for direct application.
Alibaba Cloud Hybrid Cloud Disaster Recovery Service HDR
Product introduction
Hybrid cloud disaster recovery HDR (Hybrid Disaster Recovery) is a service that provides data centers with integrated local backup and cloud disaster recovery for enterprise-level applications. It can provide disaster recovery services as low as second-level RPO and minute-level RTO for local data centers and enterprise key businesses on Alibaba Cloud, effectively ensuring data security and business continuity.
Core issues to be solved
Application-level disaster recovery ensures business continuity (Business Continuity): In the event of data center failure or long-term system maintenance operations, application operations can be quickly restored on the cloud, shortening business downtime and greatly reducing losses.
Data-level disaster recovery: Back up your database, virtual machines, and physical machines in the data center. The backup data is stored locally and automatically uploaded to the cloud. It can ensure data security in the event of a major disaster in a self-built data center, while providing efficient local and cloud dual recovery.
Supported data replication technologies
Continuous replication disaster recovery (CDR)
: Provide high-standard disaster recovery solutions for key enterprise applications, and provide minute-level RPO and RTO disaster recovery.
Cloud disk asynchronous replication disaster recovery
: Based on ESSD cloud disk asynchronous replication technology, it provides disaster recovery capabilities as low as 15 minutes RPO and minute-level RTO for complex ECS applications.
Comparative item Cloud disk asynchronous replication (EBS Async) Continuous Data Replication (CDR) Applicable scene The number of ECSs is large, the operating systems are diverse, and the amount of data is large. Less than or equal to 10 ECSs. The application scale is small, the ECS data volume is small, and the operating system is compatible with CDR ECS disaster recovery. RPO, RTO 15 minutes 1-5 minutes (depending on ECS write volume) cost low cost Fees include: disaster recovery cloud disk replication traffic illustrate HDR does not charge for the use of disaster recovery software during the public beta phase. The cost is slightly higher Fees include: Disaster recovery software usage fee Disaster recovery cloud disk Replication traffic Duplicate ECS Invasive No intrusion The client needs to be installed, occupying ECS resources. The Windows client needs to be restarted after installation. Single-disk replication throughput upper limit 100MB/s 30MB/s Single-site replication throughput upper limit unlimited 50MB/s Sandbox walkthrough Not supported at the moment support Operating system compatibility Most operating systems Limited to designated Windows and Linux systems. See Operating System for more information. Cloud disk type compatibility ESSD all types
basic concept
Before using hybrid cloud disaster recovery HDR, you need to understand the following basic concepts.
concept describe Fail Over Disaster recovery refers to the process of restoring the application on Alibaba Cloud when your IDC application fails. Fail Back After the environment in your IDC is restored, the process of migrating application data back to your own IDC resumes application operation. RPO Recovery Point Objective refers to the expected amount of data loss in the event of an application failure. For example, RPO = 15 minutes means that in the event of an application failure, the last 15 minutes of data cannot be recovered on the cloud. RTO Recovery Time Objective (recovery time objective) refers to the time required to restore the application to run on the cloud when a failure occurs. Hybrid cloud disaster recovery all-in-one machine An all-in-one machine with disaster recovery and backup functions launched by Alibaba Cloud.
Product advantages
Low total cost
There is no need to build a disaster recovery center yourself, eliminating costs such as computer room operation and maintenance and hardware procurement.
The cloud mainly consumes storage resources and requires very little computing resources.
Different RPOs and RTOs can be configured according to different application requirements and different network bandwidths, thereby saving costs.
Compared with the solution of building a self-built disaster recovery center, it can save up to 80% of the cost.
Simple and easy to use
Deployment under the cloud is simple, resources on the cloud are fully automatically managed, and the console provides centralized control.
Backup recovery drills and disaster recovery drills can be performed at any time, with one-click start and quick cleanup.
RPO/RTO classification
Enterprises need to develop stepped RPO/RTO for applications with different levels of importance. The enterprise's infrastructure, especially the network conditions, will restrict the disaster recovery indicators that can be achieved.
Continuous replication disaster recovery (CDR) is based on disk-level real-time data replication technology and can provide RPO/RTO in seconds to minutes.
Hybrid cloud big data disaster recovery provides big data disaster recovery with nearly 0 RPO. The Hadoop cluster disaster recovery can be moved to Alibaba Cloud OSS or EMR, and bidirectional real-time replication can be performed between Hadoop clusters to build a big data lake.
Application-level disaster recovery and data-level disaster recovery
Supports efficient disaster recovery replication and cloud recovery of Windows and Linux application servers to achieve application-level disaster recovery.
You can perform scheduled backups and backups to the cloud only for key application data, including SQL Server, Oracle databases, VMWare virtual machines, etc., to achieve data-level disaster recovery.
Application scenarios
Off-site disaster recovery for critical applications
Applications running on local data centers can face various unexpected situations. For example, because the software and hardware environment is damaged and applications cannot be restored in a short period of time, events such as fires and natural disasters may even lead to the reconstruction of the entire data center. These situations can cause critical applications to be unavailable for a long period of time, causing significant losses to your business. When applications in your own IDC cannot be restored in a short time, hybrid cloud disaster recovery services can help you quickly launch applications on the cloud.
After using the hybrid cloud disaster recovery gateway, the server images, application data, files, etc. of core applications are continuously copied to Alibaba Cloud. If an application in your own IDC encounters a fault that is difficult to recover from, you can start the disaster recovery gateway on Alibaba Cloud and quickly restore the application server on ECS, bringing the application back online quickly and greatly reducing business losses. In normal times, you can also easily conduct disaster recovery drills to ensure a smooth recovery process when a real failure occurs and ensure the accuracy of the disaster recovery plan.
The hybrid cloud disaster recovery service eliminates the need for you to bear the huge investment in building your own disaster recovery center, and you do not need to worry about the complex software and hardware deployment and operation of traditional disaster recovery solutions. It greatly reduces the cost of off-site disaster recovery and improves the effectiveness of disaster recovery.
Whole machine cloud migration
Traditional migration to the cloud generally requires steps such as reinstallation and configuration of applications on cloud images, reconfiguration of ECS virtual machines, and even application reconstruction. This process is often lengthy. Especially for some applications developed by third parties, cloud migration operations are more difficult because of many unclear software dependencies and complex configurations.
Hybrid cloud disaster recovery gateway or disaster recovery all-in-one machine provides a way to back up the entire machine to the cloud and restore it on the cloud, allowing you to truly restore the server environment in the cloud very conveniently in ECS, making cloud migration simple and intuitive.
Disaster recovery planning
demand analysis
Data protection and business continuity are of great significance to data centers. Failure of critical applications or data loss can cause significant losses to your business. Hybrid cloud disaster recovery services provide two levels of capabilities to protect data and ensure business continuity.
Offsite backup
Server images and data are backed up and uploaded directly to the Alibaba Cloud disaster recovery database to achieve highly reliable off-site backup on the cloud. Stable off-site backup ensures that critical data is not lost in extreme situations such as a fire in the local data center, and can be restored to the local location after the local facilities are repaired.
Cloud disaster recovery
In order to reduce business losses caused by application failures, when a serious failure occurs in the data center and cannot be quickly restored, the hybrid cloud disaster recovery service can efficiently and quickly restore your applications on ECS.
RTO and RPO requirements
Application disaster recovery has two core indicators:
RPO: refers to the amount of data loss that can be tolerated when an application fails. The more important the data, the smaller the RPO requirement. The smaller the RPO, the higher the frequency of data backup and replication, the greater the pressure on the production environment and network, and the higher the cost.
RTO: refers to the expected time required from the start of disaster recovery operations to the restoration of the application after a fault occurs. The greater the damage caused to the business by a fault per unit time, the shorter the RTO is required.
RTO and RPO are generally requested by the business department, discussed with the IT department, and based on comprehensive considerations such as technical feasibility, impact on existing systems, and cost. There is often a linear relationship between the level of RTO and RPO standards and infrastructure costs.
You can also refer to national and industry standards to set RTO and RPO goals. The GB/T 20988-2007 standard is an information system disaster recovery specification formulated by the China National Standardization Administration Committee. There are examples of RPO/RTO level specifications in a certain industry in the appendix, as shown below. For more information, see
Application analysis
Application deployment
Before deploying critical applications, you need to consider the following three elements:
What servers does this application contain?
Network connections between servers
What configuration needs to be done in the server
For example, a simple web application contains the following elements:
The application contains: 1 database server, 1 back-end server, and 1 Web front-end server.
3 servers are on the same network.
There is a configuration item in the back-end server that specifies the IP address of the database server, and the Web front-end server has a configuration item that specifies the IP address of the back-end server.
After identifying these elements, you can plan as follows:
The hybrid cloud disaster recovery service needs to protect these three servers.
When restoring on Alibaba Cloud, these three servers need to be restored in the same VPC.
After the entire machine is restored, to ensure that the application can run, you must ensure that the same IP address is used during restoration as in the cloud. Alternatively, ensure that configuration items are modified using automated scripts after the recovery is complete.
environmental dependence
Application disaster recovery is a process that requires the cooperation of multiple departments, including the cooperation of application administrators, computer room administrators, network administrators and other roles. A complete disaster recovery solution that can meet business requirements needs to consider details from many aspects, including:
The environment that the application depends on, such as Active Directory (AD), DNS, etc.
Network configuration required by the application
In many cases, the operation of the application also has some important environmental dependencies. For example, in a Windows environment, many applications rely on AD to run. Then when restoring on the cloud, your VPC environment on the cloud must be able to connect to the AD service. Of course, DNS services are also a strong demand in many environments.
Taking AD as an example, there are usually two situations:
If you have deployed multiple master-slave AD servers in different data centers, you only need to establish a high-speed channel or SSLVPN connection between the data center where AD is located and the VPC on the cloud.
If your AD server is deployed centrally in a data center, it may go offline at the same time. We recommend that you:
Use the hybrid cloud disaster recovery all-in-one machine to protect the AD server, and restore the AD server first when a failure occurs in the cloud.
Deploy a secondary AD server in the VPC on the cloud and maintain a connection with the primary AD server under the cloud. When a failure occurs under the cloud, AD on the cloud is used.
Similarly, the DNS server also needs to be configured accordingly to meet the application environment requirements after disaster recovery.
Application client connection
After the application is restored, you need to ensure that the client can connect to the restored application. Typically, you need:
If the restored application server IP address is the same as the original one and the DNS server is successfully restored, then only the client and the application need to have a network connection. You may need to use SSLVPN or high-speed channel to ensure that the client can connect to the restored application on the cloud, or the restored application provides a public IP address for the client to access.
It is not required to use the original IP address when the application is restored. You can also modify the DNS to ensure that the client can connect to the new service.
If both the domain name and IP address change, you need to modify the client.
Disaster recovery equipment and network environment
Based on the number of application servers, data volume, RPO and RTO standards, and the requirements of the dependent environmental facilities, you can reasonably select disaster recovery equipment and deploy a suitable network environment.
CDR disaster recovery all-in-one machine
If a virtualized environment is supported and the number of servers that require disaster recovery protection is less than 5, it is recommended that you deploy virtualization.
If a virtualized environment is not supported, or the number of servers for disaster recovery protection is more than 5, it is recommended to use a CDR disaster recovery all-in-one machine. The available all-in-one models are as follows:
model Number of servers supported Apsara DR100 <20 Apsara DR200 <100
Web environment
The network environments required for the above disaster recovery equipment include the following two types:
Network between data center and Alibaba Cloud
Due to the optimized data storage and transmission algorithm, the hybrid cloud disaster recovery service does not require the local data center to establish a dedicated line connection with Alibaba Cloud. However, for scenarios with large data volumes and strict RPO requirements, it is recommended that you use dedicated line connections to ensure that the disaster recovery service can meet the required indicators.
After the application is restored, depending on the connection requirements between the client, AD, DNS, etc. and Alibaba Cloud VPC, you may need to consider SSLVPN, high-speed channel connection, application exposure of the public IP address, etc. to ensure the normal use of the application.
The network between the hybrid cloud disaster recovery appliance and the protected server
In order to perform normal backup and recovery of the protected server, there needs to be a network connection between the disaster recovery machine and the protected server.
The backup all-in-one machine provides dual Gigabit and dual 10G network cards to choose from, and you can configure them as needed according to the backup and recovery throughput requirements.