Table of Content
TABLE OF CONTENTS
This blog post is the final part of the Data Warehouse Migration to AR series. The second part of the blog post series, Data Warehouse Migration to Amazon Redshift – Part 2, details how to get started with Amazon Redshift and the business and technical benefits of using it.
Modern data warehouse migration to AWS also requires planning for validation, cutover strategy, and post-migration optimization. These steps help ensure that the move to Amazon Redshift delivers not just speed, but long-term operational stability.
1. Migrating to AR
The migrating strategy that you choose depends on various factors, such as:
- The size of the database and its tables
- Network bandwidth between the source server and AWS
- Whether the migration and switchover to AWS will be done in one step or a sequence of steps over time
- The data change rate in the source system
- Transformations during migration
- The partner tool that you plan to use for migration and ETL
The right approach often depends on whether the team is modernizing to Redshift Serverless vs. provisioned capacity, since each option has different implications for flexibility, workload control, and cost. That decision should be aligned with the expected query patterns and operating model.
One-Step Migration
One-step migration is a good choice for small databases that do not require persistent operation. Clients can extricate existing databases as comma-separated value (CSV) files, and then use services such as AWS Import/Export Snowball to deliver datasets to Amazon S3 for stacking into Amazon Redshift. Clients then test the destination Amazon Redshift database for data consistency with the source. Once all validations have passed, the database is switched over to AWS.
This approach can work well for smaller workloads, but teams should still build a data validation and reconciliation framework to verify completeness, accuracy, and consistency after the move. Validation should happen before cutover, not after issues reach production.
Two-Step Migration
This type of migration is commonly used for databases of any size.
Initial data migration: The data is extracted from the source database, preferably during nonpeak usage, to minimize the impact. The data is then migrated to Amazon Redshift by following the one-step migration approach described above.
Changed data migration: This refers to data that changed in the source database after the initial data migration is propagated to the destination before switchover. This step synchronizes the source and destination databases.
Once all the changed data is migrated, validate the data in the destination database, perform necessary tests, and if all tests are passed, switch over to the Amazon Redshift data warehouse.
For larger environments, a staged cutover can also reduce risk by allowing teams to rehearse the zero-downtime cutover patterns needed for business-critical workloads. This is especially useful when source systems must keep running during migration.
Data Migration Tools
There are several tools and technologies available for data migration. These tools can be used interchangeably, or you can make use of third-party tools or open source tools.
When deciding on the best ETL tool to migrate data from Oracle DB to Amazon Redshift or other platforms, the real question is usually how well the tool handles schema conversion, ongoing synchronization, and downtime limits. That is where tooling choice becomes a strategic migration decision.
While migrating your database from one engine to another, you have two major things to consider: the conversion of the schema and code objects, and the migration and conversion of the data itself. AWS has tools to help do both. You can convert schema and code with AWS SCT and migrate data with AWS DMS. AWS DMS helps you migrate your data easily and securely with minimal downtime.
AWS Data Migration Service (DMS)
AWS Database Migration Service (AWS DMS) helps you to migrate databases to AWS efficiently and securely. The source database can remain completely operational amid the relocation, limiting downtime to applications that depend on the database. This tool supports both the one-step and the two-step migration processes described above. To follow the two-step migration process, you should enable supplemental logging to capture changes to the source system. You can enable supplemental logging at the table or database level.
Make use of AWS DMS to migrate data, tables, and primary keys to the target database. All other database elements are not migrated. The service bolsters homogeneous relocations such as Oracle to Oracle, and also heterogeneous relocations between various database stages, such as Oracle to MySQL or MySQL to Amazon Aurora. The source or target database must be on an AWS service.
For many organizations, AWS DMS is one of the AWS-recommended tools for SQL Server to Redshift migration because it supports both initial load and change data capture. It is especially useful when teams need to keep source systems operational during migration.
AWS DMS adopts a moderate approach and creates only those objects required to proficiently relocate the data, for instance, tables with a primary key.
In addition, DMS can support many common source databases and help reduce downtime, but teams still need a clear reconciliation process to confirm that the target warehouse matches the source.
AWS Schema Conversion Tool (SCT)
AWS SCT converts the existing database schema from one database engine to another. You can convert a relational OLTP schema or data warehouse schema. Your converted schema is suitable for an Amazon Relational Database Service (Amazon RDS), MySQL DB instance, an Amazon Aurora DB cluster, an Amazon RDS PostgreSQL DB instance, or an Amazon Redshift cluster.
AWS SCT is especially valuable for Schema Conversion Tooling because it helps translate database schema, code objects, and application logic into a Redshift-compatible format. This reduces manual rewrite effort and improves the consistency of the migration.
AWS SCT gives a task-based UI to automatically change over the database pattern of your source database into a configuration compatible with your target Amazon RDS instance. If a pattern from your source database can't be changed automatically, AWS SCT gives directions on how you can make an equivalent schema in your target Amazon RDS database.
In addition to converting your existing database schema from one database engine to another, AWS SCT has some additional features that help you move your data and applications to the cloud.
SCT also helps teams optimize their target warehouse by recommending distribution keys, sort keys, and other design choices that support better performance after migration. These settings can have a major impact on query speed and workload efficiency.
- Use data extraction agents to extract data from your data warehouse to prepare to migrate it to Amazon Redshift.
- Make use of AWS SCT to create AWS DMS endpoints and tasks, run and monitor these tasks.
- In some cases, database features can't be changed over to proportionate Amazon RDS or Amazon Redshift features. The AWS SCT extension pack wizard can enable you to install AWS Lambda functions and Python libraries to mimic the highlights that can't be changed.
- AWS SCT helps to optimize your existing Amazon Redshift database. AWS SCT recommends sort keys and distribution keys to optimize your database.
- With AWS SCT, you can convert SQL in your C++, C#, Java, or other application code; see, investigate, edit, and save the converted SQL code.
Amazon Redshift Best Practices
Amazon Redshift differs from other SQL database systems. To completely understand the advantages of the Amazon Redshift architecture, you need to explicitly configure, build, and load your tables to use massively parallel processing, columnar data storage, and columnar data compression.
After migration, teams should review distribution keys, sort keys, and WLM optimization post-migration to make sure the warehouse is tuned for actual workload patterns. Without this step, even a successful migration can underperform.
For additional details on best practices for designing tables, loading data into tables, and writing queries, check the references below.
Validation and reconciliation
A strong migration program should include automated reconciliation checks for row counts, key attributes, and aggregate measures across source and target systems. This helps confirm that data moved correctly and that business users can trust the new warehouse from day one.
Choosing data movement tools
The best tool to migrate data from FTP to Redshift, MongoDB to Amazon Redshift, or SQL Server to Redshift depends on whether the priority is batch transfer, CDC, schema conversion, or minimal downtime. In many cases, the right answer is a combination of tools rather than a single utility.
Conclusion
There is a strategic shift in data warehousing as enterprises migrate their analytics databases and solutions from on-premises solutions to the cloud to take advantage of the cloud’s simplicity, performance, and cost-effectiveness. AWS gives an expansive range of services and strong partner ecosystems that empower you to effectively fabricate and run enterprise data warehousing in the cloud.
References
AWS Database Migration Service
Working with the AWS Database Migration Service Using the AWS Schema Conversion Tool
Getting Started with the AWS Schema Conversion Tool
Amazon Redshift best practices for designing tables
Tags
Data-as-an-Asset
Prabhu R Chennupati
Enterprise Consulting Architect
With over two decades of experience spanning enterprise architecture, data and solution architecture, strategic planning, and delivery leadership, Prabhu has significantly guided CDO organizations to develop data architecture strategies and roadmaps for diverse clients.