Amazon Redshift is a quick, fully-managed, petabyte scale knowledge warehouse that gives the pliability to make use of provisioned or serverless compute to your analytical workloads. Utilizing Amazon Redshift Serverless and Question Editor v2, you possibly can load and question giant datasets in just some clicks and pay just for what you employ. The decoupled compute and storage structure of Amazon Redshift allows you to construct extremely scalable, resilient, and cost-effective workloads. Many purchasers migrate their knowledge warehousing workloads to Amazon Redshift and profit from the wealthy capabilities it presents. The next are simply a number of the notable capabilities:
- Amazon Redshift seamlessly integrates with broader analytics companies on AWS. This permits you to decide on the appropriate software for the appropriate job. Trendy analytics is way wider than SQL-based knowledge warehousing. Amazon Redshift enables you to construct lake home architectures after which carry out any sort of analytics, equivalent to interactive analytics, operational analytics, large knowledge processing, visible knowledge preparation, predictive analytics, machine studying (ML), and extra.
- You don’t want to fret about workloads, equivalent to ETL, dashboards, ad-hoc queries, and so forth, interfering with one another. You may isolate workloads utilizing knowledge sharing, whereas utilizing the identical underlying datasets.
- When customers run many queries at peak occasions, compute seamlessly scales inside seconds to offer constant efficiency at excessive concurrency. You get one hour of free concurrency scaling capability for twenty-four hours of utilization. This free credit score meets the concurrency demand of 97% of the Amazon Redshift buyer base.
- Amazon Redshift is easy-to-use with self-tuning and self-optimizing capabilities. You will get sooner insights with out spending priceless time managing your knowledge warehouse.
- Fault Tolerance is inbuilt. All the knowledge written to Amazon Redshift is routinely and repeatedly replicated to Amazon Easy Storage Service (Amazon S3). Any {hardware} failures are routinely changed.
- Amazon Redshift is easy to work together with. You may entry knowledge with conventional, cloud-native, containerized, and serverless net services-based or event-driven purposes and so forth.
- Redshift ML makes it simple for knowledge scientists to create, prepare, and deploy ML fashions utilizing acquainted SQL. They’ll additionally run predictions utilizing SQL.
- Amazon Redshift supplies complete knowledge safety at no further price. You may arrange end-to-end knowledge encryption, configure firewall guidelines, outline granular row and column stage safety controls on delicate knowledge, and so forth.
- Amazon Redshift integrates seamlessly with different AWS companies and third-party instruments. You may transfer, remodel, load, and question giant datasets shortly and reliably.
On this submit, we offer a walkthrough for migrating an information warehouse from Google BigQuery to Amazon Redshift utilizing AWS Schema Conversion Instrument (AWS SCT) and AWS SCT knowledge extraction brokers. AWS SCT is a service that makes heterogeneous database migrations predictable by routinely changing nearly all of the database code and storage objects to a format that’s appropriate with the goal database. Any objects that may’t be routinely transformed are clearly marked in order that they are often manually transformed to finish the migration. Moreover, AWS SCT can scan your software code for embedded SQL statements and convert them.
Answer overview
AWS SCT makes use of a service account to connect with your BigQuery mission. First, we create an Amazon Redshift database into which BigQuery knowledge is migrated. Subsequent, we create an S3 bucket. Then, we use AWS SCT to transform BigQuery schemas and apply them to Amazon Redshift. Lastly, emigrate knowledge, we use AWS SCT knowledge extraction brokers, which extract knowledge from BigQuery, add it into the S3 bucket, after which copy to Amazon Redshift.
Conditions
Earlier than beginning this walkthrough, it’s essential to have the next stipulations:
- A workstation with AWS SCT, Amazon Corretto 11, and Amazon Redshift drivers.
- You should utilize an Amazon Elastic Compute Cloud (Amazon EC2) occasion or your native desktop as a workstation. On this walkthrough, we’re utilizing Amazon EC2 Home windows occasion. To create it, use this information.
- To obtain and set up AWS SCT on the EC2 occasion that you simply beforehand created, use this information.
- Obtain the Amazon Redshift JDBC driver from this location.
- Obtain and set up Amazon Corretto 11.
- A GCP service account that AWS SCT can use to connect with your supply BigQuery mission.
- Grant BigQuery Admin and Storage Admin roles to the service account.
- Copy the Service account key file, which was created within the Google cloud administration console, to the EC2 occasion that has AWS SCT.
- Create a Cloud Storage bucket in GCP to retailer your supply knowledge throughout migration.
This walkthrough covers the next steps:
- Create an Amazon Redshift Serverless Workgroup and Namespace
- Create the AWS S3 Bucket and Folder
- Convert and apply BigQuery Schema to Amazon Redshift utilizing AWS SCT
- Connecting to the Google BigQuery Supply
- Connect with the Amazon Redshift Goal
- Convert BigQuery schema to an Amazon Redshift
- Analyze the evaluation report and handle the motion objects
- Apply transformed schema to focus on Amazon Redshift
- Migrate knowledge utilizing AWS SCT knowledge extraction brokers
- Producing Belief and Key Shops (Optionally available)
- Set up and begin knowledge extraction agent
- Register knowledge extraction agent
- Add digital partitions for giant tables (Optionally available)
- Create a neighborhood migration activity
- Begin the Native Knowledge Migration Process
- View Knowledge in Amazon Redshift
Create an Amazon Redshift Serverless Workgroup and Namespace
On this step, we create an Amazon Redshift Serverless workgroup and namespace. Workgroup is a set of compute assets and namespace is a set of database objects and customers. To isolate workloads and handle totally different assets in Amazon Redshift Serverless, you possibly can create namespaces and workgroups and handle storage and compute assets individually.
Comply with these steps to create Amazon Redshift Serverless workgroup and namespace:
- Navigate to the Amazon Redshift console.
- Within the higher proper, select the AWS Area that you simply need to use.
- Develop the Amazon Redshift pane on the left and select Redshift Serverless.
- Select Create Workgroup.
- For Workgroup identify, enter a reputation that describes the compute assets.
- Confirm that the VPC is identical because the VPC because the EC2 occasion with AWS SCT.
- Select Subsequent.

- For Namespace identify, enter a reputation that describes your dataset.
- In Database identify and password part, choose the checkbox Customise admin person credentials.
- For Admin person identify, enter a username of your alternative, for instance awsuser.
- For Admin person password: enter a password of your alternative, for instance MyRedShiftPW2022.

- Select Subsequent. Observe that knowledge in Amazon Redshift Serverless namespace is encrypted by default.
- Within the Evaluate and Create web page, select Create.
- Create an AWS Identification and Entry Administration (IAM) function and set it because the default in your namespace, as described within the following. Observe that there can solely be one default IAM function.
- Navigate to the Amazon Redshift Serverless Dashboard.
- Beneath Namespaces / Workgroups, select the namespace that you simply simply created.
- Navigate toSafety and encryption.
- Beneath Permissions, select Handle IAM roles.
- Navigate to Handle IAM roles. Then, select the Handle IAM roles drop-down and select Create IAM function.
- Beneath Specify an Amazon S3 bucket for the IAM function to entry, select one of many following strategies:
- Select No further Amazon S3 bucket to permit the created IAM function to entry solely the S3 buckets with a reputation beginning with redshift.
- Select Any Amazon S3 bucket to permit the created IAM function to entry the entire S3 buckets.
- Select Particular Amazon S3 buckets to specify a number of S3 buckets for the created IAM function to entry. Then select a number of S3 buckets from the desk.
- Select Create IAM function as default. Amazon Redshift routinely creates and units the IAM function as default.
- Seize the Endpoint for the Amazon Redshift Serverless workgroup that you simply simply created.
Create the S3 bucket and folder
In the course of the knowledge migration course of, AWS SCT makes use of Amazon S3 as a staging space for the extracted knowledge. Comply with these steps to create the S3 bucket:
- Navigate to the Amazon S3 console
- Select Create bucket. The Create bucket wizard opens.
- For Bucket identify, enter a singular DNS-compliant identify to your bucket (e.g., uniquename-bq-rs). See guidelines for bucket naming when selecting a reputation.
- For AWS Area, select the area during which you created the Amazon Redshift Serverless workgroup.
- Choose Create Bucket.
- Within the Amazon S3 console, navigate to the S3 bucket that you simply simply created (e.g., uniquename-bq-rs).
- Select “Create folder” to create a brand new folder.
- For Folder identify, enter incoming and select Create Folder.
Convert and apply BigQuery Schema to Amazon Redshift utilizing AWS SCT
To transform BigQuery schema to the Amazon Redshift format, we use AWS SCT. Begin by logging in to the EC2 occasion that we created beforehand, after which launch AWS SCT.
Comply with these steps utilizing AWS SCT:
Connect with the BigQuery Supply
- From the File Menu select Create New Undertaking.
- Select a location to retailer your mission recordsdata and knowledge.
- Present a significant however memorable identify to your mission, equivalent to BigQuery to Amazon Redshift.
- To hook up with the BigQuery supply knowledge warehouse, select Add supply from the primary menu.
- Select BigQuery and select Subsequent. The Add supply dialog field seems.
- For Connection identify, enter a reputation to explain BigQuery connection. AWS SCT shows this identify within the tree within the left panel.
- For Key path, present the trail of the service account key file that was beforehand created within the Google cloud administration console.
- Select Take a look at Connection to confirm that AWS SCT can connect with your supply BigQuery mission.
- As soon as the connection is efficiently validated, select Join.

Connect with the Amazon Redshift Goal
Comply with these steps to connect with Amazon Redshift:
- In AWS SCT, select Add Goal from the primary menu.
- Select Amazon Redshift, then select Subsequent. The Add Goal dialog field seems.
- For Connection identify, enter a reputation to explain the Amazon Redshift connection. AWS SCT shows this identify within the tree in the appropriate panel.
- For Server identify, enter the Amazon Redshift Serverless workgroup endpoint captured beforehand.
- For Server port, enter 5439.
- For Database, enter dev.
- For Consumer identify, enter the username chosen when creating the Amazon Redshift Serverless workgroup.
- For Password, enter the password chosen when creating Amazon Redshift Serverless workgroup.
- Uncheck the “Use AWS Glue” field.
- Select Take a look at Connection to confirm that AWS SCT can connect with your goal Amazon Redshift workgroup.
- Select Join to connect with the Amazon Redshift goal.
Observe that alternatively you need to use connection values which might be saved in AWS Secrets and techniques Supervisor.

Convert BigQuery schema to an Amazon Redshift
After the supply and goal connections are efficiently made, you see the supply BigQuery object tree on the left pane and goal Amazon Redshift object tree on the appropriate pane.
Comply with these steps to transform BigQuery schema to the Amazon Redshift format:
- On the left pane, right-click on the schema that you simply need to convert.
- Select Convert Schema.
- A dialog field seems with a query, The objects would possibly exist already within the goal database. Change?. Select Sure.
As soon as the conversion is full, you see a brand new schema created on the Amazon Redshift pane (proper pane) with the identical identify as your BigQuery schema.

The pattern schema that we used has 16 tables, 3 views, and three procedures. You may see these objects within the Amazon Redshift format in the appropriate pane. AWS SCT converts the entire BigQuery code and knowledge objects to the Amazon Redshift format. Moreover, you need to use AWS SCT to transform exterior SQL scripts, software code, or further recordsdata with embedded SQL.
Analyze the evaluation report and handle the motion objects
AWS SCT creates an evaluation report back to assess the migration complexity. AWS SCT can convert nearly all of code and database objects. Nevertheless, a number of the objects could require guide conversion. AWS SCT highlights these objects in blue within the conversion statistics diagram and creates motion objects with a complexity connected to them.
To view the evaluation report, change from the Important view to the Evaluation Report view as follows:

The Abstract tab exhibits objects that had been transformed routinely, and objects that weren’t transformed routinely. Inexperienced represents routinely transformed or with easy motion objects. Blue represents medium and sophisticated motion objects that require guide intervention.

The Motion Gadgets tab exhibits the really helpful actions for every conversion difficulty. If you choose an motion merchandise from the checklist, AWS SCT highlights the item to which the motion merchandise applies.
The report additionally comprises suggestions for how one can manually convert the schema merchandise. For instance, after the evaluation runs, detailed reviews for the database/schema present you the hassle required to design and implement the suggestions for changing Motion objects. For extra details about deciding how one can deal with guide conversions, see Dealing with guide conversions in AWS SCT. Amazon Redshift takes some actions routinely whereas changing the schema to Amazon Redshift. Objects with these actions are marked with a pink warning signal.

You may consider and examine the person object DDL by choosing it from the appropriate pane, and it’s also possible to edit it as wanted. Within the following instance, AWS SCT modifies the RECORD and JSON datatype columns in BigQuery desk ncaaf_referee_data to the SUPER datatype in Amazon Redshift. The partition key within the ncaaf_referee_data desk is transformed to the distribution key and type key in Amazon Redshift.

Apply transformed schema to focus on Amazon Redshift
To use the transformed schema to Amazon Redshift, choose the transformed schema in the appropriate pane, right-click, after which select Apply to database.

Migrate knowledge from BigQuery to Amazon Redshift utilizing AWS SCT knowledge extraction brokers
AWS SCT extraction brokers extract knowledge out of your supply database and migrate it to the AWS Cloud. On this walkthrough, we present how one can configure AWS SCT extraction brokers to extract knowledge from BigQuery and migrate to Amazon Redshift.
First, set up AWS SCT extraction agent on the identical Home windows occasion that has AWS SCT put in. For higher efficiency, we suggest that you simply use a separate Linux occasion to put in extraction brokers if attainable. For large datasets, you need to use a number of knowledge extraction brokers to extend the information migration velocity.
Producing belief and key shops (non-obligatory)
You should utilize Safe Socket Layer (SSL) encrypted communication with AWS SCT knowledge extractors. Whenever you use SSL, the entire knowledge handed between the purposes stays personal and integral. To make use of SSL communication, it’s essential to generate belief and key shops utilizing AWS SCT. You may skip this step for those who don’t need to use SSL. We suggest utilizing SSL for manufacturing workloads.
Comply with these steps to generate belief and key shops:
- In AWS SCT, navigate to Settings → International Settings → Safety.
- Select Generate belief and key retailer.

- Enter the identify and password for belief and key shops and select a location the place you want to retailer them.

- Select Generate.
Set up and configure Knowledge Extraction Agent
Within the set up package deal for AWS SCT, you discover a sub-folder agent (aws-schema-conversion-tool-1.0.newest.zipagents). Find and set up the executable file with a reputation like aws-schema-conversion-tool-extractor-xxxxxxxx.msi.
Within the set up course of, comply with these steps to configure AWS SCT Knowledge Extractor:
- For Listening port, enter the port quantity on which the agent listens. It’s 8192 by default.
- For Add a supply vendor, enter no, as you don’t want drivers to connect with BigQuery.
- For Add the Amazon Redshift driver, enter YES.
- For Enter Redshift JDBC driver file or recordsdata, enter the situation the place you downloaded Amazon Redshift JDBC drivers.
- For Working folder, enter the trail the place the AWS SCT knowledge extraction agent will retailer the extracted knowledge. The working folder will be on a distinct pc from the agent, and a single working folder will be shared by a number of brokers on totally different computer systems.
- For Allow SSL communication, enter sure. Select No right here for those who don’t need to use SSL.
- For Key retailer, enter the storage location chosen when creating the belief and key retailer.
- For Key retailer password, enter the password for the important thing retailer.
- For Allow consumer SSL authentication, enter sure.
- For Belief retailer, enter the storage location chosen when creating the belief and key retailer.
- For Belief retailer password, enter the password for the belief retailer.
Beginning Knowledge Extraction Agent(s)
Use the next process to begin extraction brokers. Repeat this process on every pc that has an extraction agent put in.
Extraction brokers act as listeners. Whenever you begin an agent with this process, the agent begins listening for directions. You ship the brokers directions to extract knowledge out of your knowledge warehouse in a later part.
To start out the extraction agent, navigate to the AWS SCT Knowledge Extractor Agent listing. For instance, in Microsoft Home windows, double-click C:Program FilesAWS SCT Knowledge Extractor AgentStartAgent.bat.
- On the pc that has the extraction agent put in, from a command immediate or terminal window, run the command listed following your working system.
- To examine the standing of the agent, run the identical command however change begin with standing.
- To cease an agent, run the identical command however change begin with cease.
- To restart an agent, run the identical RestartAgent.bat file.
Register the Knowledge Extraction Agent
Comply with these steps to register the Knowledge Extraction Agent:
- In AWS SCT, change the view to Knowledge Migration view (different) and select + Register.
- Within the connection tab:
- For Description, enter a reputation to establish the Knowledge Extraction Agent.
- For Host identify, for those who put in the Knowledge Extraction Agent on the identical workstation as AWS SCT, enter 0.0.0.0 to point native host. In any other case, enter the host identify of the machine on which the AWS SCT Knowledge Extraction Agent is put in. It’s really helpful to put in the Knowledge Extraction Brokers on Linux for higher efficiency.
- For Port, enter the quantity entered for the Listening Port when putting in the AWS SCT Knowledge Extraction Agent.
- Choose the checkbox to make use of SSL (if utilizing SSL) to encrypt the AWS SCT connection to the Knowledge Extraction Agent.
- In the event you’re utilizing SSL, then within the SSL Tab:
- For Belief retailer, select the belief retailer identify created when producing Belief and Key Shops (optionally, you possibly can skip this if SSL connectivity isn’t wanted).
- For Key Retailer, select the important thing retailer identify created when producing Belief and Key Shops (optionally, you possibly can skip this if SSL connectivity isn’t wanted).

- Select Take a look at Connection.
- As soon as the connection is validated efficiently, select Register.

Add digital partitions for giant tables (non-obligatory)
You should utilize AWS SCT to create digital partitions to optimize migration efficiency. When digital partitions are created, AWS SCT extracts the information in parallel for partitions. We suggest creating digital partitions for giant tables.
Comply with these steps to create digital partitions:
- Deselect all objects on the supply database view in AWS SCT.
- Select the desk for which you want to add digital partitioning.
- Proper-click on the desk, and select Add Digital Partitioning.

- You should utilize Checklist, Vary, or Auto Break up partitions. To study extra about digital partitioning, discuss with Use digital partitioning in AWS SCT. On this instance, we use Auto cut up partitioning, which generates vary partitions routinely. You’d specify the beginning worth, finish worth, and the way large the partition needs to be. AWS SCT determines the partitions routinely. For an indication, on the Lineorder desk:
- For Begin Worth, enter 1000000.
- For Finish Worth, enter 3000000.
- For Interval, enter 1000000 to point partition dimension.
- Select Okay.

You may see the partitions routinely generated below the Digital Partitions tab. On this instance, AWS SCT routinely created the next 5 partitions for the sphere:
-
- <1000000
- >=1000000 and <=2000000
- >2000000 and <=3000000
- >3000000
- IS NULL

Create a neighborhood migration activity
Emigrate knowledge from BigQuery to Amazon Redshift, create, run, and monitor the native migration activity from AWS SCT. This step makes use of the information extraction agent emigrate knowledge by making a activity.
Comply with these steps to create a neighborhood migration activity:
- In AWS SCT, below the schema identify within the left pane, right-click on Normal tables.
- Select Create Native activity.

- There are three migration modes from which you’ll be able to select:
- Extract supply knowledge and retailer it on a neighborhood computer/digital machine (VM) the place the agent runs.
- Extract knowledge and add it on an S3 bucket.
- Select Extract add and replica, which extracts knowledge to an S3 bucket after which copies to Amazon Redshift.

- Within the Superior tab, for Google CS bucket folder enter the Google Cloud Storage bucket/folder that you simply created earlier within the GCP Administration Console. AWS SCT shops the extracted knowledge on this location.

- Within the Amazon S3 Settings tab, for Amazon S3 bucket folder, present the bucket and folder names of the S3 bucket that you simply created earlier. The AWS SCT knowledge extraction agent uploads the information into the S3 bucket/folder earlier than copying to Amazon Redshift.

- Select Take a look at Process.

- As soon as the duty is efficiently validated, select Create.
Begin the Native Knowledge Migration Process
To start out the duty, select the Begin button within the Duties tab.

- First, the Knowledge Extraction Agent extracts knowledge from BigQuery into the GCP storage bucket.
- Then, the agent uploads knowledge to Amazon S3 and launches a replica command to maneuver the information to Amazon Redshift.
- At this level, AWS SCT has efficiently migrated knowledge from the supply BigQuery desk to the Amazon Redshift desk.
View knowledge in Amazon Redshift
After the information migration activity executes efficiently, you possibly can connect with Amazon Redshift and validate the information.
Comply with these steps to validate the information in Amazon Redshift:
- Navigate to the Amazon Redshift QueryEditor V2.
- Double-click on the Amazon Redshift Serverless workgroup identify that you simply created.
- Select the Federated Consumer possibility below Authentication.
- Select Create Connection.

- Create a brand new editor by selecting the + icon.
- Within the editor, write a question to pick out from the schema identify and desk identify/view identify you want to confirm. Discover the information, run ad-hoc queries, and make visualizations and charts and views.

The next is a side-by-side comparability between supply BigQuery and goal Amazon Redshift for the sports activities data-set that we used on this walkthrough.

Clear up up any AWS assets that you simply created for this train
Comply with these steps to terminate the EC2 occasion:
- Navigate to the Amazon EC2 console.
- Within the navigation pane, select Cases.
- Choose the check-box for the EC2 occasion that you simply created.
- Select Occasion state, after which Terminate occasion.
- Select Terminate when prompted for affirmation.
Comply with these steps to delete Amazon Redshift Serverless workgroup and namespace
- Navigate to Amazon Redshift Serverless Dashboard.
- Beneath Namespaces / Workgroups, select the workspace that you simply created.
- Beneath Actions, select Delete workgroup.
- Choose the checkbox Delete the related namespace.
- Uncheck Create last snapshot.
- Enter delete within the delete affirmation textual content field and select Delete.

Comply with these steps to delete the S3 bucket
- Navigate to Amazon S3 console.
- Select the bucket that you simply created.
- Select Delete.
- To verify deletion, enter the identify of the bucket within the textual content enter discipline.
- Select Delete bucket.
Conclusion
Migrating an information warehouse generally is a difficult, complicated, and but rewarding mission. AWS SCT reduces the complexity of information warehouse migrations. Following this walkthrough, you possibly can perceive how an information migration activity extracts, downloads, after which migrates knowledge from BigQuery to Amazon Redshift. The answer that we offered on this submit performs a one-time migration of database objects and knowledge. Knowledge adjustments made in BigQuery when the migration is in progress received’t be mirrored in Amazon Redshift. When knowledge migration is in progress, put your ETL jobs to BigQuery on maintain or replay the ETLs by pointing to Amazon Redshift after the migration. Think about using the finest practices for AWS SCT.
AWS SCT has some limitations when utilizing BigQuery as a supply. For instance, AWS SCT can’t convert sub queries in analytic features, geography features, statistical combination features, and so forth. Discover the complete checklist of limitations within the AWS SCT person information. We plan to handle these limitations in future releases. Regardless of these limitations, you need to use AWS SCT to routinely convert most of your BigQuery code and storage objects.
Obtain and set up AWS SCT, register to the AWS Console, checkout Amazon Redshift Serverless, and begin migrating!
Concerning the authors
Cedrick Hoodye is a Options Architect with a deal with database migrations utilizing the AWS Database Migration Service (DMS) and the AWS Schema Conversion Instrument (SCT) at AWS. He works on DB migrations associated challenges. He works intently with EdTech, Power, and ISV enterprise sector clients to assist them understand the true potential of DMS service. He has helped migrate 100s of databases into the AWS cloud utilizing DMS and SCT.
Amit Arora is a Options Architect with a deal with Database and Analytics at AWS. He works with our Monetary Expertise and International Power clients and AWS licensed companions to offer technical help and design buyer options on cloud migration initiatives, serving to clients migrate and modernize their current databases to the AWS Cloud.
Jagadish Kumar is an Analytics Specialist Answer Architect at AWS targeted on Amazon Redshift. He’s deeply keen about Knowledge Structure and helps clients construct analytics options at scale on AWS.
Anusha Challa is a Senior Analytics Specialist Answer Architect at AWS targeted on Amazon Redshift. She has helped many purchasers construct large-scale knowledge warehouse options within the cloud and on premises. Anusha is keen about knowledge analytics and knowledge science and enabling clients obtain success with their large-scale knowledge initiatives.
