AWS Glue is an event-driven, serverless computing platform provided by Amazon as a part of Amazon Web Services. It is a full-fledged data integration service used by hundreds of thousands of organizations worldwide. AWS Glue runs in a serverless environment, with no need for provisioning, configuring, or spinning up servers .
With the pay-as-you-go model and the capability to integrate petabyte-scale volumes of data, Glue is quickly becoming a popular choice for building secure and scalable data lakes, warehouse, lake-house and data mesh architectures. With AWS Glue, you can discover and connect to various diverse data sources and manage your data in a centralized data catalog. AWS Glue capabilities include data discovery, modern ETL, data cleansing and transformations. Being a Serverless service, it requires no infrastructure to manage with offering flexible support for all types of workloads such as ETL, ELT and streaming. With ability to scale on demand, it helps us to focus on high-volume activities that maximize the value of data.
Glue Connectors
AWS Glue provides built-in support for the commonly used data stores such as Amazon Redshift, Aurora, Microsoft SQL Server using JDBC connection. These connectors allow you to create Glue jobs with ability to extract, transform and load (ETL). A connector is an optional code packages that assists with accessing data stores in AWS Glue Studio.
AWS Glue also allows you to subscribe to several connectors offered in AWS Marketplace. While creating a job, we can use a natively supported data source or use a connector from AWS Marketplace based on specific source system you are trying to extract and load data.