Maximizing Data Engineering Potential with Event-Driven Architecture on Google Cloud

Cloud

June 21, 2023

Overview

Monolithic applications are complex, and their dependent architectures are difficult to manage and scale on par with modern applications built with microservices. In addition, monolithic architectures are not well-suited for applications that require real-time or near-real-time processing. This is because they are designed to handle requests one at a time, which can lead to latency issues.

However, Event-Driven Architecture (EDA) offers a solution by decoupling services and enabling real-time processing through events. In this blog post, we will explore what EDA is, how it works on Google Cloud, and when to apply it in data engineering. Further, it enlightens the use of EDA to decouple a monolithic infrastructure using Google Cloud Pub/Sub and cloud functions.

What is Event-Driven Architecture?

EDA is a distributed architecture where components exchange events, enabling loosely coupled communication driven by event occurrence. It promotes agility, scalability, reliability, and resilience. EDA finds applications in microservices integration, data sharing, and external data ingestion/analytics.

Here are a few data engineering scenarios where an event-driven architecture could be a better choice and the right approach:

  • Event-based integration at the application level of each microservice.
  • Sharing and normalizing data across applications.
  • Connecting data from external source devices for data ingestion/analytics.

EDA Applications and Benefits on Cloud

The cloud is an ideal environment for implementing an event-driven architecture because it allows services to scale dynamically in response to changes in demand. Cloud providers such as AWS, Azure, and Google Cloud provide a range of tools and services, such as AWS Lambda, Google Cloud Pub/Sub, etc. that can be used to implement EDA.

There are many ways to build an event-driven architecture (EDA) in the cloud. Here are some common approaches:

  • Identify the events: The first step in implementing an event-driven architecture in the cloud is to identify the events that are meaningful to the system.
  • Use a cloud-based event producer, broker, and consumer: Once events are identified, they can be mapped to event producers and event consumers. Cloud-based event brokers provide a central location for publishing and subscribing to events.

EDA Applications and Benefits on Cloud

Figure 1: A typical Event-Driven Architecture Framework

Challenges Solved By Implementing EDA

  • In the traditional request-response model, each service must wait for a response from the other services before continuing. This creates a bottleneck and slows down the entire system.
  • To make decisions quickly, companies need to process events faster.
  • Adopting the EDA can address major challenges such as scalability, fault tolerance, high availability, and resilience.

EDA on Google Cloud for Data Engineering

Google Cloud offers a variety of services to build Event-Driven Architectures (EDAs). Here are some of the most popular Google Cloud services for EDAs:

  • Google Cloud Pub/Sub: A fully managed messaging service that makes sending and receiving messages between independent applications easy.
  • Google Cloud Functions – A serverless execution environment for building and connecting cloud services.
  • Google Cloud Storage – Google’s object store can trigger events to Google Cloud functions.
  • Google Cloud Run – A serverless computing platform that enables developers to run stateless containers on-demand without managing the underlying infrastructure.
  • Google Dataflow – A fully managed streaming analytics service that minimizes latency, processing time, and cost through auto-scaling and batch processing.

Application of EDA for Data Engineering: A Hexaware Success Story

Our client is a dynamic and innovative organization with a global footprint. They are at the forefront of the healthcare industry. With a strong focus on revolutionizing the pharma landscape, the company combines cutting-edge technology with a deep understanding of patient care. Through their state-of-the-art software solutions, they empower pharmacies to effectively manage prescriptions, automate tasks, and provide personalized care to patients.

Challenges:

The client operated within an on-premises environment where they regularly received files from their partners in a standardized format, delivered to a designated SFTP folder. Quality and compliance checks involved a manual process where their personnel manually performed the validations and either passed them or sent them back to the vendor for revision per the validation checklist. As the client partnership grew, they received a large volume of files, and the existing solution caused many challenges. Recognizing the limitations of the current system, the client acknowledged the need for an automated solution to streamline the file-handling process.

The Hexaware Solution:

We provided a robust solution that identified the files once they were placed in SFTP and implemented a mechanism to automatically update the database if the file passed all checks. If the file had errors, the mechanism would send the file back to the partners using a hybrid approach with the help of Google Cloud Pub-Sub and an Event Driven Architecture.

To solve the problem of processing and storing events, we employed an event-driven approach using Google’s Pub-Sub pattern. We leveraged Google Cloud Serverless services, which fit this type of application well. We followed the below approach:

The Hexaware Solution

Figure 2: Event-Driven approach using Google’s Pub/Sub Service

  • Data Placement: Data files are uploaded to a Google Cloud Storage bucket from the on-premises environment. When a file is ingested, a message is triggered and sent to Google’s Pub/Sub service. This message acts as a notification for further processing.
  • Getting & Tracking the Data Files: The arrival of a data file triggers the “e1” event. The Data Engineering (DE) service consumes this event and checks if the file’s name follows the predefined naming convention. It also tracks file arrivals at a specified time. If the file meets the criteria, the DE service sends an “e2” message to Pub/Sub, indicating that the data files have arrived.
  • Parsing of Data Files: The “e2” message is consumed by a DF (Data Flow) pipeline invoker, which is a cloud function. The cloud function then calls a DF pipeline for further processing. The DF pipeline performs parsing techniques and checks the file length and required fields. If successful, it sends an “e3” message. If errors occur, a message is sent to a different topic for error logging.
  • Outbound Data to On-Premises: The “e4” message is consumed by an API invoker. After processing the data files, it sends the data back to the on-premises environment, where multiple databases are present. The data is split and fed into the required tables. Data syncing also occurs using the “e5” event message within the cloud.
  • The response generator function collects information about the validity of records and sends it back to the customer.

This architecture demonstrates an event-driven approach using Google’s Pub/Sub service to enable efficient processing and storage of data files in a distributed system. It leverages serverless services on Google Cloud for scalability and flexibility.

Business Benefits of EDA

  • With the EDA-driven approach, services can communicate asynchronously so that one service can send an event and move on without waiting for a response, making the overall system more resilient and responsive.
  • It helps build scalable applications to easily deploy and integrate new applications without affecting existing services.
  • The system can handle heavy traffic from many events with low latency.
  • This approach also helps companies save on infrastructure setup and development costs.
  • EDA is more fault-tolerant; one will have less downtime than traditional systems.

Conclusion

To summarize, if you are building a new application or looking to modernize an existing one, EDA is an excellent option. It can help you build a more scalable, resilient, and agile application. Hexaware offers comprehensive expertise in Event-Driven Architecture, delivering scalable, resilient, and cloud-native solutions. We enable organizations to unlock real-time insights, drive operational efficiency, and achieve competitive advantage in today’s dynamic business landscape.

For more information, contact us at marketing@hexaware.com.

About the Author

Joy Maitra

Joy Maitra

With over 15 years of experience in IT, Joy Maitra has a strong foundation in data warehousing. His expertise has evolved into the rapidly growing field of Generative AI, where he now focuses on innovative solutions to deliver transformative business outcomes. His diverse background enables him to bridge the gap between traditional data architectures and next-gen AI-driven solutions.

Read more Read more image

Related Blogs

Every outcome starts with a conversation

Ready to Pursue Opportunity?

Connect Now

right arrow

ready_to_pursue
Ready to Pursue Opportunity?

Every outcome starts with a conversation

Enter your name
Enter your business email
Country*
Enter your phone number
Please complete this required field.
Enter source
Enter other source
Accepted file formats: .xlsx, .xls, .doc, .docx, .pdf, .rtf, .zip, .rar
upload
DTRHXZ
RefreshCAPTCHA RefreshCAPTCHA
PlayCAPTCHA PlayCAPTCHA PlayCAPTCHA
Invalid captcha
RefreshCAPTCHA RefreshCAPTCHA
PlayCAPTCHA PlayCAPTCHA PlayCAPTCHA
Please accept the terms to proceed
thank you

Thank you for providing us with your information

A representative should be in touch with you shortly