The Programming Laboratory focuses on fundamental problems in programming languages.
Schema Registry in Kafka
Get link
Facebook
X
Pinterest
Email
Other Apps
-
Recently, I was inspired by an article of Amarpreet Singh. I decided to do a research.
Why Schema Registry?
Kafka, transfers data in byte format. There is no data verification that’s being done at the Kafka cluster level. In fact, Kafka doesn’t even know what kind of data it is sending or receiving. Whether it is a string or integer.
Due to the decoupled nature of Kafka, producers and consumers do not communicate with each other directly, but rather information transfer happens via Kafka topic. At the same time, the consumer still needs to know the type of data the producer is sending in order to deserialize it. What if the producer starts sending bad data to Kafka or if the data type of your data gets changed? Than your consumers will start breaking. We need a way to have a common data type that must be agreed upon.
That’s where Schema Registry comes into the picture. It is an application that resides outside of your Kafka cluster and handles the distribution of schemas to the producer and consumer by storing a copy of schema in its local cache.
Schema Registry Architecture
With the schema registry in place, the producer, before sending the data to Kafka, talks to the schema registry first and checks if the schema is available. If it doesn’t find the schema then it registers and caches it in the schema registry. Once the producer gets the schema, it will serialize the data with the schema and send it to Kafka in binary format prepended with a unique schema ID. When the consumer processes this message, it will communicate with the schema registry using the schema ID it got from the producer and deserialize it using the same schema. If there is a schema mismatch, the schema registry will throw an error letting the producer know that it’s breaking the schema agreement.
Conclusion
Schema Registry is a simple concept but it’s really powerful in enforcing data governance within your Kafka architecture. Schemas reside outside of your Kafka cluster, only the schema ID resides in your Kafka, hence making schema registry a critical component of your infrastructure. If the schema registry is not available, it will break producers and consumers. So it is always a best practice to ensure your schema registry is highly available.
I recently read a Gain Java Knowledge article about logging Spring Boot Request and Response. I really like the idea so I decided to test it in practice. How to logging Spring Boot Request and Response? Request and response body for each endpoint we can print using Servlet Filter. Inside our controller class we will not log any statement but our filter class will log the request and response body for each API call. So this approach will reduce the lines of code and we don’t need to worry about to add log statements in each API to print Request and response body. The Filter class will be used to log requests and responses for each API. LoggingFilter class will extends OncePerRequestFilter class because this is Filter base class that aims to guarantee a single execution per request dispatch, on any servlet container. It provides a doFilterInternal method with HttpServletRequest and HttpServletResponse arguments. Conclusion: In my opinion, this is the best solution to log request...
Aspect Oriented Programming ( AOP ) is one of the key feature of Spring framework. AOP is similar to OOPS concept where it breaks code into multiple reusable modules. AOP provides the capability to dynamically add module at runtime. This is like injecting complete module(logging, cahcing, etc ) in Spring framework dynamically. Logging, caching, security, monitoring, etc. are some of the examples cross cutting concern from AOP. At any point of time there modules can be added dynamically. Spring AOP has interceptors which can intercept application and its methods. This is to perform some extra action at the time of property initiation, method initialization or destroy. Let's try it to create a custom annotation to trace the requests and responses to a REST method in Spring Boot At first you we need to use spring-boot-starter-web and spring-boot-starter-aop dependencies to make it work. STEP1 : An interface with the annotation name @Traceable Now , in whichever method you add the...
Jenkins is an open source, Java-based automation server that offers an easy way to set up a continuous integration and continuous delivery (CI/CD) pipeline. Continuous Integration (CI) is the process of automating the build and testing of code every time a team member commits changes to version control. Continuous Delivery (CD) is the process to build, test, configure and deploy from a build to a production environment. Jenkins docker image with docker inside Step 1/ Create Dockerfile extending the official Alpine image. Then get jenkins.war and docker from official sites. Step 2/ Jenkins admin user setup to set a default admin user and password, by creating a Groovy script file. Step 3/ Adding Plugins to auto install some selected plugins. Step 4/ Build Your Custom Jenkins Image Here is a fully functional Jenkins server.
Comments
Post a Comment