Kafka Community Spotlight #3
1. Personal
Please tell us about yourself, and where you are from.
Hi. My name is Jakub Scholz. Despite my slightly German name, I’m actually Czech. I was born in a small city called Olomouc, but now I live in Prague, Czech Republic.
How do you spend your free time? What are your hobbies?
Right now, I’m “in between” jobs. So I guess Strimzi and Apache Kafka are one of my hobbies right now. 🙂
Aside from that, my biggest passion is football. I support Aston Villa in the English Premier League and Sigma Olomouc in the Czech league. I also do a lot of reading - usually fiction books, thrillers, crime stories, and so on. And I spend a lot of time walking.
What does your ideal weekend look like?
My football teams win! 😉
Best type of music? Best song?
To be honest, I’m not sure if I have any favourite music. I listen to all kinds of music, but usually it is just background for me when doing something else.
Favorite food? Best cuisine you’d recommend?
I like various Asian cuisines - Indian, Japanese, and so on. And of course, Czech cuisine - it goes well with beer 🙂.
What is the best advice you ever got?
Frankly, I’m not very good at listening. So I probably got a lot of great advice, which I didn’t really listen to or take seriously.
Have you studied at a university? If yes, was it related to computers? Did your study help you with your current job?
I studied at the Czech Technical University in Prague. When I want to sound special, I say that I studied at the Faculty of Nuclear Sciences and Physical Engineering, which sounds fancy. But the reality is a bit more ordinary, as I studied a mix of computer science, economics, and management. And not really nuclear sciences. 🙂
To be honest, I find the economics and management part more useful for life and work. Economics helps you to understand how the world works. And management helps you understand things at work. The computer science part, I think, if you really want, you can learn that yourself as well.
2. Kafka
How did you get into Kafka?
I used to work with various messaging technologies at Deutsche Börse - the German stock exchange. Mainly with Apache Qpid and other AMQP-based messaging brokers. I was aware of Apache Kafka, but we did not use it. I started to really work with Kafka only when I joined Red Hat in 2017 to work on it.
What Version of Kafka did you start with?
I think that was shortly before the 1.0.0 release. So probably 0.11?
When do you think one ought to use Kafka?
Kafka is pretty versatile. You can start at a small scale and grow from there. You can also use it to handle various use cases. From using it as a dumb pipe to deliver some observability data up to building sophisticated event-driven applications and near-time stream processing. I think almost everyone can find some reason to at least give it a try and play with it a bit.
Do you think Kafka has a high entry barrier?
I think it depends.
If you are a developer who wants to write applications using Kafka as consumers or producers, I think it is pretty easy to start. Although people often make it harder for themselves by insisting they will start developing with Kafka on Kubernetes instead of just downloading the Kafka ZIP file and running it locally.
And if you are an ops person who just got tasked with running a big Kafka cluster in production with all the availability and reliability requirements, it is definitely a lot harder. And while a lot of operational knowledge is published somewhere online, you sometimes need to learn things the hard way. But I do not think it is harder than other systems of a similar level of complexity.
Finally, if you are trying to join the community and contribute to the Apache Kafka project itself, that is pretty hard in my experience. It is one of the least friendly communities I have ever encountered. Or at least it was in the past. I mostly stopped trying a long time ago. But who knows, maybe things have improved since.
Can you share more on the community experience? This is a safe space to rant :)
Probably the most frustrating part for me was when I opened a KIP or a PR, and nobody reacted to it. I know that the goal of an open source community is not to accept every single contribution. But I think it is great if you can get at least some acknowledgement, maybe some explanation why it is not a good or interesting idea, and so on.
It was also clear that not all of the committers and PMC members, who are responsible for the community, acted independently. Rather, they were often working based on some behind-the-scenes decisions and agreements. So, for example, a KIP that had been completely ignored for months was suddenly approved within a few minutes because someone somewhere agreed on it. Also, around the ZooKeeper removal, it was clear way before any public announcement that there were many behind-the-scenes discussions and plans. And that many of the things were already decided.
That does not necessarily mean that the final decisions taken were wrong. But the lack of transparency and predictability is certainly not something I find encouraging when deciding in which project I’m going to invest my time.
I think that the second problem improved at least a bit. Partially, as the community diversity improved. Partially, as Confluent’s focus shifted slightly to other things. The first issue probably continues at least to some extent till today.
What’s your favorite feature in Kafka?
I probably don’t have a single favourite feature. From the recent improvements, tiered storage is something that I believe is very valuable and useful. I also like the Kafka Streams API for its lightweight and yet very powerful approach to stream processing. I don’t think it gets the recognition it deserves.
What’s the most annoying thing in Kafka you can think of?
It is probably how the dynamic configuration updates are handled and how they do not have a single source of truth. That is something that I run into again and again.
If you had a magic wand and could instantly contribute/fix one thing to Kafka, what would it be?
It is not exactly a fix or contribution directly to Apache Kafka itself. But I always wanted to try to create a cloud-native “fork” of Kafka. Sadly, I’m too much of a coward to leave everything behind and fully focus on getting it done. And instead, I spend my time working on easier things that are faster to get done. If I had a magic wand, that is probably what I would use it for. But maybe soon AI will improve enough to be the magic wand and do it for me. 🙂
If AI had the magic wand, can you tell me the main prompt you’d give it for the cloud-native Kafka idea? What features are most important?
I don’t think I have the right prompt for it - at least not yet. But basically, Apache Kafka has a pretty monolithic design. It is great at scale. If you have big clusters with hundreds or thousands of partitions, big throughput, and many client applications … you get a lot of value from the economy of scale. With few big broker nodes, the JVM overhead is shared by many partitions. Kafka’s own logic for scheduling or balancing partitions across a large Kafka cluster makes sense as well.
But when all you need is a small cluster, for example, with one or two topics, each with a few partitions, Kafka’s efficiency isn’t so great anymore, and it becomes very heavy. The JVM overhead gets expensive. And using Kafka’s mechanisms for scheduling, bin-packing, or balancing becomes cumbersome. Now imagine you had a lightweight alternative that would work as a single-partition-per-process. And you could deploy it directly on Kube and use Kubernetes’ own primitives for scaling, availability, bin-packing, scheduling, and so on?
3. Business/Work
Whose idea was Strimzi? How did you get into it?
It probably depends on what exactly you mean by the Strimzi idea. David Ingham at Red Hat brought me, Paolo Patierno, and Tom Bentley together with the idea of working on Kafka on Kubernetes. So all of us together started and shaped Strimzi. Paolo and I still work on Strimzi pretty much daily.
How has Strimzi changed over the years? Can you reminisce on the old days?
Initially, we started as a bunch of YAML files that you just applied to your Kubernetes cluster. Looking back, it is funny how primitive that was. And yet, at that time, it seemed pretty advanced. 🙂
But pretty quickly, we moved to the Kubernetes operator pattern. In March 2026, it will be 8 years since the first Strimzi release that followed the operator pattern. Obviously, during that time, there were many changes. The biggest one was definitely moving to KRaft and dropping the ZooKeeper support.
We also witnessed a lot of changes in the world around us. In the beginning, we had to do a lot of explaining of why it makes sense to run stateful applications like Apache Kafka on Kubernetes. And every conference talk started by explaining what a Kubernetes Operator actually is and how it works. These days, everyone understands it, and nobody questions it anymore.
Where is Strimzi today, and what are your ambitions with it? What would success look like in 3 years?
One of the past decisions I regret is that we were always postponing the Strimzi 1.0.0 release. We always had a reasonable justification for it, such as “When ZooKeeper is removed, too many things will change”, and so on. But waiting for 8 years for a 1.0 release is weird. And looking back, I think we should have done things differently. But this year, we will finally release the 1.0.0 version. It is currently planned for April 2026.
We should also start working on the next steps within the Cloud Native Computing Foundation. Strimzi is right now in the “incubating” projects category. And after the 1.0.0 release, we should start working on graduating and moving to “graduated” projects.
Is Kafka workable in Kubernetes without Strimzi? (in your opinion)
Operators like Strimzi have many advantages. But they also have some disadvantages. In particular, they are always “opinionated” about how they do things. And if you are someone who has been running Kafka in production for the last 10 years and you know exactly what to do and how to do it, this might be a big limitation. Running Kafka with a bunch of YAML files or with a Helm Chart might give you a lot more flexibility to do things exactly the way you want.
Of course, with flexibility comes also a lot of responsibility. So it is not something I would recommend to everyone. But you can definitely do it. However, you should think about it very carefully. And it probably should not be an individual decision. Because, as a company, it is probably better to be dependent on an open source community like Strimzi than on a single ops person who is the only one who knows how to upgrade your Kafka cluster.
Do you have thoughts on using local disk storage for Kafka vs. network attached (eg like EBS)?
I think both have their value. If you have good network-attached storage, you should probably stick with it. But there are many cases - especially in on-premises environments - where there is no network attached storage available, or when it is available, it is too slow. So the fact that Kafka can deal well with local storage is a big advantage.
Do you have any experience and/or opinions on the different clouds (AWS, GCP, Azure)? Favorite one, least favorite one? Why
I don’t think I have a favorite one. AWS usually surprises me again and again with how hard it is to deploy its own managed services. But maybe that is simply because I used it more than the other cloud providers. But one of the advantages of working on top of Kubernetes for so many years is that most of the time, I do not really need to care about the actual cloud I use. I just need to deploy Kubernetes there, and I’m immediately in a familiar place.
4. General/Parting
Do you have any thoughts on the general open-core Kafka industry? Things like proprietary Kafka implementations, vendors’ claims against Kafka, the competitive dynamic.
Competition is good. But I think the Kafka community needs to deal with two (slightly related) things:
- Confluent’s role in the Apache Kafka project
- And the dichotomy between Apache Kafka as the software project and Apache Kafka as the protocol
Not addressing these issues can cause a lot of problems and can jeopardize the future of the Apache Kafka project.
It is great that many different companies chose the Kafka protocol for their own proprietary implementations. It shows the importance and relevance of Apache Kafka. But maintaining compatibility between different implementations is pretty hard. Especially when the Kafka protocol does not have a proper specification and versioning because it was not implemented with the intent of being used by various vendors.
So it can very easily happen that the whole Kafka ecosystem suddenly falls apart with too many incompatible implementations. I’m not sure if the Apache Kafka PMC members and committers can really prevent the spreading of the alternative Kafka implementations. But if not, maybe it is time to “democratize” the Kafka protocol and consider spinning it off from the Apache Kafka implementation?
But as I know pretty well from Strimzi, building a good community and ecosystem is a really hard task. So I do not really want to pretend like I have all the right answers for these issues. We have a lot of our own issues and struggles. So please take it just as my personal take and nothing else.
What’s your opinion on AMQP as a protocol versus, say, Kafka’s protocol?
I used to work with AMQP a lot in the past. It is an open messaging protocol designed for interoperability. It defines the transport layer, type system, and so on. It is approved as an ISO19464 standard. And I’m still a big fan of it. But I guess it did not really fulfill many of its expectations. Trying to create an independent protocol with multiple vendors and proper standardization is hard and takes a lot of time.
A lot of it is probably also a question of luck and good timing. AMQP 1.0 landed at a time when the “traditional” message queueing projects were just going a bit out of fashion. And newer projects like Kafka were getting a lot of hype and a lot of great marketing. The fact that RabbitMQ, as one of the big projects in the space, was for a long time stuck on old versions of the AMQP protocol and did not adopt the 1.0 version, which brought major changes and improvements, did not help either.
A proprietary protocol like Kafka’s can move and develop much faster. But as I already suggested before, maybe we need to try to find some middle ground on how to maintain and foster the Kafka protocol within the wider ecosystem and not just within the Apache Kafka project itself?
What was Apache Qpid in a few words, and how was your experience with it at Deutsche Börse?
The Apache Qpid project still exists and continues to work on the AMQP clients and the Java-based messaging broker. At Deutsche Boerse I used to work with the C++-based AMQP messaging broker (or rather the Red Hat MRG-Messaging product based on it). We used it both for the internal communication within the company as well as to communicate with customers. The AMQP protocol was great for this because it provided stability and compatibility across the different clients and releases. Once I started to work on Strimzi, I did not have much time for it anymore. But the Qpid community was great. It was always very friendly, welcoming, and helpful.
Thoughts on Queues vs Fan-out Log messaging? What’s more popular, what do you prefer?
I have my own cattle versus pet analogy for this. If you care about individual messages, if you have the need to query for some particular message because of its parameters, and so on … then the message is a pet for you. And you should probably not use Kafka. On the other hand, if you see the messages as a uniform stream of things to receive and process, then they are cattle, and Kafka is a good choice for you.
The new “Kafka queues” feature is interesting if you are already a big Kafka user and need just a little bit of queueing. But I do not think it should replace messaging brokers designed with queues as a first-class citizen.
That said, a lot of the actual usage is not driven by completely rational decisions. Often, we want to use the new shiny thing. And at this point, while the main Kafka hype is probably over, I think the Kafka ecosystem is still “in”. But it might change one day again.
How do you see the future of Kafka usage and development, 5 years out?
I think it will continue to provide stability and reliability, while continuing to slowly absorb the new features from the various forks and competitors. It is always easier and faster to develop new features when starting fresh without the years of history, old features, old code, and old design decisions. But eventually, Kafka will rein in these new features, and the majority of users appreciate the stability and maturity of the Apache Kafka project. We can see this, for example, with features such as tiered storage or diskless topics.
What I’m curious about is whether something brand new shows up and squeezes Kafka out of the market. So far, all the attempts were only slightly better. They found their use cases and their share of users. But the change was never big enough to gather enough interest.
Do you think we’ve innovated in the messaging space in the last 10-15 years? How have you seen the space change?
I don’t know. It often feels like we are running in circles. Rediscovering the things we have forgotten and forgetting the things we have used before. Most innovation seems to come in how we deliver the capabilities, rather than the capabilities themselves. When I look at Apache Kafka, it did not invent pub-sub messaging. It did not invent commit logs. It did not invent event-driven applications. And so on. Its main innovation was that it gave it to everyone for free, at scale, and in easy-to-use packaging. Similarly, managed services later became the main competitor to “running Apache Kafka yourself”. Again, the main innovation was how the same old capabilities are delivered. So there is definitely a lot of innovation going on. But it does not always happen in the low-level technical primitives.
Any Social Media channels of yours we can follow?
You should be able to find me in the usual places - LinkedIn, Bluesky, Twitter.
Anything else you’d like to add?
Up the Villa! 😉