Kafka has become a key data infrastructure technology, and we all have at least a vague sense that it is a messaging system, but what else is it? How can an overgrown message bus be getting this much buzz? Well, because Kafka is merely the center of a rich streaming data platform that invites detailed exploration.
In this talk, we’ll look at the entire open-source streaming platform provided by the Apache Kafka and Confluent Open Source projects. Starting with a lonely key-value pair, we’ll build up topics, partitioning, replication, and low-level Producer and Consumer APIs. We’ll group consumers into elastically scalable, fault-tolerant application clusters, then layer on more sophisticated stream processing APIs like Kafka Streams and KSQL. We’ll help teams collaborate around data formats with schema management. We’ll integrate with legacy systems without writing custom code. By the time we’re done, the open-source project we thought was Big Data’s answer to message queues will have become an enterprise-grade streaming platform, all in 90 minutes.
On the inside, Kafka is schemaless, but there is nothing schemaless about the worlds we live in. Our languages impose type systems, and the objects in our business domains have fixed sets of properties and semantics that must be obeyed. Pretending that we can operate without competent schema management does us no good at all.
In this talk, we’ll explore our how the different parts of the open-source Kafka ecosystem help us manage schema, from KSQL’s data format opinions to the full power of the Confluent Schema Registry. We will examine the Schema Registry’s operations in some detail, how it handles schema migrations, and look at examples of client code that makes proper use of it. You’ll leave this talk seeing that schema is not just an inconvenience that must be remedied, but a key means of collaboration around an enterprise-wide streaming platform.
Your goal is simple: take that is happening in your company—every click, every database change, every application log—and made it all available as a real-time stream of well-structured data? No big deal! You’re just taking your decades-old, batch-oriented data integration and data processing and migrating to to real-time streams and real-time processing. In your shop, you call that Tuesday. But of the several challenges to tackle, you’ll have to get data in and out of that stream processing system, and there’s a whole bunch of code there you don’t want to write. This is where Kafka Connect comes in.
Connect is a standard part of Apache Kafka that provides a scalable, fault-tolerant way to stream data between Kafka and other systems. It provides an API for developing standard connectors for common data sources and sinks, giving you the ability to ingest database changes, write streams to tables, store archives in HDFS, and more. We’ll explore the standard connector implementations offered in the Confluent Open Source download, and look at a few operational questions as well. Come to this session to get connected to Kafka!
Harold McMillan was Prime Minister of England from 1957 to 1963, the last British PM born during Queen Victoria’s rule, and one whose wit and even-keeled nature defined his administration. When asked by a reporter what might force his government off the course he had firmly laid out for it, he allegedly replied “Events, dear boy, events.”
The same might be said about what is driving software architectures today. Event-driven systems have enabled organizations to build substantial microservices ecosystems with all of the decoupling and evolvability that we were promised by the distributed computing technologies of 20 years ago. But these systems raise some interesting questions: if events now rule, what has become of entities? If we store events in logs, do we still need databases? Can we merely produce immutable events to trivially scalable logs and loose our microservices to consume them with no regard for what is actually out there in the world?
To make sense of this, we turn to the past. Spanning 2,500 years before McMillan deployed his wit on that poor reporter, we will look at what Heraclitus, Aristotle, Karl Popper, and W.V.O. Quine thought and wrote about these same questions. Are there things in the world that maintain their identity over time, or is the world just a sequence of experiences? Life may be a stream of events, but sometimes I still want to look things up by key. Four great thinkers will help be better at following the paradigm that will be shaping our systems for the next generation. And as usual, a good philosophy lesson will make us better at practical tasks. We’ll apply a rich view of events and entities to a proposed microservices architecture that can last the next decade.
Build and test software written in Java and many other languages with Gradle, the open source project automation tool that’s getting a lot of attention. This concise introduction provides numerous code examples to help you explore Gradle, both as a build tool and as a complete solution for automating the compilation, test, and release process of simple and enterprise-level applications.
Discover how Gradle improves on the best ideas of Ant, Maven, and other build tools, with standards for developers who want them and lots of flexibility for those who prefer less structure.