Oleg Zhurakousky

Principal Architect w/Hortonworks

Oleg is a Principal Architect with Hortonworks responsible for architecting scalable BigData solutions using various OpenSource technologies available within and outside the Hadoop ecosystem. Before Hortonworls Oleg was part of the SpringSource/VMWare where he was a core engineer working on Spring Integration framework, leading Spring Integration Scala DSL and contributing to other projects in Spring portfolio. He has 17+ years of experience in software engineering across multiple disciplines including software architecture and design, consulting, business analysis and application development. Oleg has been focusing on professional Java development since 1999. Since 2004 he has been heavily involved in using several open source technologies and platforms across a number of projects around the world and spanning industries such as Teleco, Banking, Law Enforcement, US DOD and others.
As a speaker Oleg presented seminars at dozens of conferences worldwide (i.e.SpringOne, JavaOne, Java Zone, Jazoon, Java2Days, Scala Days, Uberconf, and others).

Presentations

High Speed Continuous & Reliable Data Ingest into Hadoop

8:30 AM MDT

This talk will explore the area of real-time data ingest into Hadoop and present the architectural trade-offs as well as demonstrate alternative implementations that strike the appropriate balance across the following common challenges:

Decentralized writes (multiple data centers and collectors)
Continuous Availability, High Reliability
No loss of data
Elasticity of introducing more writers
Bursts in Speed per syslog emitter
Continuous, real-time collection
Flexible Write Targets (local FS, HDFS etc.)

Go Beyond "Debug": Wire Tap your App for Knowledge with Hadoop

10:30 AM MDT

Today, application developers devote roughly 80% of their code to persisting roughly 20% of the total data flowing through the applications. That means two things:

80% of the data flowing through our applications is at best lost in rolling log files, at worst never collected – without ever being analyzed or accounted for.
Application-level database programming, licensing, storage, administration, and ETL processing have maxed out IT budgets and have constrained app development teams from keeping pace with the rate of change in the business.
The other 80% of the data is “Event Data” that can no longer be ignored if you want to stay competitive. Changes to application state are already stored as a sequence of events in application and middleware logs. In fact, since this data never held value to anyone but the developer in the past, a lot of potentially valuable information is often never collected. With Hadoop, we can: * store and query these events - Transaction tracing,
use the event log to reconstruct the application domain at any point in time - ETL,
use the same event log to construct new domains we haven't planned for - ELT, and
automatically adjust our data domains to cope with retroactive changes - ???

In this talk, we will demonstrate how capturing all event data could dramatically simplify data collection and management within the enterprise.