In this presentation, Dmitriy will describe the strategy and architecture behind the Apache Ignite(TM) (incubating) In-Memory Data Fabric, a high-performance, distributed in-memory data management software layer that boosts application performance and scale by orders of magnitude. We will dive into the technical details of distributed clusters and compute grids as well as distributed data grids, and provide code samples for each. As integral parts of an In-Memory Data Fabric, Dmitriy will also cover distributed streaming, CEP and Hadoop acceleration. This presentation is particularly relevant for software developers and architects who work on the front lines of high-speed, low-latency Fast Data systems, high-performance transactional systems and real-time analytics applications.
QCon London: Mastering long-running processes in modern architectures
Apache Ignite In-Memory Data Fabric for Fast Data and Analytics
1. Apache®,
Apache
Ignite,
Ignite®,
and
the
Apache
Ignite
logo
are
either
registered
trademarks
or
trademarks
of
the
Apache
Software
Foundation
in
the
United
States
and/or
other
countries.
DMITRIY
SETRAKYAN
Founder,
PPMC
Apache
IgniteTM
(Incubating)
-‐
In-‐Memory
Data
Fabric
Fast
Data
Meets
Open
Source
http://www.ignite.incubator.apache.org @apacheignite @dsetrakyan
2. Apache®,
Apache
Ignite,
Ignite®,
and
the
Apache
Ignite
logo
are
either
registered
trademarks
or
trademarks
of
the
Apache
Software
Foundation
in
the
United
States
and/or
other
countries.
Agenda
• About
In-‐Memory
Computing
• Apache
Ignite
(tm)
In-‐Memory
Data
Fabric
• Advanced
Clustering
• Data
Grid
• Compute
Grid
• Service
Grid
• Ignite
For
Analytics
• Streaming
&
CEP
• Share
State
Across
Spark
Jobs
• In-‐Memory
MapReduce
• Interactive
SQL
• DevOps:
Yarn
and
Mesos
• Q
&
A
3. Apache®,
Apache
Ignite,
Ignite®,
and
the
Apache
Ignite
logo
are
either
registered
trademarks
or
trademarks
of
the
Apache
Software
Foundation
in
the
United
States
and/or
other
countries.
Apache
IgniteTM
In-‐Memory
Data
Fabric:
Strategic
Approach
to
IMC
• Supports Applications of
various types and
languages
• Open Source – Apache 2.0
• Simple Java APIs
• 1 JAR Dependency
• High Performance & Scale
• Automatic Fault Tolerance
• Management/Monitoring
• Runs on Commodity Hardware
• Supports existing &
new data sources
• No need to rip & replace
4. Apache®,
Apache
Ignite,
Ignite®,
and
the
Apache
Ignite
logo
are
either
registered
trademarks
or
trademarks
of
the
Apache
Software
Foundation
in
the
United
States
and/or
other
countries.
In-‐Memory
Data
Fabric:
More
Than
Data
Grid
5. Apache®,
Apache
Ignite,
Ignite®,
and
the
Apache
Ignite
logo
are
either
registered
trademarks
or
trademarks
of
the
Apache
Software
Foundation
in
the
United
States
and/or
other
countries.
• Automatic
Discovery
– Simple
Configuration
– AWS/EC2/S3
– Google
Compute
Engine
(NEW)
– Other
Clouds
with
JClouds
(NEW)
• Docker
Support
– Automatically
Build
and
Deploy
Apache
Ignite:
Better
Cloud
Support
6. Apache®,
Apache
Ignite,
Ignite®,
and
the
Apache
Ignite
logo
are
either
registered
trademarks
or
trademarks
of
the
Apache
Software
Foundation
in
the
United
States
and/or
other
countries.
• JCache
(JSR
107)
– Basic
Cache
Operations
– ConcurrentMap
APIs
– Collocated
Processing
(EntryProcessor)
– Events
and
Metrics
– Pluggable
Persistence
• Ignite
Data
Grid
– ACID
Transactions
– SQL
Queries
(ANSI
99)
– In-‐Memory
Indexes
– Automatic
RDBMS
Integration
Data
Grid:
JCache
(JSR
107)
7. Apache®,
Apache
Ignite,
Ignite®,
and
the
Apache
Ignite
logo
are
either
registered
trademarks
or
trademarks
of
the
Apache
Software
Foundation
in
the
United
States
and/or
other
countries.
Data
Grid:
Partitioned
Cache
8. Apache®,
Apache
Ignite,
Ignite®,
and
the
Apache
Ignite
logo
are
either
registered
trademarks
or
trademarks
of
the
Apache
Software
Foundation
in
the
United
States
and/or
other
countries.
Data
Grid:
Replicated
Cache
9. Apache®,
Apache
Ignite,
Ignite®,
and
the
Apache
Ignite
logo
are
either
registered
trademarks
or
trademarks
of
the
Apache
Software
Foundation
in
the
United
States
and/or
other
countries.
• Unlimited
Vertical
Scale
• Avoid
Java
Garbage
Collection
Pauses
• Small
On-‐Heap
Footprint
• Large
Off-‐Heap
Footprint
• Off-‐Heap
Indexes
• Full
RAM
Utilization
• Simple
Configuration
Data
Grid:
Off-‐Heap
Memory
10. Apache®,
Apache
Ignite,
Ignite®,
and
the
Apache
Ignite
logo
are
either
registered
trademarks
or
trademarks
of
the
Apache
Software
Foundation
in
the
United
States
and/or
other
countries.
• ANSI-‐99
SQL
• Always
Consistent
• Fault
Tolerant
• In-‐Memory
Indexes
(On-‐Heap
and
Off-‐Heap)
• Automatic
Group
By,
Aggregations,
Sorting
• Cross-‐Cache
Joins,
Unions,
etc.
• Ad-‐Hoc
SQL
Support
Data
Grid:
Ad-‐Hoc
SQL
(ANSI
99)
11. Apache®,
Apache
Ignite,
Ignite®,
and
the
Apache
Ignite
logo
are
either
registered
trademarks
or
trademarks
of
the
Apache
Software
Foundation
in
the
United
States
and/or
other
countries.
SQL
Cross-‐Cache
JOIN
Example
12. Apache®,
Apache
Ignite,
Ignite®,
and
the
Apache
Ignite
logo
are
either
registered
trademarks
or
trademarks
of
the
Apache
Software
Foundation
in
the
United
States
and/or
other
countries.
SQL
Cross-‐Cache
GROUP
BY
Example
13. Apache®,
Apache
Ignite,
Ignite®,
and
the
Apache
Ignite
logo
are
either
registered
trademarks
or
trademarks
of
the
Apache
Software
Foundation
in
the
United
States
and/or
other
countries.
• Direct
API
for
MapReduce
• Direct
API
for
ForkJoin
• Zero
Deployment
• Cron-‐like
Task
Scheduling
• State
Checkpoints
• Load
Balancing
• Automatic
Failover
• Full
Cluster
Management
• Pluggable
SPI
Design
In-‐Memory
Compute
Grid
14. Apache®,
Apache
Ignite,
Ignite®,
and
the
Apache
Ignite
logo
are
either
registered
trademarks
or
trademarks
of
the
Apache
Software
Foundation
in
the
United
States
and/or
other
countries.
• Streaming
Data
Never
Ends
• Branching
Pipelines
• Pluggable
Routing
• Sliding
Windows
for
CEP/Continuous
Query
• SQL
Queries
(ANSI
99)
• Query
Across
Sliding
Windows
• Real
Time
Analysis
In-‐Memory
Streaming
and
CEP
15. Apache®,
Apache
Ignite,
Ignite®,
and
the
Apache
Ignite
logo
are
either
registered
trademarks
or
trademarks
of
the
Apache
Software
Foundation
in
the
United
States
and/or
other
countries.
• Singletons
on
the
Cluster
– Cluster
Singleton
– Node
Singleton
– Key
Singleton
• Distribute
any
Data
Structure
– Available
Anywhere
on
the
Grid
– Access
Anywhere
via
Proxies
• Guaranteed
Availability
– Auto
Redeployment
in
Case
of
Failures
In-‐Memory
Service
Grid
16. Apache®,
Apache
Ignite,
Ignite®,
and
the
Apache
Ignite
logo
are
either
registered
trademarks
or
trademarks
of
the
Apache
Software
Foundation
in
the
United
States
and/or
other
countries.
Apache
Ignite
for
BI
and
Analytics
17. Apache®,
Apache
Ignite,
Ignite®,
and
the
Apache
Ignite
logo
are
either
registered
trademarks
or
trademarks
of
the
Apache
Software
Foundation
in
the
United
States
and/or
other
countries.
• Automatic
Resource
Management
• Easy
Data
Center
Installation
• Easy
Data
Center
Configuration
• On-‐Demand
Elasticity
DevOps:
Integration
with
Yarn
and
Mesos
18. Apache®,
Apache
Ignite,
Ignite®,
and
the
Apache
Ignite
logo
are
either
registered
trademarks
or
trademarks
of
the
Apache
Software
Foundation
in
the
United
States
and/or
other
countries.
• IgniteRDD
– Share
RDD
across
jobs
on
the
host
– Share
RDD
across
jobs
in
the
application
– Share
RDD
globally
• Faster
SQL
– In-‐Memory
Indexes
– SQL
on
top
of
Shared
RDD
Share
RDDs
Across
Spark
Jobs
19. Apache®,
Apache
Ignite,
Ignite®,
and
the
Apache
Ignite
logo
are
either
registered
trademarks
or
trademarks
of
the
Apache
Software
Foundation
in
the
United
States
and/or
other
countries.
• Ignite
In-‐Memory
File
System
(IGFS)
– Hadoop-‐compliant
– Easy
to
Install
– On-‐Heap
and
Off-‐Heap
– Caching
Layer
for
HDFS
– Write-‐through
and
Read-‐through
HDFS
– Performance
Boost
Ignite
In-‐Memory
File
System
20. Apache®,
Apache
Ignite,
Ignite®,
and
the
Apache
Ignite
logo
are
either
registered
trademarks
or
trademarks
of
the
Apache
Software
Foundation
in
the
United
States
and/or
other
countries.
Ignite
In-‐Memory
Map
Reduce
• In-‐Memory
Native
Performance
• Zero
Code
Change
• Use
existing
MR
code
• Use
existing
Hive
queries
• No
Name
Node
• No
Network
Noise
• In-‐Process
Data
Colocation
• Eager
Push
Scheduling
21. Apache®,
Apache
Ignite,
Ignite®,
and
the
Apache
Ignite
logo
are
either
registered
trademarks
or
trademarks
of
the
Apache
Software
Foundation
in
the
United
States
and/or
other
countries.
Interactive
SQL
with
Apache
Zeppelin
22. Apache®,
Apache
Ignite,
Ignite®,
and
the
Apache
Ignite
logo
are
either
registered
trademarks
or
trademarks
of
the
Apache
Software
Foundation
in
the
United
States
and/or
other
countries.
GridGain
Enterprise
&
Apache
Ignite
Comparison
Chart
GridGain
Enterprise
Subscriptions
include
the
following
during
the
term
of
the
subscription:
> Right
to
use
GridGain
Enterprise
Edition
> Bug
fixes,
patches,
updates
and
upgrades
> 9x5
or
24x7
Support
> Ability
to
procure
Training
and
Consulting
Services
from
GridGain
> Confidence
and
protection,
not
provided
under
Open
Source
licensing,
that
only
a
commercial
vendor
can
provide,
such
as
indemnification
Features Apache Ignite
Enterprise
Edition
In-Memory Data Grid ✓
CHECK
✓
In-Memory Compute Grid ✓ ✓
Real-Time Streaming & CEP ✓ ✓
Hadoop Acceleration ✓ ✓
Management & Monitoring GUI ✓
Portable Objects ✓
.Net and C++ APIs ✓
Enterprise-grade Security ✓
Network Segmentation Protection ✓
Local Restartable Store ✓
Rolling Production Updates ✓
Datacenter Replication ✓
9x5 and 24x7 Support ✓
Long Term Support & Patches ✓
23. Apache®,
Apache
Ignite,
Ignite®,
and
the
Apache
Ignite
logo
are
either
registered
trademarks
or
trademarks
of
the
Apache
Software
Foundation
in
the
United
States
and/or
other
countries.
ANY
QUESTIONS?
Thank
you
for
joining
us.
Follow
the
conversation.
http://www.ignite.incubator.apache.org
@apacheignite @dsetrakyan