IMCSummit 2015 - Day 2 IT Business Track - Real-time Interactive Big Data Analysis Using In-Memory Computing

Real-time Interactive
Big Data Analysis
Using In-Memory
Computing
Mike
Joyce
–
Manager
So0ware
Engineer,
iCrossing

Shawn
Nguyen
–
Lead
So0ware
Engineer,
iCrossing

CONNECTED
MARKETING
PLATFORM
(TECHNOLOGY)

Bid
Management
/
Trading
Desk

Data
Management
PlaNorm
(Core
Audience)

+
+

STRATEGY
&
PLANNING

Market
Research

AnalyPcs

Strategy
&
Planning

PROGRAM
DESIGN

Media
Planning

&
Buying

CreaPve
&

Experience
Design

Content
CreaPon

&
Management

AUDIENCE

ENGAGEMENT

Search
MarkePng
Programs

Social
Media
/
Mobile

Technology
&

App
Development

Measurement
&

OpPmizaPon

Leveraging audience insights:
•  20+
brands

•  30+
TV
networks

•  50+
newspapers

•  300+
magazines

CONTENT

DIGITAL
AGENCY
INSIDE
A

EMPIRE

Big Data - Cookies!
300+
million

unique
cookies

•  Subscribers

•  Visitors

•  InternaPonal

•  MulPple
devices

DMP Audience Data
A]ributes

•  Geographic

•  Demographic

•  Behavioral

•  Psychographic

11,000+ Unique Attributes

Cookies + Audience Attributes = Super Big Data!
90M+
Cookies
Male
Age 20 - 35
Sports Enthusiasts
Average
user
800+
attributes
Iowa
High Income
iPad, iPhone
Drives Mini Van
Foodie
72B+
Attribute
User
pairs

Audiences – Targeting vs Discovering
•  Who
you
are
targePng

•  How
do
you
connect

with
them?

•  What
describes
them?

Data Scientists
Discovering
Audience
A]ributes

1.  Deﬁne
an
audience
using

a]ributes

2.  IdenPfy
all
a]ributes
of

cookies
in
audience

3.  Calculate
highly
indexing

a]ributes

1) Define the Audience
Population"
90M Cookies"
Audience"
300K Cookies"
Age: 20-35"
US > North Dakota"
Gender: Male"

2) Audience Attributes
Interest:
Sports
Enthusiast

Interest:
Moose
HunPng

Intent:
Auto
Purchase
>
Truck

US
>
North
Dakota
>
Fargo

Pet
Supplies
>
Dog
Food

Attributes of"
Cookies in Audience"
Audience"
300K Cookies"

A3ribute

Audience

Frequency

PopulaDon

Frequency

Interest:
Sports
Enthusiast
24%
27%

Interest:
Moose
HunPng
40%
6%

Intent:
Auto
Purchase
>
Truck
17%
4%

US
>
North
Dakota
>
Fargo
30%
2%

Pet
Supplies
>
Dog
Food
6%
9%

3) Index the Attributes
Interest:
Sports
Enthusiast

Interest:
Moose
HunPng

Intent:
Auto
Purchase
>
Truck

US
>
North
Dakota
>
Fargo

Pet
Supplies
>
Dog
Food

Attributes of"
Cookies in Audience"

Data Scientists
Development
Ask

1.  Make
it
accessible
to

“normals”

2.  Exportable
visualizaPons
&

calculaPons

3.  Reduce
query
Pme
from
1
hr

to
1
sec

Why is this Hard?
90M+
Cookies
Male
Age 20 - 35
Sports Enthusiasts
Average
user
800+
attributes
Iowa
High Income
iPad, iPhone
Drives Mini Van
Foodie
72B+
Attribute
User
pairs
Algorithm

1. Check
every
cookie
if
it

saPsﬁes
audience
criteria

2. Collect
all
a]ributes
for

every
audience
cookie

3. Calculate
percentages
&

index

Within
1
sec
!!!!!!

•  Audience discovery
–  Cookie Attributes
–  Frequency vs Population
•  Built for non-technical users
–  Strategy
–  Sales / Account
–  Anyone
•  Flexible
–  Research tool
–  In-meeting, iterative discovery
•  Approachable
–  Real-time
–  Results in seconds
–  Simple, elegant interface
–  Multiple export formats
“Making science accessible”
The Answer – Audience Discovery Tool

Traditional Relational Databases
•  Long
load
Pme

•  Complex
queries
resulPng
in
long
query

Pmes

•  Rigid
data
model

Non Traditional Databases
•  Lack
of
complex
query
feature

•  Large
memory
footprint
requirement

•  AggregaPon
query
exceeded
by
many

10x
of
seconds

The Low Hanging Fruit
•  In
memory
cache

•  Customizable
query
using
Java
code

•  RelaPvely
low
data
loading
Pme

Distributed Computing Ecosystem
•  Not
producPon
ready

•  Data
import
fails
without
explanaPon

•  AggregaPon
fails
to
impress

Back to Basics
•  Pure
Java
code
soluPon

•  Data
and
logic
must
exists
in
same

memory
space

•  Capable
of
advanced
ﬁltering

•  Distributed
compuPng,
low
overhead

•  Data
locality

•  Minimal
code
migraPon

The Challenges
•  Tedious
manual
data
distribuPon

•  Gar
building
and
deployment
issues

•  Development
challenges

What We Learned
•  Indexed
data
requiring
minor
calculaPons
-‐-‐

databases
(relaPonal
&
noSQL)
great

•  Large
non-‐indexed
data

-‐-‐
the
data
&

processing

need
to
live
in
the
same
(memory)

space

IMCSummit 2015 - Day 2 IT Business Track - Real-time Interactive Big Data Analysis Using In-Memory Computing

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (17)

Similar to IMCSummit 2015 - Day 2 IT Business Track - Real-time Interactive Big Data Analysis Using In-Memory Computing

Similar to IMCSummit 2015 - Day 2 IT Business Track - Real-time Interactive Big Data Analysis Using In-Memory Computing (20)

More from In-Memory Computing Summit

More from In-Memory Computing Summit (20)

Recently uploaded

Recently uploaded (20)

IMCSummit 2015 - Day 2 IT Business Track - Real-time Interactive Big Data Analysis Using In-Memory Computing