Cassandra @
Satoshi Konnohttp://www.cybergarage.org
• Engineering Manager of NoSQL Team @ Yahoo! Japan
• Open Source Software Developer for Virtual Reality, IoT and Cloud Computing
• Doctor's Course Student @ JAISTDéfago Lab : The φ accrual failure detector
About me
2
Agenda
• Company Profile
• Summary of C* Clusters
• Issues and Solutions of C*
• Next Generation Infrastructures for C*
Company Profile
4
Founded : January 31, 1996
Businesses : Internet Advertising
e-Commerce
Members Services, etc.
Web Services : 100+
Smartphone Apps: 50+ (iOS), 50+ (Android)
Employees : 5,800+ (as of June 30, 2016)
Head Office : Chiyoda-ku, Tokyo, Japan
Company Profile
5
Shareholder Composition
6
An independent and public company in the Japanese Market
U.S. Japan
35.5 % 42.9 %
Market Cap
$22 billion
Market Cap
$29 billion
Market Cap
$60 billion
18th Largest Internet Company in market cap
7
0
100
200
300
400
500
600
bilion U.S. dollars
http://www.statista.com/statistics/277483/market-value-of-the-largest-internet-companies-worldwide/
19 years
1617
18
Revenue ¥652B, Operating Income ¥171B (FY2015)
Continued Growth Sustained
60%Consumer
32%
%
Others
8 %Marketing Solutions
Revenue Portfolio
(FY2015)
Extensive Reach to a Wide Range of Users
10
80 %
80% of all Japanese Internet users use Yahoo! JAPAN
Nielsen NetView June 2015 : Data by Brands. Access from home and work using PCs (excl. internet applications)
Many Strong Services
11
Media
US
Search Video Answer Mail
JP
US
JP
Membership C2C Payment C2C EC B2C EC Local
Search Knowledge search MailNews
YAHUOKU!Premium Wallet Loco
Summary of C* Clusters
12
Yahoo! JAPAN Database Platforms
13
300+
Systems
NoSQL
Team
100+Services
OSS Database Platforms
14
300+
Systems
180Systems
MySQL630DBs
100Systems
Cassandra130DBs
30
70
60
40
Yahoo Japan
NoSQL
Team
RDB
Team
Cassandra @ Yahoo! JAPAN
15
2010 2012 2014 2016
ServiceDepartments
OurTeam
0.5 0.8 1.x
0.8 1.x 2.x 3.x
NoSQL
Team
Our Cassandra Clusters
16
30Clusters
30TBUsages
1000+Nodes
300,000
Read/sec
100,000
Write/sec
2016
10Nodes /
Cluster
160Nodes /
Cluster
…1
Shared
Cluster
30Special
Clusters
30Systems
50Systems
3DCs
Our Use Case Summary on Cassandra
17
100
Systems
20
Database Caching
10
Advertising Services
40
User Databases
50
Service Databases
Browsing History
Impression Data
・・・・
Meta Data
Aggregated Data
・・・・
Generated Data
Session Data
Meta Data
Aggregated Data
・・・・
Generated Data
Recommendation
Demographic Data
Life Log
・・・・
Preference Data
Behavior History
Our Issues and Solutions
18
ISSUE #1 : C10k Problem – C* Proxy
19
PC + Tablet
3.36B PV
Smart Device
3.45B PV
6.8 Billion PV / month
ISSUE #1 : C10k Problem – C* Proxy
20
Yahoo Japan Services
..........
10 〜 200 Front-end Servers / Service
PHOTO:AFLO
ISSUE #1 : C10k Problem – C* Proxy
• PROBLEM : 200 front-end servers * 128 processes
* 2 (C* request + C* heart beat)
=51,200 connections / node
21PHOTO:AFLO
200 Front-end Servers
128 processes
51,200 connections !
ISSUE #1 : C10k Problem – C* Proxy
• PROBLEM : 200 front-end servers * 128 processes
* 2 (C* request + C* heart beat)
=51,200 connections / node
22PHOTO:AFLO
ISSUE #1 : C10k Problem – C* Proxy
• PROBLEM : 200 front-end servers * 128 processes
* 2 (C* request + C* heart beat)
=51,200 connections / node
23
Process down
PHOTO:AFLO
ISSUE #1 : C10k Problem – C* Proxy
• SOLUTION : 200 front-end servers * 128 processes
* 1 proxy * 2 (C* request + C* heart beat)
=400 connections / node
24
200 front-end servers
1 proxy
400 connections !
128 processes
PHOTO:AFLO
ISSUE #2 : Boostrap Problem - Driver
• Heavy Services : ↑3000qps/node= C* cluster with real servers (SSD is recommended)
• Light Services : ↓1000qps/node and ↓3GB/node= C * cluster with virtual servers on OpenStack
25
Heavy Service Light Service
CPU = GoodvCPU = Cheap
ISSUE #2 : Boostrap Problem - Driver
• PROBLEM : All processes in each front-end server tries
to connect a new C* node which is added into the cluster
at the same time ...
26
..........
! ! !
! ! !
vCPU = Cheap
PHOTO:AFLO
ISSUE #2 : Boostrap Problem - Driver
• PROBLEM : The authentication of C* based on BCrypt is
heavy processing for the vCPU nodes.
27
..........
!
vCPU : Authentication (BCrypt) is heavy !
! !
! ! !
PHOTO:AFLO
ISSUE #2 : Boostrap Problem - Driver
• PROBLEM : Most processes can not connect to C*
clusters on OpenStack due to the authentication
processing, and the processes will timeout and repeat to
connect without waiting endlessly …
28
All vCPU Usages = 100% !
PHOTO:AFLO
vCPU : Authentication (BCrypt) is heavy !
Timeout ! Retry !
ISSUE #2 : Boostrap Problem - Driver
• SOLUTION : Improving the C* drivers not to connect
simultaneously when the connection is failed.
29
..........
!! !
! ! !
PHOTO:AFLO
ISSUE #3 : Multi-tenancy – Slow Query
• Small Services : (↓500qps and ↓10GB) / keyspace
= Shared C* cluster with real servers
30
Shared
Cluster
50Services
ISSUE #3 : Multi-tenancy – Slow Query
• PROBLEM : Couldn’t find the causal service of the high
loading queries in the multi-tenancy cluster.
31
Shared
Cluster Which
services ?
QUERY
QUERY
PHOTO:AFLO
ISSUE #3 : Multi-tenancy – Slow Query
• SOLUTION : CASSANDRA-12403 - Slow query
detecting
32
Shared
Cluster
Service Remove
Special
Cluster
QUERY
PHOTO:AFLO
Slow Query !
ISSUE #4 : Multi-racking – Inbound Params
• PROBLEM : Our C* clusters are build with other services
in a same rack or under a same core switch.
33PHOTO:AFLO
ISSUE #4 : Multi-racking – Inbound Params
• PROBLEM : C* Streaming occurs when the node is
added or remove by the our operation or the failure
detection.
34
Streaming
PHOTO:AFLO
ISSUE #4 : Multi-racking – Inbound Params
• PROBLEM : The streaming of C* rises a heavy traffic,
and it troubles the other services.
35
Streaming
Streaming
Streaming
Stop C*
streaming !
PHOTO:AFLO
stream_throughput_outbound
stream_throughput_outbound
stream_throughput_outbound
ISSUE #4 : Multi-racking – Inbound Params
• SOLUTION : CASSANDRA-11303 - New inbound
throughput parameters for streaming
36
Streaming
Streaming
Streaming
PHOTO:AFLO
stream_throughput_outbound
stream_throughput_outbound
stream_throughput_outbound
stream_throughput_inbound
stream_throughput_inbound
stream_throughput_inbound
Next Generation Infrastructures
for C*
37
• PURPOSE : To abstract our data center resources using
OpenStack.
Apps
Platforms
Infrastructures
APIAPI
API API API API
OpenStack @ Yahoo! JAPAN
38
50,000+
instances
Trial #1 : Special Hypervisor for C*
• PROBLEM : Our hypervisors of OpenStack has C* and
other service VMs.
39
Noisy
Neighbours
Trial #1 : Special Hypervisor for C*
• SOLUTION : Trying to offer the special hypervisors
which runs only C* VMs.
40
vCPU : 8+, Mem : 16GiB+
SSD : 100GiB+
Optimal
Flavors for C*
10Gbps x 2
TRIAL#2 : Bare Metal Clusters for C*
• PROBLEM : vCPU of OpenStack is cheap to run a C*
node in our special service environment such as the
many connections.
41
vCPU : Authentication (BCrypt) is heavy !
TRIAL #2 : Bare Metal Clusters for C*
• SOLUTION : Trying to offer the special bare metal
clusters which runs only C* using OpenStack Ironic.
42
Ironic
Xeon D-1541 2.1GHz (1CPU)
32GBMEM / SATA SSD 400GB
10Gbps x 2
Top Related