About me · Infrastructure of CloudStack λ Host is the basic unit of scale.Host run a hypervisor...

29

Transcript of About me · Infrastructure of CloudStack λ Host is the basic unit of scale.Host run a hypervisor...

About me ...

•  Davor Guttierrez •  Company 3 Gen d.o.o. •  33 employes

•  Located in Ljubljana - Slovenia

•  IBM Mainframe, Virtualization (RHEL, OracleVM, …)

•  Oracle Application Servers •  Oracle Databases

•  Jboss, Weblogic, Webspere

Agenda

λ  Begining λ  Research

λ  Project λ  How to …

λ  Our problems

λ  The End … is not near ...

Begining

λ  What are we looking for? λ  It's all started few years ago …

λ  ESXi virtualization

λ  Few standalone KVM or. XEN ... λ 

We use ...

λ  RHEV -  started with version 2.x

-  cca. 150 virtualized servers – SLES9, 10, 11 and Scientific Linux 6.5

λ  OracleVM

-  started with version 2.x and still using it in one environment -  SLES9, 10, 11 and Oracle Linux

λ  Ovirtu in our testing environment -  Latest version for testing

-  CentOS 6.x and CentOS 7.x

Orchestration

λ  We needed something for orchestration … λ  We have tried in our testing environment

-  OpenStack

-  Cloudstack

-  Eucalyptus

-  OpenNebula

Cloudstack

λ  Has met all our requrements λ  It just works λ  Many big companies use CloudStack λ  Good support forum λ  Has everything what we need λ  Easy installation - http://www.itsprite.com/how-to-install-cloudstack-4-2-on-centos-6-4/

-  single node – management server -  multiple node -  basic networking -  advanced networking

Cloudstack users

λ  Tata λ  Zynga

λ  WebMD

λ  GoDaddy

λ  BT

λ  Softlayer λ  Orange

λ  Alcatel Lucent

Infrastructure of CloudStack

λ  Host is the basic unit of scale. Host run a hypervisor or are bare metal λ  One or more host of same hypervisor are grouped into a cluster

λ  All host in cluster have access to primary storage

λ  One or more clusters are grouped into a pod – with a L2 switch

λ  One or more hosts are grouped into an availbility zone

λ  A zone has access to secondary storage λ  One or more zone is controled by a management server

Why Cloudstack

λ  SelfService Portal λ  Resource Pooling

λ  Metered Use

λ  Rapid Elasticity

Self Service is important

λ  We have virtualized allmost all of our servers λ  But many operations are still manual

-  provisioning new server -  networking setup -  application installation -  no scalling

λ  Big risk factor – human error λ  And Cloudstack has good Self Service portal, which has many features which are important for

our organization

Resource Pooling

λ  Must have in cloud λ  Dynamically allocated

λ  Workloads can be averaged aroud the globe, across times …

λ  Resource pooling is an IT term used in cloud computing environments to describe a situation in which providers serve multiple clients, customers or "tenants" with provisional and scalable services. These services can be adjusted to suit each client's needs without any changes being apparent to the client or end user.

Metered Use

λ  *aaS model allows measurment -  storage

-  compute

-  network

λ  Cloudstack usage server

λ  this is optional component λ  must be enabled in global options

λ 

Challenge

λ  Is to offer IaaS Cloud Services for developers λ  Required IaaS cloud platform with a rich API for easy integration of software

λ  Developers use an simple instance creation form .. λ  Developers have their own quota for servers and resources

Start…

Solution

λ  CS is base orchestrator λ  We use advanced monitoring with our monitorig system

λ  Logs are sent to our centralized monitoring server

Benefit

λ  Cloudstack maturity and ease of use allowed us to focus on thing around our cloud (complex network, creating templates, ...)

λ  We have implemented and integrated software like Vagrant, Ansible, Puppet and Foreman to automate the management of cloud

The End ...

Technical

λ  Two DC λ  All kinds of hardware

λ  Apache Cloudstack 4.3.0

λ  KVM Hypervisors in CentOS based servers

λ  We offer OracleLinux 6.5, CentOS 6.5 and 7, Scientific Linux 6.5 and Ubuntu 12.04 and 14.04 to our clients

λ  Instances are from 512 MB with 1 core to 16 GB with 8 cores

λ  We use Puppet for management server and hypervisors setup

Best Practices

λ  Deploying a cloud is challenging. There are many different technology choices to make, and CloudStack is flexible enough in its configuration that there are many possible ways to combine and configure the chosen technology.

Process Best Practices

A staging system that models the production environment is strongly advised. It is critical if customizations have been applied to CloudStack.

Allow adequate time for installation, a beta, and learning the system. Installs with basic networking can be done in few hours.

Installs with advanced networking usually take several days for the first attempt, with complicated installations taking longer. For a full production system, allow at least 4-8 weeks for a beta to work through all of the integration issues. You can get help from fellow users on the cloudstack-users mailing list.

Setup Best Practices

Each host should be configured to accept connections only from well-known entities such as the CloudStack Management Server or your network monitoring software.

Use multiple clusters per pod if you need to achieve a certain switch density. Primary storage mountpoints or LUNs should not exceed 6 TB in size. It is better to have multiple smaller

primary storage elements per cluster than one large one. When exporting shares on primary storage, avoid data loss by restricting the range of IP addresses that can

access the storage. -See “Linux NFS on Local Disks and DAS” or “Linux NFS on iSCSI”. NIC bonding is straightforward to implement and provides increased reliability. 10G networks are generally recommended for storage access when larger servers that can support relatively

more VMs are used. Host capacity should generally be modeled in terms of RAM for the guests. Storage and CPU may be

overprovisioned. RAM may not. - RAM is usually the limiting factor in capacity designs.

Maintenance Best Practices

Monitor host disk space. Many host failures occur because the host’s root disk fills up from logs that were not rotated adequately. Monitor the total number of VM instances in each cluster, and disable allocation to the cluster if the total is approaching the maximum that the hypervisor can handle. Be sure to leave a safety margin to allow for the possibility of one or more hosts failing, which would increase the VM load on the other hosts as the VMs are redeployed. Consult the documentation for your chosen hypervisor to find the maximum permitted number of VMs per host, then use CloudStack global configuration settings to set this as the default limit. Monitor the VM activity in each cluster and keep the total number of VMs below a safe level that allows for the occasional host failure. For example, if there are N hosts in the cluster, and you want to allow for one host in the cluster to be down at any given time, the total number of VM instances you can permit in the cluster is at most (N-1) * (per-host-limit). Once a cluster reaches this number of VMs, use the CloudStack UI to disable allocation to the cluster.

Our problems

λ  complex network in our environment λ  upgrade from version 4.2 to 4.3 doesn't go well … λ  support is ok, but ...

DEMO

•  testing system •  1 zone

•  1 pod

•  1 cluster

•  2 hosts

•  2 primary storage •  advanced network

TNX ...

λ  E-mail: [email protected]

λ  λ  Blog: www.d-mashina.net λ  CV: www.guttierrez.org λ  Company: www.3Gen.si