From pets to cattle - powered by CoreOS, docker, Mesos & nginx
-
Upload
qaware-gmbh -
Category
Data & Analytics
-
view
516 -
download
2
Transcript of From pets to cattle - powered by CoreOS, docker, Mesos & nginx
MUC_DATE_THEME_INITIALS
From pets to cattle –powered by CoreOS, docker, Mesos & nginxThomas Schneider
MUC_DATE_THEME_INITIALS 2
Ι IntroΙ Pets vs. cattleΙ Where we started ...Ι Learnings & dead-endsΙ ... where we ended upΙ Technology stackΙ Showtime, baby!
Agenda
MUC_DATE_THEME_INITIALS 3
Intro
Ι Background§ ~10 years experience in IT§ Mainly software development§ Some system engineering, automation & operations§ @zooplus:
§ Lead engineer, build & runtime platform§ Trying to encourage DevOps culture
Ι Get in touch§ [email protected]§ github.com/schneidexe
MUC_DATE_THEME_INITIALS 4
Pets vs. cattle
MUC_DATE_THEME_INITIALS 5
Where we started …
Ι Done via shellΙ .sh scripts, scp, rsyncΙ Mainly .jar/.war filesΙ 2 out of 3 apps
deployed differentlyΙ Locally != dev != prodΙ Dev create & test, but
no idea how it runsΙ Ops deploy & run, but
have no idea what
Ι Monolithic code-baseΙ CI partly automatedΙ Manually configured
teamcityΙ 10+ all (=one) purpose
agentsΙ Up to 10+ builds in
parallelΙ Maven binary repository
(nexus)Ι Source repository
(subversion)
Build Deploy
Ι 300+ apps/jobsΙ Lot’s of bare-metalΙ Static environmentΙ Servers used by multiple
apps, upgrades blockedΙ Outdated machinesΙ Monitoring: ELK,
collectd (hard to get all)Ι High error noise levelΙ Ops == FirefightersΙ Devs blocked by InfraΙ Biz scaled, errors $$$
Run
Application stack is limited and tailored to environment. Build and release process slow, error-prone and inflexible. Intransparent infrastructure, extension is slow & painful.
MUC_DATE_THEME_INITIALS 6
Where we started …
MUC_DATE_THEME_INITIALS 7
Learnings & dead-ends
Ι Handle diversity§ Horizontal scaling is not our primary issue§ Handling tons of different apps it is!
Ι Immutability & automation§ Everything is (made from) code, build immutable artifacts§ Automation is not only for speed but for knowledge§ Re-creation over re-configuration
Ι Bring the pain forward§ Do not do it all by yourself, get infra and devs (and biz) on-board§ Think reverse: from prod to dev
Ι Keep it simple and fast§ Get buy-in from devs, they have to understand it§ Teams had/have to adopt several times (after all that years of stable infra!)§ Simple is fun, Speed is fun - if it’s fun people will use it
MUC_DATE_THEME_INITIALS 8
Learnings & dead-ends
Ι Docker helps, but does not solve all of your problems§ It’s not just about ‘docker build’ & ‘docker run’§ Think about scaling, monitoring and management of your apps on prod
Ι Stay focused§ What do YOU need? (not Google, Netflix & co)§ There are new products around every week, do 2-3 POCs, then stick with your descision§ Keep it modular, have a plan in mind how to migrate/replace parts§ Be not scared of throwing away a few things
Ι Persistence§ Try not to mix stateful and stateless things, externalize data§ A database might be more a pet than cattle
MUC_DATE_THEME_INITIALS 9
Learnings & dead-ends
Ι Puppet§ Not the best tool for deploying your app§ Re-configuration can be tricky (systems become almost identical)§ Not immutable (unless you really nail every dependency)
Ι Fleet (0.9.x)§ nice features like side-kicks, low overhead§ no resource management, too low level§ stability issues
Ι Mixing frameworks on Mesos agent nodes§ Isolation can get tough if patterns are too different e.g. jobs and services§ Not enough resources for big jobs on service nodes§ Spike utilization of batch jobs or builds can impact overall host performance
Ι Graphite and containers§ cannot handle metrics with too much dynamics (30k different containers in 1 week)
MUC_DATE_THEME_INITIALS 10
... where we ended up
Ι GUIs and REST APIsΙ Deploy
§ Services§ Jobs§ Containers§ Machines
Ι Unified deployment & management
Ι Cloud-agnosticΙ You build it, you run it!*
Ι Distributed code-baseΙ full CI/CD life-cycle can
be automatedΙ Pre-configured jenkins
masterΙ Disposable, customized
jenkins slavesΙ Scalable buildsΙ Multi-format binary
repository (Artifactory)Ι Source repository (git)
Build Deploy
Ι Flexible resourcemanagement
Ι Health-checks & self-healing
Ι Environment configΙ Service discovery &
routingΙ Horizontal scalingΙ House-keepingΙ Out-of-the-box
monitoring (metrics, logs)
Run
Build, deploy and run any application with high flexibility & low effort (Jenkinsfile, Dockerfile, Deployment .json). Same release process for all applications. High transparency on infrastructure*.
MUC_DATE_THEME_INITIALS 11
... where we ended up
MUC_DATE_THEME_INITIALS 12
Technology stack: Service Discovery & Routing, PaaS, CaaS, IaaS
SD/SR Nixy Mesos-DNS
PaaSMarathon Chronos Jenkins
Mesos Zookeeper
CaaS
IaaSdeploy-API + cloud-init
MUC_DATE_THEME_INITIALS 13
Technology stack: Monitoring
SD/SR
journal+
filebeat
dockerbeatPaaS
CaaS
hostbeat
IaaSvCenter
MUC_DATE_THEME_INITIALS 14
Technology stack: Deploy API
Ι Fast cloud-like provisioning of VMs (resources: cpu, mem, disk, net)Ι Lightweight bootstrapping
with cloud-initΙ Focus on cattle machinesΙ Re-create over re-configure
$ curl -X POST \--form "image=coreos-1081.3.0" \--form "application=docker" \--form "env=dev112" \--form "cpu=8" \--form "mem=16" \--form "disk=100" \--form "[email protected]" \"http://deploy.zooplus.de/api/v1/machines"
coreos:units
- name: docker.servicedrop-ins:
- name: docker-opts.confcontent: |
[Service]Environment='DOCKER_OPTS=host=tcp://0.0.0.0:2375'
MUC_DATE_THEME_INITIALS 15
Technology stack: Docker
Ι Automated reproducible builds with DockerfileΙ Immutable imagesΙ Bundles app and
dependenciesΙ Common artifact format
Ι Standardized way of deployment, monitoring, etc.Ι Isolation of applicationsΙ Resource allocation
$ cat DockerfileFROM repo.zooplus.de/centos:7RUN yum install –y java-1.8.0_47 && \
yum clean allADD shop.jar /shop.jarCMD java –jar shop.jar
$ docker build –t shop . Building image shopStep 1 : FROM repo.zooplus.de/centos:7---> 9b92a6d1f7de
...
$ docker run shopStarting shop...
MUC_DATE_THEME_INITIALS 16
Technology stack: Mesos
Ι Resource managerΙ Task distribution Ι “Whole DC as single machine”Ι All tasks run in docker
containers
Ι Web UI for status/utilization and debugging (logs, task state)Ι Usually no direct interaction
MUC_DATE_THEME_INITIALS 17
Technology stack: Jenkins
Ι Automation engineΙ Jenkins 2.xΙ Jenkinsfile and multi-
branch supportΙ Post-commit hooksΙ Immutable slaves
§ Running on mesos/docker§ Customized§ Highly scalable
Ι Jenkins master docker image § Spawn test instance in <1min
(builds should run on prod)§ Bootstrap with DSL
folder(”catalog") { }multibranchWorkflowJob(catalog/app') {
branchSources {git {
remote('ssh://[email protected]:22/cat/app.git')credentialsId('ef406810-be3c-4f2a-ad65-6239706d1766')
}}
}
MUC_DATE_THEME_INITIALS 18
Technology stack: Marathon
Ι Task scheduler for mesosΙ Distributed init systemΙ Long runnnig apps/services
Ι Rest API: submit apps via .json
Ι GUI: manage apps & manual config
Ι Health checks & self-healingΙ Multi-app deploymentsΙ Rolling updatesΙ Horizontal scaling
MUC_DATE_THEME_INITIALS 19
Technology stack: Chronos
Ι Task scheduler for mesosΙ Distributed cron systemΙ Batch jobs
Ι Rest API: submit jobs via .json
Ι GUI: job details & status, manual execution
Ι Scheduling§ Time-based§ Dependency-based
MUC_DATE_THEME_INITIALS 20
Technology stack: Nixy/Nginx/Mesos-DNS
Ι Nixy§ Service catalog from marathon§ REST-like API§ Event-based§ Configures nginx based on
templatesΙ Nginx
§ State-of-the-art web server§ Used as service router§ SSL termination§ Proxy for HTTP, TCP and UDP§ Access control & public
exposureΙ Mesos-DNS
§ Service catalog from mesos§ Convention-over-configuration
naming pattern§ used for “internal” services
"Apps": {"/finance/jenkins": {"Tasks": [["ops85-150.web.zooplus.de:20357"],["ops85-150.web.zooplus.de:20358"]
],"Frontends": [{"Type": "http","Data": ["finance-jenkins"]
}]
}}
$ host jenkins-finance.marathon.prod.zooplus.netjenkins-finance.marathon.prod.zooplus.net has address 192.168.85.150
MUC_DATE_THEME_INITIALS 21
Technology stack: journal & beats
Ι Hostbeat§ Ships host metrics in beats
format§ Like collectd
Ι Dockerbeat§ Ships container metrics in
beats format§ Metadata: env, labels
Ι Journal/Filebeat§ Ships every single log line from
journald to ELK§ Docker uses journal log-driver
to ship stdout/stderr§ Apps should log in JSON-lines
Ι ELK/Graphite§ Elastic search: event-data§ Graphite: TSD/metrics
Ι Nagios
MUC_DATE_THEME_INITIALS 22
Technology stack: Cluster structure
MUC_DATE_THEME_INITIALS 23
Technology stack:Questions?
MUC_DATE_THEME_INITIALS 24
Showtime, baby!
MUC_DATE_THEME_INITIALS 25
Thank You!
… and yes, we’re hiring! ;)