Data & Open Technologies A Perfect Combinationuocpga.gr/ddw1/kaukakis_ddw.pdf · A Perfect...
Transcript of Data & Open Technologies A Perfect Combinationuocpga.gr/ddw1/kaukakis_ddw.pdf · A Perfect...
Data & Open Technologies
A Perfect Combination Introduction Lecture
Καουκάκης Σταύρος Αναλυτής – Προγραμματιστής Πληροφοριακών Συστημάτων, M.Sc.
Μέλος Δ.Σ. Συλλόγου Αποφοίτων Μεταπτυχιακών Σπουδών Π.Κ. @kaukakis
11 June 2016
Contents
Main Topics
Data Sources (Who Produce Data?)
Open Source & Free Software
Some (future) stats
We have Data. So, What We Need?
Open (Source) Software Tools & Platforms
Some Examples & Case Studies
Discussion
1st Data Driven World 11 June 2016 2
Some Topics
11 June 2016 1st Data Driven World 3
Who Produce Data ?
All of us. Everybody!
Environment
When?
All the Time!
Who Collects Data?
Government
Companies
Users
Who Owns Data?
Why All Collect Data?
Take Advantage of Data!
Μερικοί Αριθμοί Big Data
Πάνω από το 90% του συνόλου των δεδομένων δημιουργήθηκαν τα τελευταία 2 χρόνια
Κάθε 2 ημέρες αποθηκεύονται τόσα δεδομένα όσα υπήρχαν ψηφιακά μέχρι το 2003
Tο 2020 το μέγεθος των δεδομένων θα 10πλασιαστεί (~40 Zettabytes)
Κάθε 1 έτος τα δεδομένα σχεδόν διπλασιάζονται
Συσκευές σε σύνδεση με το διαδίκτυο: 13 δις
Μέχρι το 2020 αναμένεται να φτάσουν στα 50 δις
Πάνω από 3 δις χρήστες
DVDs Stack to the Moon!!! (And Back) Ben Golub @golubbe
1st Data Driven World 11 June 2016 4
…E
very 6
0 S
eco
nd
s!
2015 Report Source: qmee.com
1st Data Driven World 11 June 2016 5
1st Data Driven World 11 June 2016 6
Source: wikimedia.org
1st Data Driven World 11 June 2016 7
(Big – Linked) Data & Software
Software & Tools Needed
Open Source Software
Open Hardware
Open Technologies
Open Data Platforms
1st Data Driven World 11 June 2016 8
Why Open Source? Customizability
Flexibility – Agility
Interoperability
Big Communities
Freedom
Try Before You Buy
Low Cost
Security
Online community and public directory of free and open source software
https://www.openhub.net/
1st Data Driven World 11 June 2016 9
Tools & Software for Data… Storage
Analysis
Cleaning
Mining
Visualization
Integration
Publishing
Automation
Programming Languages
… Open Technology is everywhere!
&
1st Data Driven World 11 June 2016 10
1st Data Driven World 11 June 2016 11
CKAN (Data Publishing)
CKAN is a powerful data management system
Publishing
Sharing
Using Data
Web: ckan.org
Case Study: http://www.data.gov.gr/
1st Data Driven World 11 June 2016 12
Open Refine (Data Cleaning)
A free, open source, powerful tool for working with messy data
Cleaning
Transforming from one format into another
Extending
Web: openrefine.org
An Example
1st Data Driven World 11 June 2016 13
Datawrapper (Data Visualization – Web App)
Datawrapper is like having an amazing graphic designer at the tip of your fingers
Brings Data to Life
Interactive Charts
No Coding Skills Needed
Limitations for free edition (extraction in PNG files)
Web: datawrapper.de
Examples: https://datawrapper.de/gallery
1st Data Driven World 11 June 2016 14
Data-Driven Documents (for Programmers)
D3.js is a JavaScript library for manipulating documents based on data
Brings Data to Life
Modern browsers Compatibility
Data-driven approach
Web: d3js.org
Examples: github.com/d3/d3/wiki/Gallery
Have a look to Google Charts
1st Data Driven World 11 June 2016 15
Lumify (Analysis and visualization )
Lumify is an open source big data analysis and visualization platform
Analyze relationships
Geographical view
Sharing your works in real time
Web: lumify.io
Examples: http://lumify.io/
1st Data Driven World 11 June 2016 16
R Language - Environment
R is a language and environment for statistical computing and graphics
Statistical & Graphical techniques
Linear and nonlinear modeling
Classification, Clustering
Web: .r-project.org
Examples: http://www.rexamples.com/
1st Data Driven World 11 June 2016 17
Data Storage - Management & More Open…
Hadoop (hadoop.apache.org)
MongoDB (mongodb.com)
Talend (talend.com)
Rapidminer (rapidminer.com)
Elodina Platform (elodina.net)
RDMS, like MySql and PostgreSQL
1st Data Driven World 11 June 2016 18
Online community and public directory of free and
open source
https://www.openhub.net/
https://opensource.org/
Thank You,
Questions?
1st Data Driven World 11 June 2016 19
Καουκάκης Σταύρος [email protected]
@kaukakis