NoSQL: Μη-σχεσιακές βάσεις δεδομένων για υψηλή κλιμάκωση...

of 48 /48
Στέλιος Καραμπασάκης » [email protected] τμήμα Πληροφορικής και Τηλεπικοινωνιών ΕΚΠΑ ΠΜΣ 510 – Θέματα Εφαρμογών Βάσεων Δεδομένων

Embed Size (px)

description

Download original PPTX presentation with speaker notes in greek from: http://www.mediafire.com/?me3h3zfqkny NoSQL Grunge Logo designed by me and released to the public domain. Download as PSD or PNG from: http://www.mediafire.com/?sharekey=2644cf1d57cb17d6ab1eab3e9fa335cace0f768f8ef0a62b --------- Παρουσίαση που πραγματοποιήθηκε στις 26/5/2010 στο τμήμα Πληροφορικής και Τηλεπικοινωνιών ΕΚΠΑ, στα πλαίσια του μεταπτυχιακού μαθήματος "Θέματα Εφαρμογών Βάσεων Δεδομένων"

Transcript of NoSQL: Μη-σχεσιακές βάσεις δεδομένων για υψηλή κλιμάκωση...

2. 2/47 3. servers reads updates database servers hardware 3/47 4. 25 . 100 . 1 . / 55M tweets/ 600 . searches/ 500 . 200 data clusters120 . queries/secQuery throughput: 40GB/sec/cluster >10PB 4/47 5. Ask Ron, our Systems Engineering Lead,the exact number of servers we have inproduction and he'll probably respondwith I don't honestly know. digg.com 5/47 6. 6/47 7. Normalization Transactions Joins ACID properties Foreign keys Atomicity Indexes Consistency SQL parsing Isolation Durability Query optimization Security/Authentication Persistent storage 7/47 8. Normalization Transactions Joins ACID properties Foreign keys Atomicity Indexes Consistency SQL parsing Isolation Durability Query optimization Security/Authentication Persistent storage 8/47 9. Scaling Up CPU RAM server Denormalization joins Distributed Caching memcached Replication Master-Slave master writes & slaves reads Multi-master masters writes & slaves reads Partitioning Vertical Partitioning tables servers Horizontal Partitioning Sharding servers 9/47 10. Our growth has forced us into horizontal and vertical partitioning strategies that have eliminated most of the value of a relational database, while still incurring all the overhead. Digg.com 10/47 11. 11/47 12. !! 12/47 13. Dealing with failures in an infrastructure comprised of millions of components is our standard mode of operation. Amazon.com 13/47 14. ! 14/47 15. 2004 To Google , BigTable. 2005 CouchDB 2006 o paper BigTable. 2007 Amazon paper Dynamo Amazon S3 Dynamo. 2008 Google BigTable , Google App Engine 2008 To Facebook Cassandra, BigTable Dynamo. 2008 To LinkedIn Project Voldemort. 15/ 16. 16/47 17. 17/47 18. 18/47 19. Availability ConsistencyPartitiontolerance 19/47 20. Availability ConsistencyPartitiontolerance 20/47 21. = + partitioning + JOINs + = Replication + = latency + access control 21/47 22. data models SQL query APIs eventual consistency OXI partitioning replicationload balancing 22/47 23. , schema-less, partitioning 23/47 24. partitioning , DHTs Single object operations read . update . JOINs Object versioning use cases (.. Wikipedia, Google Docs) 24/47 25. Key-valueColumn JohnDocumentSmith 212 555-1234 646 555-4567 25/47 26. : ( hash)( blob) 26/47 27. : ()f1:col f2:col f3:col1 f3:col2 f3:col3 f999:col ... ( ) 27/47 28. get (key) Key-value put (key, context, object) get(table, key, columnName) Columninsert (table, key, rowMutation)delete (table, key, columnName) 28/47 29. nosql rdbms nosql rdbms nosql nosql nosql rdbms rdbms rdbmsrdbms mappers k-v reducersnosqlrdbms 45 29/47 30. . , . 30/47 31. . eventual consistency . , , , . . , ; : vector clocks 31/47 32. 32/47 33. : Consistent Hashing 33/47 34. 63 0 56 8 48 1640 2432 64 34/47 35. . 35/47 36. . k, k 36/47 37. cluster 37/47 38. cluster cluster / 38/47 39. cluster cluster cluster / 39/47 40. consistent hashing . 40/47 41. consistent hashing 41/47 42. cluster P . V , P=4 V .V=3 / cluster V cluster V . 42/47 43. . , . 43/47 44. 44/47 45. N-1 =4 k k k . 45/47 46. get R put W . R,W,N Trade-off consistency latency 46/47 47. Amazon Dynamo Scalaris Key-value Voldemort Google BigTable Hbase Column Cassandra Hyperbase CouchDB Riak Document MongoDB 47/47 48. F. Chang et al., Bigtable: A distributed storage system for structured data, in Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation (OSDI06), 2006, http://labs.google.com/papers/bigtable.html Giuseppe DeCandia et al., Dynamo: Amazons Highly Available Key-value Store, in Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles (Stevenson, Washington, USA: ACM, 2007), 205-220, http://portal.acm.org/citation.cfm?id=1294261.1294281 A. Lakshman and P. Malik, Cassandra-A Decentralized Structured Storage System (2007), http://www.cs.cornell.edu/projects/ladis2009/papers/lakshman-ladis2009.pdf S. Das et al., Clouded Data: Comprehending Scalable Data Management Systems, Technical Report 2008-18, UCSB, 2008. 48/47