Post on 14-Jun-2015
description
HBase Data Types Nick Dimiduk, Hortonworks @xefyr n10k.com
Agenda
• Motivations • Progress thus far • Future work • Examples • More Examples
2014-‐11-‐18 2 Licensed under a Crea3ve Commons A8ribu3on-‐ShareAlike 3.0 Unported License.
Why introduce types?
• Δ(SQL, byte[]): (╯°□°)╯︵ ┻━┻ • Rule of least surprise • Interoperability across tools • Distill best practices
2014-‐11-‐18 3 Licensed under a Crea3ve Commons A8ribu3on-‐ShareAlike 3.0 Unported License.
Considerations
• Opt-in for current users • Easy transition for existing applications • Client-side only mostly – Filters, Split policies, Coprocessors, Block encoding
• Avoid POJO constraints – No required base-class/interface – No magic (avoid ASM, ORM)
• Non-Java clients • HBASE-8089
2014-‐11-‐18 4 Licensed under a Crea3ve Commons A8ribu3on-‐ShareAlike 3.0 Unported License.
2014-‐11-‐18 5 Licensed under a Crea3ve Commons A8ribu3on-‐ShareAlike 3.0 Unported License.
Inspiration
• Orderly • PostgreSQL / PostGIS
• HBASE-7221 • HBASE-7692
2014-‐11-‐18 6 Licensed under a Crea3ve Commons A8ribu3on-‐ShareAlike 3.0 Unported License.
Features: Encoding
• Order preservation • Override direction (ASC/DSC) • Fixed, variable-width • Null-able • Self-identifying • Efficient
2014-‐11-‐18 7 Licensed under a Crea3ve Commons A8ribu3on-‐ShareAlike 3.0 Unported License.
Features: API
• Complex type encoding – Compound rowkey pattern – Order preservation – Nullable fields
• Runtime metadata • User-extensible
2014-‐11-‐18 8 Licensed under a Crea3ve Commons A8ribu3on-‐ShareAlike 3.0 Unported License.
Implementation$HBASE-8089
Implementation: Encoding
2014-‐11-‐18 10 Licensed under a Crea3ve Commons A8ribu3on-‐ShareAlike 3.0 Unported License.
o.a.h.h.util.Bytes
• numeric • boolean • int16, int32, int64 • float32, float64 • variable-length text
o.a.h.h.util.OrderedBytes
• null • numeric, +/-Inf, NaN • int8, int16, int32, int64 • float32, float64 • variable-length text • variable-length blob
Implementation: API
2014-‐11-‐18 11 Licensed under a Crea3ve Commons A8ribu3on-‐ShareAlike 3.0 Unported License.
interface DataType<T>
• decode() • encode() • encodedClass() • encodedLength() • getOrder() • isNullable() • isOrderPreserving() • isSkippable() • skip()
implements DataType
• OrderedXXX • RawXXX • Struct – StructBuilder – StructIterator – TerminatedWrapper – FixedLengthWrapper
• Union{2,3,4}
Up Next
• “Default” types • More complex types
– Arrays/Lists – Maps/Dicts
• Tool integration – Apache Phoenix – Cloudera Kite
• Performance audit, HBASE-8694 • Improved metadata,
HBASE-8863 – isCastableTo – isCoercableTo – isComparableTo
• TypedTable, HBASE-7941 • Beyond Java, HBASE-10091
– REST – Thrift – Shell
• ImportTsv, HBASE-8593 • User documentation • Coprocessors? • Filters? • CAS? • DataBlockEncoders?
2014-‐11-‐18 12 Licensed under a Crea3ve Commons A8ribu3on-‐ShareAlike 3.0 Unported License.
Examples
A case for TypedTable
Put p = new Put(Bytes.toBytes(u.user));
p.add(INFO_FAM, USER_COL, Bytes.toBytes(u.user));
p.add(INFO_FAM, NAME_COL, Bytes.toBytes(u.name));
p.add(INFO_FAM, EMAIL_COL, Bytes.toBytes(u.email));
p.add(INFO_FAM, PASS_COL, Bytes.toBytes(u.password));
2014-‐11-‐18 14 Licensed under a Crea3ve Commons A8ribu3on-‐ShareAlike 3.0 Unported License.
A case for TypedTable!
static final RawString ENC_STR = new RawString();!
static final RawLong ENC_LONG = new RawLong();!
--!
!
SimplePositionedByteRange pbr =!
new SimplePositionedByteRange(100);!
ENC_STR.encode(pbr, u.user);!
Put p = new Put(Bytes.copy(pbr.getBytes(), pbr.getOffset(), pbr.getPosition()));!
p.add(INFO_FAM, USER_COL, Bytes.copy(pbr.getBytes(), ...);!
pbr.setPosition(0);!
ENC_STR.encode(pbr, u.name);!
p.add(INFO_FAM, NAME_COL, Bytes.copy(pbr.getBytes(), ...);!
...!
2014-‐11-‐18 15 Licensed under a Crea3ve Commons A8ribu3on-‐ShareAlike 3.0 Unported License.
Structs: writing
!
!
!
Struct struct = new StructBuilder()!
.add(OrderedNumeric.ASCENDING)!
.add(OrderedString.ASCENDING)!
.toStruct();!
PositionedByteRange buf1 =!
new SimplePositionedByteRange(7);!
struct.encode(buf1,!
new Object[] { BigDecimal.ONE, "foo" });!
!
2014-‐11-‐18 16 Licensed under a Crea3ve Commons A8ribu3on-‐ShareAlike 3.0 Unported License.
Structs: reading
!
!
!
!
buf1.setPosition(0);!
StructIterator it = longer.iterator(buf1);!
while (it.hasNext()) {!
System.out.print(it.next() + ", ");!
}!
!
> BigDecimal.ONE, foo!
2014-‐11-‐18 17 Licensed under a Crea3ve Commons A8ribu3on-‐ShareAlike 3.0 Unported License.
Structs: schema migration
Struct addedFields = new StructBuilder()!
.add(OrderedNumeric.ASCENDING)!
.add(OrderedString.ASCENDING)!
.add(OrderedString.ASCENDING)!
.add(OrderedNumeric.ASCENDING)!
.toStruct();!
!
buf1.setPosition(0);!
StructIterator it = longer.iterator(buf1);!
while (it.hasNext()) {!
System.out.print(it.next() + ", ");!
}!
> BigDecimal.ONE, foo, null, null!
!2014-‐11-‐18 18 Licensed under a Crea3ve Commons A8ribu3on-‐ShareAlike 3.0 Unported License.
Protobuf (HBASE-11161)
!
class PBKeyValue extends PBType<CellProtos.KeyValue> {!
!
@Override!
public int encode(PositionedByteRange dst, KeyValue val) {!
CodedOutputStream os = outputStreamFromByteRange(dst);!
int before = os.spaceLeft(), after, written;!
val.writeTo(os);!
after = os.spaceLeft();!
written = before - after;!
dst.setPosition(dst.getPosition() + written);!
return written;!
}!
2014-‐11-‐18 19 Licensed under a Crea3ve Commons A8ribu3on-‐ShareAlike 3.0 Unported License.
More Examples$https://gist.github.com/ndimiduk/bcf33f09cc7e4408f684
Thanks!
M A N N I N G
Nick Dimiduk Amandeep Khurana
FOREWORD BY Michael Stack
hbaseinaction.com
Nick Dimiduk github.com/ndimiduk
@xefyr
n10k.com
http://s.apache.org/bGN
2014-‐11-‐18 21 Licensed under a Crea3ve Commons A8ribu3on-‐ShareAlike 3.0 Unported License.