How do I specify a row key in hbase shell that has a tab in it?

Go To StackoverFlow.com

6

In our infinite wisdom, we decided our rows would be keyed with a tab in the middle:

item_id <tab> location

For example:

000001  http://www.url.com/page

Using Hbase Shell, we cannot perform a get command because the tab character doesn't get written properly in the input line. We tried

get 'tableName', '000001\thttp://www.url.com/page'

without success. What should we do?

2012-04-04 21:59
by whiterook6


12

I had the same issue for binary values: \x00. This was my separator.

For the shell to accept your binary values, you need to provide them in double quote (") instead of single quote (').

put 'MyTable', "MyKey", 'Family:Qualifier', "\x00\x00\x00\x00\x00\x00\x00\x06Hello from shell"

Check how your tab is being encoded, my best bet would be that it is UTF8 encoded so from the ASCII table, this would be "000001\x09http://www.url.com/page".

On a side note, you should use null character for your separator, it will help you in scan.

2012-07-26 00:17
by Pierre-Luc Bertrand
This seems to work, especially since it addresses the general question of escaping characters in the HBase shell. What we wound up doing, instead, was md5ing our keys and using that, which not only provides very boring keys (all hex characters) but also spreads them across our table to hit all the regions nicely - whiterook6 2012-08-01 17:27
'\xC0' does not work and is converted to \xEF\xBF\xBD which I found is a special UTF character, probably to flag an error. Any other bytes in the range 80 until FF will not work - Constantino Cronemberger 2018-10-22 16:23


0

Hope you can change the tab character. :) Yeah that's a bad idea since Map Reduce jobs use the tab as a delimiter, and its generally a bad idea to use a tab or space as a delimiter.

You could use a double colon (::) as a delimiter. But wait, what if the URL has a double-colon in the URL? Well, urlencode the URL when you store it to HBase - that way, you have a standard delimiter, and the URL part of the key will not conflict with the delimiter.

In Python:

import urllib

DELIMITER = "::"
urlkey = urllib.quote_plus(location)

rowkey = item_id + DELIMITER + urlkey
2012-07-27 15:47
by Suman
My team uses tab as a delimiter all the time in HBase, for data we know can't have tabs in it. No problems with Map Reduce or anything else - Tony 2017-07-24 21:37
Ads