YCSB run against HBase 0.92 on Amazon Elastic MapReduceSeptember 16, 2012 by Krystian Nowak
In this post we will show you how in simple steps using Yahoo! Cloud Serving Benchmark: https://github.com/dataminelab/YCSB you can run benchmarks against HBase 0.92 cluster deployed automatically by Amazon Elastic MapReduce and what measurements and comparisons you can obtain while choosing among different available instance types.
We will create EMR HBase clusters using the tooling provided by Amazon:
http://elasticmapreduce.s3.amazonaws.com/elastic-mapreduce-ruby.zip
Note: As you might see in commands.rb the default_hadoop_version is set to 0.20(.x), but as our tests found using Hadoop in version 1.0.3 has significant performance gain. Therefore when creating EMR cluster, we will explicitly set this version.
Let’s create one:
elastic-mapreduce --create \ --hbase \ --name "EMR HBase YCSB" \ --num-instances 2 \ --instance-type m1.large \ --hadoop-version 1.0.3 Created job flow j-1PP3JU6UJ0HQ1
elastic-mapreduce --list --active j-1PP3JU6UJ0HQ1 WAITING ec2-23-22-19-48.compute-1.amazonaws.com EMR HBase YCSB COMPLETED Start HBase
Build the project (HBase master server variables should now defaults to localhost (127.0.0.1)).
git clone git@github.com:dataminelab/YCSB.git cd YCSB export MAVEN_OPTS="-Xmx512m -Xms128m -Xss2m"
(check http://jira.codehaus.org/browse/MASSEMBLY-549 why…)
mvn clean install -Dcheckstyle.skip=true cd distribution/target scp -i ~/.ssh/dataminelab-ec2.pem ycsb-0.1.5-SNAPSHOT.tar.gz \ hadoop@ec2-23-22-19-48.compute-1.amazonaws.com:/home/hadoop/ycsb.tar.gz ssh -i ~/.ssh/dataminelab-ec2.pem \ hadoop@ec2-23-22-19-48.compute-1.amazonaws.com tar xvzf ycsb.tar.gz ln -s ycsb-0.1.5-SNAPSHOT ycsb cd ycsb
Create the working table in HBase (aleady pre-split):
hbase org.apache.hadoop.hbase.util.RegionSplitter usertable -c 200 -f family
Hard to be perfect – because of https://issues.apache.org/jira/browse/HBASE-4163 is still not in place – please vote! :)
But it still seems to be better than no split at all!
You might spot:
12/08/25 13:39:16 ERROR metrics.MetricsSaver: Failed SaveRecords hdfs:/mnt/var/lib/hadoop/metrics/raw/i-694c4712_04272_raw.bin Shutdown in progress
as in https://forums.aws.amazon.com/thread.jspa?threadID=100643 but it doesn’t seem to hurt us…
hbase shell scan '.META.', {COLUMNS => 'info:regioninfo'} exit
Load initial data into HBase
./bin/ycsb load hbase -p columnfamily=family -P workloads/workloada | tee load.log
Check for your own eyes that the data is loaded into HBase
hbase shell hbase(main):001:0> count 'usertable' Current count: 1000, row: user995698996184959679 1000 row(s) in 2.3210 seconds
And run the tests – only as a warm-up:
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloada \ -p columnfamily=family \ -p operationcount=10000 \ -s \ -threads 10 | tee warm-up-tests.log
And now the real tests with 10 threads:
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloada \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 10 | tee real-tests-workload-a.log
cat real-tests-workload-a.log
[OVERALL], RunTime(ms), 47132.0 [OVERALL], Throughput(ops/sec), 2121.700755325469 [UPDATE], Operations, 50209 [UPDATE], AverageLatency(us), 186.93305980999423
And also 10 threads, but for another workload type.
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloadf \ -p columnfamily=family \ -p operationcount=100000 \ -s -threads 10 | tee real-tests-workload-f.log cat real-tests-workload-f.log
[OVERALL], RunTime(ms), 52748.0 [OVERALL], Throughput(ops/sec), 1895.8064760749223 [UPDATE], Operations, 50018 [UPDATE], AverageLatency(us), 11.925006997480907
Now we might check how these workload scenarios behave when increasing thread number.
Starting with 100 threads.
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloada \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 100 | tee real-tests-workload-a-100t.log cat real-tests-workload-a-100t.log
[OVERALL], RunTime(ms), 24234.0 [OVERALL], Throughput(ops/sec), 4126.433935792688 [UPDATE], Operations, 50063 [UPDATE], AverageLatency(us), 1076.5547010766434
500 threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloada \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 500 | tee real-tests-workload-a-500t.log cat real-tests-workload-a-500t.log
[OVERALL], RunTime(ms), 20706.0 [OVERALL], Throughput(ops/sec), 4829.518014102193 [UPDATE], Operations, 50099 [UPDATE], AverageLatency(us), 6167.192359128925
1000 threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloada \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 1000 | tee real-tests-workload-a-1kt.log cat real-tests-workload-a-1kt.log
[OVERALL], RunTime(ms), 21484.0 [OVERALL], Throughput(ops/sec), 4654.626698938745 [UPDATE], Operations, 49988 [UPDATE], AverageLatency(us), 9423.208390013604
2000 threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloada \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 2000 | tee real-tests-workload-a-2kt.log cat real-tests-workload-a-2kt.log
[OVERALL], RunTime(ms), 24358.0 [OVERALL], Throughput(ops/sec), 4105.427374989737 [UPDATE], Operations, 49957 [UPDATE], AverageLatency(us), 7786.985767760274
And the same for the other workload scenario now:
100 threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloadf \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 100 | tee real-tests-workload-f-100t.log cat real-tests-workload-f-100t.log
[OVERALL], RunTime(ms), 33924.0 [OVERALL], Throughput(ops/sec), 2947.7655936799906 [UPDATE], Operations, 50136 [UPDATE], AverageLatency(us), 17.44125977341631
1000 threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloadf \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 1000 | tee real-tests-workload-f-1kt.log cat real-tests-workload-f-1kt.log
[OVERALL], RunTime(ms), 29309.0 [OVERALL], Throughput(ops/sec), 3411.921252857484 [UPDATE], Operations, 50127 [UPDATE], AverageLatency(us), 16.611586570111914
2000 threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloadf \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 2000 | tee real-tests-workload-f-2kt.log cat real-tests-workload-f-2kt.log
[OVERALL], RunTime(ms), 29311.0 [OVERALL], Throughput(ops/sec), 3411.688444611238 [UPDATE], Operations, 49951 [UPDATE], AverageLatency(us), 59.80148545574663
3000 threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloadf \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 3000 | tee real-tests-workload-f-3kt.log cat real-tests-workload-f-3kt.log
[OVERALL], RunTime(ms), 32314.0 [OVERALL], Throughput(ops/sec), 3063.6875657609703 [UPDATE], Operations, 49492 [UPDATE], AverageLatency(us), 20.00127293299927
4000 threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloadf \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 4000 | tee real-tests-workload-f-4kt.log cat real-tests-workload-f-4kt.log
[OVERALL], RunTime(ms), 35051.0 [OVERALL], Throughput(ops/sec), 2852.985649482183 [UPDATE], Operations, 50095 [UPDATE], AverageLatency(us), 38.50611837508733
Let’s now try more instances instead just one slave – 4 slaves, same type as before.
elastic-mapreduce --create \ --hbase \ --name "EMR HBase YCSB" \ --num-instances 5 \ --instance-type m1.large \ --hadoop-version 1.0.3 Created job flow j-OE7G6YUHMD2I
elastic-mapreduce --list --active j-OE7G6YUHMD2I WAITING ec2-50-17-100-242.compute-1.amazonaws.com EMR HBase YCSB COMPLETED Start HBase
Now just copy already built test suite:
scp -i ~/.ssh/dataminelab-ec2.pem ycsb-0.1.5-SNAPSHOT.tar.gz \ hadoop@ec2-50-17-100-242.compute-1.amazonaws.com:/home/hadoop/ycsb.tar.gz ssh -i ~/.ssh/dataminelab-ec2.pem \ hadoop@ec2-50-17-100-242.compute-1.amazonaws.com tar xvzf ycsb.tar.gz ln -s ycsb-0.1.5-SNAPSHOT ycsb cd ycsb
Initialize table:
hbase org.apache.hadoop.hbase.util.RegionSplitter usertable -c 200 -f family
Load initial data:
./bin/ycsb load hbase \ -p columnfamily=family \ -P workloads/workloada | tee load.log
And run tests:
warm-up
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloada \ -p columnfamily=family \ -p operationcount=10000 \ -s \ -threads 10 | tee warm-up-tests.log
10 threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloada \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 10 | tee real-tests-workload-a.log cat real-tests-workload-a.log
[OVERALL], RunTime(ms), 42609.0 [OVERALL], Throughput(ops/sec), 2346.9220117815485 [UPDATE], Operations, 50073 [UPDATE], AverageLatency(us), 117.53685618996265
100 threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloada \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 100 | tee real-tests-workload-a-100t.log cat real-tests-workload-a-100t.log
[OVERALL], RunTime(ms), 23500.0 [OVERALL], Throughput(ops/sec), 4255.31914893617 [UPDATE], Operations, 49837 [UPDATE], AverageLatency(us), 1089.7759295302687
500 threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloada \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 500 | tee real-tests-workload-a-500t.log cat real-tests-workload-a-500t.log
[OVERALL], RunTime(ms), 19763.0 [OVERALL], Throughput(ops/sec), 5059.960532307848 [UPDATE], Operations, 50196 [UPDATE], AverageLatency(us), 4854.259104311101
1000 threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloada \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 1000 | tee real-tests-workload-a-1kt.log cat real-tests-workload-a-1kt.log
[OVERALL], RunTime(ms), 20028.0 [OVERALL], Throughput(ops/sec), 4993.0097862991815 [UPDATE], Operations, 49904 [UPDATE], AverageLatency(us), 9582.977617024688
2000 threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloada \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 2000 | tee real-tests-workload-a-2kt.log cat real-tests-workload-a-2kt.log
[OVERALL], RunTime(ms), 22608.0 [OVERALL], Throughput(ops/sec), 4423.2130219391365 [UPDATE], Operations, 49988 [UPDATE], AverageLatency(us), 6244.29357045691
5000 threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloada \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 5000 | tee real-tests-workload-a-5kt.log cat real-tests-workload-a-5kt.log
[OVERALL], RunTime(ms), 24861.0 [OVERALL], Throughput(ops/sec), 4022.3643457624394 [UPDATE], Operations, 50100 [UPDATE], AverageLatency(us), 8150.377125748503
10k threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloada \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 10000 | tee real-tests-workload-a-10kt.log cat real-tests-workload-a-10kt.log
[OVERALL], RunTime(ms), 25336.0 [OVERALL], Throughput(ops/sec), 3946.9529523208084 [UPDATE], Operations, 50176 [UPDATE], AverageLatency(us), 8851.578204719388
workload f, 10 threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloadf \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 10 | tee real-tests-workload-f.log cat real-tests-workload-f.log
[OVERALL], RunTime(ms), 53310.0 [OVERALL], Throughput(ops/sec), 1875.8206715438005 [UPDATE], Operations, 49867 [UPDATE], AverageLatency(us), 12.18058034371428
100 threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloadf \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 100 | tee real-tests-workload-f-100t.log cat real-tests-workload-f-100t.log
[OVERALL], RunTime(ms), 30991.0 [OVERALL], Throughput(ops/sec), 3226.7432480397533 [UPDATE], Operations, 50145 [UPDATE], AverageLatency(us), 13.73040183467943
1k threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloadf \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 1000 | tee real-tests-workload-f-1kt.log cat real-tests-workload-f-1kt.log
[OVERALL], RunTime(ms), 29185.0 [OVERALL], Throughput(ops/sec), 3426.4176803152304 [UPDATE], Operations, 50047 [UPDATE], AverageLatency(us), 29.82979998801127
2k threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloadf \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 2000 | tee real-tests-workload-f-2kt.log cat real-tests-workload-f-2kt.log
[OVERALL], RunTime(ms), 31906.0 [OVERALL], Throughput(ops/sec), 3134.206732276061 [UPDATE], Operations, 50111 [UPDATE], AverageLatency(us), 24.55253337590549
3k threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloadf \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 3000 | tee real-tests-workload-f-3kt.log cat real-tests-workload-f-3kt.log
[OVERALL], RunTime(ms), 34410.0 [OVERALL], Throughput(ops/sec), 2877.070619006103 [UPDATE], Operations, 49607 [UPDATE], AverageLatency(us), 23.37424153849255
Now let’s see how even more serious instances offered by AWS would behave in this scenario!
m1.xlarge (2 x more memory, 2 x more CPU than m1.large)
elastic-mapreduce --create \ --hbase \ --name "EMR HBase YCSB" \ --num-instances 5 \ --instance-type m1.xlarge \ --hadoop-version 1.0.3 Created job flow j-2ICBS9029MJAV
./elastic-mapreduce --list --active j-2ICBS9029MJAV WAITING ec2-107-21-130-111.compute-1.amazonaws.com EMR HBase YCSB COMPLETED Start HBase
scp -i ~/.ssh/dataminelab-ec2.pem ycsb-0.1.5-SNAPSHOT.tar.gz \ hadoop@ec2-107-21-130-111.compute-1.amazonaws.com:/home/hadoop/ycsb.tar.gz ssh -i ~/.ssh/dataminelab-ec2.pem \ hadoop@ec2-107-21-130-111.compute-1.amazonaws.com tar xvzf ycsb.tar.gz ln -s ycsb-0.1.5-SNAPSHOT ycsb cd ycsb
hbase org.apache.hadoop.hbase.util.RegionSplitter usertable -c 200 -f family
./bin/ycsb load hbase \ -p columnfamily=family \ -P workloads/workloada | tee load.log
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloada \ -p columnfamily=family \ -p operationcount=10000 \ -s \ -threads 10 | tee warm-up-tests.log
10 threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloada \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 10 | tee real-tests-workload-a.log cat real-tests-workload-a.log
[OVERALL], RunTime(ms), 39481.0 [OVERALL], Throughput(ops/sec), 2532.8639092221574 [UPDATE], Operations, 49981 [UPDATE], AverageLatency(us), 62.85440467377604
100 threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloada \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 100 | tee real-tests-workload-a-100t.log cat real-tests-workload-a-100t.log
[OVERALL], RunTime(ms), 17877.0 [OVERALL], Throughput(ops/sec), 5593.779716954747 [UPDATE], Operations, 50100 [UPDATE], AverageLatency(us), 640.4568662674651
1k threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloada \ -p columnfamily=family \ -p operationcount=100000 \ -s -threads 1000 | tee real-tests-workload-a-1kt.log cat real-tests-workload-a-1kt.log
[OVERALL], RunTime(ms), 13986.0 [OVERALL], Throughput(ops/sec), 7150.00715000715 [UPDATE], Operations, 49750 [UPDATE], AverageLatency(us), 8759.566291457286
2k threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloada \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 2000 | tee real-tests-workload-a-2kt.log cat real-tests-workload-a-2kt.log
[OVERALL], RunTime(ms), 14783.0 [OVERALL], Throughput(ops/sec), 6764.526821348847 [UPDATE], Operations, 50118 [UPDATE], AverageLatency(us), 26718.534857735744
3k threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloada \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 3000 | tee real-tests-workload-a-3kt.log cat real-tests-workload-a-3kt.log
[OVERALL], RunTime(ms), 15477.0 [OVERALL], Throughput(ops/sec), 6396.588486140725 [UPDATE], Operations, 49465 [UPDATE], AverageLatency(us), 12066.01403012231
4k threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloada \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 4000 | tee real-tests-workload-a-4kt.log cat real-tests-workload-a-4kt.log
[OVERALL], RunTime(ms), 15261.0 [OVERALL], Throughput(ops/sec), 6552.650547146321 [UPDATE], Operations, 49883 [UPDATE], AverageLatency(us), 22551.664294449012
another workload, 10 threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloadf \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 10 | tee real-tests-workload-f.log cat real-tests-workload-f.log
[OVERALL], RunTime(ms), 45751.0 [OVERALL], Throughput(ops/sec), 2185.744573889095 [UPDATE], Operations, 49950 [UPDATE], AverageLatency(us), 9.801721721721721
500 threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloadf \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 500 | tee real-tests-workload-f-500t.log cat real-tests-workload-f-500t.log
[OVERALL], RunTime(ms), 21870.0 [OVERALL], Throughput(ops/sec), 4572.473708276178 [UPDATE], Operations, 49678 [UPDATE], AverageLatency(us), 11.18187125085551
1k threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloadf \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 1000 | tee real-tests-workload-f-1kt.log cat real-tests-workload-f-1kt.log
[OVERALL], RunTime(ms), 19207.0 [OVERALL], Throughput(ops/sec), 5206.435153850159 [UPDATE], Operations, 49879 [UPDATE], AverageLatency(us), 11.812406022574631
2k threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloadf \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 2000 | tee real-tests-workload-f-2kt.log cat real-tests-workload-f-2kt.log
[OVERALL], RunTime(ms), 20493.0 [OVERALL], Throughput(ops/sec), 4879.715024642561 [UPDATE], Operations, 50114 [UPDATE], AverageLatency(us), 12.770423434569182
And for now, more CPU power!
c1.xlarge (same memory, 5 x more CPU than m1.large)
elastic-mapreduce --create \ --hbase \ --name "EMR HBase YCSB" \ --num-instances 5 \ --instance-type c1.xlarge \ --hadoop-version 1.0.3 Created job flow j-3KZHQRG2D74AY
./elastic-mapreduce --list --active j-3KZHQRG2D74AY WAITING ec2-75-101-255-226.compute-1.amazonaws.com EMR HBase YCSB COMPLETED Start HBase
scp -i ~/.ssh/dataminelab-ec2.pem ycsb-0.1.5-SNAPSHOT.tar.gz \ hadoop@ec2-75-101-255-226.compute-1.amazonaws.com:/home/hadoop/ycsb.tar.gz ssh -i ~/.ssh/dataminelab-ec2.pem \ hadoop@ec2-75-101-255-226.compute-1.amazonaws.com tar xvzf ycsb.tar.gz ln -s ycsb-0.1.5-SNAPSHOT ycsb cd ycsb
hbase org.apache.hadoop.hbase.util.RegionSplitter usertable -c 200 -f family
./bin/ycsb load hbase \ -p columnfamily=family \ -P workloads/workloada | tee load.log
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloada \ -p columnfamily=family \ -p operationcount=10000 \ -s \ -threads 10 | tee warm-up-tests.log
10 threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloada \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 10 | tee real-tests-workload-a.log cat real-tests-workload-a.log
[OVERALL], RunTime(ms), 32121.0 [OVERALL], Throughput(ops/sec), 3113.228106223343 [UPDATE], Operations, 49973 [UPDATE], AverageLatency(us), 71.10029415884577
100 threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloada \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 100 | tee real-tests-workload-a-100t.log cat real-tests-workload-a-100t.log
[OVERALL], RunTime(ms), 15076.0 [OVERALL], Throughput(ops/sec), 6633.059166887769 [UPDATE], Operations, 50167 [UPDATE], AverageLatency(us), 644.8327187194769
1k threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloada \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 1000 | tee real-tests-workload-a-1kt.log cat real-tests-workload-a-1kt.log
[OVERALL], RunTime(ms), 12864.0 [OVERALL], Throughput(ops/sec), 7773.63184079602 [UPDATE], Operations, 50240 [UPDATE], AverageLatency(us), 9889.390306528663
2k threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloada \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 2000 | tee real-tests-workload-a-2kt.log cat real-tests-workload-a-2kt.log
[OVERALL], RunTime(ms), 14889.0 [OVERALL], Throughput(ops/sec), 6716.367788300087 [UPDATE], Operations, 50216 [UPDATE], AverageLatency(us), 41222.41986617811
3k threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloada \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 3000 | tee real-tests-workload-a-3kt.log cat real-tests-workload-a-3kt.log
[OVERALL], RunTime(ms), 14461.0 [OVERALL], Throughput(ops/sec), 6845.9995850909345 [UPDATE], Operations, 49451 [UPDATE], AverageLatency(us), 51852.53568178601
5k threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloada \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 5000 | tee real-tests-workload-a-5kt.log cat real-tests-workload-a-5kt.log
[OVERALL], RunTime(ms), 17072.0 [OVERALL], Throughput(ops/sec), 5857.544517338331 [UPDATE], Operations, 49835 [UPDATE], AverageLatency(us), 82378.54861041436
10k threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloada \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 10000 | tee real-tests-workload-a-10kt.log cat real-tests-workload-a-10kt.log
[OVERALL], RunTime(ms), 20226.0 [OVERALL], Throughput(ops/sec), 4944.131316127757 [UPDATE], Operations, 50113 [UPDATE], AverageLatency(us), 49147.25219005049
another workload, 10 threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloadf \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 10 | tee real-tests-workload-f.log cat real-tests-workload-f.log
[OVERALL], RunTime(ms), 40801.0 [OVERALL], Throughput(ops/sec), 2450.920320580378 [UPDATE], Operations, 49966 [UPDATE], AverageLatency(us), 12.13715326421967
400 threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloadf \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 400 | tee real-tests-workload-f-400t.log cat real-tests-workload-f-400t.log
[OVERALL], RunTime(ms), 17856.0 [OVERALL], Throughput(ops/sec), 5600.358422939068 [UPDATE], Operations, 50071 [UPDATE], AverageLatency(us), 14.301591739729584
500 threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloadf \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 500 | tee real-tests-workload-f-500t.log cat real-tests-workload-f-500t.log
[OVERALL], RunTime(ms), 17909.0 [OVERALL], Throughput(ops/sec), 5583.784689262382 [UPDATE], Operations, 50210 [UPDATE], AverageLatency(us), 16.105915156343357
1k threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloadf \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 1000 | tee real-tests-workload-f-1kt.log cat real-tests-workload-f-1kt.log
[OVERALL], RunTime(ms), 16982.0 [OVERALL], Throughput(ops/sec), 5888.5879166175955 [UPDATE], Operations, 50088 [UPDATE], AverageLatency(us), 15.313268647180962
2k threads
./bin/ycsb run hbase \ -p columnfamily=family \ -P workloads/workloadf \ -p columnfamily=family \ -p operationcount=100000 \ -s \ -threads 2000 | tee real-tests-workload-f-2kt.log cat real-tests-workload-f-2kt.log
[OVERALL], RunTime(ms), 17219.0 [OVERALL], Throughput(ops/sec), 5807.538184563564 [UPDATE], Operations, 49989 [UPDATE], AverageLatency(us), 17.61469523295125
Even after running these simple scenarios we are able to check how for given configuration the number of threads used influences the throughput for each of workload type:
You can now play with other instance types and instance numbers. You can also mix multiple nodes running YCSB benchmark code and observe possible saturation, either from master’s CPU or network layer.
We also invite you to play with the code or even contribute features and improvements, so that others can benefit from them too – have fun!