Currently we do not support per record compression for SEQUENCEFILE; we do support no
compression and block compression. Per record compression is typically very slow
(since the compressor is invoked per record in the table) and not widely used.
We chose to add support for per record compression as part of our effort to use Impala
for all of our testdata loading infrastructure. We have per record compressed tables
in testdata, so even though there is no customer demand for per record compression,
we need it to migrate our data loading off of Hive.
Change-Id: I6ea98ae0d31cceff8236b4b006c3a9fc00f64131
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5302
Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com>
Tested-by: jenkins
(cherry picked from commit f62a76f8d00b8dbc2846deb36ee5f65031ad846e)
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5322
The sequence writer test had an issue with zlib on certain cluster machines, making
this a flaky test. This has passed several times locally and in private builds. This
re-enables the test because the failures could not be produced in private builds.
Change-Id: I0aeea3a2d000e711e5a84427a7b40592e1eef75b
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5077
Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com>
Tested-by: jenkins
Avro and Sequence writers are only available if query option
ALLOW_UNSUPPORTED_FORMATS is set to true, prints an error otherwise.
Change-Id: I597039f7c68f708fda10f848531eb557d6910f92
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4539
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
This supports both uncompressed and block compressed formats. Row compressed formats are
not supported. The type of compression is specified using a query parameter
COMPRESSION_CODEC with values NONE, GZIP, BZIP2, and SNAPPY.
Note: this patch only has basic testing. More extensive testing will be done when this
avro writer is used in data loading.
Change-Id: Id284bd4f3a28e27e49d56b1127cdc83c736feb61
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/3541
Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com>
Tested-by: jenkins