Commit Graph

5 Commits

Author SHA1 Message Date
Victor Bittorf
4339133887 Adding SEQUENCEFILE compressed record format
Currently we do not support per record compression for SEQUENCEFILE; we do support no
compression and block compression. Per record compression is typically very slow
(since the compressor is invoked per record in the table) and not widely used.

We chose to add support for per record compression as part of our effort to use Impala
for all of our testdata loading infrastructure. We have per record compressed tables
in testdata, so even though there is no customer demand for per record compression,
we need it to migrate our data loading off of Hive.

Change-Id: I6ea98ae0d31cceff8236b4b006c3a9fc00f64131
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5302
Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com>
Tested-by: jenkins
(cherry picked from commit f62a76f8d00b8dbc2846deb36ee5f65031ad846e)
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5322
2014-11-19 17:21:36 -08:00
Victor Bittorf
3f75bd6735 Reintroduce SEQUENCEFILE writer tests
The sequence writer test had an issue with zlib on certain cluster machines, making
this a flaky test. This has passed several times locally and in private builds. This
re-enables the test because the failures could not be produced in private builds.

Change-Id: I0aeea3a2d000e711e5a84427a7b40592e1eef75b
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5077
Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com>
Tested-by: jenkins
2014-11-17 11:19:16 -08:00
Victor Bittorf
dbaf718221 IMPALA-1185: Make Avro and Seq writers unsupported
Avro and Sequence writers are only available if query option
ALLOW_UNSUPPORTED_FORMATS is set to true, prints an error otherwise.

Change-Id: I597039f7c68f708fda10f848531eb557d6910f92
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4539
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
2014-09-26 12:28:03 -07:00
Nong Li
d52a620737 Add support for writing compressed text.
Change-Id: I314b925594801ae4b5c47248d998801aa0b37270
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4205
Tested-by: jenkins
Reviewed-by: Nong Li <nong@cloudera.com>
2014-09-07 22:08:30 -07:00
Victor Bittorf
f2ef06bef1 SEQUENCEFILE: Add support for writing sequence files.
This supports both uncompressed and block compressed formats. Row compressed formats are
not supported. The type of compression is specified using a query parameter
COMPRESSION_CODEC with values NONE, GZIP, BZIP2, and SNAPPY.

Note: this patch only has basic testing. More extensive testing will be done when this
avro writer is used in data loading.

Change-Id: Id284bd4f3a28e27e49d56b1127cdc83c736feb61
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/3541
Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com>
Tested-by: jenkins
2014-08-17 12:45:10 -07:00