Add gzip and zlib support for FixedLengthRecordReader (#8901)
* Switch from InputBuffer to BufferedInputStream for FixedLengthRecordReader This commit switch from InputBuffer to BufferedInputStream for FixedLengthRecordReader, for the purpose of allowing compressed file as FixedLengthRecordReader. Signed-off-by:Yong Tang <yong.tang.github@outlook.com> * Remove the need to file the file size before hand for FixedLengthRecordReader This commit updates the implementation of FixedLengthRecordReader so that it will not try to find the file size before hand. The purpose is to allow FixedLengthRecordReader to take a non-seekable file as an input. In a non-seekable file (e.g., a compressed file), it may not be possible to find the file size (e.g., in a uncompressed file length). This commit is part of the effort to support gzip/zlib compression. Signed-off-by:
Yong Tang <yong.tang.github@outlook.com> * Add gzip and zlib support for FixedLengthRecordReader This fix adds gzip and zlib support for FixedLengthRecordReader, as was discussed in 8856. When FixedLengthRecordReader is used, it will check for encoding flag and use ZlibInputStream as needed. The usage of InputBuffer in FixedLengthRecordReader has also been changed to BufferedInputStream to match ZlibInputStream. This fix fixes 8856. Signed-off-by:
Yong Tang <yong.tang.github@outlook.com> * Add additional test cases where num_records=0 and hop_bytes>record_bytes This commit adds additional test cases where num_records=0 and hop_bytes>record_bytes This commit also updated the API golden so that API compatibility test passes. Signed-off-by:
Yong Tang <yong.tang.github@outlook.com>
Loading
Please sign in to comment