Commit 25952518 authored by Joel Shor's avatar Joel Shor Committed by Gunhan Gulsoy
Browse files

Fix dataset resampling bug introduced by a bug in datasets itself. fixes #16606 (#17896)

* Fixes github issue #16606.

The core issue is that in the case of certain random Tensors, the
following two lines aren't the same:

```
rand_0s_and_1s_ds = ...
gather_ds = rand_0s_and_1s_ds.map(lambda i: tf.gather([0, 1], i))
tup_ds = tf.data.Dataset.zip(gather_ds, rand_0s_and_1s_ds)
```

```
rand_0s_and_1s_ds = ...
tup_ds = rand_0s_and_1s_ds.map(lambda i: (tf.gather([0, 1], i), i))
Note that this does NOT fix the underlying issue of drawing multiple
sampes from the underlying distribution.
```

Tested:
With the new test, bazel test :resample_test fails before and succeeds
after.

* Fixes github issue #16606.

The core issue is that in the case of certain random Tensors, the
following two lines aren't the same:

```
rand_0s_and_1s_ds = ...
gather_ds = rand_0s_and_1s_ds.map(lambda i: tf.gather([0, 1], i))
tup_ds = tf.data.Dataset.zip(gather_ds, rand_0s_and_1s_ds)
```

```
rand_0s_and_1s_ds = ...
tup_ds = rand_0s_and_1s_ds.map(lambda i: (tf.gather([0, 1], i), i))
Note that this does NOT fix the underlying issue of drawing multiple
sampes from the underlying distribution.
```

Tested:
With the new test, bazel test :resample_test fails before and succeeds
after.

* Undo a spurious git-induced change.

* Fix indent issue.

* Fix indent issue.
parent 00c90e67
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment