how to batch a variable length spectogram in tensorflow












0















I have to train a denoising autoencoder but i need to batch the 5-frame noisy powerspectrum with 1 frame clean powerspectrum , but i dono how to batch the spectrogram since my data are all variable length in time-series.



def parse_line(noise_file,clean_file):
noise_binary = tf.read_file(noise_file)
noise_binary = tf.contrib.ffmpeg.decode_audio(noise_binary, file_format='wav', samples_per_second=16000, channel_count=1)
noise_stfts = tf.contrib.signal.stft(tf.reshape(noise_binary, [1, -1]), frame_length=512, frame_step=256,fft_length=512)
noise_powerspectrum = tf.log(tf.abs(noise_stfts)**2)
noise_data = tf.squeeze(tf.contrib.signal.frame(noise_powerspectrum,frame_length=5,frame_step=1,axis=1))
clean_binary = tf.read_file(clean_file)
clean_binary = tf.contrib.ffmpeg.decode_audio(clean_binary, file_format='wav', samples_per_second=16000, channel_count=1)
clean_stfts = tf.contrib.signal.stft(tf.reshape(clean_binary, [1, -1]), frame_length=512, frame_step=256,fft_length=512)
clean_powerspectrum = tf.log(tf.abs(clean_stfts)**2)
clean_data = tf.squeeze(clean_powerspectrum)[:-4]
return noise_data, clean_data


my tf.data pipeline is as shown below



shuffle_batch = 10
batch_size = 10
dataset = tf.data.Dataset.from_tensor_slices((noise_datalist,clean_datalist))
dataset = dataset.shuffle(shuffle_batch) # shuffle number of files perbatch
dataset = dataset.map(parse_line,num_parallel_calls=8)
dataset = dataset.batch(batch_size)
dataset = dataset.prefetch(tf.contrib.data.AUTOTUNE)
dataset = dataset.make_one_shot_iterator()
next_element = dataset.get_next()


this is the errors that shows



InvalidArgumentError (see above for traceback): Cannot batch tensors with different shapes in component 0. First element had shape [443,5,257] and element 1 had shape [280,5,257].
[[{{node IteratorGetNext}} = IteratorGetNext[output_shapes=[<unknown>, <unknown>], output_types=[DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](OneShotIterator)]]


when i change the batch_size to 1 it works and get one data. How can I batch this variable length data or even maybe batch all data to 1 like [443,5,257] and [280,5,257] to [723,5,257]?










share|improve this question























  • 443 and 280 corresponds to shapes of your noisy data and clean data respectively?

    – kvish
    Jan 1 at 10:28











  • @kvish my noisy data are created from clean_data, so they are same size, but i have many different time in length of wav files(clean_data) and i read it from noise_datalist and clean_datalist . i have a list of noise and clean data like data1.wav, data2.wav which is different length of time which shows in the post, for example data1.wav have 443frames, data2.wav and 280 frames and etc.

    – Leow
    Jan 1 at 11:08











  • oh okay thanks for clearing that up. Would padding and batching change the context of what you are trying to do? It might produce different lengths of first component in different batches according to what is the maximum in that batch, if you do not know what is the overall max length that you would want to use.

    – kvish
    Jan 1 at 11:34


















0















I have to train a denoising autoencoder but i need to batch the 5-frame noisy powerspectrum with 1 frame clean powerspectrum , but i dono how to batch the spectrogram since my data are all variable length in time-series.



def parse_line(noise_file,clean_file):
noise_binary = tf.read_file(noise_file)
noise_binary = tf.contrib.ffmpeg.decode_audio(noise_binary, file_format='wav', samples_per_second=16000, channel_count=1)
noise_stfts = tf.contrib.signal.stft(tf.reshape(noise_binary, [1, -1]), frame_length=512, frame_step=256,fft_length=512)
noise_powerspectrum = tf.log(tf.abs(noise_stfts)**2)
noise_data = tf.squeeze(tf.contrib.signal.frame(noise_powerspectrum,frame_length=5,frame_step=1,axis=1))
clean_binary = tf.read_file(clean_file)
clean_binary = tf.contrib.ffmpeg.decode_audio(clean_binary, file_format='wav', samples_per_second=16000, channel_count=1)
clean_stfts = tf.contrib.signal.stft(tf.reshape(clean_binary, [1, -1]), frame_length=512, frame_step=256,fft_length=512)
clean_powerspectrum = tf.log(tf.abs(clean_stfts)**2)
clean_data = tf.squeeze(clean_powerspectrum)[:-4]
return noise_data, clean_data


my tf.data pipeline is as shown below



shuffle_batch = 10
batch_size = 10
dataset = tf.data.Dataset.from_tensor_slices((noise_datalist,clean_datalist))
dataset = dataset.shuffle(shuffle_batch) # shuffle number of files perbatch
dataset = dataset.map(parse_line,num_parallel_calls=8)
dataset = dataset.batch(batch_size)
dataset = dataset.prefetch(tf.contrib.data.AUTOTUNE)
dataset = dataset.make_one_shot_iterator()
next_element = dataset.get_next()


this is the errors that shows



InvalidArgumentError (see above for traceback): Cannot batch tensors with different shapes in component 0. First element had shape [443,5,257] and element 1 had shape [280,5,257].
[[{{node IteratorGetNext}} = IteratorGetNext[output_shapes=[<unknown>, <unknown>], output_types=[DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](OneShotIterator)]]


when i change the batch_size to 1 it works and get one data. How can I batch this variable length data or even maybe batch all data to 1 like [443,5,257] and [280,5,257] to [723,5,257]?










share|improve this question























  • 443 and 280 corresponds to shapes of your noisy data and clean data respectively?

    – kvish
    Jan 1 at 10:28











  • @kvish my noisy data are created from clean_data, so they are same size, but i have many different time in length of wav files(clean_data) and i read it from noise_datalist and clean_datalist . i have a list of noise and clean data like data1.wav, data2.wav which is different length of time which shows in the post, for example data1.wav have 443frames, data2.wav and 280 frames and etc.

    – Leow
    Jan 1 at 11:08











  • oh okay thanks for clearing that up. Would padding and batching change the context of what you are trying to do? It might produce different lengths of first component in different batches according to what is the maximum in that batch, if you do not know what is the overall max length that you would want to use.

    – kvish
    Jan 1 at 11:34
















0












0








0








I have to train a denoising autoencoder but i need to batch the 5-frame noisy powerspectrum with 1 frame clean powerspectrum , but i dono how to batch the spectrogram since my data are all variable length in time-series.



def parse_line(noise_file,clean_file):
noise_binary = tf.read_file(noise_file)
noise_binary = tf.contrib.ffmpeg.decode_audio(noise_binary, file_format='wav', samples_per_second=16000, channel_count=1)
noise_stfts = tf.contrib.signal.stft(tf.reshape(noise_binary, [1, -1]), frame_length=512, frame_step=256,fft_length=512)
noise_powerspectrum = tf.log(tf.abs(noise_stfts)**2)
noise_data = tf.squeeze(tf.contrib.signal.frame(noise_powerspectrum,frame_length=5,frame_step=1,axis=1))
clean_binary = tf.read_file(clean_file)
clean_binary = tf.contrib.ffmpeg.decode_audio(clean_binary, file_format='wav', samples_per_second=16000, channel_count=1)
clean_stfts = tf.contrib.signal.stft(tf.reshape(clean_binary, [1, -1]), frame_length=512, frame_step=256,fft_length=512)
clean_powerspectrum = tf.log(tf.abs(clean_stfts)**2)
clean_data = tf.squeeze(clean_powerspectrum)[:-4]
return noise_data, clean_data


my tf.data pipeline is as shown below



shuffle_batch = 10
batch_size = 10
dataset = tf.data.Dataset.from_tensor_slices((noise_datalist,clean_datalist))
dataset = dataset.shuffle(shuffle_batch) # shuffle number of files perbatch
dataset = dataset.map(parse_line,num_parallel_calls=8)
dataset = dataset.batch(batch_size)
dataset = dataset.prefetch(tf.contrib.data.AUTOTUNE)
dataset = dataset.make_one_shot_iterator()
next_element = dataset.get_next()


this is the errors that shows



InvalidArgumentError (see above for traceback): Cannot batch tensors with different shapes in component 0. First element had shape [443,5,257] and element 1 had shape [280,5,257].
[[{{node IteratorGetNext}} = IteratorGetNext[output_shapes=[<unknown>, <unknown>], output_types=[DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](OneShotIterator)]]


when i change the batch_size to 1 it works and get one data. How can I batch this variable length data or even maybe batch all data to 1 like [443,5,257] and [280,5,257] to [723,5,257]?










share|improve this question














I have to train a denoising autoencoder but i need to batch the 5-frame noisy powerspectrum with 1 frame clean powerspectrum , but i dono how to batch the spectrogram since my data are all variable length in time-series.



def parse_line(noise_file,clean_file):
noise_binary = tf.read_file(noise_file)
noise_binary = tf.contrib.ffmpeg.decode_audio(noise_binary, file_format='wav', samples_per_second=16000, channel_count=1)
noise_stfts = tf.contrib.signal.stft(tf.reshape(noise_binary, [1, -1]), frame_length=512, frame_step=256,fft_length=512)
noise_powerspectrum = tf.log(tf.abs(noise_stfts)**2)
noise_data = tf.squeeze(tf.contrib.signal.frame(noise_powerspectrum,frame_length=5,frame_step=1,axis=1))
clean_binary = tf.read_file(clean_file)
clean_binary = tf.contrib.ffmpeg.decode_audio(clean_binary, file_format='wav', samples_per_second=16000, channel_count=1)
clean_stfts = tf.contrib.signal.stft(tf.reshape(clean_binary, [1, -1]), frame_length=512, frame_step=256,fft_length=512)
clean_powerspectrum = tf.log(tf.abs(clean_stfts)**2)
clean_data = tf.squeeze(clean_powerspectrum)[:-4]
return noise_data, clean_data


my tf.data pipeline is as shown below



shuffle_batch = 10
batch_size = 10
dataset = tf.data.Dataset.from_tensor_slices((noise_datalist,clean_datalist))
dataset = dataset.shuffle(shuffle_batch) # shuffle number of files perbatch
dataset = dataset.map(parse_line,num_parallel_calls=8)
dataset = dataset.batch(batch_size)
dataset = dataset.prefetch(tf.contrib.data.AUTOTUNE)
dataset = dataset.make_one_shot_iterator()
next_element = dataset.get_next()


this is the errors that shows



InvalidArgumentError (see above for traceback): Cannot batch tensors with different shapes in component 0. First element had shape [443,5,257] and element 1 had shape [280,5,257].
[[{{node IteratorGetNext}} = IteratorGetNext[output_shapes=[<unknown>, <unknown>], output_types=[DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](OneShotIterator)]]


when i change the batch_size to 1 it works and get one data. How can I batch this variable length data or even maybe batch all data to 1 like [443,5,257] and [280,5,257] to [723,5,257]?







python-3.x tensorflow tensorflow-datasets






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Dec 31 '18 at 6:20









LeowLeow

1209




1209













  • 443 and 280 corresponds to shapes of your noisy data and clean data respectively?

    – kvish
    Jan 1 at 10:28











  • @kvish my noisy data are created from clean_data, so they are same size, but i have many different time in length of wav files(clean_data) and i read it from noise_datalist and clean_datalist . i have a list of noise and clean data like data1.wav, data2.wav which is different length of time which shows in the post, for example data1.wav have 443frames, data2.wav and 280 frames and etc.

    – Leow
    Jan 1 at 11:08











  • oh okay thanks for clearing that up. Would padding and batching change the context of what you are trying to do? It might produce different lengths of first component in different batches according to what is the maximum in that batch, if you do not know what is the overall max length that you would want to use.

    – kvish
    Jan 1 at 11:34





















  • 443 and 280 corresponds to shapes of your noisy data and clean data respectively?

    – kvish
    Jan 1 at 10:28











  • @kvish my noisy data are created from clean_data, so they are same size, but i have many different time in length of wav files(clean_data) and i read it from noise_datalist and clean_datalist . i have a list of noise and clean data like data1.wav, data2.wav which is different length of time which shows in the post, for example data1.wav have 443frames, data2.wav and 280 frames and etc.

    – Leow
    Jan 1 at 11:08











  • oh okay thanks for clearing that up. Would padding and batching change the context of what you are trying to do? It might produce different lengths of first component in different batches according to what is the maximum in that batch, if you do not know what is the overall max length that you would want to use.

    – kvish
    Jan 1 at 11:34



















443 and 280 corresponds to shapes of your noisy data and clean data respectively?

– kvish
Jan 1 at 10:28





443 and 280 corresponds to shapes of your noisy data and clean data respectively?

– kvish
Jan 1 at 10:28













@kvish my noisy data are created from clean_data, so they are same size, but i have many different time in length of wav files(clean_data) and i read it from noise_datalist and clean_datalist . i have a list of noise and clean data like data1.wav, data2.wav which is different length of time which shows in the post, for example data1.wav have 443frames, data2.wav and 280 frames and etc.

– Leow
Jan 1 at 11:08





@kvish my noisy data are created from clean_data, so they are same size, but i have many different time in length of wav files(clean_data) and i read it from noise_datalist and clean_datalist . i have a list of noise and clean data like data1.wav, data2.wav which is different length of time which shows in the post, for example data1.wav have 443frames, data2.wav and 280 frames and etc.

– Leow
Jan 1 at 11:08













oh okay thanks for clearing that up. Would padding and batching change the context of what you are trying to do? It might produce different lengths of first component in different batches according to what is the maximum in that batch, if you do not know what is the overall max length that you would want to use.

– kvish
Jan 1 at 11:34







oh okay thanks for clearing that up. Would padding and batching change the context of what you are trying to do? It might produce different lengths of first component in different batches according to what is the maximum in that batch, if you do not know what is the overall max length that you would want to use.

– kvish
Jan 1 at 11:34














0






active

oldest

votes











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53984242%2fhow-to-batch-a-variable-length-spectogram-in-tensorflow%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes
















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53984242%2fhow-to-batch-a-variable-length-spectogram-in-tensorflow%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Mossoró

Error while reading .h5 file using the rhdf5 package in R

Pushsharp Apns notification error: 'InvalidToken'