DEFLATE: dynamic block structure

I am trying to construct an explicit example of a dynamic block. Please let me know if this is wrong.

Considering this example of lit/len alphabet:

A(0), B(0), C(1), D(0), E(2), F(2), G(2), H(2)
and the rest of symbols having zero code lengths.

The sequence(SQ) of code lengths would be 0...,0,0,1,0,2,2,2,2,...0.

Then we have to compress it further with run-length encoding. So we have to calculate number of repetitions and either use flag 16 to copy the previous code length, or 17 or 18 to repeat code length 0 (using extra bits).

My problem is this. After sending the header information and the sequence of code-length code lengths in the right order 16,17,18,..., the next sequence of information would be something like:

18, some extra bits value,1,0,2,16, some extra bits value 0,18, some extra bits value. (Probably there would be another 18 flag since the maximum repeat count is 138.)

Then we have the same thing with the distance alphabet and finally the inputs data encoded with the canonical Huffman, and extra bits if necessary.

Is it necessary to send the code lengths of 0? If so, why?

If yes, why is it necessary to have hclit and hcdist and not only hclen, knowing that the lengths of the sequences are 286 for lit/len and 30 for distances?

If not what would be the real solution?

Another problem:

In this case we have code length 2 with repetitions (3) extra bits value of 0.

Is this last number also included in the code length tree construction?

If yes I can't understand how: flag 18 has next a maximum possible extra bits value of 127 (1111111) representing 138 repetitions and it couldn't be included into the alphabet symbols of 0-18.

P.S When I say extra bits in this case I mean the factor that it is used to know how many repetitions of the previous length are used.

More precisely 0 - 15 we have 0 bit factor repetition, for 16,17,18 we have 2,3,7 bits repetitions factor. The value of those bits is what I mean with extra bits value.

I think I'm missing something about what Huffman codes are generated by the Huffman code-length alphabet.

edited Dec 30 '18 at 19:53

asked Dec 29 '18 at 21:23

Franco Bosi

add a comment |

I am trying to construct an explicit example of a dynamic block. Please let me know if this is wrong.

Considering this example of lit/len alphabet:

A(0), B(0), C(1), D(0), E(2), F(2), G(2), H(2)
and the rest of symbols having zero code lengths.

The sequence(SQ) of code lengths would be 0...,0,0,1,0,2,2,2,2,...0.

My problem is this. After sending the header information and the sequence of code-length code lengths in the right order 16,17,18,..., the next sequence of information would be something like:

18, some extra bits value,1,0,2,16, some extra bits value 0,18, some extra bits value. (Probably there would be another 18 flag since the maximum repeat count is 138.)

Then we have the same thing with the distance alphabet and finally the inputs data encoded with the canonical Huffman, and extra bits if necessary.

Is it necessary to send the code lengths of 0? If so, why?

If yes, why is it necessary to have hclit and hcdist and not only hclen, knowing that the lengths of the sequences are 286 for lit/len and 30 for distances?

If not what would be the real solution?

Another problem:

In this case we have code length 2 with repetitions (3) extra bits value of 0.

Is this last number also included in the code length tree construction?

If yes I can't understand how: flag 18 has next a maximum possible extra bits value of 127 (1111111) representing 138 repetitions and it couldn't be included into the alphabet symbols of 0-18.

P.S When I say extra bits in this case I mean the factor that it is used to know how many repetitions of the previous length are used.

More precisely 0 - 15 we have 0 bit factor repetition, for 16,17,18 we have 2,3,7 bits repetitions factor. The value of those bits is what I mean with extra bits value.

I think I'm missing something about what Huffman codes are generated by the Huffman code-length alphabet.

edited Dec 30 '18 at 19:53

asked Dec 29 '18 at 21:23

Franco Bosi

add a comment |

I am trying to construct an explicit example of a dynamic block. Please let me know if this is wrong.

Considering this example of lit/len alphabet:

A(0), B(0), C(1), D(0), E(2), F(2), G(2), H(2)
and the rest of symbols having zero code lengths.

The sequence(SQ) of code lengths would be 0...,0,0,1,0,2,2,2,2,...0.

My problem is this. After sending the header information and the sequence of code-length code lengths in the right order 16,17,18,..., the next sequence of information would be something like:

18, some extra bits value,1,0,2,16, some extra bits value 0,18, some extra bits value. (Probably there would be another 18 flag since the maximum repeat count is 138.)

Then we have the same thing with the distance alphabet and finally the inputs data encoded with the canonical Huffman, and extra bits if necessary.

Is it necessary to send the code lengths of 0? If so, why?

If yes, why is it necessary to have hclit and hcdist and not only hclen, knowing that the lengths of the sequences are 286 for lit/len and 30 for distances?

If not what would be the real solution?

Another problem:

In this case we have code length 2 with repetitions (3) extra bits value of 0.

Is this last number also included in the code length tree construction?

If yes I can't understand how: flag 18 has next a maximum possible extra bits value of 127 (1111111) representing 138 repetitions and it couldn't be included into the alphabet symbols of 0-18.

P.S When I say extra bits in this case I mean the factor that it is used to know how many repetitions of the previous length are used.

More precisely 0 - 15 we have 0 bit factor repetition, for 16,17,18 we have 2,3,7 bits repetitions factor. The value of those bits is what I mean with extra bits value.

I think I'm missing something about what Huffman codes are generated by the Huffman code-length alphabet.

edited Dec 30 '18 at 19:53

asked Dec 29 '18 at 21:23

Franco Bosi

I am trying to construct an explicit example of a dynamic block. Please let me know if this is wrong.

Considering this example of lit/len alphabet:

A(0), B(0), C(1), D(0), E(2), F(2), G(2), H(2)
and the rest of symbols having zero code lengths.

The sequence(SQ) of code lengths would be 0...,0,0,1,0,2,2,2,2,...0.

My problem is this. After sending the header information and the sequence of code-length code lengths in the right order 16,17,18,..., the next sequence of information would be something like:

18, some extra bits value,1,0,2,16, some extra bits value 0,18, some extra bits value. (Probably there would be another 18 flag since the maximum repeat count is 138.)

Then we have the same thing with the distance alphabet and finally the inputs data encoded with the canonical Huffman, and extra bits if necessary.

Is it necessary to send the code lengths of 0? If so, why?

If yes, why is it necessary to have hclit and hcdist and not only hclen, knowing that the lengths of the sequences are 286 for lit/len and 30 for distances?

If not what would be the real solution?

Another problem:

In this case we have code length 2 with repetitions (3) extra bits value of 0.

Is this last number also included in the code length tree construction?

If yes I can't understand how: flag 18 has next a maximum possible extra bits value of 127 (1111111) representing 138 repetitions and it couldn't be included into the alphabet symbols of 0-18.

P.S When I say extra bits in this case I mean the factor that it is used to know how many repetitions of the previous length are used.

More precisely 0 - 15 we have 0 bit factor repetition, for 16,17,18 we have 2,3,7 bits repetitions factor. The value of those bits is what I mean with extra bits value.

I think I'm missing something about what Huffman codes are generated by the Huffman code-length alphabet.

deflate

edited Dec 30 '18 at 19:53

asked Dec 29 '18 at 21:23

Franco Bosi

edited Dec 30 '18 at 19:53

asked Dec 29 '18 at 21:23

Franco Bosi

edited Dec 30 '18 at 19:53

asked Dec 29 '18 at 21:23

Franco Bosi

asked Dec 29 '18 at 21:23

Franco Bosi

asked Dec 29 '18 at 21:23

Franco Bosi

add a comment |

1 Answer
1

active

oldest

votes

First off, your example code is invalid, since it oversubscribes the available bit patterns. 2,2,2,2 would use all of the bit patterns, since there are only four possible two-bit patterns. You can't have one more code, of any length. Possible valid codes for five symbols is 1,2,3,4,4, or 2,2,2,3,3.

To answer your questions in order:

You need to send the leading zeros, but you do not need to send the trailing zeros. The HLIT and HDIST counts determine how many lengths of each code type are in the header, where any after those are taken to be zero. You need to send the zeros, since the code lengths are associated with the corresponding symbol by their position in the list.

It saves space in header to have the HLIT and HDIST counts, so that you don't need to provide lengths for all 316 codes in every header.

I don't understand this question, but I guess it doesn't apply.

If I understand your question, extra bits have nothing to do with the descriptions of the Huffman codes in the headers. The extra bits are implied by the symbol. In any case, a repeated length is encoded with code 16, not code 18. So the four twos would be encoded as 2, 16(0), where the (0) represents two extra bits that are zeros.

answered Dec 30 '18 at 16:26

Mark Adler

57.8k763111

I really appreciate your effort answering my question and for this you have my gratitude. For the last question when I say extra bits I mean the bit-factor number that you add after the flag 16 that in this case is 00 as you said, not the actual extra bits that are added afer the huffman codes of data. The 00 in this case is also included as a possible symbol of the huffman code length alphabet? In other words, all the numbers in the sequence after the run-encoding are used as a possible symbol of the code length alphabet?

– Franco Bosi
Dec 30 '18 at 19:42

No. Extra bits are extra bits that go directly in the stream. As I said, they have nothing to do with the Huffman codes.

– Mark Adler
Dec 30 '18 at 20:55

Ok thank you Adler!

– Franco Bosi
Dec 30 '18 at 21:06

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53973435%2fdeflate-dynamic-block-structure%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

To answer your questions in order:

You need to send the leading zeros, but you do not need to send the trailing zeros. The HLIT and HDIST counts determine how many lengths of each code type are in the header, where any after those are taken to be zero. You need to send the zeros, since the code lengths are associated with the corresponding symbol by their position in the list.

It saves space in header to have the HLIT and HDIST counts, so that you don't need to provide lengths for all 316 codes in every header.

I don't understand this question, but I guess it doesn't apply.

If I understand your question, extra bits have nothing to do with the descriptions of the Huffman codes in the headers. The extra bits are implied by the symbol. In any case, a repeated length is encoded with code 16, not code 18. So the four twos would be encoded as 2, 16(0), where the (0) represents two extra bits that are zeros.

answered Dec 30 '18 at 16:26

Mark Adler

57.8k763111

I really appreciate your effort answering my question and for this you have my gratitude. For the last question when I say extra bits I mean the bit-factor number that you add after the flag 16 that in this case is 00 as you said, not the actual extra bits that are added afer the huffman codes of data. The 00 in this case is also included as a possible symbol of the huffman code length alphabet? In other words, all the numbers in the sequence after the run-encoding are used as a possible symbol of the code length alphabet?

– Franco Bosi
Dec 30 '18 at 19:42

No. Extra bits are extra bits that go directly in the stream. As I said, they have nothing to do with the Huffman codes.

– Mark Adler
Dec 30 '18 at 20:55

Ok thank you Adler!

– Franco Bosi
Dec 30 '18 at 21:06

add a comment |

To answer your questions in order:

You need to send the leading zeros, but you do not need to send the trailing zeros. The HLIT and HDIST counts determine how many lengths of each code type are in the header, where any after those are taken to be zero. You need to send the zeros, since the code lengths are associated with the corresponding symbol by their position in the list.

It saves space in header to have the HLIT and HDIST counts, so that you don't need to provide lengths for all 316 codes in every header.

I don't understand this question, but I guess it doesn't apply.

If I understand your question, extra bits have nothing to do with the descriptions of the Huffman codes in the headers. The extra bits are implied by the symbol. In any case, a repeated length is encoded with code 16, not code 18. So the four twos would be encoded as 2, 16(0), where the (0) represents two extra bits that are zeros.

answered Dec 30 '18 at 16:26

Mark Adler

57.8k763111

I really appreciate your effort answering my question and for this you have my gratitude. For the last question when I say extra bits I mean the bit-factor number that you add after the flag 16 that in this case is 00 as you said, not the actual extra bits that are added afer the huffman codes of data. The 00 in this case is also included as a possible symbol of the huffman code length alphabet? In other words, all the numbers in the sequence after the run-encoding are used as a possible symbol of the code length alphabet?

– Franco Bosi
Dec 30 '18 at 19:42

No. Extra bits are extra bits that go directly in the stream. As I said, they have nothing to do with the Huffman codes.

– Mark Adler
Dec 30 '18 at 20:55

Ok thank you Adler!

– Franco Bosi
Dec 30 '18 at 21:06

add a comment |

To answer your questions in order:

You need to send the leading zeros, but you do not need to send the trailing zeros. The HLIT and HDIST counts determine how many lengths of each code type are in the header, where any after those are taken to be zero. You need to send the zeros, since the code lengths are associated with the corresponding symbol by their position in the list.

It saves space in header to have the HLIT and HDIST counts, so that you don't need to provide lengths for all 316 codes in every header.

I don't understand this question, but I guess it doesn't apply.

If I understand your question, extra bits have nothing to do with the descriptions of the Huffman codes in the headers. The extra bits are implied by the symbol. In any case, a repeated length is encoded with code 16, not code 18. So the four twos would be encoded as 2, 16(0), where the (0) represents two extra bits that are zeros.

answered Dec 30 '18 at 16:26

Mark Adler

57.8k763111

To answer your questions in order:

You need to send the leading zeros, but you do not need to send the trailing zeros. The HLIT and HDIST counts determine how many lengths of each code type are in the header, where any after those are taken to be zero. You need to send the zeros, since the code lengths are associated with the corresponding symbol by their position in the list.

It saves space in header to have the HLIT and HDIST counts, so that you don't need to provide lengths for all 316 codes in every header.

I don't understand this question, but I guess it doesn't apply.

If I understand your question, extra bits have nothing to do with the descriptions of the Huffman codes in the headers. The extra bits are implied by the symbol. In any case, a repeated length is encoded with code 16, not code 18. So the four twos would be encoded as 2, 16(0), where the (0) represents two extra bits that are zeros.

answered Dec 30 '18 at 16:26

Mark Adler

57.8k763111

answered Dec 30 '18 at 16:26

Mark Adler

57.8k763111

answered Dec 30 '18 at 16:26

Mark Adler

57.8k763111

answered Dec 30 '18 at 16:26

Mark Adler

57.8k763111

I really appreciate your effort answering my question and for this you have my gratitude. For the last question when I say extra bits I mean the bit-factor number that you add after the flag 16 that in this case is 00 as you said, not the actual extra bits that are added afer the huffman codes of data. The 00 in this case is also included as a possible symbol of the huffman code length alphabet? In other words, all the numbers in the sequence after the run-encoding are used as a possible symbol of the code length alphabet?

– Franco Bosi
Dec 30 '18 at 19:42

No. Extra bits are extra bits that go directly in the stream. As I said, they have nothing to do with the Huffman codes.

– Mark Adler
Dec 30 '18 at 20:55

Ok thank you Adler!

– Franco Bosi
Dec 30 '18 at 21:06

add a comment |

I really appreciate your effort answering my question and for this you have my gratitude. For the last question when I say extra bits I mean the bit-factor number that you add after the flag 16 that in this case is 00 as you said, not the actual extra bits that are added afer the huffman codes of data. The 00 in this case is also included as a possible symbol of the huffman code length alphabet? In other words, all the numbers in the sequence after the run-encoding are used as a possible symbol of the code length alphabet?

– Franco Bosi
Dec 30 '18 at 19:42

No. Extra bits are extra bits that go directly in the stream. As I said, they have nothing to do with the Huffman codes.

– Mark Adler
Dec 30 '18 at 20:55

Ok thank you Adler!

– Franco Bosi
Dec 30 '18 at 21:06

I really appreciate your effort answering my question and for this you have my gratitude. For the last question when I say extra bits I mean the bit-factor number that you add after the flag 16 that in this case is 00 as you said, not the actual extra bits that are added afer the huffman codes of data. The 00 in this case is also included as a possible symbol of the huffman code length alphabet? In other words, all the numbers in the sequence after the run-encoding are used as a possible symbol of the code length alphabet?

– Franco Bosi
Dec 30 '18 at 19:42

No. Extra bits are extra bits that go directly in the stream. As I said, they have nothing to do with the Huffman codes.

– Mark Adler
Dec 30 '18 at 20:55

Ok thank you Adler!

– Franco Bosi
Dec 30 '18 at 21:06

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Bdtjtk