Add support for base64url variant

classic Classic list List threaded Threaded
23 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Add support for base64url variant

Pierre Téchoueyres-2

I'm actually working on an emacs library to work with JWT (JSON Web
Token) as described in RFC 7519 / 7797.  To achive that I must use an
variant of base 64 encoding named base64url as described in RFC4648.

Actually Emacs doesn't provide this variant.  I've attached an attempt
to implement it at C level.

I'm avare that my patch is far from perfect and I would welcome any
help or comments to improve it.

Thanks in advance.  Pierre.


0001-Add-support-for-base64url-variant.patch (24K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Add support for base64url variant

Eli Zaretskii
> From: Pierre Téchoueyres <[hidden email]>
> Date: Wed, 22 May 2019 00:32:19 +0200
>
> I'm actually working on an emacs library to work with JWT (JSON Web
> Token) as described in RFC 7519 / 7797.  To achive that I must use an
> variant of base 64 encoding named base64url as described in RFC4648.
>
> Actually Emacs doesn't provide this variant.  I've attached an attempt
> to implement it at C level.
>
> I'm avare that my patch is far from perfect and I would welcome any
> help or comments to improve it.

Thanks, but please avoid posting such large patches as long as you
don't have a copyright assignment for Emacs on file.  That's because
just by reading the patch, people will remember some of the
implementation, and will then have hard time to come up with a
different one, should your copyright assignment paperwork not come
through.

If you'd like to sign the forms, I will send them to you.  Then we
could revisit the patch when the legal paperwork is done.

Thanks again for working on this.

Reply | Threaded
Open this post in threaded view
|

Re: Add support for base64url variant

Pierre Téchoueyres-2
Hello Eli,
>
> Thanks, but please avoid posting such large patches as long as you
> don't have a copyright assignment for Emacs on file.

I have already signed paperwork for TRAMP and, I think, for Emacs.
But maybe there is something wrong with them ?

>  That's because
> just by reading the patch, people will remember some of the
> implementation, and will then have hard time to come up with a
> different one, should your copyright assignment paperwork not come
> through.
>
> If you'd like to sign the forms, I will send them to you.  Then we
> could revisit the patch when the legal paperwork is done.

If you can't find my previous assignment, then yes send me them back.

> Thanks again for working on this.
Hope this could help.

I would like to have some discussion on how to improve some points :
- Is adding parameter to existing functions the way to go or is it
better to add new ones for base64url
- I would like to improve base64-decode* in a way it could detect the
variant but actually don't know how to do that.
- The documentations is not really what I would like to see (and there
my English is faulty)

Thanks in advance.  Pierre.

Reply | Threaded
Open this post in threaded view
|

Re: Add support for base64url variant

Eli Zaretskii
> Date: Wed, 22 May 2019 11:25:34 +0200
> From: Pierre Téchoueyres <[hidden email]>
> Cc: [hidden email]
>
> Hello Eli,
> >
> > Thanks, but please avoid posting such large patches as long as you
> > don't have a copyright assignment for Emacs on file.
>
> I have already signed paperwork for TRAMP and, I think, for Emacs.
> But maybe there is something wrong with them ?

I do see the assignment for Tramp, but not for Emacs.

> If you can't find my previous assignment, then yes send me them back.

Will send off-list.

> I would like to have some discussion on how to improve some points :
> - Is adding parameter to existing functions the way to go or is it
> better to add new ones for base64url

I think an optional argument will be fine.

> - I would like to improve base64-decode* in a way it could detect the
> variant but actually don't know how to do that.

Maybe someone else will have an idea.  Is such an algorithmic
detection described someplace?

> - The documentations is not really what I would like to see (and there
> my English is faulty)

Don't worry about that, we will polish it when the time comes.  Thanks
for providing the documentation in the first place.

Reply | Threaded
Open this post in threaded view
|

Re: Add support for base64url variant

Pierre Téchoueyres-2
Eli Zaretskii <[hidden email]> writes:

>> ...
>> I would like to have some discussion on how to improve some points :
>> - Is adding parameter to existing functions the way to go or is it
>> better to add new ones for base64url
>
> I think an optional argument will be fine.

Except here I've added two, and one that doesn't really mean anything if
the second isn't set.  ie. you should not (at least without breaking the
RFC) generate an base64 string without padding.  At first many
parameters seemed to me a good thing : limitted patch and flexibility.
But now I'm no more sure ...


>> - I would like to improve base64-decode* in a way it could detect the
>> variant but actually don't know how to do that.
>
> Maybe someone else will have an idea.  Is such an algorithmic
> detection described someplace?
>
None I'm aware of.  I was thinking to something like that :
- define two boolean.  One saying you're on crude base64 another saying
  you're on base64url variant.
- initialize them as false.
- start decoding.
- when finding crude base64  chars (/ or +) set the base64 to true,
- when finding specific url variant chars (- or _) set base64url to true,

try to decode until the end of data.  On parts where padding is checked
do it only if base64 is true.

But this approch could fail on the following cases :
- a mix of chars from both variant without padding (no checks but
  obviously wrong)
- absence of chars from any variant (here I can't decide for the
  necessity of padding).

I would also bring your attention on the part where I dynamically assign
pointers on specialized arrays for encoding (resp. decoding).

ex: line 244 of patch

char const *b64_value_to_char = (url_variant) ? base64url_value_to_char : base64_value_to_char;


Before my change there were static const, so I suppose compiler could
have inlined them or at least stored on some cache.  But now ...
So I'm a little scarried by the possible lost of performance.  If anyone
has some hint on how I could benchmark this (other than by the naive way
which could result in my data where all in cache ...)

Pierre.

Reply | Threaded
Open this post in threaded view
|

Re: Add support for base64url variant

Noam Postavsky
On Thu, 23 May 2019 at 13:51, Pierre Téchoueyres
<[hidden email]> wrote:

>
> Eli Zaretskii <[hidden email]> writes:
>
> >> ...
> >> I would like to have some discussion on how to improve some points :
> >> - Is adding parameter to existing functions the way to go or is it
> >> better to add new ones for base64url
> >
> > I think an optional argument will be fine.
>
> Except here I've added two, and one that doesn't really mean anything if
> the second isn't set.  ie. you should not (at least without breaking the
> RFC) generate an base64 string without padding.  At first many
> parameters seemed to me a good thing : limitted patch and flexibility.
> But now I'm no more sure ...

You could leave out the NO-PAD argument, it's easy enough for the
calling Lisp code to delete a couple of "=" chars if really needed.

> I would also bring your attention on the part where I dynamically assign
> pointers on specialized arrays for encoding (resp. decoding).
>
> ex: line 244 of patch
>
> char const *b64_value_to_char = (url_variant) ? base64url_value_to_char : base64_value_to_char;
>
>
> Before my change there were static const, so I suppose compiler could
> have inlined them or at least stored on some cache.  But now ...
> So I'm a little scarried by the possible lost of performance.  If anyone
> has some hint on how I could benchmark this

I doubt it will have a measurable impact on performance. And even if
it did have some very tiny effect, what would you do about it?
Duplicate the whole encoding function, just to avoid a dynamic choice
of array? The difference would have to be pretty drastic to be worth
that, IMO.

Reply | Threaded
Open this post in threaded view
|

Re: Add support for base64url variant

Eli Zaretskii
In reply to this post by Pierre Téchoueyres-2
> From: Pierre Téchoueyres <[hidden email]>
> Date: Wed, 22 May 2019 00:32:19 +0200
>
> +    NILP(no_pad), !NILP(url_variant),

Please leave one blank between the name of a macro or function and the
following opening parenthesis.

> +  if (pad) {
> +    *e++ = '=';
> +    *e++ = '=';
> +  }

This is not our style of writing blocks in braces.  We use this style:

   if (pad)
     {
       *e++ = '=';
       ...

> +  *e++ = b64_value_to_char[value];
> +  if (pad) {
> +    *e++ = '=';
> +  }

Likewise.

I'd suggest to call the new argument base64url or somesuch, since
this is trhe official name.

Reply | Threaded
Open this post in threaded view
|

Re: Add support for base64url variant

Eli Zaretskii
In reply to this post by Pierre Téchoueyres-2
> From: [hidden email] (Pierre Téchoueyres)
> Cc: [hidden email]
> Date: Thu, 23 May 2019 19:51:00 +0200
>
> > I think an optional argument will be fine.
>
> Except here I've added two, and one that doesn't really mean anything if
> the second isn't set.

Then perhaps switch their order.

> >> - I would like to improve base64-decode* in a way it could detect the
> >> variant but actually don't know how to do that.
> >
> > Maybe someone else will have an idea.  Is such an algorithmic
> > detection described someplace?
> >
> None I'm aware of.  I was thinking to something like that :
> - define two boolean.  One saying you're on crude base64 another saying
>   you're on base64url variant.
> - initialize them as false.
> - start decoding.
> - when finding crude base64  chars (/ or +) set the base64 to true,
> - when finding specific url variant chars (- or _) set base64url to true,

Doesn't sound worth it.  Just an argument passed by the caller should
be fine.  The caller will have to know how the text is encoded.

> I would also bring your attention on the part where I dynamically assign
> pointers on specialized arrays for encoding (resp. decoding).

I don't see a problem with that.

> Before my change there were static const, so I suppose compiler could
> have inlined them or at least stored on some cache.  But now ...
> So I'm a little scarried by the possible lost of performance.  If anyone
> has some hint on how I could benchmark this (other than by the naive way
> which could result in my data where all in cache ...)

Emacs has benchmarking facilities, so you could time the code and see
if there's some performance penalty.  If the performance suffers too
much, we could always have a helper function which receives the array
as a parameter.

Reply | Threaded
Open this post in threaded view
|

Re: Add support for base64url variant

Pierre Téchoueyres-2
In reply to this post by Noam Postavsky
Noam Postavsky <[hidden email]> writes:

> On Thu, 23 May 2019 at 13:51, Pierre Téchoueyres
> <[hidden email]> wrote:
>>
>> Eli Zaretskii <[hidden email]> writes:
>>
>> >> ...
>> >> I would like to have some discussion on how to improve some points :
>> >> - Is adding parameter to existing functions the way to go or is it
>> >> better to add new ones for base64url
>> >
>> > I think an optional argument will be fine.
>>
>> Except here I've added two, and one that doesn't really mean anything if
>> the second isn't set.  ie. you should not (at least without breaking the
>> RFC) generate an base64 string without padding.  At first many
>> parameters seemed to me a good thing : limitted patch and flexibility.
>> But now I'm no more sure ...
>
> You could leave out the NO-PAD argument, it's easy enough for the
> calling Lisp code to delete a couple of "=" chars if really needed.
>
This could be an option ... but I do see this as ugly adding characters
on one side to remove them from the other.

Maybe it could be better to split the functions then ? What about

(base64-encode-string STRING &optional NO-LINE-BREAK)
(base64url-encode-string STRING &optional NO-PAD)

and

(base64-encode-region BEG END &optional NO-LINE-BREAK)
(base64url-encode-region BEG END &optional NO-PAD)



>> I would also bring your attention on the part where I dynamically assign
>> pointers on specialized arrays for encoding (resp. decoding).
>>
>> ex: line 244 of patch
>>
>> char const *b64_value_to_char = (url_variant) ? base64url_value_to_char : base64_value_to_char;
>>
>>
>> Before my change there were static const, so I suppose compiler could
>> have inlined them or at least stored on some cache.  But now ...
>> So I'm a little scarried by the possible lost of performance.  If anyone
>> has some hint on how I could benchmark this
>
> I doubt it will have a measurable impact on performance. And even if
> it did have some very tiny effect, what would you do about it?
> Duplicate the whole encoding function, just to avoid a dynamic choice
> of array? The difference would have to be pretty drastic to be worth
> that, IMO.

Thanks, I hoped for something like your answer.

Reply | Threaded
Open this post in threaded view
|

Re: Add support for base64url variant

Pierre Téchoueyres-2
In reply to this post by Eli Zaretskii
Eli Zaretskii <[hidden email]> writes:

>> From: Pierre Téchoueyres <[hidden email]>
>> Date: Wed, 22 May 2019 00:32:19 +0200
>>
>> +    NILP(no_pad), !NILP(url_variant),
>
> Please leave one blank between the name of a macro or function and the
> following opening parenthesis.

Fixed.

>
>> +  if (pad) {
>> +    *e++ = '=';
>> +    *e++ = '=';
>> +  }
>
> This is not our style of writing blocks in braces.  We use this style:
>
>    if (pad)
>      {
>        *e++ = '=';
>        ...
>
>> +  *e++ = b64_value_to_char[value];
>> +  if (pad) {
>> +    *e++ = '=';
>> +  }
>
> Likewise.
>

Fixed.

> I'd suggest to call the new argument base64url or somesuch, since
> this is trhe official name.
>
You mean in replacement of url_variant or b64_value_to_char ?

How should I send new versions of the patch ? As a full patch in
attachment like previously ?

Reply | Threaded
Open this post in threaded view
|

Re: Add support for base64url variant

Eli Zaretskii
> From: [hidden email] (Pierre Téchoueyres)
> Cc: [hidden email]
> Date: Thu, 23 May 2019 21:37:01 +0200
>
> > I'd suggest to call the new argument base64url or somesuch, since
> > this is trhe official name.
> >
> You mean in replacement of url_variant or b64_value_to_char ?

The former.

> How should I send new versions of the patch ? As a full patch in
> attachment like previously ?

Yes.  And please include the log messages (in the ChangeLog style).

Reply | Threaded
Open this post in threaded view
|

Re: Add support for base64url variant

Pierre Téchoueyres-2
Eli Zaretskii <[hidden email]> writes:

>> From: [hidden email] (Pierre Téchoueyres)
>> Cc: [hidden email]
>> Date: Thu, 23 May 2019 21:37:01 +0200
>>
>> > I'd suggest to call the new argument base64url or somesuch, since
>> > this is trhe official name.
>> >
>> You mean in replacement of url_variant or b64_value_to_char ?
>
> The former.
>
>> How should I send new versions of the patch ? As a full patch in
>> attachment like previously ?
>
> Yes.  And please include the log messages (in the ChangeLog style).
>
>
You'll find two patches attached.
First one contains requested changes and, hoppefully, a valid Changelog


The second patch contains an reworked version wich instead of adding
many parameters to base64-encode-region (resp. base64-encode-string)
function create the base64url-encode-region
(resp. base64url-encode-string) function. Documentation and tests are
also updated consequently.



Tell me what you thing is the way to go.

0001-Add-support-for-base64url-variant-with-parameters.patch (24K) Download Attachment
0001-Add-support-for-base64url-variant-with-new-functions.patch (30K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Add support for base64url variant

Pierre Téchoueyres-2
Had you have a chance to look at my patches ?
Should/Could I do something ?

[hidden email] (Pierre Téchoueyres) writes:

> Eli Zaretskii <[hidden email]> writes:
>
>>> From: [hidden email] (Pierre Téchoueyres)
>>> Cc: [hidden email]
>>> Date: Thu, 23 May 2019 21:37:01 +0200
>>>
>>> > I'd suggest to call the new argument base64url or somesuch, since
>>> > this is trhe official name.
>>> >
>>> You mean in replacement of url_variant or b64_value_to_char ?
>>
>> The former.
>>
>>> How should I send new versions of the patch ? As a full patch in
>>> attachment like previously ?
>>
>> Yes.  And please include the log messages (in the ChangeLog style).
>>
>>
> You'll find two patches attached.
> First one contains requested changes and, hoppefully, a valid Changelog
>
>
>
> The second patch contains an reworked version wich instead of adding
> many parameters to base64-encode-region (resp. base64-encode-string)
> function create the base64url-encode-region
> (resp. base64url-encode-string) function. Documentation and tests are
> also updated consequently.
>
>
>
> Tell me what you thing is the way to go.

Reply | Threaded
Open this post in threaded view
|

Re: Add support for base64url variant

Eli Zaretskii
> From: [hidden email] (Pierre Téchoueyres)
> Cc: [hidden email]
> Date: Fri, 07 Jun 2019 23:04:44 +0200
>
> Had you have a chance to look at my patches ?
> Should/Could I do something ?

Your patches are in my queue (which was unusually long the past 2
weeks).  Sorry for the delay, I will get to that in a day or two.

Reply | Threaded
Open this post in threaded view
|

Re: Add support for base64url variant

Eli Zaretskii
In reply to this post by Pierre Téchoueyres-2
> From: [hidden email] (Pierre Téchoueyres)
> Cc: [hidden email]
> Date: Mon, 27 May 2019 22:30:14 +0200
>
> You'll find two patches attached.

Thanks, I liked the second one better, so I pushed it.

Please note that the doc strings and the manual text included a few
typos, and also used the passive tense too much.  See the follow-up
changes I pushed after your changeset.  In the future, I suggest to
use the Emacs spell-checking capabilities to spell-check the text in
the manual and also the comments/strings in the C/Lisp sources.

Reply | Threaded
Open this post in threaded view
|

Re: Add support for base64url variant

Pierre Téchoueyres-2
Eli Zaretskii <[hidden email]> writes:

>> From: [hidden email] (Pierre Téchoueyres)
>> Cc: [hidden email]
>> Date: Mon, 27 May 2019 22:30:14 +0200
>>
>> You'll find two patches attached.
>
> Thanks, I liked the second one better, so I pushed it.

Thank you.  Honestly I didn't expect it was accepted without further
modifications.

>
> Please note that the doc strings and the manual text included a few
> typos, and also used the passive tense too much.  See the follow-up
> changes I pushed after your changeset.  In the future, I suggest to
> use the Emacs spell-checking capabilities to spell-check the text in
> the manual and also the comments/strings in the C/Lisp sources.

This was the part I was the less confident (and for good reasons).
Thanks again to have (completely) rewritten it.  I'll try to be
more rigorous in the future with the doc, NEWS, etc..

I'll also thanks Paul Eggert for his optimizations of the new functions
in commit 5abaea334cf4c0e004fca2b8b272e091eb5b5444.

Pierre.

P.S.
Would it be any interest if I try to add others bases encoding as
defined in RFC 4648 ?  I think, first, to BASE32 and maybe on BASE16
later.

Reply | Threaded
Open this post in threaded view
|

Re: Add support for base64url variant

Eli Zaretskii
> From: [hidden email] (Pierre Téchoueyres)
> Cc: [hidden email], Paul Eggert <[hidden email]>
> Date: Tue, 11 Jun 2019 20:36:52 +0200
>
> Would it be any interest if I try to add others bases encoding as
> defined in RFC 4648 ?  I think, first, to BASE32 and maybe on BASE16
> later.

I have never needed those.  Let's hear what others have to say.

Reply | Threaded
Open this post in threaded view
|

Re: Add support for base64url variant

Achim Gratz
Eli Zaretskii writes:
>> From: [hidden email] (Pierre Téchoueyres)
>> Cc: [hidden email], Paul Eggert <[hidden email]>
>> Date: Tue, 11 Jun 2019 20:36:52 +0200
>>
>> Would it be any interest if I try to add others bases encoding as
>> defined in RFC 4648 ?  I think, first, to BASE32 and maybe on BASE16
>> later.
>
> I have never needed those.  Let's hear what others have to say.

Base32url is useful for use on case-insensitive filesystems or
with case-insensitive protocols.


Regards,
Achim.
--
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+

SD adaptation for Waldorf microQ V2.22R2:
http://Synth.Stromeko.net/Downloads.html#WaldorfSDada


Reply | Threaded
Open this post in threaded view
|

Re: Add support for base64url variant

Richard Copley-2
Eli Zaretskii writes:

    Please note that the doc strings and the manual text included a few
    typos, and also used the passive tense too much.  See the follow-up
    changes I pushed after your changeset.  In the future, I suggest to
    use the Emacs spell-checking capabilities to spell-check the text in
    the manual and also the comments/strings in the C/Lisp sources.

Two more typos (if/of, inserts/insert):

diff --git a/doc/lispref/text.texi b/doc/lispref/text.texi
index 55f5269849..2e7c497f57 100644
--- a/doc/lispref/text.texi
+++ b/doc/lispref/text.texi
@@ -4560,8 +4560,8 @@ Base 64
 
 @deffn Command base64url-encode-region beg end &optional no-pad
 This function is like @code{base64-encode-region}, but it implements
-the URL variant if base 64 encoding, per RFC 4648, and it doesn't
-inserts newline characters into the encoded text, so the output is
+the URL variant of base 64 encoding, per RFC 4648, and it doesn't
+insert newline characters into the encoded text, so the output is
 just one long line.
 
 If the optional argument @var{no-pad} is non-@code{nil} then this

Reply | Threaded
Open this post in threaded view
|

Re: Add support for base64url variant

Stefan Monnier
In reply to this post by Achim Gratz
>>> Would it be any interest if I try to add others bases encoding as
>>> defined in RFC 4648 ?  I think, first, to BASE32 and maybe on BASE16
>>> later.
>> I have never needed those.  Let's hear what others have to say.

Same here.

> Base32url is useful for use on case-insensitive filesystems or
> with case-insensitive protocols.

That's what it's designed for, yes, but I can't think of any example
where that's useful.  E.g. for case-insensitive filenames, using a hash
instead of a base32 encoding was a better option in all the cases I've
encountered because I only needed "one-way" and it conveniently bounds
the max file name length.


        Stefan


12