bug#34341: 26.1; url-retrieve-synchronously returns blank buffer

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

bug#34341: 26.1; url-retrieve-synchronously returns blank buffer

Laurent Charignon
#### Context

The symptom of this issue first came up in the context of `ghub`, a package
that offers interfacing with Github from emacs (issue:

Multiple users of the `ghub` package started observing that
synchronously retrieving content from the github API through ghub
returned blank responses.

I investigated and narrowed down a minimal repro case of the problem,
which I believe indicate that the problem is in Emacs itself.

#### Steps to reproduce

1) Run emacs -Q
2) Evaluate the following elisp snippet
```
(switch-to-buffer (url-retrieve-synchronously "https://api.github.com/users/charignon/repos")
                  (buffer-string))
```

I expect then to see a buffer with an http response, instead I get a
blank buffer.

#### Investigation / Observation

1) This behavior reproduces consistently with api.github.com, not with
any websites that I tried.

2) `url-retrieve`, the asynchronous way to fetch url, does not seem to be subject to the issue, and works as expected, for
example:
```
(url-retrieve "https://api.github.com/users/charignon/repos" (lambda (x &rest v) x))
```
works

3) Once a connection is established (for example by `url-retrieve`) and
can be reused, then `url-retrieve-synchronously` works. That is, the snippet above returns a buffer with an HTTP response from github. Removing the connection (emacs process managing it) is necessary after you ran `url-retrieve` if you want to reproduce the bug.

4) I used wireshark to look at the network traffic:

Here is the trace for the `url-retrieve-synchronously` call to github
(from steps to reproduce):

I blurred out the info field for DNS packets (not relevant)

```
No.     Time           Source                Destination           Protocol Length Info
      9 1.658218501    vulture.lan           testwifi.here         DNS      74     XXX
     10 1.658260266    vulture.lan           testwifi.here         DNS      74     XXX
     11 1.665053481    testwifi.here         vulture.lan           DNS      158    XXX
     12 1.703946584    testwifi.here         vulture.lan           DNS      106    XXX
     13 1.704485426    vulture.lan           api.github.com        TCP      74     55424 → https(443) [SYN] Seq=0 Win=29200 Len=0 MSS=1460 SACK_PERM=1 TSval=2499281666 TSecr=0 WS=128
     14 1.728009793    api.github.com        vulture.lan           TCP      74     https(443) → 55424 [SYN, ACK] Seq=0 Ack=1 Win=28480 Len=0 MSS=1436 SACK_PERM=1 TSval=3029633168 TSecr=2499281666 WS=1024
     15 1.728074018    vulture.lan           api.github.com        TCP      66     55424 → https(443) [ACK] Seq=1 Ack=1 Win=29312 Len=0 TSval=2499281689 TSecr=3029633168
     16 1.748051225    vulture.lan           api.github.com        TLSv1.3  583    Client Hello
     17 1.773082726    api.github.com        vulture.lan           TLSv1.3  1490   Server Hello, Change Cipher Spec, Encrypted Extensions
     18 1.773104270    vulture.lan           api.github.com        TCP      66     55424 → https(443) [ACK] Seq=518 Ack=1425 Win=32128 Len=0 TSval=2499281734 TSecr=3029633179
     19 1.773405428    vulture.lan           api.github.com        TLSv1.3  72     Change Cipher Spec
     20 1.774050525    api.github.com        vulture.lan           TCP      1490   https(443) → 55424 [ACK] Seq=1425 Ack=518 Win=29696 Len=1424 TSval=3029633179 TSecr=2499281709 [TCP segment of a reassembled PDU]
     21 1.775039404    api.github.com        vulture.lan           TLSv1.3  615    Certificate, Certificate Verify, Finished
     22 1.775047977    vulture.lan           api.github.com        TCP      66     55424 → https(443) [ACK] Seq=524 Ack=3398 Win=37760 Len=0 TSval=2499281736 TSecr=3029633179
     23 1.838313183    api.github.com        vulture.lan           TCP      66     https(443) → 55424 [ACK] Seq=3398 Ack=524 Win=29696 Len=0 TSval=3029633196 TSecr=2499281735
     24 1.838358258    vulture.lan           api.github.com        HTTP     388    GET /users/charignon/repos HTTP/1.1 
     25 1.861920124    api.github.com        vulture.lan           TCP      66     https(443) → 55424 [ACK] Seq=3398 Ack=846 Win=30720 Len=0 TSval=3029633201 TSecr=2499281800
     26 1.862284763    api.github.com        vulture.lan           TLSv1.3  145    New Session Ticket
     27 1.862648197    api.github.com        vulture.lan           TLSv1.3  145    New Session Ticket
     28 1.864591679    vulture.lan           api.github.com        TCP      66     55424 → https(443) [RST, ACK] Seq=846 Ack=3556 Win=37760 Len=0 TSval=2499281826 TSecr=3029633201
     29 1.866733948    vulture.lan           testwifi.here         DNS      74     XXX
     30 1.866762442    vulture.lan           testwifi.here         DNS      74     XXX
     31 1.874549249    testwifi.here         vulture.lan           DNS      158    XXX
     32 1.900313841    testwifi.here         vulture.lan           DNS      106    XXX
     33 1.900576751    vulture.lan           api.github.com        TCP      74     43292 → https(443) [SYN] Seq=0 Win=29200 Len=0 MSS=1460 SACK_PERM=1 TSval=4190850269 TSecr=0 WS=128
     34 1.903210022    vulture.lan           testwifi.here         DNS      88     XXX
     35 1.903272421    vulture.lan           testwifi.here         DNS      87     XXX
     36 1.903318706    vulture.lan           testwifi.here         DNS      86     XXX
     37 1.903361506    vulture.lan           testwifi.here         DNS      85     XXX
     38 1.906623630    testwifi.here         vulture.lan           DNS      137    XXX
     39 1.906956508    testwifi.here         vulture.lan           DNS      137    XXX
     40 1.907649896    testwifi.here         vulture.lan           DNS      87     XXX
     41 1.908724609    testwifi.here         vulture.lan           DNS      156    XXX
     42 1.925632754    api.github.com        vulture.lan           TCP      74     https(443) → 43292 [SYN, ACK] Seq=0 Ack=1 Win=28480 Len=0 MSS=1436 SACK_PERM=1 TSval=3140518241 TSecr=4190850269 WS=1024
     43 1.925691567    vulture.lan           api.github.com        TCP      66     43292 → https(443) [ACK] Seq=1 Ack=1 Win=29312 Len=0 TSval=4190850294 TSecr=3140518241
     44 1.926622012    vulture.lan           api.github.com        HTTP     308    GET /users/charignon/repos HTTP/1.1 
     45 1.950026748    api.github.com        vulture.lan           TCP      66     https(443) → 43292 [FIN, ACK] Seq=1 Ack=243 Win=29696 Len=0 TSval=3140518247 TSecr=4190850295
     46 1.950219585    vulture.lan           api.github.com        TCP      66     43292 → https(443) [FIN, ACK] Seq=243 Ack=2 Win=29312 Len=0 TSval=4190850318 TSecr=3140518247
     47 1.972892477    api.github.com        vulture.lan           TCP      66     https(443) → 43292 [ACK] Seq=2 Ack=244 Win=29696 Len=0 TSval=3140518253 TSecr=4190850318
```

vulture.lan is my computer.
testwifi.here is my DNS.
starting at packet 33 we talk to a second github server, not like the one we talk to in packet 13.
I configured wireshark to inspect encrypted traffic to see the HTTP
request in there.
What you can notice is that in the synchronous (failing) case, packet 28
is resetting (unexpectedly as far as I can tell) the TCP connection, and
we attempt a retry, which does not work.  

And here is the trace for the `url-retrieve` case, where everything goes well:

```
No.     Time           Source                Destination           Protocol Length Info
      1 0.000000000    vulture.lan           testwifi.here         DNS      74     XXX
      2 0.000068027    vulture.lan           testwifi.here         DNS      74     XXX
      3 0.008919887    testwifi.here         vulture.lan           DNS      158    XXX
      4 0.035184884    testwifi.here         vulture.lan           DNS      106    XXX
      5 0.107029734    vulture.lan           api.github.com        TCP      74     55428 → https(443) [SYN] Seq=0 Win=29200 Len=0 MSS=1460 SACK_PERM=1 TSval=2499323470 TSecr=0 WS=128
      6 0.134137564    api.github.com        vulture.lan           TCP      74     https(443) → 55428 [SYN, ACK] Seq=0 Ack=1 Win=28480 Len=0 MSS=1436 SACK_PERM=1 TSval=2535678566 TSecr=2499323470 WS=1024
      7 0.134221939    vulture.lan           api.github.com        TCP      66     55428 → https(443) [ACK] Seq=1 Ack=1 Win=29312 Len=0 TSval=2499323498 TSecr=2535678566
      8 0.134313037    vulture.lan           api.github.com        TLSv1.3  583    Client Hello
      9 0.162958206    api.github.com        vulture.lan           TLSv1.3  1490   Server Hello, Change Cipher Spec, Encrypted Extensions
     10 0.163005821    vulture.lan           api.github.com        TCP      66     55428 → https(443) [ACK] Seq=518 Ack=1425 Win=32128 Len=0 TSval=2499323526 TSecr=2535678573
     11 0.163752256    api.github.com        vulture.lan           TCP      1490   https(443) → 55428 [ACK] Seq=1425 Ack=518 Win=29696 Len=1424 TSval=2535678573 TSecr=2499323498 [TCP segment of a reassembled PDU]
     12 0.163769338    vulture.lan           api.github.com        TLSv1.3  72     Change Cipher Spec
     13 0.164938244    api.github.com        vulture.lan           TLSv1.3  615    Certificate, Certificate Verify, Finished
     14 0.164970615    vulture.lan           api.github.com        TCP      66     55428 → https(443) [ACK] Seq=524 Ack=3398 Win=37760 Len=0 TSval=2499323528 TSecr=2535678573
     15 0.229427678    api.github.com        vulture.lan           TCP      66     https(443) → 55428 [ACK] Seq=3398 Ack=524 Win=29696 Len=0 TSval=2535678590 TSecr=2499323527
     16 0.229496471    vulture.lan           api.github.com        HTTP     388    GET /users/charignon/repos HTTP/1.1 
     17 0.253820590    api.github.com        vulture.lan           TCP      66     https(443) → 55428 [ACK] Seq=3398 Ack=846 Win=30720 Len=0 TSval=2535678596 TSecr=2499323593
     18 0.255187068    api.github.com        vulture.lan           TLSv1.3  145    New Session Ticket
     19 0.256462285    api.github.com        vulture.lan           TLSv1.3  145    New Session Ticket
     20 0.256590338    vulture.lan           api.github.com        TCP      66     55428 → https(443) [ACK] Seq=846 Ack=3556 Win=37760 Len=0 TSval=2499323620 TSecr=2535678596
     21 0.485155486    api.github.com        vulture.lan           TLSv1.3  1458   [SSL segment of a reassembled PDU]
     22 0.485523530    api.github.com        vulture.lan           TLSv1.3  1458   [SSL segment of a reassembled PDU]
     23 0.485567467    vulture.lan           api.github.com        TCP      66     55428 → https(443) [ACK] Seq=846 Ack=6340 Win=43392 Len=0 TSval=2499323849 TSecr=2535678654
     24 0.486210922    api.github.com        vulture.lan           TLSv1.3  1490   [SSL segment of a reassembled PDU]
     25 0.486551493    api.github.com        vulture.lan           TLSv1.3  1276   [SSL segment of a reassembled PDU]
     26 0.486588968    vulture.lan           api.github.com        TCP      66     55428 → https(443) [ACK] Seq=846 Ack=8974 Win=49152 Len=0 TSval=2499323850 TSecr=2535678654
     27 0.487345701    api.github.com        vulture.lan           TLSv1.3  1490   [SSL segment of a reassembled PDU]
     28 0.487710595    api.github.com        vulture.lan           TLSv1.3  1490   [SSL segment of a reassembled PDU] [TCP segment of a reassembled PDU]
     29 0.487741685    vulture.lan           api.github.com        TCP      66     55428 → https(443) [ACK] Seq=846 Ack=11822 Win=54784 Len=0 TSval=2499323851 TSecr=2535678654
     30 0.489279277    api.github.com        vulture.lan           TLSv1.3  1490   [SSL segment of a reassembled PDU] [TCP segment of a reassembled PDU]
     31 0.490560244    api.github.com        vulture.lan           TLSv1.3  1490   [SSL segment of a reassembled PDU] [TCP segment of a reassembled PDU]
     32 0.490600325    vulture.lan           api.github.com        TCP      66     55428 → https(443) [ACK] Seq=846 Ack=14670 Win=60544 Len=0 TSval=2499323854 TSecr=2535678654
     33 0.491903018    api.github.com        vulture.lan           TLSv1.3  1490   [SSL segment of a reassembled PDU] [TCP segment of a reassembled PDU]
     34 0.493248621    api.github.com        vulture.lan           TLSv1.3  1490   [SSL segment of a reassembled PDU] [TCP segment of a reassembled PDU]
     35 0.493286446    vulture.lan           api.github.com        TCP      66     55428 → https(443) [ACK] Seq=846 Ack=17518 Win=66176 Len=0 TSval=2499323857 TSecr=2535678654
     36 0.501270566    vulture.lan           testwifi.here         DNS      86     XXX
     37 0.501341962    vulture.lan           testwifi.here         DNS      85     XXX
     38 0.503268399    testwifi.here         vulture.lan           DNS      137    XXX
     39 0.504365465    testwifi.here         vulture.lan           DNS      137    XXX
     40 0.507593395    api.github.com        vulture.lan           HTTP     75     HTTP/1.1 200 OK  (application/json)
     41 0.549680721    vulture.lan           api.github.com        TCP      66     55428 → https(443) [ACK] Seq=846 Ack=17527 Win=66176 Len=0 TSval=2499323913 TSecr=2535678660
```

5) I noticed that adding `(sit-for 0.5)` in various places throughout the
codepath would consistently make the bug go away. For example in
`open-network-stream` after opening the stream, or at the beginning of
`url-http-create-request`.

6) I couldn't find information online indicating that github changed
anything in their network request handling that broke other tools.

I am using Emacs 26.1:

In GNU Emacs 26.1 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.22.30)
 of 2018-07-05 built on juergen
Windowing system distributor 'The X.Org Foundation', version 11.0.12003000
System Description: Manjaro Linux
Reply | Threaded
Open this post in threaded view
|

bug#34341: Seems fixed in latest master 5259fdf96073febd18e83785960c443bdf02a310

Laurent Charignon
I built the latest master I could find in git (5259fdf96073febd18e83785960c443bdf02a31) and the bug seems fixed there!

From bisecting, I know that the fix is in between 191a5568db (working) and 2a3bd6798e96708 (not working). It is hard to pinpoint further due to segfaults that were introduced then fixed in this range. My best guess is that "Fix unlikely races with GnuTLS, datagrams" (e87e6a24c4) resolved this.
Reply | Threaded
Open this post in threaded view
|

bug#34341: more observations

Mikhail Rudenko
In reply to this post by Laurent Charignon
I've also hit this bug, here's a couple of observations:

(1) Looks like the version of gnutls affects the behavior of url-retrieve-synchronously.
My conclusion is that the bug manifests with gnutls >= 3.6. I confirmed this on Arch with
emacs 26.1 and on Ubuntu bionic/cosmic with emacs 25.2

(2) Weird workaround, which works for me: add (setq gnutls-log-level 1) to the end of init.el