Details
-
Improvement
-
Resolution: Fixed
-
Major
-
3.1.2
-
None
-
Security Level: Public
-
None
-
CBG Sprint 145
-
1
Description
This can cause an error like:
Zipfile built: /tmp/sync-gateway-747784f75c-2mckq-2023-11-02T14:33:52+01:00.zip
|
Traceback (most recent call last):
|
File "tasks.py", line 1090, in do_upload
|
File "urllib/request.py", line 517, in open
|
File "urllib/request.py", line 534, in _open
|
File "urllib/request.py", line 494, in _call_chain
|
File "urllib/request.py", line 1389, in https_open
|
File "urllib/request.py", line 1346, in do_open
|
File "http/client.py", line 1285, in request
|
File "http/client.py", line 1331, in _send_request
|
File "http/client.py", line 1280, in endheaders
|
File "http/client.py", line 1079, in _send_output
|
File "http/client.py", line 1001, in send
|
File "ssl.py", line 1205, in sendall
|
File "ssl.py", line 1174, in send
|
OverflowError: string longer than 2147483647 bytes
|
|
Zipfile deleted: /tmp/sync-gateway-747784f75c-2mckq-2023-11-02T14:33:52+01:00.zip
|
This code was introduced in https://github.com/couchbase/sync_gateway/pull/1991 / https://github.com/couchbase/sync_gateway/pull/1990
At this time, python3 APIs are available and we shouldn't mmap a file, but pass the open file object directly directly to urllib.request.Request without reading the entire file.
Curiosly, cbcollect_info does still use curl, presumably a version of curl is bundled on windows in the couchbase server distribution? https://github.com/couchbase/ns_server/blob/1e82181a201f58995638769c3a3fc54751bc1fea/cbcollect_info#L2342
It isn't correct to say that it loads the entire file into memory, it mmaps a the size of the on disk file into virtual memory and a page fault should add and evict the data. I hypothesize there isn't enough memory on a machine to do this or the eviction doesn't happen fast enough. Regardless, it's easy enough to change in python3.
Attachments
Issue Links
- Clones
-
CBG-3591 sgcollect loads entire file into mmap for upload, can run out of memory
- Closed