Pod receives 500 internal server error from remote pod when fetching track with authentication, but fetching track with authentication stripped off works
Steps to reproduce
-
Set up Funkwhale 1.2.8 with the multi-container setup
-
Fix Docker outbound IPv6 by setting up
/etc/docker/daemon.json
like this:
{
"ipv6": true,
"fixed-cidr-v6": "fec0:1234:ffff:0000:0000:0000:0000:0000/64",
"experimental": true,
"ip6tables": true
}
and by adding an IPv6 network to the docker-compose file, with ::1 as the gateway, and putting the API server on it. This is needed because the pod I am testing against is on;y actually reachable over IPv6 today.
-
Follow this public library: https://zik.canevas.eu/federation/music/libraries/266b479b-6146-4cb5-9d5e-57af7d8a8768
-
Try to play the local version of this track: https://zik.canevas.eu/library/tracks/23131
What happens?
On my pod at least, the track will not play.
When my instance fetches https://zik.canevas.eu/api/v1/listen/f8bd566b-e796-46b2-86e6-39b5f78a165d/?upload=058dfcde-f374-4376-8a63-29095dc74553&download=false, and sends auth information for my user, it receives back in internal server error (500) from the remote instance. The response does not appear to have a body I can dump, but I'm not certain of that.
I added some extra logging to the track fetching code, and a fallback to fetch with auth
unset, and got this log:
fw-api_1 | 2023-01-25 17:24:05,309 funkwhale_api.music.views INFO Fetching remote track https://zik.canevas.eu/api/v1/listen/2a3028b2-a7bf-4125-89b8-8cfeb09f823b/?upload=6fb28bd0-345f-4fc2-a779-48ced606e88f&download=false as interfect@music.novak.network
fw-api_1 | 2023-01-25 17:24:38,182 plugins DEBUG Calling handlers for filter urls
fw-nginx_1 | 172.16.238.4 - - [25/Jan/2023:17:24:39 +0000] "GET /federation/actors/interfect HTTP/1.1" 200 1626 "-" "python-requests (funkwhale/1.2.5; +https://zik.canevas.eu)" rt="1.092" uct="0.001" uht="1.089" urt="1.092"
fw-api_1 | 2023-01-25 17:24:39,695 funkwhale_api.music.models ERROR Error fetching content. Retrying without auth
fw-api_1 | Traceback (most recent call last):
fw-api_1 | File "/app/funkwhale_api/music/models.py", line 812, in download_audio_from_remote
fw-api_1 | remote_response.raise_for_status()
fw-api_1 | File "/venv/lib/python3.8/site-packages/requests/models.py", line 953, in raise_for_status
fw-api_1 | raise HTTPError(http_error_msg, response=self)
fw-api_1 | requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: https://zik.canevas.eu/api/v1/listen/2a3028b2-a7bf-4125-89b8-8cfeb09f823b/?upload=6fb28bd0-345f-4fc2-a779-48ced606e88f&download=false
When I modify Upload.download_audio_from_remote()
in funkwhale_api/music/models.py
to first send the request without authentication (auth=None
), and only actually pass along auth
to session.get_session().get()
if the anonymous call fails, the remote server services the request, and manages it in time for my local pod to accept it and fulfil my browser's request.
What is expected?
When the remote server returns a 500 error to a track fetch attempt, I want to see in my pod's log any other information associated with the error.
If the remote server serves a track properly without authentication information on the request, it should not 500 when authentication information is added. If my user is banned or something (which seems unlikely, but I suppose is possible), it should send a 403 Forbidden.
To debug this, I ought to be able to dump the actual web requests that Funkwhale is sending, ideally as Curl commands, so I can generate the 500 error form the remote pod without my pod actually being involved.
Context
Funkwhale version(s) affected: 1.2.8
I feel like this might have something to do with my problems getting /federation/actors
to work properly. My authenticated request prompts a remote fetch of /federation/actors/<me>
, it times out, remote returns 500, this unblocks some kind of incoming connection limit on my Funkwhale, and then nginx actually manages to serve the /federation/actors/<me>
request (too late).
My hack (anonymous first) in models.py:
def download_audio_from_remote(self, actor):
from funkwhale_api.federation import signing
if actor:
auth = signing.get_auth(actor.private_key, actor.private_key_id)
else:
auth = None
remote_response = session.get_session().get(
self.source,
auth=None,
stream=True,
timeout=20,
headers={"Content-Type": "application/octet-stream"},
)
with remote_response as r:
try:
remote_response.raise_for_status()
except:
logger.exception('Error fetching content. Retrying with auth')
remote_response = session.get_session().get(
self.source,
auth=auth,
stream=True,
timeout=20,
headers={"Content-Type": "application/octet-stream"},
)
with remote_response as r:
extension = utils.get_ext_from_type(self.mimetype)
title_parts = []
title_parts.append(self.track.title)
if self.track.album:
title_parts.append(self.track.album.title)
title_parts.append(self.track.artist.name)
title = " - ".join(title_parts)
filename = "{}.{}".format(title, extension)
tmp_file = tempfile.TemporaryFile()
for chunk in r.iter_content(chunk_size=512):
tmp_file.write(chunk)
self.audio_file.save(filename, tmp_file, save=False)
self.save(update_fields=["audio_file"])
return
extension = utils.get_ext_from_type(self.mimetype)
title_parts = []
title_parts.append(self.track.title)
if self.track.album:
title_parts.append(self.track.album.title)
title_parts.append(self.track.artist.name)
title = " - ".join(title_parts)
filename = "{}.{}".format(title, extension)
tmp_file = tempfile.TemporaryFile()
for chunk in r.iter_content(chunk_size=512):
tmp_file.write(chunk)
self.audio_file.save(filename, tmp_file, save=False)
self.save(update_fields=["audio_file"])
And in views.py for debugging:
@record_downloads
def handle_serve(
upload, user, format=None, max_bitrate=None, proxy_media=True, download=True
):
f = upload
# we update the accessed_date
now = timezone.now()
upload.accessed_date = now
upload.save(update_fields=["accessed_date"])
f = upload
if f.audio_file:
file_path = get_file_path(f.audio_file)
elif f.source and (
f.source.startswith("http://") or f.source.startswith("https://")
):
# we need to populate from cache
with transaction.atomic():
# why the transaction/select_for_update?
# this is because browsers may send multiple requests
# in a short time range, for partial content,
# thus resulting in multiple downloads from the remote
qs = f.__class__.objects.select_for_update()
f = qs.get(pk=f.pk)
if user.is_authenticated:
actor = user.actor
else:
actor = actors.get_service_actor()
try:
logger.info("Fetching remote track %s as %s", f.source, actor)
f.download_audio_from_remote(actor=actor)
except requests.exceptions.RequestException as e:
body = "<no body>"
if hasattr(e, 'response') and hasattr(e.response, 'body'):
body = e.response.body
logger.exception("Fetching remote track %s as %s failed: %s", f.source, actor, body)
return Response({"detail": "Remote track {} is unavailable to actor {}: {} with body {}".format(f.source, actor, e, body)}, status=503)
data = f.get_audio_data()
if data:
f.duration = data["duration"]
f.size = data["size"]
f.bitrate = data["bitrate"]
f.save(update_fields=["bitrate", "duration", "size"])
file_path = get_file_path(f.audio_file)
elif f.source and f.source.startswith("file://"):
file_path = get_file_path(f.source.replace("file://", "", 1))
mt = f.mimetype
if should_transcode(f, format, max_bitrate=max_bitrate):
transcoded_version = f.get_transcoded_version(format, max_bitrate=max_bitrate)
transcoded_version.accessed_date = now
transcoded_version.save(update_fields=["accessed_date"])
f = transcoded_version
file_path = get_file_path(f.audio_file)
mt = f.mimetype
if not proxy_media and f.audio_file:
# we simply issue a 302 redirect to the real URL
response = Response(status=302)
response["Location"] = f.audio_file.url
return response
if mt:
response = Response(content_type=mt)
else:
response = Response()
filename = f.filename
mapping = {"nginx": "X-Accel-Redirect", "apache2": "X-Sendfile"}
file_header = mapping[settings.REVERSE_PROXY_TYPE]
response[file_header] = file_path
if download:
response["Content-Disposition"] = get_content_disposition(filename)
if mt:
response["Content-Type"] = mt
return response