Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot download video from udemy #1164

Open
tiendungitd opened this issue Oct 4, 2021 · 28 comments
Open

Cannot download video from udemy #1164

tiendungitd opened this issue Oct 4, 2021 · 28 comments
Labels
account-needed Account details are needed to test/fix this can-share-account Someone is willing to provide account details for development site-bug Issue with a specific website

Comments

@tiendungitd
Copy link

tiendungitd commented Oct 4, 2021

Hi there,
I got 403 error when downloading course from udemy business. I used command:
yt-dlp -u udemy@abc.com -p password -P '~/Downloads' -o '%(playlist)s/%(chapter_number)s - %(chapter)s/%(title)s.%(ext)s' https://abc.udemy.com/course/ielts-vocab-builder-002/
I'm sure that the username/password in command is correct. Please see the log:

[debug] Command-line config: ['-u', 'PRIVATE', '-p', 'PRIVATE', '-P', "'~/Downloads'", '-o', "'%(playlist)s/%(chapter_number)s", '-', "%(chapter)s/%(title)s.%(ext)s'", 'https://abc.udemy.com/course/ielts-vocab-builder-002/', '-v']
[debug] Encodings: locale cp1252, fs utf-8, out utf-8, pref cp1252
[debug] yt-dlp version 2021.09.25 (exe)
[debug] Python version 3.8.10 (CPython 64bit) - Windows-10-10.0.18363-SP0
[debug] exe versions: none
[debug] Optional libraries: Crypto, mutagen, sqlite, websockets
[debug] Proxy map: {}
[debug] [generic] Extracting URL: -
ERROR: [generic] '-' is not a valid URL. Set --default-search "ytsearch" (or run  yt-dlp "ytsearch:-" ) to search YouTube
Traceback (most recent call last):
  File "yt_dlp\extractor\common.py", line 585, in extract
  File "yt_dlp\extractor\generic.py", line 2490, in _real_extract
yt_dlp.utils.ExtractorError: '-' is not a valid URL. Set --default-search "ytsearch" (or run  yt-dlp "ytsearch:-" ) to search YouTube
Traceback (most recent call last):
  File "yt_dlp\extractor\common.py", line 585, in extract
  File "yt_dlp\extractor\generic.py", line 2490, in _real_extract
yt_dlp.utils.ExtractorError: '-' is not a valid URL. Set --default-search "ytsearch" (or run  yt-dlp "ytsearch:-" ) to search YouTube

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "yt_dlp\YoutubeDL.py", line 1227, in wrapper
  File "yt_dlp\YoutubeDL.py", line 1252, in __extract_info
  File "yt_dlp\extractor\common.py", line 601, in extract
yt_dlp.utils.ExtractorError: [generic] '-' is not a valid URL. Set --default-search "ytsearch" (or run  yt-dlp "ytsearch:-" ) to search YouTube

[debug] [generic] Extracting URL: %(chapter)s/%(title)s.%(ext)s'
ERROR: [generic] "%(chapter)s/%(title)s.%(ext)s'" is not a valid URL. Set --default-search "ytsearch" (or run  yt-dlp "ytsearch:%(chapter)s/%(title)s.%(ext)s'" ) to search YouTube
Traceback (most recent call last):
  File "yt_dlp\extractor\common.py", line 585, in extract
  File "yt_dlp\extractor\generic.py", line 2490, in _real_extract
yt_dlp.utils.ExtractorError: "%(chapter)s/%(title)s.%(ext)s'" is not a valid URL. Set --default-search "ytsearch" (or run  yt-dlp "ytsearch:%(chapter)s/%(title)s.%(ext)s'" ) to search YouTube
Traceback (most recent call last):
  File "yt_dlp\extractor\common.py", line 585, in extract
  File "yt_dlp\extractor\generic.py", line 2490, in _real_extract
yt_dlp.utils.ExtractorError: "%(chapter)s/%(title)s.%(ext)s'" is not a valid URL. Set --default-search "ytsearch" (or run  yt-dlp "ytsearch:%(chapter)s/%(title)s.%(ext)s'" ) to search YouTube

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "yt_dlp\YoutubeDL.py", line 1227, in wrapper
  File "yt_dlp\YoutubeDL.py", line 1252, in __extract_info
  File "yt_dlp\extractor\common.py", line 601, in extract
yt_dlp.utils.ExtractorError: [generic] "%(chapter)s/%(title)s.%(ext)s'" is not a valid URL. Set --default-search "ytsearch" (or run  yt-dlp "ytsearch:%(chapter)s/%(title)s.%(ext)s'" ) to search YouTube

[udemy:course] Downloading login popup
ERROR: [udemy:course] course: Unable to download webpage: HTTP Error 403: Forbidden (caused by <HTTPError 403: 'Forbidden'>); please report this issue on  https://github.com/yt-dlp/yt-dlp . Make sure you are using the latest version; see  https://github.com/yt-dlp/yt-dlp  on how to update. Be sure to call yt-dlp with the --verbose flag and include its complete output.
  File "yt_dlp\extractor\common.py", line 694, in _request_webpage
  File "yt_dlp\YoutubeDL.py", line 3256, in urlopen
  File "urllib\request.py", line 531, in open
  File "urllib\request.py", line 640, in http_response
  File "urllib\request.py", line 569, in error
  File "urllib\request.py", line 502, in _call_chain
  File "urllib\request.py", line 649, in http_error_default
@tiendungitd tiendungitd added the question Question label Oct 4, 2021
@pukkandan pukkandan added account-needed Account details are needed to test/fix this and removed question Question account-needed Account details are needed to test/fix this labels Oct 5, 2021
@Abdelraman

This comment has been minimized.

@pukkandan pukkandan mentioned this issue Oct 23, 2021
7 tasks
@pukkandan pukkandan added the site-bug Issue with a specific website label Oct 23, 2021
@flexagoon

This comment has been minimized.

@cmhobbs

This comment has been minimized.

@user334

This comment has been minimized.

@crossRT

This comment has been minimized.

@kstephan-wescale

This comment has been minimized.

@pukkandan pukkandan added the can-share-account Someone is willing to provide account details for development label Dec 11, 2021
@Tony-Klink

This comment has been minimized.

@pukkandan

This comment was marked as resolved.

@flexagoon

This comment was marked as resolved.

@antoine-iut
Copy link

Account shared.
Answer from pukkandan: Cloudflare captcha is causing the issue. Will need to investigate further to find a solution

@sairam-gudiputis

This comment was marked as resolved.

@pukkandan
Copy link
Member

No, nobody has a solution yet. If/when there is any update, you'll see a commit/PR

@ThisLimn0
Copy link

ThisLimn0 commented Apr 26, 2022

Wouldn't it be possible to forward the captcha to a browser, like jDownloader does it?

@johnroyer
Copy link

same issue using version 2022.05.18

$ yt-dlp --verbose -u 'xxx@gmail.com' -p 'secret' -P './' -o'%(playlist)s/%(chapter_number)s - %(chapter)s/%(title)s.%(ext)s' 'https://www.udemy.com/course/docker-kubernetes-the-practical-guide/'
[debug] Command-line config: ['--verbose', '-u', 'PRIVATE', '-p', 'PRIVATE', '-P', './', '-o', '%(playlist)s/%(chapter_number)s - %(chapter)s/%(title)s.%(ext)s', 'https://www.udemy.com/course/docker-kubernetes-the-practical-guide/']
[debug] Encodings: locale UTF-8, fs utf-8, pref UTF-8, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.05.18 [b14d52355] (source)
[debug] Lazy loading extractors is disabled
[debug] Plugins: ['SamplePluginIE', 'SamplePluginPP']
[debug] Git HEAD: 926ccc84e
[debug] Python version 3.8.10 (CPython 64bit) - Linux-5.4.0-1058-raspi-aarch64-with-glibc2.29
[debug] Checking exe version: ffprobe -bsfs
[debug] Checking exe version: ffmpeg -bsfs
[debug] exe versions: ffmpeg 4.2.7, ffprobe 4.2.7
[debug] Optional libraries: Cryptodome-3.14.1, brotli-1.0.9, certifi-2019.11.28, mutagen-1.45.1, secretstorage-2.3.1, sqlite3-2.6.0, websockets-10.2
[debug] Proxy map: {}
[udemy:course] Downloading login popup
ERROR: [udemy:course] course: Unable to download webpage: HTTP Error 403: Forbidden (caused by <HTTPError 403: 'Forbidden'>); please report this issue on  https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issuetemplate. Confirm you are on the latest version using  yt-dlp -U
  File "/home/johnroyer/devel/yt-dlp/yt_dlp/extractor/common.py", line 640, in extract
    self.initialize()
  File "/home/johnroyer/devel/yt-dlp/yt_dlp/extractor/common.py", line 545, in initialize
    self._perform_login(username, password)
  File "/home/johnroyer/devel/yt-dlp/yt_dlp/extractor/udemy.py", line 169, in _perform_login
    login_popup = self._download_webpage(
  File "/home/johnroyer/devel/yt-dlp/yt_dlp/extractor/common.py", line 933, in _download_webpage
    res = self._download_webpage_handle(
  File "/home/johnroyer/devel/yt-dlp/yt_dlp/extractor/udemy.py", line 131, in _download_webpage_handle
    ret = super(UdemyIE, self)._download_webpage_handle(
  File "/home/johnroyer/devel/yt-dlp/yt_dlp/extractor/common.py", line 801, in _download_webpage_handle
    urlh = self._request_webpage(url_or_request, video_id, note, errnote, fatal, data=data, headers=headers, query=query, expected_status=expected_status)
  File "/home/johnroyer/devel/yt-dlp/yt_dlp/extractor/common.py", line 786, in _request_webpage
    raise ExtractorError(errmsg, cause=err)

  File "/home/johnroyer/devel/yt-dlp/yt_dlp/extractor/common.py", line 768, in _request_webpage
    return self._downloader.urlopen(url_or_request)
  File "/home/johnroyer/devel/yt-dlp/yt_dlp/YoutubeDL.py", line 3596, in urlopen
    return self._opener.open(req, timeout=self._socket_timeout)
  File "/usr/lib/python3.8/urllib/request.py", line 531, in open
    response = meth(req, response)
  File "/usr/lib/python3.8/urllib/request.py", line 640, in http_response
    response = self.parent.error(
  File "/usr/lib/python3.8/urllib/request.py", line 569, in error
    return self._call_chain(*args)
  File "/usr/lib/python3.8/urllib/request.py", line 502, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.8/urllib/request.py", line 649, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden

@hwiorn
Copy link

hwiorn commented Jun 29, 2022

I want to watch Udemy video on mpv for studying and note-taking.
I tested 2022.06.29 and found some issues and workaround.

  1. Login check logic needs to be updated in udemy.py:L168-L171
        def is_logged(webpage):
            return any(re.search(p, webpage) for p in (
                r'href=["\'](?:https://www\.udemy\.com)?/user/logout/',
                r'>Logout<',
                r'"is_authenticated":true', # added
                fr'"email":"{username}"')) # added

Udemy login popup is updated. So it needs to be updated.
But after a few successful logins, I couldn't log in. Udemy just returnend the login popup html.
I don't know why. Maybe Udemy restriction?

  1. Captcha page of cloudflare is changed. yt-dlp needs to be updated in udemy.py:L132-L139
        if any(p in webpage for p in (
                '>Please verify you are a human',
                'Access to this page has been denied because we believe you are using automation tools to browse the website',
                '"_pxCaptcha"',
                'cf-captcha-container')): # added
            raise ExtractorError(
                'Udemy asks you to solve a CAPTCHA. Login with browser, '
                'solve CAPTCHA, then export cookies and pass cookie file to '
                'yt-dlp with --cookies.', expected=True)
  1. The course id is wrong in udemy.py:L205
    def _real_extract(self, url):
        lecture_id = self._match_id(url)

        webpage = self._download_webpage(url, lecture_id)

        course_id, _ = self._extract_course_info(webpage, lecture_id)
        #course_id = "3833504" # If I pass correct course id, it would get correct lecture info.

        try:
            lecture = self._download_lecture(course_id, lecture_id)

_real_extract passes wrong course_id to _download_lecture. It always gets 403 Forbidden.
If course_id is correct, yt-dlp will download video properly.
I wanted to make a PR but I couldn't fix the 3 because I couldn't pass playlist url which has real course_id to _real_extract.

@ashishmahajansipl
Copy link

ashishmahajansipl commented Jul 29, 2022

@hwiorn , I've add the above changes with the specific course id but still its not working.

Here is the error which I'm getting -

sm@Sndps-MacBook-Pro yt-dlp-master % yt-dlp -u xxxx -p xxxx -P '/Downloads' -o '%(playlist)s/%(chapter_number)s - %(chapter)s/%(title)s.%(ext)s' https://www.udemy.com/course/learn-flutter-dart-to-build-ios-android-apps/ --verbose

[debug] Command-line config: ['-u', 'PRIVATE', '-p', 'PRIVATE', '-P', '/Downloads', '-o', '%(playlist)s/%(chapter_number)s - %(chapter)s/%(title)s.%(ext)s', 'https://www.udemy.com/course/learn-flutter-dart-to-build-ios-android-apps/', '--verbose']

[debug] Encodings: locale UTF-8, fs utf-8, pref UTF-8, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.07.18 [135f05e]
[debug] Python 3.9.13 (CPython 64bit) - macOS-12.4-x86_64-i386-64bit
[debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs
[debug] exe versions: ffmpeg N-107363-gc6fdbe26ef (setts), ffprobe N-107363-gc6fdbe26ef
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {}
[udemy:course] Downloading login popup
ERROR: [udemy:course] course: Unable to download webpage: HTTP Error 403: Forbidden (caused by <HTTPError 403: 'Forbidden'>); please report this issue on https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using yt-dlp -U

File "/usr/local/Cellar/yt-dlp/2022.7.18/libexec/lib/python3.9/site-packages/yt_dlp/extractor/common.py", line 642, in extract
self.initialize()
File "/usr/local/Cellar/yt-dlp/2022.7.18/libexec/lib/python3.9/site-packages/yt_dlp/extractor/common.py", line 548, in initialize
self._perform_login(username, password)
File "/usr/local/Cellar/yt-dlp/2022.7.18/libexec/lib/python3.9/site-packages/yt_dlp/extractor/udemy.py", line 165, in _perform_login
login_popup = self._download_webpage(
File "/usr/local/Cellar/yt-dlp/2022.7.18/libexec/lib/python3.9/site-packages/yt_dlp/extractor/common.py", line 1053, in _download_webpage
return self.__download_webpage(url_or_request, video_id, note, errnote, None, fatal, *args, **kwargs)
File "/usr/local/Cellar/yt-dlp/2022.7.18/libexec/lib/python3.9/site-packages/yt_dlp/extractor/common.py", line 1004, in download_content
res = getattr(self, download_handle.name)(url_or_request, video_id, **kwargs)
File "/usr/local/Cellar/yt-dlp/2022.7.18/libexec/lib/python3.9/site-packages/yt_dlp/extractor/udemy.py", line 127, in _download_webpage_handle
ret = super(UdemyIE, self)._download_webpage_handle(
File "/usr/local/Cellar/yt-dlp/2022.7.18/libexec/lib/python3.9/site-packages/yt_dlp/extractor/common.py", line 838, in _download_webpage_handle
urlh = self._request_webpage(url_or_request, video_id, note, errnote, fatal, data=data, headers=headers, query=query, expected_status=expected_status)
File "/usr/local/Cellar/yt-dlp/2022.7.18/libexec/lib/python3.9/site-packages/yt_dlp/extractor/common.py", line 795, in _request_webpage
raise ExtractorError(errmsg, cause=err)

File "/usr/local/Cellar/yt-dlp/2022.7.18/libexec/lib/python3.9/site-packages/yt_dlp/extractor/common.py", line 777, in _request_webpage
return self._downloader.urlopen(self._create_request(url_or_request, data, headers, query))
File "/usr/local/Cellar/yt-dlp/2022.7.18/libexec/lib/python3.9/site-packages/yt_dlp/YoutubeDL.py", line 3639, in urlopen
return self._opener.open(req, timeout=self._socket_timeout)
File "/usr/local/Cellar/python@3.9/3.9.13_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 523, in open
response = meth(req, response)
File "/usr/local/Cellar/python@3.9/3.9.13_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 632, in http_response
response = self.parent.error(
File "/usr/local/Cellar/python@3.9/3.9.13_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 561, in error
return self._call_chain(*args)
File "/usr/local/Cellar/python@3.9/3.9.13_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 494, in _call_chain
result = func(*args)
File "/usr/local/Cellar/python@3.9/3.9.13_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 641, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden

@openfnord
Copy link

Account shared.
Answer from pukkandan: Cloudflare captcha is causing the issue. Will need to investigate further to find a solution
so yt-dlp could execute a browser in background to solve the captcha, with the session cookies exchanged with the browser.

@openfnord
Copy link

BTW: why should cloudflare ask for a captcha to be solved using yt-dlp from the same machine, same originating IP compared with a webbrowser on the same machine.

@vendelin8
Copy link

@hwiorn I tried your method, but it doesn't seem to work. I'm not that into Python, tried to add print('HERE----') before your changes, but they didn't show up in the console, and got the same 403 error. It's the latest yt-dlp though 2022-09-01. Does your method still work for you?

@bashonly bashonly mentioned this issue Oct 28, 2022
8 tasks
@bryn1u

This comment was marked as spam.

@gamedazed
Copy link

I was just playing around with this and found a simple work-around, namely --cookies-from-browser firefox

I don't know that firefox is a necessary argument, but I think the cloudflare issue must be getting proc'd by some difference in --cookies-from-browser and --cookies. It seems plausible udemy may be directly parsing the cookies files for the header lines

# Netscape HTTP Cookie File
# This file is generated by yt-dlp.  Do not edit.
yt-dlp  --user-agent 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:105.0) Gecko/20100101 Firefox/105.0' --cookies-from-browser firefox -P . -o "%(playlist)s/%(chapter_number)s - %(chapter)s/%(title)s.%(ext)s" "https://www.udemy.com/course/docker-mastery"
[Cookies] Extracting cookies from firefox
[Cookies] Extracted 1909 cookies from firefox
[udemy:course] Extracting URL: https://www.udemy.com/course/docker-mastery
[udemy:course] course: Downloading webpage
[udemy:course] 1035000: Downloading course curriculum
[download] Downloading playlist: 1035000
[udemy:course] Playlist 1035000: Downloading 170 items of 170
[download] Downloading item 1 of 170
[udemy] Extracting URL: https://www.udemy.com/course/learn/v4/t/lecture/32367182#__youtubedl_smuggle=%7B%22course_id%22%3A+%221035000%22%7D
[udemy] 32367182: Downloading lecture JSON
[udemy] 41522200: Downloading m3u8 information
[info] 41522200: Downloading 1 format(s): hls-3215
[hlsnative] Downloading m3u8 manifest
[hlsnative] Total fragments: 99
[download] Destination: 1035000\1 - Quick Start!\What is Docker in 2022? The Three Innovations.mp4
[download] 100% of  227.58MiB in 00:01:07 at 3.35MiB/s
[FixupM3u8] Fixing MPEG-TS in MP4 container of "1035000\1 - Quick Start!\What is Docker in 2022? The Three Innovations.mp4"
[download] Downloading item 2 of 170
[udemy] Extracting URL: https://www.udemy.com/course/learn/v4/t/lecture/32367184#__youtubedl_smuggle=%7B%22course_id%22%3A+%221035000%22%7D
[udemy] 32367184: Downloading lecture JSON
[udemy] 41522954: Downloading m3u8 information
[info] 41522954: Downloading 1 format(s): hls-2580
[hlsnative] Downloading m3u8 manifest
[hlsnative] Total fragments: 109
[...]

I have not tried to modify the netscape-formatted cookies file to see if it changes behavior, just a shot in the dark.

@azec-pdx
Copy link

The workaround from @gamedazed provided few days ago does not work for me. Also it seems that Udemy Business is using AWS CloudFront now. It stopped working all of a sudden around May 14th / May 15th (relative to sun).

@mahdisky

This comment was marked as spam.

@mahdisky
Copy link

Has anyone found a solution?

@bashonly
Copy link
Member

Possible workaround is to use --legacy-server-connect

@mahdisky
Copy link

Possible workaround is to use --legacy-server-connect

not work

@SacrilegeTx
Copy link

Has anyone found a solution or working workaround?

@devsairam
Copy link

I was hopping any one got solution for this. :)

@yt-dlp yt-dlp locked as spam and limited conversation to collaborators May 26, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
account-needed Account details are needed to test/fix this can-share-account Someone is willing to provide account details for development site-bug Issue with a specific website
Projects
Status: No status
Development

No branches or pull requests