Merge branch 'tiktok' of https://github.com/skyme5/youtube-dl into skyme5-tiktok

author Tom-Oliver Heidel <redacted>

Sat, 12 Sep 2020 03:49:52 +0000 (05:49 +0200)

committer Tom-Oliver Heidel <redacted>

Sat, 12 Sep 2020 03:49:52 +0000 (05:49 +0200)
author Tom-Oliver Heidel <redacted>
Sat, 12 Sep 2020 03:49:52 +0000 (05:49 +0200)
committer Tom-Oliver Heidel <redacted>
Sat, 12 Sep 2020 03:49:52 +0000 (05:49 +0200)
diff --git a/.github/ISSUE_TEMPLATE/1_broken_site.md b/.github/ISSUE_TEMPLATE/1_broken_site.md

index 3a94bd621ff4c68eed4af3b1d7eb3764d6e85c35..f05aa66e67ac3b2625aa80028812c28a1ac108b4 100644 (file)
--- a/.github/ISSUE_TEMPLATE/1_broken_site.md
+++ b/.github/ISSUE_TEMPLATE/1_broken_site.md
@@ -18,7 +18,7 @@ ## Checklist
  
  <!--
  Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
-- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2019.11.28. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
+- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.09.06. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
  - Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
  - Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
  - Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
@@ -26,7 +26,7 @@ ## Checklist
  -->
  
  - [ ] I'm reporting a broken site support
-- [ ] I've verified that I'm running youtube-dl version **2019.11.28**
+- [ ] I've verified that I'm running youtube-dl version **2020.09.06**
  - [ ] I've checked that all provided URLs are alive and playable in a browser
  - [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
  - [ ] I've searched the bugtracker for similar issues including closed ones
@@ -41,7 +41,7 @@ ## Verbose log
   [debug] User config: []
   [debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
   [debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
- [debug] youtube-dl version 2019.11.28
+ [debug] youtube-dl version 2020.09.06
   [debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
   [debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
   [debug] Proxy map: {}
diff --git a/.github/ISSUE_TEMPLATE/2_site_support_request.md b/.github/ISSUE_TEMPLATE/2_site_support_request.md

index 72bee12aa2cde7b5c434dfaab92175259beebec8..29beaf437eba4dd93a3b56485eaf42717a276b92 100644 (file)
--- a/.github/ISSUE_TEMPLATE/2_site_support_request.md
+++ b/.github/ISSUE_TEMPLATE/2_site_support_request.md
@@ -19,7 +19,7 @@ ## Checklist
  
  <!--
  Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
-- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2019.11.28. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
+- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.09.06. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
  - Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
  - Make sure that site you are requesting is not dedicated to copyright infringement, see https://yt-dl.org/copyright-infringement. youtube-dl does not support such sites. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
  - Search the bugtracker for similar site support requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
@@ -27,7 +27,7 @@ ## Checklist
  -->
  
  - [ ] I'm reporting a new site support request
-- [ ] I've verified that I'm running youtube-dl version **2019.11.28**
+- [ ] I've verified that I'm running youtube-dl version **2020.09.06**
  - [ ] I've checked that all provided URLs are alive and playable in a browser
  - [ ] I've checked that none of provided URLs violate any copyrights
  - [ ] I've searched the bugtracker for similar site support requests including closed ones
diff --git a/.github/ISSUE_TEMPLATE/3_site_feature_request.md b/.github/ISSUE_TEMPLATE/3_site_feature_request.md

index ddf67e95183c09de5b25c8ee826d760118a2c2a0..f96b8d2bb504cc40fac267c9aef4d11ae0e46253 100644 (file)
--- a/.github/ISSUE_TEMPLATE/3_site_feature_request.md
+++ b/.github/ISSUE_TEMPLATE/3_site_feature_request.md
@@ -18,13 +18,13 @@ ## Checklist
  
  <!--
  Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
-- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2019.11.28. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
+- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.09.06. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
  - Search the bugtracker for similar site feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
  - Finally, put x into all relevant boxes (like this [x])
  -->
  
  - [ ] I'm reporting a site feature request
-- [ ] I've verified that I'm running youtube-dl version **2019.11.28**
+- [ ] I've verified that I'm running youtube-dl version **2020.09.06**
  - [ ] I've searched the bugtracker for similar site feature requests including closed ones
  
  
diff --git a/.github/ISSUE_TEMPLATE/4_bug_report.md b/.github/ISSUE_TEMPLATE/4_bug_report.md

index 7122e2714dd92fe4c9655c9022c0c76c123ce4d4..3a175aa4d00327cf4ec7e145eedf9f7b645b6283 100644 (file)
--- a/.github/ISSUE_TEMPLATE/4_bug_report.md
+++ b/.github/ISSUE_TEMPLATE/4_bug_report.md
@@ -18,7 +18,7 @@ ## Checklist
  
  <!--
  Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
-- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2019.11.28. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
+- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.09.06. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
  - Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
  - Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
  - Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
@@ -27,7 +27,7 @@ ## Checklist
  -->
  
  - [ ] I'm reporting a broken site support issue
-- [ ] I've verified that I'm running youtube-dl version **2019.11.28**
+- [ ] I've verified that I'm running youtube-dl version **2020.09.06**
  - [ ] I've checked that all provided URLs are alive and playable in a browser
  - [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
  - [ ] I've searched the bugtracker for similar bug reports including closed ones
@@ -43,7 +43,7 @@ ## Verbose log
   [debug] User config: []
   [debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
   [debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
- [debug] youtube-dl version 2019.11.28
+ [debug] youtube-dl version 2020.09.06
   [debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
   [debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
   [debug] Proxy map: {}
diff --git a/.github/ISSUE_TEMPLATE/5_feature_request.md b/.github/ISSUE_TEMPLATE/5_feature_request.md

index a93882b39dace5577bb9bb58effe1086319e625c..4977079deb765e2ccfe6ffa83aae9795c30103c4 100644 (file)
--- a/.github/ISSUE_TEMPLATE/5_feature_request.md
+++ b/.github/ISSUE_TEMPLATE/5_feature_request.md
@@ -19,13 +19,13 @@ ## Checklist
  
  <!--
  Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
-- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2019.11.28. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
+- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.09.06. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
  - Search the bugtracker for similar feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
  - Finally, put x into all relevant boxes (like this [x])
  -->
  
  - [ ] I'm reporting a feature request
-- [ ] I've verified that I'm running youtube-dl version **2019.11.28**
+- [ ] I've verified that I'm running youtube-dl version **2020.09.06**
  - [ ] I've searched the bugtracker for similar feature requests including closed ones
  
  
diff --git a/.github/workflows/python-publish.yml b/.github/workflows/python-publish.yml

new file mode 100644 (file)

index 0000000..224a002
--- /dev/null
+++ b/.github/workflows/python-publish.yml
@@ -0,0 +1,33 @@
+# This workflows will upload a Python Package using Twine when a release is created
+# For more information see: https://help.github.com/en/actions/language-and-framework-guides/using-python-with-github-actions#publishing-to-package-registries
+
+name: Upload Python Package
+
+on:
+  push:
+    branches:
+      - release
+
+jobs:
+  deploy:
+
+    runs-on: ubuntu-latest
+
+    steps:
+    - uses: actions/checkout@v2
+    - name: Set up Python
+      uses: actions/setup-python@v2
+      with:
+        python-version: '3.x'
+    - name: Install dependencies
+      run: |
+        python -m pip install --upgrade pip
+        pip install setuptools wheel twine
+    - name: Build and publish
+      env:
+        TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}
+        TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}
+      run: |
+        rm -rf dist/*
+        python setup.py sdist bdist_wheel
+        twine upload dist/*
diff --git a/.gitignore b/.gitignore

index c4870a6baf478ed5526c76cf4088dabf09ecba6f..9d371d9978fe7fa089524604ae5d412c79e11ad8 100644 (file)
--- a/.gitignore
+++ b/.gitignore
@@ -11,12 +11,20 @@ dist/
  MANIFEST
  README.txt
  youtube-dl.1
+youtube-dlc.1
  youtube-dl.bash-completion
+youtube-dlc.bash-completion
  youtube-dl.fish
+youtube-dlc.fish
  youtube_dl/extractor/lazy_extractors.py
+youtube_dlc/extractor/lazy_extractors.py
  youtube-dl
+youtube-dlc
  youtube-dl.exe
+youtube-dlc.exe
  youtube-dl.tar.gz
+youtube-dlc.tar.gz
+youtube-dlc.spec
  .coverage
  cover/
  updates_key.pem
@@ -41,6 +49,7 @@ updates_key.pem
  test/local_parameters.json
  .tox
  youtube-dl.zsh
+youtube-dlc.zsh
  
  # IntelliJ related files
  .idea
diff --git a/.travis.yml b/.travis.yml

index 14d95fa84c105e3c310b2e0ecf984a7688bfb172..fb499845e4b652edf2661a724bb9f7e515d4f056 100644 (file)
--- a/.travis.yml
+++ b/.travis.yml
@@ -12,34 +12,27 @@ python:
  dist: trusty
  env:
    - YTDL_TEST_SET=core
-  - YTDL_TEST_SET=download
-matrix:
+jobs:
    include:
      - python: 3.7
        dist: xenial
        env: YTDL_TEST_SET=core
-    - python: 3.7
-      dist: xenial
-      env: YTDL_TEST_SET=download
      - python: 3.8
        dist: xenial
        env: YTDL_TEST_SET=core
-    - python: 3.8
-      dist: xenial
-      env: YTDL_TEST_SET=download
      - python: 3.8-dev
        dist: xenial
        env: YTDL_TEST_SET=core
-    - python: 3.8-dev
-      dist: xenial
-      env: YTDL_TEST_SET=download
      - env: JYTHON=true; YTDL_TEST_SET=core
-    - env: JYTHON=true; YTDL_TEST_SET=download
+    - name: flake8
+      python: 3.8
+      dist: xenial
+      install: pip install flake8
+      script: flake8 .
    fast_finish: true
    allow_failures:
      - env: YTDL_TEST_SET=download
      - env: JYTHON=true; YTDL_TEST_SET=core
-    - env: JYTHON=true; YTDL_TEST_SET=download
  before_install:
    - if [ "$JYTHON" == "true" ]; then ./devscripts/install_jython.sh; export PATH="$HOME/jython/bin:$PATH"; fi
  script: ./devscripts/run_tests.sh
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md

index ac759ddc4ee356adc2eb5081d8bdd4325a9d14ff..58ab3a4b8947d5dbadf5f8be1e4bb0868868afec 100644 (file)
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -153,7 +153,7 @@ ### Adding support for a new site
  5. Add an import in [`youtube_dl/extractor/extractors.py`](https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/extractor/extractors.py).
  6. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, then rename ``_TEST`` to ``_TESTS`` and make it into a list of dictionaries. The tests will then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc. Note that tests with `only_matching` key in test's dict are not counted in.
  7. Have a look at [`youtube_dl/extractor/common.py`](https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](https://github.com/ytdl-org/youtube-dl/blob/7f41a598b3fba1bcab2817de64a08941200aa3c8/youtube_dl/extractor/common.py#L94-L303). Add tests and code for as many as you want.
-8. Make sure your code follows [youtube-dl coding conventions](#youtube-dl-coding-conventions) and check the code with [flake8](http://flake8.pycqa.org/en/latest/index.html#quickstart):
+8. Make sure your code follows [youtube-dl coding conventions](#youtube-dl-coding-conventions) and check the code with [flake8](https://flake8.pycqa.org/en/latest/index.html#quickstart):
  
          $ flake8 youtube_dl/extractor/yourextractor.py
  
diff --git a/ChangeLog b/ChangeLog

index d2f17ee067c9215cdea2f6d1df5c30cee8a4b5ed..86b0e8ccbbd977f68e9ac7013e11d44c03b9fc33 100644 (file)
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,10 +1,337 @@
-version <unreleased>
+version 2020.09.06
+
+Core
++ [utils] Recognize wav mimetype (#26463)
+
+Extractors
+* [nrktv:episode] Improve video id extraction (#25594, #26369, #26409)
+* [youtube] Fix age gate content detection (#26100, #26152, #26311, #26384)
+* [youtube:user] Extend URL regular expression (#26443)
+* [xhamster] Improve initials regular expression (#26526, #26353)
+* [svtplay] Fix video id extraction (#26425, #26428, #26438)
+* [twitch] Rework extractors (#12297, #20414, #20604, #21811, #21812, #22979,
+  #24263, #25010, #25553, #25606)
+    * Switch to GraphQL
+    + Add support for collections
+    + Add support for clips and collections playlists
+* [biqle] Improve video ext extraction
+* [xhamster] Fix extraction (#26157, #26254)
+* [xhamster] Extend URL regular expression (#25789, #25804, #25927))
+
+
+version 2020.07.28
+
+Extractors
+* [youtube] Fix sigfunc name extraction (#26134, #26135, #26136, #26137)
+* [youtube] Improve description extraction (#25937, #25980)
+* [wistia] Restrict embed regular expression (#25969)
+* [youtube] Prevent excess HTTP 301 (#25786)
++ [youtube:playlists] Extend URL regular expression (#25810)
++ [bellmedia] Add support for cp24.com clip URLs (#25764)
+* [brightcove] Improve embed detection (#25674)
+
+
+version 2020.06.16.1
+
+Extractors
+* [youtube] Force old layout (#25682, #25683, #25680, #25686)
+* [youtube] Fix categories and improve tags extraction
+
+
+version 2020.06.16
+
+Extractors
+* [youtube] Fix uploader id and uploader URL extraction
+* [youtube] Improve view count extraction
+* [youtube] Fix upload date extraction (#25677)
+* [youtube] Fix thumbnails extraction (#25676)
+* [youtube] Fix playlist and feed extraction (#25675)
++ [facebook] Add support for single-video ID links
++ [youtube] Extract chapters from JSON (#24819)
++ [kaltura] Add support for multiple embeds on a webpage (#25523)
+
+
+version 2020.06.06
+
+Extractors
+* [tele5] Bypass geo restriction
++ [jwplatform] Add support for bypass geo restriction
+* [tele5] Prefer jwplatform over nexx (#25533)
+* [twitch:stream] Expect 400 and 410 HTTP errors from API
+* [twitch:stream] Fix extraction (#25528)
+* [twitch] Fix thumbnails extraction (#25531)
++ [twitch] Pass v5 Accept HTTP header (#25531)
+* [brightcove] Fix subtitles extraction (#25540)
++ [malltv] Add support for sk.mall.tv (#25445)
+* [periscope] Fix untitled broadcasts (#25482)
+* [jwplatform] Improve embeds extraction (#25467)
+
+
+version 2020.05.29
+
+Core
+* [postprocessor/ffmpeg] Embed series metadata with --add-metadata
+* [utils] Fix file permissions in write_json_file (#12471, #25122)
+
+Extractors
+* [ard:beta] Extend URL regular expression (#25405)
++ [youtube] Add support for more invidious instances (#25417)
+* [giantbomb] Extend URL regular expression (#25222)
+* [ard] Improve URL regular expression (#25134, #25198)
+* [redtube] Improve formats extraction and extract m3u8 formats (#25311,
+  #25321)
+* [indavideo] Switch to HTTPS for API request (#25191)
+* [redtube] Improve title extraction (#25208)
+* [vimeo] Improve format extraction and sorting (#25285)
+* [soundcloud] Reduce API playlist page limit (#25274)
++ [youtube] Add support for yewtu.be (#25226)
+* [mailru] Fix extraction (#24530, #25239)
+* [bellator] Fix mgid extraction (#25195)
+
+
+version 2020.05.08
+
+Core
+* [downloader/http] Request last data block of exact remaining size
+* [downloader/http] Finish downloading once received data length matches
+  expected
+* [extractor/common] Use compat_cookiejar_Cookie for _set_cookie to always
+  ensure cookie name and value are bytestrings on python 2 (#23256, #24776)
++ [compat] Introduce compat_cookiejar_Cookie
+* [utils] Improve cookie files support
+    + Add support for UTF-8 in cookie files
+    * Skip malformed cookie file entries instead of crashing (invalid entry
+      length, invalid expires at)
+
+Extractors
+* [youtube] Improve signature cipher extraction (#25187, #25188)
+* [iprima] Improve extraction (#25138)
+* [uol] Fix extraction (#22007)
++ [orf] Add support for more radio stations (#24938, #24968)
+* [dailymotion] Fix typo
+- [puhutv] Remove no longer available HTTP formats (#25124)
+
+
+version 2020.05.03
+
+Core
++ [extractor/common] Extract multiple JSON-LD entries
+* [options] Clarify doc on --exec command (#19087, #24883)
+* [extractor/common] Skip malformed ISM manifest XMLs while extracting
+  ISM formats (#24667)
+
+Extractors
+* [crunchyroll] Fix and improve extraction (#25096, #25060)
+* [youtube] Improve player id extraction
+* [youtube] Use redirected video id if any (#25063)
+* [yahoo] Fix GYAO Player extraction and relax URL regular expression
+  (#24178, #24778)
+* [tvplay] Fix Viafree extraction (#15189, #24473, #24789)
+* [tenplay] Relax URL regular expression (#25001)
++ [prosiebensat1] Extract series metadata
+* [prosiebensat1] Improve extraction and remove 7tv.de support (#24948)
+- [prosiebensat1] Remove 7tv.de support (#24948)
+* [youtube] Fix DRM videos detection (#24736)
+* [thisoldhouse] Fix video id extraction (#24548, #24549)
++ [soundcloud] Extract AAC format (#19173, #24708)
+* [youtube] Skip broken multifeed videos (#24711)
+* [nova:embed] Fix extraction (#24700)
+* [motherless] Fix extraction (#24699)
+* [twitch:clips] Extend URL regular expression (#24290, #24642)
+* [tv4] Fix ISM formats extraction (#24667)
+* [tele5] Fix extraction (#24553)
++ [mofosex] Add support for generic embeds (#24633)
++ [youporn] Add support for generic embeds
++ [spankwire] Add support for generic embeds (#24633)
+* [spankwire] Fix extraction (#18924, #20648)
+
+
+version 2020.03.24
+
+Core
+- [utils] Revert support for cookie files with spaces used instead of tabs
+
+Extractors
+* [teachable] Update upskillcourses and gns3 domains
+* [generic] Look for teachable embeds before wistia
++ [teachable] Extract chapter metadata (#24421)
++ [bilibili] Add support for player.bilibili.com (#24402)
++ [bilibili] Add support for new URL schema with BV ids (#24439, #24442)
+* [limelight] Remove disabled API requests (#24255)
+* [soundcloud] Fix download URL extraction (#24394)
++ [cbc:watch] Add support for authentication (#19160)
+* [hellporno] Fix extraction (#24399)
+* [xtube] Fix formats extraction (#24348)
+* [ndr] Fix extraction (#24326)
+* [nhk] Update m3u8 URL and use native HLS downloader (#24329)
+- [nhk] Remove obsolete rtmp formats (#24329)
+* [nhk] Relax URL regular expression (#24329)
+- [vimeo] Revert fix showcase password protected video extraction (#24224)
+
+
+version 2020.03.08
+
+Core
++ [utils] Add support for cookie files with spaces used instead of tabs
+
+Extractors
++ [pornhub] Add support for pornhubpremium.com (#24288)
+- [youtube] Remove outdated code and unnecessary requests
+* [youtube] Improve extraction in 429 HTTP error conditions (#24283)
+* [nhk] Update API version (#24270)
+
+
+version 2020.03.06
+
+Extractors
+* [youtube] Fix age-gated videos support without login (#24248)
+* [vimeo] Fix showcase password protected video extraction (#24224)
+* [pornhub] Improve title extraction (#24184)
+* [peertube] Improve extraction (#23657)
++ [servus] Add support for new URL schema (#23475, #23583, #24142)
+* [vimeo] Fix subtitles URLs (#24209)
+
+
+version 2020.03.01
+
+Core
+* [YoutubeDL] Force redirect URL to unicode on python 2
+- [options] Remove duplicate short option -v for --version (#24162)
+
+Extractors
+* [xhamster] Fix extraction (#24205)
+* [franceculture] Fix extraction (#24204)
++ [telecinco] Add support for article opening videos
+* [telecinco] Fix extraction (#24195)
+* [xtube] Fix metadata extraction (#21073, #22455)
+* [youjizz] Fix extraction (#24181)
+- Remove no longer needed compat_str around geturl
+* [pornhd] Fix extraction (#24128)
++ [teachable] Add support for multiple videos per lecture (#24101)
++ [wistia] Add support for multiple generic embeds (#8347, 11385)
+* [imdb] Fix extraction (#23443)
+* [tv2dk:bornholm:play] Fix extraction (#24076)
+
+
+version 2020.02.16
+
+Core
+* [YoutubeDL] Fix playlist entry indexing with --playlist-items (#10591,
+  #10622)
+* [update] Fix updating via symlinks (#23991)
++ [compat] Introduce compat_realpath (#23991)
+
+Extractors
++ [npr] Add support for streams (#24042)
++ [24video] Add support for porn.24video.net (#23779, #23784)
+- [jpopsuki] Remove extractor (#23858)
+* [nova] Improve extraction (#23690)
+* [nova:embed] Improve (#23690)
+* [nova:embed] Fix extraction (#23672)
++ [abc:iview] Add support for 720p (#22907, #22921)
+* [nytimes] Improve format sorting (#24010)
++ [toggle] Add support for mewatch.sg (#23895, #23930)
+* [thisoldhouse] Fix extraction (#23951)
++ [popcorntimes] Add support for popcorntimes.tv (#23949)
+* [sportdeutschland] Update to new API
+* [twitch:stream] Lowercase channel id for stream request (#23917)
+* [tv5mondeplus] Fix extraction (#23907, #23911)
+* [tva] Relax URL regular expression (#23903)
+* [vimeo] Fix album extraction (#23864)
+* [viewlift] Improve extraction
+    * Fix extraction (#23851)
+    + Add support for authentication
+    + Add support for more domains
+* [svt] Fix series extraction (#22297)
+* [svt] Fix article extraction (#22897, #22919)
+* [soundcloud] Imporve private playlist/set tracks extraction (#3707)
+
+
+version 2020.01.24
+
+Extractors
+* [youtube] Fix sigfunc name extraction (#23819)
+* [stretchinternet] Fix extraction (#4319)
+* [voicerepublic] Fix extraction
+* [azmedien] Fix extraction (#23783)
+* [businessinsider] Fix jwplatform id extraction (#22929, #22954)
++ [24video] Add support for 24video.vip (#23753)
+* [ivi:compilation] Fix entries extraction (#23770)
+* [ard] Improve extraction (#23761)
+    * Simplify extraction
+    + Extract age limit and series
+    * Bypass geo-restriction
++ [nbc] Add support for nbc multi network URLs (#23049)
+* [americastestkitchen] Fix extraction
+* [zype] Improve extraction
+    + Extract subtitles (#21258)
+    + Support URLs with alternative keys/tokens (#21258)
+    + Extract more metadata
+* [orf:tvthek] Improve geo restricted videos detection (#23741)
+* [soundcloud] Restore previews extraction (#23739)
+
+
+version 2020.01.15
+
+Extractors
+* [yourporn] Fix extraction (#21645, #22255, #23459)
++ [canvas] Add support for new API endpoint (#17680, #18629)
+* [ndr:base:embed] Improve thumbnails extraction (#23731)
++ [vodplatform] Add support for embed.kwikmotion.com domain
++ [twitter] Add support for promo_video_website cards (#23711)
+* [orf:radio] Clean description and improve extraction
+* [orf:fm4] Fix extraction (#23599)
+* [safari] Fix kaltura session extraction (#23679, #23670)
+* [lego] Fix extraction and extract subtitle (#23687)
+* [cloudflarestream] Improve extraction
+    + Add support for bytehighway.net domain
+    + Add support for signed URLs
+    + Extract thumbnail
+* [naver] Improve extraction
+    * Improve geo-restriction handling
+    + Extract automatic captions
+    + Extract uploader metadata
+    + Extract VLive HLS formats
+    * Improve metadata extraction
+- [pandatv] Remove extractor (#23630)
+* [dctp] Fix format extraction (#23656)
++ [scrippsnetworks] Add support for www.discovery.com videos
+* [discovery] Fix anonymous token extraction (#23650)
+* [nrktv:seriebase] Fix extraction (#23625, #23537)
+* [wistia] Improve format extraction and extract subtitles (#22590)
+* [vice] Improve extraction (#23631)
+* [redtube] Detect private videos (#23518)
+
+
+version 2020.01.01
+
+Extractors
+* [brightcove] Invalidate policy key cache on failing requests
+* [pornhub] Improve locked videos detection (#22449, #22780)
++ [pornhub] Add support for m3u8 formats
+* [pornhub] Fix extraction (#22749, #23082)
+* [brightcove] Update policy key on failing requests
+* [spankbang] Improve removed video detection (#23423)
+* [spankbang] Fix extraction (#23307, #23423, #23444)
+* [soundcloud] Automatically update client id on failing requests
+* [prosiebensat1] Improve geo restriction handling (#23571)
+* [brightcove] Cache brightcove player policy keys
+* [teachable] Fail with error message if no video URL found
+* [teachable] Improve locked lessons detection (#23528)
++ [scrippsnetworks] Add support for Scripps Networks sites (#19857, #22981)
+* [mitele] Fix extraction (#21354, #23456)
+* [soundcloud] Update client id (#23516)
+* [mailru] Relax URL regular expressions (#23509)
+
+
+version 2019.12.25
  
  Core
  * [utils] Improve str_to_int
  + [downloader/hls] Add ability to override AES decryption key URL (#17521)
  
  Extractors
+* [mediaset] Fix parse formats (#23508)
  + [tv2dk:bornholm:play] Add support for play.tv2bornholm.dk (#23291)
  + [slideslive] Add support for url and vimeo service names (#23414)
  * [slideslive] Fix extraction (#23413)
diff --git a/MANIFEST.in b/MANIFEST.in

index 4e43e99f394dfc8506447ea4e8328467f5f6a8f5..d2cce9a1cc718446475b4a7e95919d41fbf70731 100644 (file)
--- a/MANIFEST.in
+++ b/MANIFEST.in
@@ -2,8 +2,8 @@ include README.md
  include LICENSE
  include AUTHORS
  include ChangeLog
-include youtube-dl.bash-completion
-include youtube-dl.fish
-include youtube-dl.1
+include youtube-dlc.bash-completion
+include youtube-dlc.fish
+include youtube-dlc.1
  recursive-include docs Makefile conf.py *.rst
  recursive-include test *
diff --git a/Makefile b/Makefile

index 3e17365b83d62cf320f70bc45189e78937ec89d3..9588657c15b0360f258a807f767ce22981218230 100644 (file)
--- a/Makefile
+++ b/Makefile
@@ -1,7 +1,7 @@
-all: youtube-dl README.md CONTRIBUTING.md README.txt youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish supportedsites
+all: youtube-dlc README.md CONTRIBUTING.md README.txt youtube-dlc.1 youtube-dlc.bash-completion youtube-dlc.zsh youtube-dlc.fish supportedsites
  
  clean:
-       rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish youtube_dl/extractor/lazy_extractors.py *.dump *.part* *.ytdl *.info.json *.mp4 *.m4a *.flv *.mp3 *.avi *.mkv *.webm *.3gp *.wav *.ape *.swf *.jpg *.png CONTRIBUTING.md.tmp youtube-dl youtube-dl.exe
+       rm -rf youtube-dlc.1.temp.md youtube-dlc.1 youtube-dlc.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dlc.tar.gz youtube-dlc.zsh youtube-dlc.fish youtube_dlc/extractor/lazy_extractors.py *.dump *.part* *.ytdl *.info.json *.mp4 *.m4a *.flv *.mp3 *.avi *.mkv *.webm *.3gp *.wav *.ape *.swf *.jpg *.png CONTRIBUTING.md.tmp youtube-dlc youtube-dlc.exe
         find . -name "*.pyc" -delete
         find . -name "*.class" -delete
  
@@ -17,23 +17,23 @@ SYSCONFDIR = $(shell if [ $(PREFIX) = /usr -o $(PREFIX) = /usr/local ]; then ech
  # set markdown input format to "markdown-smart" for pandoc version 2 and to "markdown" for pandoc prior to version 2
  MARKDOWN = $(shell if [ `pandoc -v | head -n1 | cut -d" " -f2 | head -c1` = "2" ]; then echo markdown-smart; else echo markdown; fi)
  
-install: youtube-dl youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish
+install: youtube-dlc youtube-dlc.1 youtube-dlc.bash-completion youtube-dlc.zsh youtube-dlc.fish
         install -d $(DESTDIR)$(BINDIR)
-       install -m 755 youtube-dl $(DESTDIR)$(BINDIR)
+       install -m 755 youtube-dlc $(DESTDIR)$(BINDIR)
         install -d $(DESTDIR)$(MANDIR)/man1
-       install -m 644 youtube-dl.1 $(DESTDIR)$(MANDIR)/man1
+       install -m 644 youtube-dlc.1 $(DESTDIR)$(MANDIR)/man1
         install -d $(DESTDIR)$(SYSCONFDIR)/bash_completion.d
-       install -m 644 youtube-dl.bash-completion $(DESTDIR)$(SYSCONFDIR)/bash_completion.d/youtube-dl
+       install -m 644 youtube-dlc.bash-completion $(DESTDIR)$(SYSCONFDIR)/bash_completion.d/youtube-dlc
         install -d $(DESTDIR)$(SHAREDIR)/zsh/site-functions
-       install -m 644 youtube-dl.zsh $(DESTDIR)$(SHAREDIR)/zsh/site-functions/_youtube-dl
+       install -m 644 youtube-dlc.zsh $(DESTDIR)$(SHAREDIR)/zsh/site-functions/_youtube-dlc
         install -d $(DESTDIR)$(SYSCONFDIR)/fish/completions
-       install -m 644 youtube-dl.fish $(DESTDIR)$(SYSCONFDIR)/fish/completions/youtube-dl.fish
+       install -m 644 youtube-dlc.fish $(DESTDIR)$(SYSCONFDIR)/fish/completions/youtube-dlc.fish
  
  codetest:
         flake8 .
  
  test:
-       #nosetests --with-coverage --cover-package=youtube_dl --cover-html --verbose --processes 4 test
+       #nosetests --with-coverage --cover-package=youtube_dlc --cover-html --verbose --processes 4 test
         nosetests --verbose test
         $(MAKE) codetest
  
@@ -51,34 +51,34 @@ offlinetest: codetest
                 --exclude test_youtube_lists.py \
                 --exclude test_youtube_signature.py
  
-tar: youtube-dl.tar.gz
+tar: youtube-dlc.tar.gz
  
  .PHONY: all clean install test tar bash-completion pypi-files zsh-completion fish-completion ot offlinetest codetest supportedsites
  
-pypi-files: youtube-dl.bash-completion README.txt youtube-dl.1 youtube-dl.fish
+pypi-files: youtube-dlc.bash-completion README.txt youtube-dlc.1 youtube-dlc.fish
  
-youtube-dl: youtube_dl/*.py youtube_dl/*/*.py
+youtube-dlc: youtube_dlc/*.py youtube_dlc/*/*.py
         mkdir -p zip
-       for d in youtube_dl youtube_dl/downloader youtube_dl/extractor youtube_dl/postprocessor ; do \
+       for d in youtube_dlc youtube_dlc/downloader youtube_dlc/extractor youtube_dlc/postprocessor ; do \
           mkdir -p zip/$$d ;\
           cp -pPR $$d/*.py zip/$$d/ ;\
         done
-       touch -t 200001010101 zip/youtube_dl/*.py zip/youtube_dl/*/*.py
-       mv zip/youtube_dl/__main__.py zip/
-       cd zip ; zip -q ../youtube-dl youtube_dl/*.py youtube_dl/*/*.py __main__.py
+       touch -t 200001010101 zip/youtube_dlc/*.py zip/youtube_dlc/*/*.py
+       mv zip/youtube_dlc/__main__.py zip/
+       cd zip ; zip -q ../youtube-dlc youtube_dlc/*.py youtube_dlc/*/*.py __main__.py
         rm -rf zip
-       echo '#!$(PYTHON)' > youtube-dl
-       cat youtube-dl.zip >> youtube-dl
-       rm youtube-dl.zip
-       chmod a+x youtube-dl
+       echo '#!$(PYTHON)' > youtube-dlc
+       cat youtube-dlc.zip >> youtube-dlc
+       rm youtube-dlc.zip
+       chmod a+x youtube-dlc
  
-README.md: youtube_dl/*.py youtube_dl/*/*.py
-       COLUMNS=80 $(PYTHON) youtube_dl/__main__.py --help | $(PYTHON) devscripts/make_readme.py
+README.md: youtube_dlc/*.py youtube_dlc/*/*.py
+       COLUMNS=80 $(PYTHON) youtube_dlc/__main__.py --help | $(PYTHON) devscripts/make_readme.py
  
  CONTRIBUTING.md: README.md
         $(PYTHON) devscripts/make_contributing.py README.md CONTRIBUTING.md
  
-issuetemplates: devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/1_broken_site.md .github/ISSUE_TEMPLATE_tmpl/2_site_support_request.md .github/ISSUE_TEMPLATE_tmpl/3_site_feature_request.md .github/ISSUE_TEMPLATE_tmpl/4_bug_report.md .github/ISSUE_TEMPLATE_tmpl/5_feature_request.md youtube_dl/version.py
+issuetemplates: devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/1_broken_site.md .github/ISSUE_TEMPLATE_tmpl/2_site_support_request.md .github/ISSUE_TEMPLATE_tmpl/3_site_feature_request.md .github/ISSUE_TEMPLATE_tmpl/4_bug_report.md .github/ISSUE_TEMPLATE_tmpl/5_feature_request.md youtube_dlc/version.py
         $(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/1_broken_site.md .github/ISSUE_TEMPLATE/1_broken_site.md
         $(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/2_site_support_request.md .github/ISSUE_TEMPLATE/2_site_support_request.md
         $(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/3_site_feature_request.md .github/ISSUE_TEMPLATE/3_site_feature_request.md
@@ -91,34 +91,34 @@ supportedsites:
  README.txt: README.md
         pandoc -f $(MARKDOWN) -t plain README.md -o README.txt
  
-youtube-dl.1: README.md
-       $(PYTHON) devscripts/prepare_manpage.py youtube-dl.1.temp.md
-       pandoc -s -f $(MARKDOWN) -t man youtube-dl.1.temp.md -o youtube-dl.1
-       rm -f youtube-dl.1.temp.md
+youtube-dlc.1: README.md
+       $(PYTHON) devscripts/prepare_manpage.py youtube-dlc.1.temp.md
+       pandoc -s -f $(MARKDOWN) -t man youtube-dlc.1.temp.md -o youtube-dlc.1
+       rm -f youtube-dlc.1.temp.md
  
-youtube-dl.bash-completion: youtube_dl/*.py youtube_dl/*/*.py devscripts/bash-completion.in
+youtube-dlc.bash-completion: youtube_dlc/*.py youtube_dlc/*/*.py devscripts/bash-completion.in
         $(PYTHON) devscripts/bash-completion.py
  
-bash-completion: youtube-dl.bash-completion
+bash-completion: youtube-dlc.bash-completion
  
-youtube-dl.zsh: youtube_dl/*.py youtube_dl/*/*.py devscripts/zsh-completion.in
+youtube-dlc.zsh: youtube_dlc/*.py youtube_dlc/*/*.py devscripts/zsh-completion.in
         $(PYTHON) devscripts/zsh-completion.py
  
-zsh-completion: youtube-dl.zsh
+zsh-completion: youtube-dlc.zsh
  
-youtube-dl.fish: youtube_dl/*.py youtube_dl/*/*.py devscripts/fish-completion.in
+youtube-dlc.fish: youtube_dlc/*.py youtube_dlc/*/*.py devscripts/fish-completion.in
         $(PYTHON) devscripts/fish-completion.py
  
-fish-completion: youtube-dl.fish
+fish-completion: youtube-dlc.fish
  
-lazy-extractors: youtube_dl/extractor/lazy_extractors.py
+lazy-extractors: youtube_dlc/extractor/lazy_extractors.py
  
-_EXTRACTOR_FILES = $(shell find youtube_dl/extractor -iname '*.py' -and -not -iname 'lazy_extractors.py')
-youtube_dl/extractor/lazy_extractors.py: devscripts/make_lazy_extractors.py devscripts/lazy_load_template.py $(_EXTRACTOR_FILES)
+_EXTRACTOR_FILES = $(shell find youtube_dlc/extractor -iname '*.py' -and -not -iname 'lazy_extractors.py')
+youtube_dlc/extractor/lazy_extractors.py: devscripts/make_lazy_extractors.py devscripts/lazy_load_template.py $(_EXTRACTOR_FILES)
         $(PYTHON) devscripts/make_lazy_extractors.py $@
  
-youtube-dl.tar.gz: youtube-dl README.md README.txt youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish ChangeLog AUTHORS
-       @tar -czf youtube-dl.tar.gz --transform "s|^|youtube-dl/|" --owner 0 --group 0 \
+youtube-dlc.tar.gz: youtube-dlc README.md README.txt youtube-dlc.1 youtube-dlc.bash-completion youtube-dlc.zsh youtube-dlc.fish ChangeLog AUTHORS
+       @tar -czf youtube-dlc.tar.gz --transform "s|^|youtube-dlc/|" --owner 0 --group 0 \
                 --exclude '*.DS_Store' \
                 --exclude '*.kate-swp' \
                 --exclude '*.pyc' \
@@ -128,8 +128,8 @@ youtube-dl.tar.gz: youtube-dl README.md README.txt youtube-dl.1 youtube-dl.bash-
                 --exclude '.git' \
                 --exclude 'docs/_build' \
                 -- \
-               bin devscripts test youtube_dl docs \
+               bin devscripts test youtube_dlc docs \
                 ChangeLog AUTHORS LICENSE README.md README.txt \
-               Makefile MANIFEST.in youtube-dl.1 youtube-dl.bash-completion \
-               youtube-dl.zsh youtube-dl.fish setup.py setup.cfg \
-               youtube-dl
+               Makefile MANIFEST.in youtube-dlc.1 youtube-dlc.bash-completion \
+               youtube-dlc.zsh youtube-dlc.fish setup.py setup.cfg \
+               youtube-dlc
diff --git a/README.md b/README.md

index 01f975958c8370016a39c9f3fb872241c977c62b..7692982b62ee83b0c8d963679913ee6e058ade61 100644 (file)
--- a/README.md
+++ b/README.md
@@ -1,54 +1,60 @@
-[![Build Status](https://travis-ci.org/ytdl-org/youtube-dl.svg?branch=master)](https://travis-ci.org/ytdl-org/youtube-dl)
+[![PyPi](https://img.shields.io/pypi/v/youtube-dlc.svg)](https://pypi.org/project/youtube-dlc)
+[![Build Status](https://travis-ci.com/blackjack4494/youtube-dlc.svg?branch=master)](https://travis-ci.com/blackjack4494/youtube-dlc)
+[![Downloads](https://pepy.tech/badge/youtube-dlc)](https://pepy.tech/project/youtube-dlc)
  
-youtube-dl - download videos from youtube.com or other video platforms
+[![Gitter chat](https://badges.gitter.im/youtube-dlc/gitter.png)](https://gitter.im/youtube-dlc) 
+[![License: Unlicense](https://img.shields.io/badge/license-Unlicense-blue.svg)](https://github.com/blackjack4494/youtube-dlc/blob/master/LICENSE)
+
+youtube-dlc - download videos from youtube.com or other video platforms
  
  - [INSTALLATION](#installation)
  - [DESCRIPTION](#description)
  - [OPTIONS](#options)
-- [CONFIGURATION](#configuration)
-- [OUTPUT TEMPLATE](#output-template)
-- [FORMAT SELECTION](#format-selection)
-- [VIDEO SELECTION](#video-selection)
-- [FAQ](#faq)
-- [DEVELOPER INSTRUCTIONS](#developer-instructions)
-- [EMBEDDING YOUTUBE-DL](#embedding-youtube-dl)
-- [BUGS](#bugs)
  - [COPYRIGHT](#copyright)
  
  # INSTALLATION
  
-To install it right away for all UNIX users (Linux, macOS, etc.), type:
+**All Platforms**  
+Preferred way using pip:  
+You may want to use `python3` instead of `python`
  
-    sudo curl -L https://yt-dl.org/downloads/latest/youtube-dl -o /usr/local/bin/youtube-dl
-    sudo chmod a+rx /usr/local/bin/youtube-dl
+    python -m pip install --upgrade youtube-dlc
  
-If you do not have curl, you can alternatively use a recent wget:
+**UNIX** (Linux, macOS, etc.)  
+Using wget:
  
-    sudo wget https://yt-dl.org/downloads/latest/youtube-dl -O /usr/local/bin/youtube-dl
-    sudo chmod a+rx /usr/local/bin/youtube-dl
+    sudo wget https://github.com/blackjack4494/youtube-dlc/releases/latest/download/youtube-dlc -O /usr/local/bin/youtube-dlc
+    sudo chmod a+rx /usr/local/bin/youtube-dlc
  
-Windows users can [download an .exe file](https://yt-dl.org/latest/youtube-dl.exe) and place it in any location on their [PATH](https://en.wikipedia.org/wiki/PATH_%28variable%29) except for `%SYSTEMROOT%\System32` (e.g. **do not** put in `C:\Windows\System32`).
+Using curl:
  
-You can also use pip:
+    sudo curl -L https://github.com/blackjack4494/youtube-dlc/releases/latest/download/youtube-dlc -o /usr/local/bin/youtube-dlc
+    sudo chmod a+rx /usr/local/bin/youtube-dlc
  
-    sudo -H pip install --upgrade youtube-dl
-    
-This command will update youtube-dl if you have already installed it. See the [pypi page](https://pypi.python.org/pypi/youtube_dl) for more information.
  
-macOS users can install youtube-dl with [Homebrew](https://brew.sh/):
+**Windows** users can download [youtube-dlc.exe](https://github.com/blackjack4494/youtube-dlc/releases/latest/download/youtube-dlc.exe) (**do not** put in `C:\Windows\System32`!).  
+
+**Compile**
+To build the Windows executable yourself
  
-    brew install youtube-dl
+    python -m pip install --upgrade pyinstaller
+    pyinstaller.exe youtube_dlc\__main__.py --onefile --name youtube-dlc
+    
+Or simply execute the `make_win.bat` if pyinstaller is installed.
+There will be a `youtube-dlc.exe` in `/dist`  
  
-Or with [MacPorts](https://www.macports.org/):
+For Unix:
+You will need the required build tools  
+python, make (GNU), pandoc, zip, nosetests  
+Then simply type this
  
-    sudo port install youtube-dl
+    make
  
-Alternatively, refer to the [developer instructions](#developer-instructions) for how to check out and work with the git repository. For further options, including PGP signatures, see the [youtube-dl Download Page](https://ytdl-org.github.io/youtube-dl/download.html).
  
  # DESCRIPTION
-**youtube-dl** is a command-line program to download videos from YouTube.com and a few more sites. It requires the Python interpreter, version 2.6, 2.7, or 3.2+, and it is not platform specific. It should work on your Unix box, on Windows or on macOS. It is released to the public domain, which means you can modify it, redistribute it or use it however you like.
+**youtube-dlc** is a command-line program to download videos from YouTube.com and a few more sites. It requires the Python interpreter, version 2.6, 2.7, or 3.2+, and it is not platform specific. It should work on your Unix box, on Windows or on macOS. It is released to the public domain, which means you can modify it, redistribute it or use it however you like.
  
-    youtube-dl [OPTIONS] URL [URL...]
+    youtube-dlc [OPTIONS] URL [URL...]
  
  # OPTIONS
      -h, --help                       Print this help text and exit
@@ -69,19 +75,19 @@ # OPTIONS
                                       extractor
      --default-search PREFIX          Use this prefix for unqualified URLs. For
                                       example "gvsearch2:" downloads two videos
-                                     from google videos for youtube-dl "large
+                                     from google videos for youtube-dlc "large
                                       apple". Use the value "auto" to let
-                                     youtube-dl guess ("auto_warning" to emit a
+                                     youtube-dlc guess ("auto_warning" to emit a
                                       warning when guessing). "error" just throws
                                       an error. The default value "fixup_error"
                                       repairs broken URLs, but emits an error if
                                       this is not possible instead of searching.
      --ignore-config                  Do not read configuration files. When given
                                       in the global configuration file
-                                     /etc/youtube-dl.conf: Do not read the user
+                                     /etc/youtube-dlc.conf: Do not read the user
                                       configuration in ~/.config/youtube-
-                                     dl/config (%APPDATA%/youtube-dl/config.txt
-                                     on Windows)
+                                     dlc/config (%APPDATA%/youtube-
+                                     dlc/config.txt on Windows)
      --config-location PATH           Location of the configuration file; either
                                       the path to the config or its containing
                                       directory.
@@ -238,7 +244,7 @@ ## Filesystem Options:
                                       filenames
      -w, --no-overwrites              Do not overwrite files
      -c, --continue                   Force resume of partially downloaded files.
-                                     By default, youtube-dl will resume
+                                     By default, youtube-dlc will resume
                                       downloads if possible.
      --no-continue                    Do not resume partially downloaded files
                                       (restart from beginning)
@@ -256,11 +262,11 @@ ## Filesystem Options:
                                       option)
      --cookies FILE                   File to read cookies from and dump cookie
                                       jar in
-    --cache-dir DIR                  Location in the filesystem where youtube-dl
-                                     can store some downloaded information
+    --cache-dir DIR                  Location in the filesystem where youtube-
+                                     dlc can store some downloaded information
                                       permanently. By default
-                                     $XDG_CACHE_HOME/youtube-dl or
-                                     ~/.cache/youtube-dl . At the moment, only
+                                     $XDG_CACHE_HOME/youtube-dlc or
+                                     ~/.cache/youtube-dlc . At the moment, only
                                       YouTube player files (for videos with
                                       obfuscated signatures) are cached, but that
                                       may change.
@@ -306,8 +312,9 @@ ## Verbosity / Simulation Options:
                                       files in the current directory to debug
                                       problems
      --print-traffic                  Display sent and read HTTP traffic
-    -C, --call-home                  Contact the youtube-dl server for debugging
-    --no-call-home                   Do NOT contact the youtube-dl server for
+    -C, --call-home                  Contact the youtube-dlc server for
+                                     debugging
+    --no-call-home                   Do NOT contact the youtube-dlc server for
                                       debugging
  
  ## Workarounds:
@@ -368,7 +375,7 @@ ## Subtitle Options:
  ## Authentication Options:
      -u, --username USERNAME          Login with this account ID
      -p, --password PASSWORD          Account password. If this option is left
-                                     out, youtube-dl will ask interactively.
+                                     out, youtube-dlc will ask interactively.
      -2, --twofactor TWOFACTOR        Two-factor authentication code
      -n, --netrc                      Use .netrc authentication data
      --video-password PASSWORD        Video password (vimeo, smotri, youku)
@@ -379,8 +386,8 @@ ## Adobe Pass Options:
                                       a list of available MSOs
      --ap-username USERNAME           Multiple-system operator account login
      --ap-password PASSWORD           Multiple-system operator account password.
-                                     If this option is left out, youtube-dl will
-                                     ask interactively.
+                                     If this option is left out, youtube-dlc
+                                     will ask interactively.
      --ap-list-mso                    List all supported multiple-system
                                       operators
  
@@ -434,1011 +441,9 @@ ## Post-processing Options:
                                       either the path to the binary or its
                                       containing directory.
      --exec CMD                       Execute a command on the file after
-                                     downloading, similar to find's -exec
-                                     syntax. Example: --exec 'adb push {}
-                                     /sdcard/Music/ && rm {}'
+                                     downloading and post-processing, similar to
+                                     find's -exec syntax. Example: --exec 'adb
+                                     push {} /sdcard/Music/ && rm {}'
      --convert-subs FORMAT            Convert the subtitles to other format
                                       (currently supported: srt|ass|vtt|lrc)
  
-# CONFIGURATION
-
-You can configure youtube-dl by placing any supported command line option to a configuration file. On Linux and macOS, the system wide configuration file is located at `/etc/youtube-dl.conf` and the user wide configuration file at `~/.config/youtube-dl/config`. On Windows, the user wide configuration file locations are `%APPDATA%\youtube-dl\config.txt` or `C:\Users\<user name>\youtube-dl.conf`. Note that by default configuration file may not exist so you may need to create it yourself.
-
-For example, with the following configuration file youtube-dl will always extract the audio, not copy the mtime, use a proxy and save all videos under `Movies` directory in your home directory:
-```
-# Lines starting with # are comments
-
-# Always extract audio
--x
-
-# Do not copy the mtime
---no-mtime
-
-# Use this proxy
---proxy 127.0.0.1:3128
-
-# Save all videos under Movies directory in your home directory
--o ~/Movies/%(title)s.%(ext)s
-```
-
-Note that options in configuration file are just the same options aka switches used in regular command line calls thus there **must be no whitespace** after `-` or `--`, e.g. `-o` or `--proxy` but not `- o` or `-- proxy`.
-
-You can use `--ignore-config` if you want to disable the configuration file for a particular youtube-dl run.
-
-You can also use `--config-location` if you want to use custom configuration file for a particular youtube-dl run.
-
-### Authentication with `.netrc` file
-
-You may also want to configure automatic credentials storage for extractors that support authentication (by providing login and password with `--username` and `--password`) in order not to pass credentials as command line arguments on every youtube-dl execution and prevent tracking plain text passwords in the shell command history. You can achieve this using a [`.netrc` file](https://stackoverflow.com/tags/.netrc/info) on a per extractor basis. For that you will need to create a `.netrc` file in your `$HOME` and restrict permissions to read/write by only you:
-```
-touch $HOME/.netrc
-chmod a-rwx,u+rw $HOME/.netrc
-```
-After that you can add credentials for an extractor in the following format, where *extractor* is the name of the extractor in lowercase:
-```
-machine <extractor> login <login> password <password>
-```
-For example:
-```
-machine youtube login myaccount@gmail.com password my_youtube_password
-machine twitch login my_twitch_account_name password my_twitch_password
-```
-To activate authentication with the `.netrc` file you should pass `--netrc` to youtube-dl or place it in the [configuration file](#configuration).
-
-On Windows you may also need to setup the `%HOME%` environment variable manually. For example:
-```
-set HOME=%USERPROFILE%
-```
-
-# OUTPUT TEMPLATE
-
-The `-o` option allows users to indicate a template for the output file names.
-
-**tl;dr:** [navigate me to examples](#output-template-examples).
-
-The basic usage is not to set any template arguments when downloading a single file, like in `youtube-dl -o funny_video.flv "https://some/video"`. However, it may contain special sequences that will be replaced when downloading each video. The special sequences may be formatted according to [python string formatting operations](https://docs.python.org/2/library/stdtypes.html#string-formatting). For example, `%(NAME)s` or `%(NAME)05d`. To clarify, that is a percent symbol followed by a name in parentheses, followed by formatting operations. Allowed names along with sequence type are:
-
- - `id` (string): Video identifier
- - `title` (string): Video title
- - `url` (string): Video URL
- - `ext` (string): Video filename extension
- - `alt_title` (string): A secondary title of the video
- - `display_id` (string): An alternative identifier for the video
- - `uploader` (string): Full name of the video uploader
- - `license` (string): License name the video is licensed under
- - `creator` (string): The creator of the video
- - `release_date` (string): The date (YYYYMMDD) when the video was released
- - `timestamp` (numeric): UNIX timestamp of the moment the video became available
- - `upload_date` (string): Video upload date (YYYYMMDD)
- - `uploader_id` (string): Nickname or id of the video uploader
- - `channel` (string): Full name of the channel the video is uploaded on
- - `channel_id` (string): Id of the channel
- - `location` (string): Physical location where the video was filmed
- - `duration` (numeric): Length of the video in seconds
- - `view_count` (numeric): How many users have watched the video on the platform
- - `like_count` (numeric): Number of positive ratings of the video
- - `dislike_count` (numeric): Number of negative ratings of the video
- - `repost_count` (numeric): Number of reposts of the video
- - `average_rating` (numeric): Average rating give by users, the scale used depends on the webpage
- - `comment_count` (numeric): Number of comments on the video
- - `age_limit` (numeric): Age restriction for the video (years)
- - `is_live` (boolean): Whether this video is a live stream or a fixed-length video
- - `start_time` (numeric): Time in seconds where the reproduction should start, as specified in the URL
- - `end_time` (numeric): Time in seconds where the reproduction should end, as specified in the URL
- - `format` (string): A human-readable description of the format 
- - `format_id` (string): Format code specified by `--format`
- - `format_note` (string): Additional info about the format
- - `width` (numeric): Width of the video
- - `height` (numeric): Height of the video
- - `resolution` (string): Textual description of width and height
- - `tbr` (numeric): Average bitrate of audio and video in KBit/s
- - `abr` (numeric): Average audio bitrate in KBit/s
- - `acodec` (string): Name of the audio codec in use
- - `asr` (numeric): Audio sampling rate in Hertz
- - `vbr` (numeric): Average video bitrate in KBit/s
- - `fps` (numeric): Frame rate
- - `vcodec` (string): Name of the video codec in use
- - `container` (string): Name of the container format
- - `filesize` (numeric): The number of bytes, if known in advance
- - `filesize_approx` (numeric): An estimate for the number of bytes
- - `protocol` (string): The protocol that will be used for the actual download
- - `extractor` (string): Name of the extractor
- - `extractor_key` (string): Key name of the extractor
- - `epoch` (numeric): Unix epoch when creating the file
- - `autonumber` (numeric): Five-digit number that will be increased with each download, starting at zero
- - `playlist` (string): Name or id of the playlist that contains the video
- - `playlist_index` (numeric): Index of the video in the playlist padded with leading zeros according to the total length of the playlist
- - `playlist_id` (string): Playlist identifier
- - `playlist_title` (string): Playlist title
- - `playlist_uploader` (string): Full name of the playlist uploader
- - `playlist_uploader_id` (string): Nickname or id of the playlist uploader
-
-Available for the video that belongs to some logical chapter or section:
-
- - `chapter` (string): Name or title of the chapter the video belongs to
- - `chapter_number` (numeric): Number of the chapter the video belongs to
- - `chapter_id` (string): Id of the chapter the video belongs to
-
-Available for the video that is an episode of some series or programme:
-
- - `series` (string): Title of the series or programme the video episode belongs to
- - `season` (string): Title of the season the video episode belongs to
- - `season_number` (numeric): Number of the season the video episode belongs to
- - `season_id` (string): Id of the season the video episode belongs to
- - `episode` (string): Title of the video episode
- - `episode_number` (numeric): Number of the video episode within a season
- - `episode_id` (string): Id of the video episode
-
-Available for the media that is a track or a part of a music album:
-
- - `track` (string): Title of the track
- - `track_number` (numeric): Number of the track within an album or a disc
- - `track_id` (string): Id of the track
- - `artist` (string): Artist(s) of the track
- - `genre` (string): Genre(s) of the track
- - `album` (string): Title of the album the track belongs to
- - `album_type` (string): Type of the album
- - `album_artist` (string): List of all artists appeared on the album
- - `disc_number` (numeric): Number of the disc or other physical medium the track belongs to
- - `release_year` (numeric): Year (YYYY) when the album was released
-
-Each aforementioned sequence when referenced in an output template will be replaced by the actual value corresponding to the sequence name. Note that some of the sequences are not guaranteed to be present since they depend on the metadata obtained by a particular extractor. Such sequences will be replaced with `NA`.
-
-For example for `-o %(title)s-%(id)s.%(ext)s` and an mp4 video with title `youtube-dl test video` and id `BaW_jenozKcj`, this will result in a `youtube-dl test video-BaW_jenozKcj.mp4` file created in the current directory.
-
-For numeric sequences you can use numeric related formatting, for example, `%(view_count)05d` will result in a string with view count padded with zeros up to 5 characters, like in `00042`.
-
-Output templates can also contain arbitrary hierarchical path, e.g. `-o '%(playlist)s/%(playlist_index)s - %(title)s.%(ext)s'` which will result in downloading each video in a directory corresponding to this path template. Any missing directory will be automatically created for you.
-
-To use percent literals in an output template use `%%`. To output to stdout use `-o -`.
-
-The current default template is `%(title)s-%(id)s.%(ext)s`.
-
-In some cases, you don't want special characters such as 中, spaces, or &, such as when transferring the downloaded filename to a Windows system or the filename through an 8bit-unsafe channel. In these cases, add the `--restrict-filenames` flag to get a shorter title:
-
-#### Output template and Windows batch files
-
-If you are using an output template inside a Windows batch file then you must escape plain percent characters (`%`) by doubling, so that `-o "%(title)s-%(id)s.%(ext)s"` should become `-o "%%(title)s-%%(id)s.%%(ext)s"`. However you should not touch `%`'s that are not plain characters, e.g. environment variables for expansion should stay intact: `-o "C:\%HOMEPATH%\Desktop\%%(title)s.%%(ext)s"`.
-
-#### Output template examples
-
-Note that on Windows you may need to use double quotes instead of single.
-
-```bash
-$ youtube-dl --get-filename -o '%(title)s.%(ext)s' BaW_jenozKc
-youtube-dl test video ''_ä↭𝕐.mp4    # All kinds of weird characters
-
-$ youtube-dl --get-filename -o '%(title)s.%(ext)s' BaW_jenozKc --restrict-filenames
-youtube-dl_test_video_.mp4          # A simple file name
-
-# Download YouTube playlist videos in separate directory indexed by video order in a playlist
-$ youtube-dl -o '%(playlist)s/%(playlist_index)s - %(title)s.%(ext)s' https://www.youtube.com/playlist?list=PLwiyx1dc3P2JR9N8gQaQN_BCvlSlap7re
-
-# Download all playlists of YouTube channel/user keeping each playlist in separate directory:
-$ youtube-dl -o '%(uploader)s/%(playlist)s/%(playlist_index)s - %(title)s.%(ext)s' https://www.youtube.com/user/TheLinuxFoundation/playlists
-
-# Download Udemy course keeping each chapter in separate directory under MyVideos directory in your home
-$ youtube-dl -u user -p password -o '~/MyVideos/%(playlist)s/%(chapter_number)s - %(chapter)s/%(title)s.%(ext)s' https://www.udemy.com/java-tutorial/
-
-# Download entire series season keeping each series and each season in separate directory under C:/MyVideos
-$ youtube-dl -o "C:/MyVideos/%(series)s/%(season_number)s - %(season)s/%(episode_number)s - %(episode)s.%(ext)s" https://videomore.ru/kino_v_detalayah/5_sezon/367617
-
-# Stream the video being downloaded to stdout
-$ youtube-dl -o - BaW_jenozKc
-```
-
-# FORMAT SELECTION
-
-By default youtube-dl tries to download the best available quality, i.e. if you want the best quality you **don't need** to pass any special options, youtube-dl will guess it for you by **default**.
-
-But sometimes you may want to download in a different format, for example when you are on a slow or intermittent connection. The key mechanism for achieving this is so-called *format selection* based on which you can explicitly specify desired format, select formats based on some criterion or criteria, setup precedence and much more.
-
-The general syntax for format selection is `--format FORMAT` or shorter `-f FORMAT` where `FORMAT` is a *selector expression*, i.e. an expression that describes format or formats you would like to download.
-
-**tl;dr:** [navigate me to examples](#format-selection-examples).
-
-The simplest case is requesting a specific format, for example with `-f 22` you can download the format with format code equal to 22. You can get the list of available format codes for particular video using `--list-formats` or `-F`. Note that these format codes are extractor specific. 
-
-You can also use a file extension (currently `3gp`, `aac`, `flv`, `m4a`, `mp3`, `mp4`, `ogg`, `wav`, `webm` are supported) to download the best quality format of a particular file extension served as a single file, e.g. `-f webm` will download the best quality format with the `webm` extension served as a single file.
-
-You can also use special names to select particular edge case formats:
-
- - `best`: Select the best quality format represented by a single file with video and audio.
- - `worst`: Select the worst quality format represented by a single file with video and audio.
- - `bestvideo`: Select the best quality video-only format (e.g. DASH video). May not be available.
- - `worstvideo`: Select the worst quality video-only format. May not be available.
- - `bestaudio`: Select the best quality audio only-format. May not be available.
- - `worstaudio`: Select the worst quality audio only-format. May not be available.
-
-For example, to download the worst quality video-only format you can use `-f worstvideo`.
-
-If you want to download multiple videos and they don't have the same formats available, you can specify the order of preference using slashes. Note that slash is left-associative, i.e. formats on the left hand side are preferred, for example `-f 22/17/18` will download format 22 if it's available, otherwise it will download format 17 if it's available, otherwise it will download format 18 if it's available, otherwise it will complain that no suitable formats are available for download.
-
-If you want to download several formats of the same video use a comma as a separator, e.g. `-f 22,17,18` will download all these three formats, of course if they are available. Or a more sophisticated example combined with the precedence feature: `-f 136/137/mp4/bestvideo,140/m4a/bestaudio`.
-
-You can also filter the video formats by putting a condition in brackets, as in `-f "best[height=720]"` (or `-f "[filesize>10M]"`).
-
-The following numeric meta fields can be used with comparisons `<`, `<=`, `>`, `>=`, `=` (equals), `!=` (not equals):
-
- - `filesize`: The number of bytes, if known in advance
- - `width`: Width of the video, if known
- - `height`: Height of the video, if known
- - `tbr`: Average bitrate of audio and video in KBit/s
- - `abr`: Average audio bitrate in KBit/s
- - `vbr`: Average video bitrate in KBit/s
- - `asr`: Audio sampling rate in Hertz
- - `fps`: Frame rate
-
-Also filtering work for comparisons `=` (equals), `^=` (starts with), `$=` (ends with), `*=` (contains) and following string meta fields:
-
- - `ext`: File extension
- - `acodec`: Name of the audio codec in use
- - `vcodec`: Name of the video codec in use
- - `container`: Name of the container format
- - `protocol`: The protocol that will be used for the actual download, lower-case (`http`, `https`, `rtsp`, `rtmp`, `rtmpe`, `mms`, `f4m`, `ism`, `http_dash_segments`, `m3u8`, or `m3u8_native`)
- - `format_id`: A short description of the format
-
-Any string comparison may be prefixed with negation `!` in order to produce an opposite comparison, e.g. `!*=` (does not contain).
-
-Note that none of the aforementioned meta fields are guaranteed to be present since this solely depends on the metadata obtained by particular extractor, i.e. the metadata offered by the video hoster.
-
-Formats for which the value is not known are excluded unless you put a question mark (`?`) after the operator. You can combine format filters, so `-f "[height <=? 720][tbr>500]"` selects up to 720p videos (or videos where the height is not known) with a bitrate of at least 500 KBit/s.
-
-You can merge the video and audio of two formats into a single file using `-f <video-format>+<audio-format>` (requires ffmpeg or avconv installed), for example `-f bestvideo+bestaudio` will download the best video-only format, the best audio-only format and mux them together with ffmpeg/avconv.
-
-Format selectors can also be grouped using parentheses, for example if you want to download the best mp4 and webm formats with a height lower than 480 you can use `-f '(mp4,webm)[height<480]'`.
-
-Since the end of April 2015 and version 2015.04.26, youtube-dl uses `-f bestvideo+bestaudio/best` as the default format selection (see [#5447](https://github.com/ytdl-org/youtube-dl/issues/5447), [#5456](https://github.com/ytdl-org/youtube-dl/issues/5456)). If ffmpeg or avconv are installed this results in downloading `bestvideo` and `bestaudio` separately and muxing them together into a single file giving the best overall quality available. Otherwise it falls back to `best` and results in downloading the best available quality served as a single file. `best` is also needed for videos that don't come from YouTube because they don't provide the audio and video in two different files. If you want to only download some DASH formats (for example if you are not interested in getting videos with a resolution higher than 1080p), you can add `-f bestvideo[height<=?1080]+bestaudio/best` to your configuration file. Note that if you use youtube-dl to stream to `stdout` (and most likely to pipe it to your media player then), i.e. you explicitly specify output template as `-o -`, youtube-dl still uses `-f best` format selection in order to start content delivery immediately to your player and not to wait until `bestvideo` and `bestaudio` are downloaded and muxed.
-
-If you want to preserve the old format selection behavior (prior to youtube-dl 2015.04.26), i.e. you want to download the best available quality media served as a single file, you should explicitly specify your choice with `-f best`. You may want to add it to the [configuration file](#configuration) in order not to type it every time you run youtube-dl.
-
-#### Format selection examples
-
-Note that on Windows you may need to use double quotes instead of single.
-
-```bash
-# Download best mp4 format available or any other best if no mp4 available
-$ youtube-dl -f 'bestvideo[ext=mp4]+bestaudio[ext=m4a]/best[ext=mp4]/best'
-
-# Download best format available but no better than 480p
-$ youtube-dl -f 'bestvideo[height<=480]+bestaudio/best[height<=480]'
-
-# Download best video only format but no bigger than 50 MB
-$ youtube-dl -f 'best[filesize<50M]'
-
-# Download best format available via direct link over HTTP/HTTPS protocol
-$ youtube-dl -f '(bestvideo+bestaudio/best)[protocol^=http]'
-
-# Download the best video format and the best audio format without merging them
-$ youtube-dl -f 'bestvideo,bestaudio' -o '%(title)s.f%(format_id)s.%(ext)s'
-```
-Note that in the last example, an output template is recommended as bestvideo and bestaudio may have the same file name.
-
-
-# VIDEO SELECTION
-
-Videos can be filtered by their upload date using the options `--date`, `--datebefore` or `--dateafter`. They accept dates in two formats:
-
- - Absolute dates: Dates in the format `YYYYMMDD`.
- - Relative dates: Dates in the format `(now|today)[+-][0-9](day|week|month|year)(s)?`
- 
-Examples:
-
-```bash
-# Download only the videos uploaded in the last 6 months
-$ youtube-dl --dateafter now-6months
-
-# Download only the videos uploaded on January 1, 1970
-$ youtube-dl --date 19700101
-
-$ # Download only the videos uploaded in the 200x decade
-$ youtube-dl --dateafter 20000101 --datebefore 20091231
-```
-
-# FAQ
-
-### How do I update youtube-dl?
-
-If you've followed [our manual installation instructions](https://ytdl-org.github.io/youtube-dl/download.html), you can simply run `youtube-dl -U` (or, on Linux, `sudo youtube-dl -U`).
-
-If you have used pip, a simple `sudo pip install -U youtube-dl` is sufficient to update.
-
-If you have installed youtube-dl using a package manager like *apt-get* or *yum*, use the standard system update mechanism to update. Note that distribution packages are often outdated. As a rule of thumb, youtube-dl releases at least once a month, and often weekly or even daily. Simply go to https://yt-dl.org to find out the current version. Unfortunately, there is nothing we youtube-dl developers can do if your distribution serves a really outdated version. You can (and should) complain to your distribution in their bugtracker or support forum.
-
-As a last resort, you can also uninstall the version installed by your package manager and follow our manual installation instructions. For that, remove the distribution's package, with a line like
-
-    sudo apt-get remove -y youtube-dl
-
-Afterwards, simply follow [our manual installation instructions](https://ytdl-org.github.io/youtube-dl/download.html):
-
-```
-sudo wget https://yt-dl.org/downloads/latest/youtube-dl -O /usr/local/bin/youtube-dl
-sudo chmod a+rx /usr/local/bin/youtube-dl
-hash -r
-```
-
-Again, from then on you'll be able to update with `sudo youtube-dl -U`.
-
-### youtube-dl is extremely slow to start on Windows
-
-Add a file exclusion for `youtube-dl.exe` in Windows Defender settings.
-
-### I'm getting an error `Unable to extract OpenGraph title` on YouTube playlists
-
-YouTube changed their playlist format in March 2014 and later on, so you'll need at least youtube-dl 2014.07.25 to download all YouTube videos.
-
-If you have installed youtube-dl with a package manager, pip, setup.py or a tarball, please use that to update. Note that Ubuntu packages do not seem to get updated anymore. Since we are not affiliated with Ubuntu, there is little we can do. Feel free to [report bugs](https://bugs.launchpad.net/ubuntu/+source/youtube-dl/+filebug) to the [Ubuntu packaging people](mailto:ubuntu-motu@lists.ubuntu.com?subject=outdated%20version%20of%20youtube-dl) - all they have to do is update the package to a somewhat recent version. See above for a way to update.
-
-### I'm getting an error when trying to use output template: `error: using output template conflicts with using title, video ID or auto number`
-
-Make sure you are not using `-o` with any of these options `-t`, `--title`, `--id`, `-A` or `--auto-number` set in command line or in a configuration file. Remove the latter if any.
-
-### Do I always have to pass `-citw`?
-
-By default, youtube-dl intends to have the best options (incidentally, if you have a convincing case that these should be different, [please file an issue where you explain that](https://yt-dl.org/bug)). Therefore, it is unnecessary and sometimes harmful to copy long option strings from webpages. In particular, the only option out of `-citw` that is regularly useful is `-i`.
-
-### Can you please put the `-b` option back?
-
-Most people asking this question are not aware that youtube-dl now defaults to downloading the highest available quality as reported by YouTube, which will be 1080p or 720p in some cases, so you no longer need the `-b` option. For some specific videos, maybe YouTube does not report them to be available in a specific high quality format you're interested in. In that case, simply request it with the `-f` option and youtube-dl will try to download it.
-
-### I get HTTP error 402 when trying to download a video. What's this?
-
-Apparently YouTube requires you to pass a CAPTCHA test if you download too much. We're [considering to provide a way to let you solve the CAPTCHA](https://github.com/ytdl-org/youtube-dl/issues/154), but at the moment, your best course of action is pointing a web browser to the youtube URL, solving the CAPTCHA, and restart youtube-dl.
-
-### Do I need any other programs?
-
-youtube-dl works fine on its own on most sites. However, if you want to convert video/audio, you'll need [avconv](https://libav.org/) or [ffmpeg](https://www.ffmpeg.org/). On some sites - most notably YouTube - videos can be retrieved in a higher quality format without sound. youtube-dl will detect whether avconv/ffmpeg is present and automatically pick the best option.
-
-Videos or video formats streamed via RTMP protocol can only be downloaded when [rtmpdump](https://rtmpdump.mplayerhq.hu/) is installed. Downloading MMS and RTSP videos requires either [mplayer](https://mplayerhq.hu/) or [mpv](https://mpv.io/) to be installed.
-
-### I have downloaded a video but how can I play it?
-
-Once the video is fully downloaded, use any video player, such as [mpv](https://mpv.io/), [vlc](https://www.videolan.org/) or [mplayer](https://www.mplayerhq.hu/).
-
-### I extracted a video URL with `-g`, but it does not play on another machine / in my web browser.
-
-It depends a lot on the service. In many cases, requests for the video (to download/play it) must come from the same IP address and with the same cookies and/or HTTP headers. Use the `--cookies` option to write the required cookies into a file, and advise your downloader to read cookies from that file. Some sites also require a common user agent to be used, use `--dump-user-agent` to see the one in use by youtube-dl. You can also get necessary cookies and HTTP headers from JSON output obtained with `--dump-json`.
-
-It may be beneficial to use IPv6; in some cases, the restrictions are only applied to IPv4. Some services (sometimes only for a subset of videos) do not restrict the video URL by IP address, cookie, or user-agent, but these are the exception rather than the rule.
-
-Please bear in mind that some URL protocols are **not** supported by browsers out of the box, including RTMP. If you are using `-g`, your own downloader must support these as well.
-
-If you want to play the video on a machine that is not running youtube-dl, you can relay the video content from the machine that runs youtube-dl. You can use `-o -` to let youtube-dl stream a video to stdout, or simply allow the player to download the files written by youtube-dl in turn.
-
-### ERROR: no fmt_url_map or conn information found in video info
-
-YouTube has switched to a new video info format in July 2011 which is not supported by old versions of youtube-dl. See [above](#how-do-i-update-youtube-dl) for how to update youtube-dl.
-
-### ERROR: unable to download video
-
-YouTube requires an additional signature since September 2012 which is not supported by old versions of youtube-dl. See [above](#how-do-i-update-youtube-dl) for how to update youtube-dl.
-
-### Video URL contains an ampersand and I'm getting some strange output `[1] 2839` or `'v' is not recognized as an internal or external command`
-
-That's actually the output from your shell. Since ampersand is one of the special shell characters it's interpreted by the shell preventing you from passing the whole URL to youtube-dl. To disable your shell from interpreting the ampersands (or any other special characters) you have to either put the whole URL in quotes or escape them with a backslash (which approach will work depends on your shell).
-
-For example if your URL is https://www.youtube.com/watch?t=4&v=BaW_jenozKc you should end up with following command:
-
-```youtube-dl 'https://www.youtube.com/watch?t=4&v=BaW_jenozKc'```
-
-or
-
-```youtube-dl https://www.youtube.com/watch?t=4\&v=BaW_jenozKc```
-
-For Windows you have to use the double quotes:
-
-```youtube-dl "https://www.youtube.com/watch?t=4&v=BaW_jenozKc"```
-
-### ExtractorError: Could not find JS function u'OF'
-
-In February 2015, the new YouTube player contained a character sequence in a string that was misinterpreted by old versions of youtube-dl. See [above](#how-do-i-update-youtube-dl) for how to update youtube-dl.
-
-### HTTP Error 429: Too Many Requests or 402: Payment Required
-
-These two error codes indicate that the service is blocking your IP address because of overuse. Contact the service and ask them to unblock your IP address, or - if you have acquired a whitelisted IP address already - use the [`--proxy` or `--source-address` options](#network-options) to select another IP address.
-
-### SyntaxError: Non-ASCII character
-
-The error
-
-    File "youtube-dl", line 2
-    SyntaxError: Non-ASCII character '\x93' ...
-
-means you're using an outdated version of Python. Please update to Python 2.6 or 2.7.
-
-### What is this binary file? Where has the code gone?
-
-Since June 2012 ([#342](https://github.com/ytdl-org/youtube-dl/issues/342)) youtube-dl is packed as an executable zipfile, simply unzip it (might need renaming to `youtube-dl.zip` first on some systems) or clone the git repository, as laid out above. If you modify the code, you can run it by executing the `__main__.py` file. To recompile the executable, run `make youtube-dl`.
-
-### The exe throws an error due to missing `MSVCR100.dll`
-
-To run the exe you need to install first the [Microsoft Visual C++ 2010 Redistributable Package (x86)](https://www.microsoft.com/en-US/download/details.aspx?id=5555).
-
-### On Windows, how should I set up ffmpeg and youtube-dl? Where should I put the exe files?
-
-If you put youtube-dl and ffmpeg in the same directory that you're running the command from, it will work, but that's rather cumbersome.
-
-To make a different directory work - either for ffmpeg, or for youtube-dl, or for both - simply create the directory (say, `C:\bin`, or `C:\Users\<User name>\bin`), put all the executables directly in there, and then [set your PATH environment variable](https://www.java.com/en/download/help/path.xml) to include that directory.
-
-From then on, after restarting your shell, you will be able to access both youtube-dl and ffmpeg (and youtube-dl will be able to find ffmpeg) by simply typing `youtube-dl` or `ffmpeg`, no matter what directory you're in.
-
-### How do I put downloads into a specific folder?
-
-Use the `-o` to specify an [output template](#output-template), for example `-o "/home/user/videos/%(title)s-%(id)s.%(ext)s"`. If you want this for all of your downloads, put the option into your [configuration file](#configuration).
-
-### How do I download a video starting with a `-`?
-
-Either prepend `https://www.youtube.com/watch?v=` or separate the ID from the options with `--`:
-
-    youtube-dl -- -wNyEUrxzFU
-    youtube-dl "https://www.youtube.com/watch?v=-wNyEUrxzFU"
-
-### How do I pass cookies to youtube-dl?
-
-Use the `--cookies` option, for example `--cookies /path/to/cookies/file.txt`.
-
-In order to extract cookies from browser use any conforming browser extension for exporting cookies. For example, [cookies.txt](https://chrome.google.com/webstore/detail/cookiestxt/njabckikapfpffapmjgojcnbfjonfjfg) (for Chrome) or [cookies.txt](https://addons.mozilla.org/en-US/firefox/addon/cookies-txt/) (for Firefox).
-
-Note that the cookies file must be in Mozilla/Netscape format and the first line of the cookies file must be either `# HTTP Cookie File` or `# Netscape HTTP Cookie File`. Make sure you have correct [newline format](https://en.wikipedia.org/wiki/Newline) in the cookies file and convert newlines if necessary to correspond with your OS, namely `CRLF` (`\r\n`) for Windows and `LF` (`\n`) for Unix and Unix-like systems (Linux, macOS, etc.). `HTTP Error 400: Bad Request` when using `--cookies` is a good sign of invalid newline format.
-
-Passing cookies to youtube-dl is a good way to workaround login when a particular extractor does not implement it explicitly. Another use case is working around [CAPTCHA](https://en.wikipedia.org/wiki/CAPTCHA) some websites require you to solve in particular cases in order to get access (e.g. YouTube, CloudFlare).
-
-### How do I stream directly to media player?
-
-You will first need to tell youtube-dl to stream media to stdout with `-o -`, and also tell your media player to read from stdin (it must be capable of this for streaming) and then pipe former to latter. For example, streaming to [vlc](https://www.videolan.org/) can be achieved with:
-
-    youtube-dl -o - "https://www.youtube.com/watch?v=BaW_jenozKcj" | vlc -
-
-### How do I download only new videos from a playlist?
-
-Use download-archive feature. With this feature you should initially download the complete playlist with `--download-archive /path/to/download/archive/file.txt` that will record identifiers of all the videos in a special file. Each subsequent run with the same `--download-archive` will download only new videos and skip all videos that have been downloaded before. Note that only successful downloads are recorded in the file.
-
-For example, at first,
-
-    youtube-dl --download-archive archive.txt "https://www.youtube.com/playlist?list=PLwiyx1dc3P2JR9N8gQaQN_BCvlSlap7re"
-
-will download the complete `PLwiyx1dc3P2JR9N8gQaQN_BCvlSlap7re` playlist and create a file `archive.txt`. Each subsequent run will only download new videos if any:
-
-    youtube-dl --download-archive archive.txt "https://www.youtube.com/playlist?list=PLwiyx1dc3P2JR9N8gQaQN_BCvlSlap7re"
-
-### Should I add `--hls-prefer-native` into my config?
-
-When youtube-dl detects an HLS video, it can download it either with the built-in downloader or ffmpeg. Since many HLS streams are slightly invalid and ffmpeg/youtube-dl each handle some invalid cases better than the other, there is an option to switch the downloader if needed.
-
-When youtube-dl knows that one particular downloader works better for a given website, that downloader will be picked. Otherwise, youtube-dl will pick the best downloader for general compatibility, which at the moment happens to be ffmpeg. This choice may change in future versions of youtube-dl, with improvements of the built-in downloader and/or ffmpeg.
-
-In particular, the generic extractor (used when your website is not in the [list of supported sites by youtube-dl](https://ytdl-org.github.io/youtube-dl/supportedsites.html) cannot mandate one specific downloader.
-
-If you put either `--hls-prefer-native` or `--hls-prefer-ffmpeg` into your configuration, a different subset of videos will fail to download correctly. Instead, it is much better to [file an issue](https://yt-dl.org/bug) or a pull request which details why the native or the ffmpeg HLS downloader is a better choice for your use case.
-
-### Can you add support for this anime video site, or site which shows current movies for free?
-
-As a matter of policy (as well as legality), youtube-dl does not include support for services that specialize in infringing copyright. As a rule of thumb, if you cannot easily find a video that the service is quite obviously allowed to distribute (i.e. that has been uploaded by the creator, the creator's distributor, or is published under a free license), the service is probably unfit for inclusion to youtube-dl.
-
-A note on the service that they don't host the infringing content, but just link to those who do, is evidence that the service should **not** be included into youtube-dl. The same goes for any DMCA note when the whole front page of the service is filled with videos they are not allowed to distribute. A "fair use" note is equally unconvincing if the service shows copyright-protected videos in full without authorization.
-
-Support requests for services that **do** purchase the rights to distribute their content are perfectly fine though. If in doubt, you can simply include a source that mentions the legitimate purchase of content.
-
-### How can I speed up work on my issue?
-
-(Also known as: Help, my important issue not being solved!) The youtube-dl core developer team is quite small. While we do our best to solve as many issues as possible, sometimes that can take quite a while. To speed up your issue, here's what you can do:
-
-First of all, please do report the issue [at our issue tracker](https://yt-dl.org/bugs). That allows us to coordinate all efforts by users and developers, and serves as a unified point. Unfortunately, the youtube-dl project has grown too large to use personal email as an effective communication channel.
-
-Please read the [bug reporting instructions](#bugs) below. A lot of bugs lack all the necessary information. If you can, offer proxy, VPN, or shell access to the youtube-dl developers. If you are able to, test the issue from multiple computers in multiple countries to exclude local censorship or misconfiguration issues.
-
-If nobody is interested in solving your issue, you are welcome to take matters into your own hands and submit a pull request (or coerce/pay somebody else to do so).
-
-Feel free to bump the issue from time to time by writing a small comment ("Issue is still present in youtube-dl version ...from France, but fixed from Belgium"), but please not more than once a month. Please do not declare your issue as `important` or `urgent`.
-
-### How can I detect whether a given URL is supported by youtube-dl?
-
-For one, have a look at the [list of supported sites](docs/supportedsites.md). Note that it can sometimes happen that the site changes its URL scheme (say, from https://example.com/video/1234567 to https://example.com/v/1234567 ) and youtube-dl reports an URL of a service in that list as unsupported. In that case, simply report a bug.
-
-It is *not* possible to detect whether a URL is supported or not. That's because youtube-dl contains a generic extractor which matches **all** URLs. You may be tempted to disable, exclude, or remove the generic extractor, but the generic extractor not only allows users to extract videos from lots of websites that embed a video from another service, but may also be used to extract video from a service that it's hosting itself. Therefore, we neither recommend nor support disabling, excluding, or removing the generic extractor.
-
-If you want to find out whether a given URL is supported, simply call youtube-dl with it. If you get no videos back, chances are the URL is either not referring to a video or unsupported. You can find out which by examining the output (if you run youtube-dl on the console) or catching an `UnsupportedError` exception if you run it from a Python program.
-
-# Why do I need to go through that much red tape when filing bugs?
-
-Before we had the issue template, despite our extensive [bug reporting instructions](#bugs), about 80% of the issue reports we got were useless, for instance because people used ancient versions hundreds of releases old, because of simple syntactic errors (not in youtube-dl but in general shell usage), because the problem was already reported multiple times before, because people did not actually read an error message, even if it said "please install ffmpeg", because people did not mention the URL they were trying to download and many more simple, easy-to-avoid problems, many of whom were totally unrelated to youtube-dl.
-
-youtube-dl is an open-source project manned by too few volunteers, so we'd rather spend time fixing bugs where we are certain none of those simple problems apply, and where we can be reasonably confident to be able to reproduce the issue without asking the reporter repeatedly. As such, the output of `youtube-dl -v YOUR_URL_HERE` is really all that's required to file an issue. The issue template also guides you through some basic steps you can do, such as checking that your version of youtube-dl is current.
-
-# DEVELOPER INSTRUCTIONS
-
-Most users do not need to build youtube-dl and can [download the builds](https://ytdl-org.github.io/youtube-dl/download.html) or get them from their distribution.
-
-To run youtube-dl as a developer, you don't need to build anything either. Simply execute
-
-    python -m youtube_dl
-
-To run the test, simply invoke your favorite test runner, or execute a test file directly; any of the following work:
-
-    python -m unittest discover
-    python test/test_download.py
-    nosetests
-
-See item 6 of [new extractor tutorial](#adding-support-for-a-new-site) for how to run extractor specific test cases.
-
-If you want to create a build of youtube-dl yourself, you'll need
-
-* python
-* make (only GNU make is supported)
-* pandoc
-* zip
-* nosetests
-
-### Adding support for a new site
-
-If you want to add support for a new site, first of all **make sure** this site is **not dedicated to [copyright infringement](README.md#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. youtube-dl does **not support** such sites thus pull requests adding support for them **will be rejected**.
-
-After you have ensured this site is distributing its content legally, you can follow this quick list (assuming your service is called `yourextractor`):
-
-1. [Fork this repository](https://github.com/ytdl-org/youtube-dl/fork)
-2. Check out the source code with:
-
-        git clone git@github.com:YOUR_GITHUB_USERNAME/youtube-dl.git
-
-3. Start a new git branch with
-
-        cd youtube-dl
-        git checkout -b yourextractor
-
-4. Start with this simple template and save it to `youtube_dl/extractor/yourextractor.py`:
-
-    ```python
-    # coding: utf-8
-    from __future__ import unicode_literals
-
-    from .common import InfoExtractor
-
-
-    class YourExtractorIE(InfoExtractor):
-        _VALID_URL = r'https?://(?:www\.)?yourextractor\.com/watch/(?P<id>[0-9]+)'
-        _TEST = {
-            'url': 'https://yourextractor.com/watch/42',
-            'md5': 'TODO: md5 sum of the first 10241 bytes of the video file (use --test)',
-            'info_dict': {
-                'id': '42',
-                'ext': 'mp4',
-                'title': 'Video title goes here',
-                'thumbnail': r're:^https?://.*\.jpg$',
-                # TODO more properties, either as:
-                # * A value
-                # * MD5 checksum; start the string with md5:
-                # * A regular expression; start the string with re:
-                # * Any Python type (for example int or float)
-            }
-        }
-
-        def _real_extract(self, url):
-            video_id = self._match_id(url)
-            webpage = self._download_webpage(url, video_id)
-
-            # TODO more code goes here, for example ...
-            title = self._html_search_regex(r'<h1>(.+?)</h1>', webpage, 'title')
-
-            return {
-                'id': video_id,
-                'title': title,
-                'description': self._og_search_description(webpage),
-                'uploader': self._search_regex(r'<div[^>]+id="uploader"[^>]*>([^<]+)<', webpage, 'uploader', fatal=False),
-                # TODO more properties (see youtube_dl/extractor/common.py)
-            }
-    ```
-5. Add an import in [`youtube_dl/extractor/extractors.py`](https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/extractor/extractors.py).
-6. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, then rename ``_TEST`` to ``_TESTS`` and make it into a list of dictionaries. The tests will then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc. Note that tests with `only_matching` key in test's dict are not counted in.
-7. Have a look at [`youtube_dl/extractor/common.py`](https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](https://github.com/ytdl-org/youtube-dl/blob/7f41a598b3fba1bcab2817de64a08941200aa3c8/youtube_dl/extractor/common.py#L94-L303). Add tests and code for as many as you want.
-8. Make sure your code follows [youtube-dl coding conventions](#youtube-dl-coding-conventions) and check the code with [flake8](http://flake8.pycqa.org/en/latest/index.html#quickstart):
-
-        $ flake8 youtube_dl/extractor/yourextractor.py
-
-9. Make sure your code works under all [Python](https://www.python.org/) versions claimed supported by youtube-dl, namely 2.6, 2.7, and 3.2+.
-10. When the tests pass, [add](https://git-scm.com/docs/git-add) the new files and [commit](https://git-scm.com/docs/git-commit) them and [push](https://git-scm.com/docs/git-push) the result, like this:
-
-        $ git add youtube_dl/extractor/extractors.py
-        $ git add youtube_dl/extractor/yourextractor.py
-        $ git commit -m '[yourextractor] Add new extractor'
-        $ git push origin yourextractor
-
-11. Finally, [create a pull request](https://help.github.com/articles/creating-a-pull-request). We'll then review and merge it.
-
-In any case, thank you very much for your contributions!
-
-## youtube-dl coding conventions
-
-This section introduces a guide lines for writing idiomatic, robust and future-proof extractor code.
-
-Extractors are very fragile by nature since they depend on the layout of the source data provided by 3rd party media hosters out of your control and this layout tends to change. As an extractor implementer your task is not only to write code that will extract media links and metadata correctly but also to minimize dependency on the source's layout and even to make the code foresee potential future changes and be ready for that. This is important because it will allow the extractor not to break on minor layout changes thus keeping old youtube-dl versions working. Even though this breakage issue is easily fixed by emitting a new version of youtube-dl with a fix incorporated, all the previous versions become broken in all repositories and distros' packages that may not be so prompt in fetching the update from us. Needless to say, some non rolling release distros may never receive an update at all.
-
-### Mandatory and optional metafields
-
-For extraction to work youtube-dl relies on metadata your extractor extracts and provides to youtube-dl expressed by an [information dictionary](https://github.com/ytdl-org/youtube-dl/blob/7f41a598b3fba1bcab2817de64a08941200aa3c8/youtube_dl/extractor/common.py#L94-L303) or simply *info dict*. Only the following meta fields in the *info dict* are considered mandatory for a successful extraction process by youtube-dl:
-
- - `id` (media identifier)
- - `title` (media title)
- - `url` (media download URL) or `formats`
-
-In fact only the last option is technically mandatory (i.e. if you can't figure out the download location of the media the extraction does not make any sense). But by convention youtube-dl also treats `id` and `title` as mandatory. Thus the aforementioned metafields are the critical data that the extraction does not make any sense without and if any of them fail to be extracted then the extractor is considered completely broken.
-
-[Any field](https://github.com/ytdl-org/youtube-dl/blob/7f41a598b3fba1bcab2817de64a08941200aa3c8/youtube_dl/extractor/common.py#L188-L303) apart from the aforementioned ones are considered **optional**. That means that extraction should be **tolerant** to situations when sources for these fields can potentially be unavailable (even if they are always available at the moment) and **future-proof** in order not to break the extraction of general purpose mandatory fields.
-
-#### Example
-
-Say you have some source dictionary `meta` that you've fetched as JSON with HTTP request and it has a key `summary`:
-
-```python
-meta = self._download_json(url, video_id)
-```
-    
-Assume at this point `meta`'s layout is:
-
-```python
-{
-    ...
-    "summary": "some fancy summary text",
-    ...
-}
-```
-
-Assume you want to extract `summary` and put it into the resulting info dict as `description`. Since `description` is an optional meta field you should be ready that this key may be missing from the `meta` dict, so that you should extract it like:
-
-```python
-description = meta.get('summary')  # correct
-```
-
-and not like:
-
-```python
-description = meta['summary']  # incorrect
-```
-
-The latter will break extraction process with `KeyError` if `summary` disappears from `meta` at some later time but with the former approach extraction will just go ahead with `description` set to `None` which is perfectly fine (remember `None` is equivalent to the absence of data).
-
-Similarly, you should pass `fatal=False` when extracting optional data from a webpage with `_search_regex`, `_html_search_regex` or similar methods, for instance:
-
-```python
-description = self._search_regex(
-    r'<span[^>]+id="title"[^>]*>([^<]+)<',
-    webpage, 'description', fatal=False)
-```
-
-With `fatal` set to `False` if `_search_regex` fails to extract `description` it will emit a warning and continue extraction.
-
-You can also pass `default=<some fallback value>`, for example:
-
-```python
-description = self._search_regex(
-    r'<span[^>]+id="title"[^>]*>([^<]+)<',
-    webpage, 'description', default=None)
-```
-
-On failure this code will silently continue the extraction with `description` set to `None`. That is useful for metafields that may or may not be present.
- 
-### Provide fallbacks
-
-When extracting metadata try to do so from multiple sources. For example if `title` is present in several places, try extracting from at least some of them. This makes it more future-proof in case some of the sources become unavailable.
-
-#### Example
-
-Say `meta` from the previous example has a `title` and you are about to extract it. Since `title` is a mandatory meta field you should end up with something like:
-
-```python
-title = meta['title']
-```
-
-If `title` disappears from `meta` in future due to some changes on the hoster's side the extraction would fail since `title` is mandatory. That's expected.
-
-Assume that you have some another source you can extract `title` from, for example `og:title` HTML meta of a `webpage`. In this case you can provide a fallback scenario:
-
-```python
-title = meta.get('title') or self._og_search_title(webpage)
-```
-
-This code will try to extract from `meta` first and if it fails it will try extracting `og:title` from a `webpage`.
-
-### Regular expressions
-
-#### Don't capture groups you don't use
-
-Capturing group must be an indication that it's used somewhere in the code. Any group that is not used must be non capturing.
-
-##### Example
-
-Don't capture id attribute name here since you can't use it for anything anyway.
-
-Correct:
-
-```python
-r'(?:id|ID)=(?P<id>\d+)'
-```
-
-Incorrect:
-```python
-r'(id|ID)=(?P<id>\d+)'
-```
-
-
-#### Make regular expressions relaxed and flexible
-
-When using regular expressions try to write them fuzzy, relaxed and flexible, skipping insignificant parts that are more likely to change, allowing both single and double quotes for quoted values and so on.
- 
-##### Example
-
-Say you need to extract `title` from the following HTML code:
-
-```html
-<span style="position: absolute; left: 910px; width: 90px; float: right; z-index: 9999;" class="title">some fancy title</span>
-```
-
-The code for that task should look similar to:
-
-```python
-title = self._search_regex(
-    r'<span[^>]+class="title"[^>]*>([^<]+)', webpage, 'title')
-```
-
-Or even better:
-
-```python
-title = self._search_regex(
-    r'<span[^>]+class=(["\'])title\1[^>]*>(?P<title>[^<]+)',
-    webpage, 'title', group='title')
-```
-
-Note how you tolerate potential changes in the `style` attribute's value or switch from using double quotes to single for `class` attribute: 
-
-The code definitely should not look like:
-
-```python
-title = self._search_regex(
-    r'<span style="position: absolute; left: 910px; width: 90px; float: right; z-index: 9999;" class="title">(.*?)</span>',
-    webpage, 'title', group='title')
-```
-
-### Long lines policy
-
-There is a soft limit to keep lines of code under 80 characters long. This means it should be respected if possible and if it does not make readability and code maintenance worse.
-
-For example, you should **never** split long string literals like URLs or some other often copied entities over multiple lines to fit this limit:
-
-Correct:
-
-```python
-'https://www.youtube.com/watch?v=FqZTN594JQw&list=PLMYEtVRpaqY00V9W81Cwmzp6N6vZqfUKD4'
-```
-
-Incorrect:
-
-```python
-'https://www.youtube.com/watch?v=FqZTN594JQw&list='
-'PLMYEtVRpaqY00V9W81Cwmzp6N6vZqfUKD4'
-```
-
-### Inline values
-
-Extracting variables is acceptable for reducing code duplication and improving readability of complex expressions. However, you should avoid extracting variables used only once and moving them to opposite parts of the extractor file, which makes reading the linear flow difficult.
-
-#### Example
-
-Correct:
-
-```python
-title = self._html_search_regex(r'<title>([^<]+)</title>', webpage, 'title')
-```
-
-Incorrect:
-
-```python
-TITLE_RE = r'<title>([^<]+)</title>'
-# ...some lines of code...
-title = self._html_search_regex(TITLE_RE, webpage, 'title')
-```
-
-### Collapse fallbacks
-
-Multiple fallback values can quickly become unwieldy. Collapse multiple fallback values into a single expression via a list of patterns.
-
-#### Example
-
-Good:
-
-```python
-description = self._html_search_meta(
-    ['og:description', 'description', 'twitter:description'],
-    webpage, 'description', default=None)
-```
-
-Unwieldy:
-
-```python
-description = (
-    self._og_search_description(webpage, default=None)
-    or self._html_search_meta('description', webpage, default=None)
-    or self._html_search_meta('twitter:description', webpage, default=None))
-```
-
-Methods supporting list of patterns are: `_search_regex`, `_html_search_regex`, `_og_search_property`, `_html_search_meta`.
-
-### Trailing parentheses
-
-Always move trailing parentheses after the last argument.
-
-#### Example
-
-Correct:
-
-```python
-    lambda x: x['ResultSet']['Result'][0]['VideoUrlSet']['VideoUrl'],
-    list)
-```
-
-Incorrect:
-
-```python
-    lambda x: x['ResultSet']['Result'][0]['VideoUrlSet']['VideoUrl'],
-    list,
-)
-```
-
-### Use convenience conversion and parsing functions
-
-Wrap all extracted numeric data into safe functions from [`youtube_dl/utils.py`](https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/utils.py): `int_or_none`, `float_or_none`. Use them for string to number conversions as well.
-
-Use `url_or_none` for safe URL processing.
-
-Use `try_get` for safe metadata extraction from parsed JSON.
-
-Use `unified_strdate` for uniform `upload_date` or any `YYYYMMDD` meta field extraction, `unified_timestamp` for uniform `timestamp` extraction, `parse_filesize` for `filesize` extraction, `parse_count` for count meta fields extraction, `parse_resolution`, `parse_duration` for `duration` extraction, `parse_age_limit` for `age_limit` extraction. 
-
-Explore [`youtube_dl/utils.py`](https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/utils.py) for more useful convenience functions.
-
-#### More examples
-
-##### Safely extract optional description from parsed JSON
-```python
-description = try_get(response, lambda x: x['result']['video'][0]['summary'], compat_str)
-```
-
-##### Safely extract more optional metadata
-```python
-video = try_get(response, lambda x: x['result']['video'][0], dict) or {}
-description = video.get('summary')
-duration = float_or_none(video.get('durationMs'), scale=1000)
-view_count = int_or_none(video.get('views'))
-```
-
-# EMBEDDING YOUTUBE-DL
-
-youtube-dl makes the best effort to be a good command-line program, and thus should be callable from any programming language. If you encounter any problems parsing its output, feel free to [create a report](https://github.com/ytdl-org/youtube-dl/issues/new).
-
-From a Python program, you can embed youtube-dl in a more powerful fashion, like this:
-
-```python
-from __future__ import unicode_literals
-import youtube_dl
-
-ydl_opts = {}
-with youtube_dl.YoutubeDL(ydl_opts) as ydl:
-    ydl.download(['https://www.youtube.com/watch?v=BaW_jenozKc'])
-```
-
-Most likely, you'll want to use various options. For a list of options available, have a look at [`youtube_dl/YoutubeDL.py`](https://github.com/ytdl-org/youtube-dl/blob/3e4cedf9e8cd3157df2457df7274d0c842421945/youtube_dl/YoutubeDL.py#L137-L312). For a start, if you want to intercept youtube-dl's output, set a `logger` object.
-
-Here's a more complete example of a program that outputs only errors (and a short message after the download is finished), and downloads/converts the video to an mp3 file:
-
-```python
-from __future__ import unicode_literals
-import youtube_dl
-
-
-class MyLogger(object):
-    def debug(self, msg):
-        pass
-
-    def warning(self, msg):
-        pass
-
-    def error(self, msg):
-        print(msg)
-
-
-def my_hook(d):
-    if d['status'] == 'finished':
-        print('Done downloading, now converting ...')
-
-
-ydl_opts = {
-    'format': 'bestaudio/best',
-    'postprocessors': [{
-        'key': 'FFmpegExtractAudio',
-        'preferredcodec': 'mp3',
-        'preferredquality': '192',
-    }],
-    'logger': MyLogger(),
-    'progress_hooks': [my_hook],
-}
-with youtube_dl.YoutubeDL(ydl_opts) as ydl:
-    ydl.download(['https://www.youtube.com/watch?v=BaW_jenozKc'])
-```
-
-# BUGS
-
-Bugs and suggestions should be reported at: <https://github.com/ytdl-org/youtube-dl/issues>. Unless you were prompted to or there is another pertinent reason (e.g. GitHub fails to accept the bug report), please do not send bug reports via personal email. For discussions, join us in the IRC channel [#youtube-dl](irc://chat.freenode.net/#youtube-dl) on freenode ([webchat](https://webchat.freenode.net/?randomnick=1&channels=youtube-dl)).
-
-**Please include the full output of youtube-dl when run with `-v`**, i.e. **add** `-v` flag to **your command line**, copy the **whole** output and post it in the issue body wrapped in \`\`\` for better formatting. It should look similar to this:
-```
-$ youtube-dl -v <your command line>
-[debug] System config: []
-[debug] User config: []
-[debug] Command-line args: [u'-v', u'https://www.youtube.com/watch?v=BaW_jenozKcj']
-[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
-[debug] youtube-dl version 2015.12.06
-[debug] Git HEAD: 135392e
-[debug] Python version 2.6.6 - Windows-2003Server-5.2.3790-SP2
-[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
-[debug] Proxy map: {}
-...
-```
-**Do not post screenshots of verbose logs; only plain text is acceptable.**
-
-The output (including the first lines) contains important debugging information. Issues without the full output are often not reproducible and therefore do not get solved in short order, if ever.
-
-Please re-read your issue once again to avoid a couple of common mistakes (you can and should use this as a checklist):
-
-### Is the description of the issue itself sufficient?
-
-We often get issue reports that we cannot really decipher. While in most cases we eventually get the required information after asking back multiple times, this poses an unnecessary drain on our resources. Many contributors, including myself, are also not native speakers, so we may misread some parts.
-
-So please elaborate on what feature you are requesting, or what bug you want to be fixed. Make sure that it's obvious
-
-- What the problem is
-- How it could be fixed
-- How your proposed solution would look like
-
-If your report is shorter than two lines, it is almost certainly missing some of these, which makes it hard for us to respond to it. We're often too polite to close the issue outright, but the missing info makes misinterpretation likely. As a committer myself, I often get frustrated by these issues, since the only possible way for me to move forward on them is to ask for clarification over and over.
-
-For bug reports, this means that your report should contain the *complete* output of youtube-dl when called with the `-v` flag. The error message you get for (most) bugs even says so, but you would not believe how many of our bug reports do not contain this information.
-
-If your server has multiple IPs or you suspect censorship, adding `--call-home` may be a good idea to get more diagnostics. If the error is `ERROR: Unable to extract ...` and you cannot reproduce it from multiple countries, add `--dump-pages` (warning: this will yield a rather large output, redirect it to the file `log.txt` by adding `>log.txt 2>&1` to your command-line) or upload the `.dump` files you get when you add `--write-pages` [somewhere](https://gist.github.com/).
-
-**Site support requests must contain an example URL**. An example URL is a URL you might want to download, like `https://www.youtube.com/watch?v=BaW_jenozKc`. There should be an obvious video present. Except under very special circumstances, the main page of a video service (e.g. `https://www.youtube.com/`) is *not* an example URL.
-
-###  Are you using the latest version?
-
-Before reporting any issue, type `youtube-dl -U`. This should report that you're up-to-date. About 20% of the reports we receive are already fixed, but people are using outdated versions. This goes for feature requests as well.
-
-###  Is the issue already documented?
-
-Make sure that someone has not already opened the issue you're trying to open. Search at the top of the window or browse the [GitHub Issues](https://github.com/ytdl-org/youtube-dl/search?type=Issues) of this repository. If there is an issue, feel free to write something along the lines of "This affects me as well, with version 2015.01.01. Here is some more information on the issue: ...". While some issues may be old, a new post into them often spurs rapid activity.
-
-###  Why are existing options not enough?
-
-Before requesting a new feature, please have a quick peek at [the list of supported options](https://github.com/ytdl-org/youtube-dl/blob/master/README.md#options). Many feature requests are for features that actually exist already! Please, absolutely do show off your work in the issue report and detail how the existing similar options do *not* solve your problem.
-
-###  Is there enough context in your bug report?
-
-People want to solve problems, and often think they do us a favor by breaking down their larger problems (e.g. wanting to skip already downloaded files) to a specific request (e.g. requesting us to look whether the file exists before downloading the info page). However, what often happens is that they break down the problem into two steps: One simple, and one impossible (or extremely complicated one).
-
-We are then presented with a very complicated request when the original problem could be solved far easier, e.g. by recording the downloaded video IDs in a separate file. To avoid this, you must include the greater context where it is non-obvious. In particular, every feature request that does not consist of adding support for a new site should contain a use case scenario that explains in what situation the missing feature would be useful.
-
-###  Does the issue involve one problem, and one problem only?
-
-Some of our users seem to think there is a limit of issues they can or should open. There is no limit of issues they can or should open. While it may seem appealing to be able to dump all your issues into one ticket, that means that someone who solves one of your issues cannot mark the issue as closed. Typically, reporting a bunch of issues leads to the ticket lingering since nobody wants to attack that behemoth, until someone mercifully splits the issue into multiple ones.
-
-In particular, every site support request issue should only pertain to services at one site (generally under a common domain, but always using the same backend technology). Do not request support for vimeo user videos, White house podcasts, and Google Plus pages in the same issue. Also, make sure that you don't post bug reports alongside feature requests. As a rule of thumb, a feature request does not include outputs of youtube-dl that are not immediately related to the feature at hand. Do not post reports of a network error alongside the request for a new video service.
-
-###  Is anyone going to need the feature?
-
-Only post features that you (or an incapacitated friend you can personally talk to) require. Do not post features because they seem like a good idea. If they are really useful, they will be requested by someone who requires them.
-
-###  Is your question about youtube-dl?
-
-It may sound strange, but some bug reports we receive are completely unrelated to youtube-dl and relate to a different, or even the reporter's own, application. Please make sure that you are actually using youtube-dl. If you are using a UI for youtube-dl, report the bug to the maintainer of the actual application providing the UI. On the other hand, if your UI for youtube-dl fails in some way you believe is related to youtube-dl, by all means, go ahead and report the bug.
-
-# COPYRIGHT
-
-youtube-dl is released into the public domain by the copyright holders.
-
-This README file was originally written by [Daniel Bolton](https://github.com/dbbolton) and is likewise released into the public domain.
diff --git a/bin/youtube-dl b/bin/youtube-dl

deleted file mode 100755 (executable)

index fc3cc8a..0000000
--- a/bin/youtube-dl
+++ /dev/null
@@ -1,6 +0,0 @@
-#!/usr/bin/env python
-
-import youtube_dl
-
-if __name__ == '__main__':
-    youtube_dl.main()
diff --git a/devscripts/bash-completion.in b/devscripts/bash-completion.in

index 28bd237278da5c0ade9ed96a4cb722c1b6973cf3..1bf41f2ccf9f576bfdcd6d521869e00b452107c4 100644 (file)
--- a/devscripts/bash-completion.in
+++ b/devscripts/bash-completion.in
@@ -1,4 +1,4 @@
-__youtube_dl()
+__youtube_dlc()
  {
      local cur prev opts fileopts diropts keywords
      COMPREPLY=()
@@ -26,4 +26,4 @@ __youtube_dl()
      fi
  }
  
-complete -F __youtube_dl youtube-dl
+complete -F __youtube_dlc youtube-dlc
diff --git a/devscripts/bash-completion.py b/devscripts/bash-completion.py

index 3d1391334bd38a23c7024192c6c36522acaa5613..d68c9b1ccec6498a798b965c04122cafd94b3c88 100755 (executable)
--- a/devscripts/bash-completion.py
+++ b/devscripts/bash-completion.py
@@ -6,9 +6,9 @@
  import sys
  
  sys.path.insert(0, dirn(dirn((os.path.abspath(__file__)))))
-import youtube_dl
+import youtube_dlc
  
-BASH_COMPLETION_FILE = "youtube-dl.bash-completion"
+BASH_COMPLETION_FILE = "youtube-dlc.bash-completion"
  BASH_COMPLETION_TEMPLATE = "devscripts/bash-completion.in"
  
  
@@ -26,5 +26,5 @@ def build_completion(opt_parser):
          f.write(filled_template)
  
  
-parser = youtube_dl.parseOpts()[0]
+parser = youtube_dlc.parseOpts()[0]
  build_completion(parser)
diff --git a/devscripts/buildserver.py b/devscripts/buildserver.py

index 4a4295ba9cd3b8acf739c983248494c989781e63..62dbd2cb17590bbd2e05fc37a1d7ba6e24af5c46 100644 (file)
--- a/devscripts/buildserver.py
+++ b/devscripts/buildserver.py
@@ -12,7 +12,7 @@
  import os.path
  
  sys.path.insert(0, os.path.dirname(os.path.dirname((os.path.abspath(__file__)))))
-from youtube_dl.compat import (
+from youtube_dlc.compat import (
      compat_input,
      compat_http_server,
      compat_str,
@@ -325,7 +325,7 @@ class YoutubeDLBuilder(object):
      authorizedUsers = ['fraca7', 'phihag', 'rg3', 'FiloSottile', 'ytdl-org']
  
      def __init__(self, **kwargs):
-        if self.repoName != 'youtube-dl':
+        if self.repoName != 'youtube-dlc':
              raise BuildError('Invalid repository "%s"' % self.repoName)
          if self.user not in self.authorizedUsers:
              raise HTTPError('Unauthorized user "%s"' % self.user, 401)
diff --git a/devscripts/check-porn.py b/devscripts/check-porn.py

index 740f04de0f22ad3ac6352b114b0e8f99cf717a9f..68a33d823f2308496123fd2e08c21ff39249008d 100644 (file)
--- a/devscripts/check-porn.py
+++ b/devscripts/check-porn.py
@@ -15,8 +15,8 @@
  sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
  
  from test.helper import gettestcases
-from youtube_dl.utils import compat_urllib_parse_urlparse
-from youtube_dl.utils import compat_urllib_request
+from youtube_dlc.utils import compat_urllib_parse_urlparse
+from youtube_dlc.utils import compat_urllib_request
  
  if len(sys.argv) > 1:
      METHOD = 'LIST'
diff --git a/devscripts/create-github-release.py b/devscripts/create-github-release.py

index 428111b3f0e893d9ae53da648844833e87dd72b3..4714d81a6ee17ea117a2a4378dd8b6c5f80fb968 100644 (file)
--- a/devscripts/create-github-release.py
+++ b/devscripts/create-github-release.py
@@ -1,7 +1,6 @@
  #!/usr/bin/env python
  from __future__ import unicode_literals
  
-import base64
  import io
  import json
  import mimetypes
@@ -13,14 +12,13 @@
  
  sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
  
-from youtube_dl.compat import (
+from youtube_dlc.compat import (
      compat_basestring,
-    compat_input,
      compat_getpass,
      compat_print,
      compat_urllib_request,
  )
-from youtube_dl.utils import (
+from youtube_dlc.utils import (
      make_HTTPS_handler,
      sanitized_Request,
  )
@@ -40,28 +38,20 @@ def _init_github_account(self):
          try:
              info = netrc.netrc().authenticators(self._NETRC_MACHINE)
              if info is not None:
-                self._username = info[0]
-                self._password = info[2]
+                self._token = info[2]
                  compat_print('Using GitHub credentials found in .netrc...')
                  return
              else:
                  compat_print('No GitHub credentials found in .netrc')
          except (IOError, netrc.NetrcParseError):
              compat_print('Unable to parse .netrc')
-        self._username = compat_input(
-            'Type your GitHub username or email address and press [Return]: ')
-        self._password = compat_getpass(
-            'Type your GitHub password and press [Return]: ')
+        self._token = compat_getpass(
+            'Type your GitHub PAT (personal access token) and press [Return]: ')
  
      def _call(self, req):
          if isinstance(req, compat_basestring):
              req = sanitized_Request(req)
-        # Authorizing manually since GitHub does not response with 401 with
-        # WWW-Authenticate header set (see
-        # https://developer.github.com/v3/#basic-authentication)
-        b64 = base64.b64encode(
-            ('%s:%s' % (self._username, self._password)).encode('utf-8')).decode('ascii')
-        req.add_header('Authorization', 'Basic %s' % b64)
+        req.add_header('Authorization', 'token %s' % self._token)
          response = self._opener.open(req).read().decode('utf-8')
          return json.loads(response)
  
@@ -108,7 +98,7 @@ def main():
      releaser = GitHubReleaser()
  
      new_release = releaser.create_release(
-        version, name='youtube-dl %s' % version, body=body)
+        version, name='youtube-dlc %s' % version, body=body)
      release_id = new_release['id']
  
      for asset in os.listdir(build_path):
diff --git a/devscripts/fish-completion.in b/devscripts/fish-completion.in

index eb79765da20b795fed35f5c02e41fb9ada630626..4f08b6d4a427eedec81d3cc78f34eb119df1ff6b 100644 (file)
--- a/devscripts/fish-completion.in
+++ b/devscripts/fish-completion.in
@@ -2,4 +2,4 @@
  {{commands}}
  
  
-complete --command youtube-dl --arguments ":ytfavorites :ytrecommended :ytsubscriptions :ytwatchlater :ythistory"
+complete --command youtube-dlc --arguments ":ytfavorites :ytrecommended :ytsubscriptions :ytwatchlater :ythistory"
diff --git a/devscripts/fish-completion.py b/devscripts/fish-completion.py

index 51d19dd33d3bf5c05fc86f3c63e23c00871fda90..a27ef44f8fe1474176291f69b31ff5a3ea46337c 100755 (executable)
--- a/devscripts/fish-completion.py
+++ b/devscripts/fish-completion.py
@@ -7,10 +7,10 @@
  import sys
  
  sys.path.insert(0, dirn(dirn((os.path.abspath(__file__)))))
-import youtube_dl
-from youtube_dl.utils import shell_quote
+import youtube_dlc
+from youtube_dlc.utils import shell_quote
  
-FISH_COMPLETION_FILE = 'youtube-dl.fish'
+FISH_COMPLETION_FILE = 'youtube-dlc.fish'
  FISH_COMPLETION_TEMPLATE = 'devscripts/fish-completion.in'
  
  EXTRA_ARGS = {
@@ -30,7 +30,7 @@ def build_completion(opt_parser):
      for group in opt_parser.option_groups:
          for option in group.option_list:
              long_option = option.get_opt_string().strip('-')
-            complete_cmd = ['complete', '--command', 'youtube-dl', '--long-option', long_option]
+            complete_cmd = ['complete', '--command', 'youtube-dlc', '--long-option', long_option]
              if option._short_opts:
                  complete_cmd += ['--short-option', option._short_opts[0].strip('-')]
              if option.help != optparse.SUPPRESS_HELP:
@@ -45,5 +45,5 @@ def build_completion(opt_parser):
          f.write(filled_template)
  
  
-parser = youtube_dl.parseOpts()[0]
+parser = youtube_dlc.parseOpts()[0]
  build_completion(parser)
diff --git a/devscripts/generate_aes_testdata.py b/devscripts/generate_aes_testdata.py

index e3df42cc2da6c99d9104c9bd2bac776af5a61c46..c89bb547e78dc4b2cd615d7ed26c9faa4a708da6 100644 (file)
--- a/devscripts/generate_aes_testdata.py
+++ b/devscripts/generate_aes_testdata.py
@@ -7,8 +7,8 @@
  import sys
  sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
  
-from youtube_dl.utils import intlist_to_bytes
-from youtube_dl.aes import aes_encrypt, key_expansion
+from youtube_dlc.utils import intlist_to_bytes
+from youtube_dlc.aes import aes_encrypt, key_expansion
  
  secret_msg = b'Secret message goes here'
  
diff --git a/devscripts/gh-pages/add-version.py b/devscripts/gh-pages/add-version.py

index 867ea0048fb88f1ca1382f11b1c60b17110d4fc8..04588a5eec75e21035fd2d81c866a15d3afc7dcc 100755 (executable)
--- a/devscripts/gh-pages/add-version.py
+++ b/devscripts/gh-pages/add-version.py
@@ -22,9 +22,9 @@
  new_version = {}
  
  filenames = {
-    'bin': 'youtube-dl',
-    'exe': 'youtube-dl.exe',
-    'tar': 'youtube-dl-%s.tar.gz' % version}
+    'bin': 'youtube-dlc',
+    'exe': 'youtube-dlc.exe',
+    'tar': 'youtube-dlc-%s.tar.gz' % version}
  build_dir = os.path.join('..', '..', 'build', version)
  for key, filename in filenames.items():
      url = 'https://yt-dl.org/downloads/%s/%s' % (version, filename)
diff --git a/devscripts/gh-pages/update-feed.py b/devscripts/gh-pages/update-feed.py

index 506a623772e0c2195f6b3692575f5942e84c046c..b07f1e830c6191710ab3ea67d5a1c2b0e49bf54f 100755 (executable)
--- a/devscripts/gh-pages/update-feed.py
+++ b/devscripts/gh-pages/update-feed.py
@@ -11,24 +11,24 @@
      <?xml version="1.0" encoding="utf-8"?>
      <feed xmlns="http://www.w3.org/2005/Atom">
          <link rel="self" href="http://ytdl-org.github.io/youtube-dl/update/releases.atom" />
-        <title>youtube-dl releases</title>
-        <id>https://yt-dl.org/feed/youtube-dl-updates-feed</id>
+        <title>youtube-dlc releases</title>
+        <id>https://yt-dl.org/feed/youtube-dlc-updates-feed</id>
          <updated>@TIMESTAMP@</updated>
          @ENTRIES@
      </feed>""")
  
  entry_template = textwrap.dedent("""
      <entry>
-        <id>https://yt-dl.org/feed/youtube-dl-updates-feed/youtube-dl-@VERSION@</id>
+        <id>https://yt-dl.org/feed/youtube-dlc-updates-feed/youtube-dlc-@VERSION@</id>
          <title>New version @VERSION@</title>
-        <link href="http://ytdl-org.github.io/youtube-dl" />
+        <link href="http://ytdl-org.github.io/youtube-dlc" />
          <content type="xhtml">
              <div xmlns="http://www.w3.org/1999/xhtml">
                  Downloads available at <a href="https://yt-dl.org/downloads/@VERSION@/">https://yt-dl.org/downloads/@VERSION@/</a>
              </div>
          </content>
          <author>
-            <name>The youtube-dl maintainers</name>
+            <name>The youtube-dlc maintainers</name>
          </author>
          <updated>@TIMESTAMP@</updated>
      </entry>
diff --git a/devscripts/gh-pages/update-sites.py b/devscripts/gh-pages/update-sites.py

index 531c93c7089c1847a7e9018fcda5ca177f68547e..38acb5d9a2df82b9b5e892bb85d02b416f15c36c 100755 (executable)
--- a/devscripts/gh-pages/update-sites.py
+++ b/devscripts/gh-pages/update-sites.py
@@ -5,10 +5,10 @@
  import os
  import textwrap
  
-# We must be able to import youtube_dl
+# We must be able to import youtube_dlc
  sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))))
  
-import youtube_dl
+import youtube_dlc
  
  
  def main():
@@ -16,7 +16,7 @@ def main():
          template = tmplf.read()
  
      ie_htmls = []
-    for ie in youtube_dl.list_extractors(age_limit=None):
+    for ie in youtube_dlc.list_extractors(age_limit=None):
          ie_html = '<b>{}</b>'.format(ie.IE_NAME)
          ie_desc = getattr(ie, 'IE_DESC', None)
          if ie_desc is False:
diff --git a/devscripts/make_contributing.py b/devscripts/make_contributing.py

index 226d1a5d6644953982db6346a00a21ec45f9b089..80426fb0ace3f76ed2158ca7dd0cef0b1cef1358 100755 (executable)
--- a/devscripts/make_contributing.py
+++ b/devscripts/make_contributing.py
@@ -1,9 +1,9 @@
  #!/usr/bin/env python
  from __future__ import unicode_literals
  
-import io
+# import io
  import optparse
-import re
+# import re
  
  
  def main():
@@ -12,22 +12,22 @@ def main():
      if len(args) != 2:
          parser.error('Expected an input and an output filename')
  
-    infile, outfile = args
+
+"""     infile, outfile = args
  
      with io.open(infile, encoding='utf-8') as inf:
          readme = inf.read()
  
-    bug_text = re.search(
-        r'(?s)#\s*BUGS\s*[^\n]*\s*(.*?)#\s*COPYRIGHT', readme).group(1)
-    dev_text = re.search(
-        r'(?s)(#\s*DEVELOPER INSTRUCTIONS.*?)#\s*EMBEDDING YOUTUBE-DL',
-        readme).group(1)
+    bug_text = re.search( """
+# r'(?s)#\s*BUGS\s*[^\n]*\s*(.*?)#\s*COPYRIGHT', readme).group(1)
+# dev_text = re.search(
+# r'(?s)(#\s*DEVELOPER INSTRUCTIONS.*?)#\s*EMBEDDING youtube-dlc',
+"""         readme).group(1)
  
      out = bug_text + dev_text
  
      with io.open(outfile, 'w', encoding='utf-8') as outf:
-        outf.write(out)
-
+        outf.write(out) """
  
  if __name__ == '__main__':
      main()
diff --git a/devscripts/make_issue_template.py b/devscripts/make_issue_template.py

index b7ad23d8363005c7f62056822219d7772e1d91d5..37cb0d4ee11987ae690b5f61306b5de1517f0a02 100644 (file)
--- a/devscripts/make_issue_template.py
+++ b/devscripts/make_issue_template.py
@@ -16,9 +16,9 @@ def main():
      with io.open(infile, encoding='utf-8') as inf:
          issue_template_tmpl = inf.read()
  
-    # Get the version from youtube_dl/version.py without importing the package
-    exec(compile(open('youtube_dl/version.py').read(),
-                 'youtube_dl/version.py', 'exec'))
+    # Get the version from youtube_dlc/version.py without importing the package
+    exec(compile(open('youtube_dlc/version.py').read(),
+                 'youtube_dlc/version.py', 'exec'))
  
      out = issue_template_tmpl % {'version': locals()['__version__']}
  
diff --git a/devscripts/make_lazy_extractors.py b/devscripts/make_lazy_extractors.py

index 0a1762dbce85adf9049529ec15bdb61510787d0a..e6de72b33a2ad4e5bf3b8a2153b16119bf106634 100644 (file)
--- a/devscripts/make_lazy_extractors.py
+++ b/devscripts/make_lazy_extractors.py
@@ -14,8 +14,8 @@
  if os.path.exists(lazy_extractors_filename):
      os.remove(lazy_extractors_filename)
  
-from youtube_dl.extractor import _ALL_CLASSES
-from youtube_dl.extractor.common import InfoExtractor, SearchInfoExtractor
+from youtube_dlc.extractor import _ALL_CLASSES
+from youtube_dlc.extractor.common import InfoExtractor, SearchInfoExtractor
  
  with open('devscripts/lazy_load_template.py', 'rt') as f:
      module_template = f.read()
diff --git a/devscripts/make_readme.py b/devscripts/make_readme.py

index 8fbce07967c177217f5d39162d9f0958f1d41bf5..73f203582aea4ef1f7f31576cfd11b50dc96ebfe 100755 (executable)
--- a/devscripts/make_readme.py
+++ b/devscripts/make_readme.py
@@ -14,7 +14,7 @@
      oldreadme = f.read()
  
  header = oldreadme[:oldreadme.index('# OPTIONS')]
-footer = oldreadme[oldreadme.index('# CONFIGURATION'):]
+# footer = oldreadme[oldreadme.index('# CONFIGURATION'):]
  
  options = helptext[helptext.index('  General Options:') + 19:]
  options = re.sub(r'(?m)^  (\w.+)$', r'## \1', options)
@@ -23,4 +23,4 @@
  with io.open(README_FILE, 'w', encoding='utf-8') as f:
      f.write(header)
      f.write(options)
-    f.write(footer)
+    # f.write(footer)
diff --git a/devscripts/make_supportedsites.py b/devscripts/make_supportedsites.py

index 764795bc5b1e560b033c2e9a0c395cecb10b1242..0ae6f8aa3092041bec374cee522cd5de4129ab77 100644 (file)
--- a/devscripts/make_supportedsites.py
+++ b/devscripts/make_supportedsites.py
@@ -7,10 +7,10 @@
  import sys
  
  
-# Import youtube_dl
+# Import youtube_dlc
  ROOT_DIR = os.path.join(os.path.dirname(__file__), '..')
  sys.path.insert(0, ROOT_DIR)
-import youtube_dl
+import youtube_dlc
  
  
  def main():
@@ -33,7 +33,7 @@ def gen_ies_md(ies):
                  ie_md += ' (Currently broken)'
              yield ie_md
  
-    ies = sorted(youtube_dl.gen_extractors(), key=lambda i: i.IE_NAME.lower())
+    ies = sorted(youtube_dlc.gen_extractors(), key=lambda i: i.IE_NAME.lower())
      out = '# Supported sites\n' + ''.join(
          ' - ' + md + '\n'
          for md in gen_ies_md(ies))
diff --git a/devscripts/prepare_manpage.py b/devscripts/prepare_manpage.py

index 76bf873e1bd70b7e5c3a20caf2ea80f0941a2dea..843ade482ef443ba0fb44dd4fdaaa88f41548b41 100644 (file)
--- a/devscripts/prepare_manpage.py
+++ b/devscripts/prepare_manpage.py
@@ -8,7 +8,7 @@
  ROOT_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
  README_FILE = os.path.join(ROOT_DIR, 'README.md')
  
-PREFIX = r'''%YOUTUBE-DL(1)
+PREFIX = r'''%youtube-dlc(1)
  
  # NAME
  
@@ -16,7 +16,7 @@
  
  # SYNOPSIS
  
-**youtube-dl** \[OPTIONS\] URL [URL...]
+**youtube-dlc** \[OPTIONS\] URL [URL...]
  
  '''
  
@@ -33,7 +33,7 @@ def main():
          readme = f.read()
  
      readme = re.sub(r'(?s)^.*?(?=# DESCRIPTION)', '', readme)
-    readme = re.sub(r'\s+youtube-dl \[OPTIONS\] URL \[URL\.\.\.\]', '', readme)
+    readme = re.sub(r'\s+youtube-dlc \[OPTIONS\] URL \[URL\.\.\.\]', '', readme)
      readme = PREFIX + readme
  
      readme = filter_options(readme)
diff --git a/devscripts/release.sh b/devscripts/release.sh

index f2411c92724f3df995d981f31eccb0a08d390cd0..04cb7fec1b6d185ddd126564804cf949a6f5dd63 100755 (executable)
--- a/devscripts/release.sh
+++ b/devscripts/release.sh
@@ -53,8 +53,8 @@ fi
  
  if [ ! -z "`git tag | grep "$version"`" ]; then echo 'ERROR: version already present'; exit 1; fi
  if [ ! -z "`git status --porcelain | grep -v CHANGELOG`" ]; then echo 'ERROR: the working directory is not clean; commit or stash changes'; exit 1; fi
-useless_files=$(find youtube_dl -type f -not -name '*.py')
-if [ ! -z "$useless_files" ]; then echo "ERROR: Non-.py files in youtube_dl: $useless_files"; exit 1; fi
+useless_files=$(find youtube_dlc -type f -not -name '*.py')
+if [ ! -z "$useless_files" ]; then echo "ERROR: Non-.py files in youtube_dlc: $useless_files"; exit 1; fi
  if [ ! -f "updates_key.pem" ]; then echo 'ERROR: updates_key.pem missing'; exit 1; fi
  if ! type pandoc >/dev/null 2>/dev/null; then echo 'ERROR: pandoc is missing'; exit 1; fi
  if ! python3 -c 'import rsa' 2>/dev/null; then echo 'ERROR: python3-rsa is missing'; exit 1; fi
@@ -68,18 +68,18 @@ make clean
  if $skip_tests ; then
      echo 'SKIPPING TESTS'
  else
-    nosetests --verbose --with-coverage --cover-package=youtube_dl --cover-html test --stop || exit 1
+    nosetests --verbose --with-coverage --cover-package=youtube_dlc --cover-html test --stop || exit 1
  fi
  
  /bin/echo -e "\n### Changing version in version.py..."
-sed -i "s/__version__ = '.*'/__version__ = '$version'/" youtube_dl/version.py
+sed -i "s/__version__ = '.*'/__version__ = '$version'/" youtube_dlc/version.py
  
  /bin/echo -e "\n### Changing version in ChangeLog..."
  sed -i "s/<unreleased>/$version/" ChangeLog
  
-/bin/echo -e "\n### Committing documentation, templates and youtube_dl/version.py..."
+/bin/echo -e "\n### Committing documentation, templates and youtube_dlc/version.py..."
  make README.md CONTRIBUTING.md issuetemplates supportedsites
-git add README.md CONTRIBUTING.md .github/ISSUE_TEMPLATE/1_broken_site.md .github/ISSUE_TEMPLATE/2_site_support_request.md .github/ISSUE_TEMPLATE/3_site_feature_request.md .github/ISSUE_TEMPLATE/4_bug_report.md .github/ISSUE_TEMPLATE/5_feature_request.md .github/ISSUE_TEMPLATE/6_question.md docs/supportedsites.md youtube_dl/version.py ChangeLog
+git add README.md CONTRIBUTING.md .github/ISSUE_TEMPLATE/1_broken_site.md .github/ISSUE_TEMPLATE/2_site_support_request.md .github/ISSUE_TEMPLATE/3_site_feature_request.md .github/ISSUE_TEMPLATE/4_bug_report.md .github/ISSUE_TEMPLATE/5_feature_request.md .github/ISSUE_TEMPLATE/6_question.md docs/supportedsites.md youtube_dlc/version.py ChangeLog
  git commit $gpg_sign_commits -m "release $version"
  
  /bin/echo -e "\n### Now tagging, signing and pushing..."
@@ -94,13 +94,13 @@ git push origin "$version"
  
  /bin/echo -e "\n### OK, now it is time to build the binaries..."
  REV=$(git rev-parse HEAD)
-make youtube-dl youtube-dl.tar.gz
+make youtube-dlc youtube-dlc.tar.gz
  read -p "VM running? (y/n) " -n 1
-wget "http://$buildserver/build/ytdl-org/youtube-dl/youtube-dl.exe?rev=$REV" -O youtube-dl.exe
+wget "http://$buildserver/build/ytdl-org/youtube-dl/youtube-dlc.exe?rev=$REV" -O youtube-dlc.exe
  mkdir -p "build/$version"
-mv youtube-dl youtube-dl.exe "build/$version"
-mv youtube-dl.tar.gz "build/$version/youtube-dl-$version.tar.gz"
-RELEASE_FILES="youtube-dl youtube-dl.exe youtube-dl-$version.tar.gz"
+mv youtube-dlc youtube-dlc.exe "build/$version"
+mv youtube-dlc.tar.gz "build/$version/youtube-dlc-$version.tar.gz"
+RELEASE_FILES="youtube-dlc youtube-dlc.exe youtube-dlc-$version.tar.gz"
  (cd build/$version/ && md5sum $RELEASE_FILES > MD5SUMS)
  (cd build/$version/ && sha1sum $RELEASE_FILES > SHA1SUMS)
  (cd build/$version/ && sha256sum $RELEASE_FILES > SHA2-256SUMS)
diff --git a/devscripts/show-downloads-statistics.py b/devscripts/show-downloads-statistics.py

index 6c8d1cc2d29219997de1e216bae08de2ef8a02fc..ef90a56ab2e01a9dfc71267fd242fdcc3f70bfa0 100644 (file)
--- a/devscripts/show-downloads-statistics.py
+++ b/devscripts/show-downloads-statistics.py
@@ -9,11 +9,11 @@
  
  sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
  
-from youtube_dl.compat import (
+from youtube_dlc.compat import (
      compat_print,
      compat_urllib_request,
  )
-from youtube_dl.utils import format_bytes
+from youtube_dlc.utils import format_bytes
  
  
  def format_size(bytes):
@@ -36,9 +36,9 @@ def format_size(bytes):
              asset_name = asset['name']
              total_bytes += asset['download_count'] * asset['size']
              if all(not re.match(p, asset_name) for p in (
-                    r'^youtube-dl$',
-                    r'^youtube-dl-\d{4}\.\d{2}\.\d{2}(?:\.\d+)?\.tar\.gz$',
-                    r'^youtube-dl\.exe$')):
+                    r'^youtube-dlc$',
+                    r'^youtube-dlc-\d{4}\.\d{2}\.\d{2}(?:\.\d+)?\.tar\.gz$',
+                    r'^youtube-dlc\.exe$')):
                  continue
              compat_print(
                  ' %s size: %s downloads: %d'
diff --git a/devscripts/zsh-completion.in b/devscripts/zsh-completion.in

index b394a1ae7447797273cda4178de08b27ced26735..bb021862fbec13415e8db41b755e5796ea74e741 100644 (file)
--- a/devscripts/zsh-completion.in
+++ b/devscripts/zsh-completion.in
@@ -1,6 +1,6 @@
-#compdef youtube-dl
+#compdef youtube-dlc
  
-__youtube_dl() {
+__youtube_dlc() {
      local curcontext="$curcontext" fileopts diropts cur prev
      typeset -A opt_args
      fileopts="{{fileopts}}"
@@ -25,4 +25,4 @@ __youtube_dl() {
      esac
  }
  
-__youtube_dl
-\ No newline at end of file
+__youtube_dlc
+\ No newline at end of file
diff --git a/devscripts/zsh-completion.py b/devscripts/zsh-completion.py

index 60aaf76cc3297adc6e80984890e33e4267b95c2b..8b957144f9c1dbb53328865e12e74c1780b57000 100755 (executable)
--- a/devscripts/zsh-completion.py
+++ b/devscripts/zsh-completion.py
@@ -6,9 +6,9 @@
  import sys
  
  sys.path.insert(0, dirn(dirn((os.path.abspath(__file__)))))
-import youtube_dl
+import youtube_dlc
  
-ZSH_COMPLETION_FILE = "youtube-dl.zsh"
+ZSH_COMPLETION_FILE = "youtube-dlc.zsh"
  ZSH_COMPLETION_TEMPLATE = "devscripts/zsh-completion.in"
  
  
@@ -45,5 +45,5 @@ def build_completion(opt_parser):
          f.write(template)
  
  
-parser = youtube_dl.parseOpts()[0]
+parser = youtube_dlc.parseOpts()[0]
  build_completion(parser)
diff --git a/docs/Makefile b/docs/Makefile

index 712218045524bd65fe4e36870fee67ee8557ae9d..a7159ff4595eed29dab5f524b9f6064dc918b9c7 100644 (file)
--- a/docs/Makefile
+++ b/docs/Makefile
@@ -85,17 +85,17 @@ qthelp:
         @echo
         @echo "Build finished; now you can run "qcollectiongenerator" with the" \
               ".qhcp project file in $(BUILDDIR)/qthelp, like this:"
-       @echo "# qcollectiongenerator $(BUILDDIR)/qthelp/youtube-dl.qhcp"
+       @echo "# qcollectiongenerator $(BUILDDIR)/qthelp/youtube-dlc.qhcp"
         @echo "To view the help file:"
-       @echo "# assistant -collectionFile $(BUILDDIR)/qthelp/youtube-dl.qhc"
+       @echo "# assistant -collectionFile $(BUILDDIR)/qthelp/youtube-dlc.qhc"
  
  devhelp:
         $(SPHINXBUILD) -b devhelp $(ALLSPHINXOPTS) $(BUILDDIR)/devhelp
         @echo
         @echo "Build finished."
         @echo "To view the help file:"
-       @echo "# mkdir -p $$HOME/.local/share/devhelp/youtube-dl"
-       @echo "# ln -s $(BUILDDIR)/devhelp $$HOME/.local/share/devhelp/youtube-dl"
+       @echo "# mkdir -p $$HOME/.local/share/devhelp/youtube-dlc"
+       @echo "# ln -s $(BUILDDIR)/devhelp $$HOME/.local/share/devhelp/youtube-dlc"
         @echo "# devhelp"
  
  epub:
diff --git a/docs/conf.py b/docs/conf.py

index 0aaf1b8fcf8220301d63250e83cb1587b618388c..fa616ebbb6e19daa6f5d240614a5b89f2ac3fd00 100644 (file)
--- a/docs/conf.py
+++ b/docs/conf.py
@@ -1,6 +1,6 @@
  # coding: utf-8
  #
-# youtube-dl documentation build configuration file, created by
+# youtube-dlc documentation build configuration file, created by
  # sphinx-quickstart on Fri Mar 14 21:05:43 2014.
  #
  # This file is execfile()d with the current directory set to its
@@ -14,7 +14,7 @@
  
  import sys
  import os
-# Allows to import youtube_dl
+# Allows to import youtube_dlc
  sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
  
  # -- General configuration ------------------------------------------------
@@ -36,7 +36,7 @@
  master_doc = 'index'
  
  # General information about the project.
-project = u'youtube-dl'
+project = u'youtube-dlc'
  copyright = u'2014, Ricardo Garcia Gonzalez'
  
  # The version info for the project you're documenting, acts as replacement for
@@ -44,7 +44,7 @@
  # built documents.
  #
  # The short X.Y version.
-from youtube_dl.version import __version__
+from youtube_dlc.version import __version__
  version = __version__
  # The full version, including alpha/beta/rc tags.
  release = version
@@ -68,4 +68,4 @@
  html_static_path = ['_static']
  
  # Output file base name for HTML help builder.
-htmlhelp_basename = 'youtube-dldoc'
+htmlhelp_basename = 'youtube-dlcdoc'
diff --git a/docs/index.rst b/docs/index.rst

index b746ff95baf6679b93a5836a305dfb6fe7a92000..afa26fef1e7fe0df485be5933f04e8fa5bb45675 100644 (file)
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -1,13 +1,13 @@
-Welcome to youtube-dl's documentation!
+Welcome to youtube-dlc's documentation!
  ======================================
  
-*youtube-dl* is a command-line program to download videos from YouTube.com and more sites.
+*youtube-dlc* is a command-line program to download videos from YouTube.com and more sites.
  It can also be used in Python code.
  
  Developer guide
  ---------------
  
-This section contains information for using *youtube-dl* from Python programs.
+This section contains information for using *youtube-dlc* from Python programs.
  
  .. toctree::
      :maxdepth: 2
diff --git a/docs/module_guide.rst b/docs/module_guide.rst

index 03d72882e02f04866db41ad2b0bcb573b8c9f1dc..6413659cfdcf54b44fb6928e8fc58aa948496945 100644 (file)
--- a/docs/module_guide.rst
+++ b/docs/module_guide.rst
@@ -1,11 +1,11 @@
-Using the ``youtube_dl`` module
+Using the ``youtube_dlc`` module
  ===============================
  
-When using the ``youtube_dl`` module, you start by creating an instance of :class:`YoutubeDL` and adding all the available extractors:
+When using the ``youtube_dlc`` module, you start by creating an instance of :class:`YoutubeDL` and adding all the available extractors:
  
  .. code-block:: python
  
-    >>> from youtube_dl import YoutubeDL
+    >>> from youtube_dlc import YoutubeDL
      >>> ydl = YoutubeDL()
      >>> ydl.add_default_info_extractors()
  
@@ -22,7 +22,7 @@ You use the :meth:`YoutubeDL.extract_info` method for getting the video informat
      [youtube] BaW_jenozKc: Downloading video info webpage
      [youtube] BaW_jenozKc: Extracting video information
      >>> info['title']
-    'youtube-dl test video "\'/\\ä↭𝕐'
+    'youtube-dlc test video "\'/\\ä↭𝕐'
      >>> info['height'], info['width']
      (720, 1280)
  
diff --git a/docs/supportedsites.md b/docs/supportedsites.md

index 2744dfca846d59520e8944231d569e0d1744ad2c..70f1bd8c2e632d4853347c31ea85ed6feff860c0 100644 (file)
--- a/docs/supportedsites.md
+++ b/docs/supportedsites.md
@@ -6,7 +6,6 @@ # Supported sites
   - **23video**
   - **24video**
   - **3qsdn**: 3Q SDN
- - **3sat**
   - **4tube**
   - **56.com**
   - **5min**
@@ -28,10 +27,11 @@ # Supported sites
   - **acast:channel**
   - **ADN**: Anime Digital Network
   - **AdobeConnect**
- - **AdobeTV**
- - **AdobeTVChannel**
- - **AdobeTVShow**
- - **AdobeTVVideo**
+ - **adobetv**
+ - **adobetv:channel**
+ - **adobetv:embed**
+ - **adobetv:show**
+ - **adobetv:video**
   - **AdultSwim**
   - **aenetworks**: A+E Networks: A&E, Lifetime, History.com, FYI Network and History Vault
   - **afreecatv**: afreecatv.com
@@ -97,6 +97,7 @@ # Supported sites
   - **BiliBili**
   - **BilibiliAudio**
   - **BilibiliAudioAlbum**
+ - **BiliBiliPlayer**
   - **BioBioChileTV**
   - **BIQLE**
   - **BitChute**
@@ -222,6 +223,7 @@ # Supported sites
   - **Disney**
   - **dlive:stream**
   - **dlive:vod**
+ - **DoodStream**
   - **Dotsub**
   - **DouyuShow**
   - **DouyuTV**: 斗鱼
@@ -350,6 +352,7 @@ # Supported sites
   - **hotstar:playlist**
   - **Howcast**
   - **HowStuffWorks**
+ - **hrfernsehen**
   - **HRTi**
   - **HRTiPlaylist**
   - **Huajiao**: 花椒直播
@@ -388,7 +391,6 @@ # Supported sites
   - **JeuxVideo**
   - **Joj**
   - **Jove**
- - **jpopsuki.tv**
   - **JWPlatform**
   - **Kakao**
   - **Kaltura**
@@ -396,6 +398,7 @@ # Supported sites
   - **Kankan**
   - **Karaoketv**
   - **KarriereVideos**
+ - **Katsomo**
   - **KeezMovies**
   - **Ketnet**
   - **KhanAcademy**
@@ -403,7 +406,6 @@ # Supported sites
   - **KinjaEmbed**
   - **KinoPoisk**
   - **KonserthusetPlay**
- - **kontrtube**: KontrTube.ru - Труба зовёт
   - **KrasView**: Красвью
   - **Ku6**
   - **KUSI**
@@ -496,6 +498,7 @@ # Supported sites
   - **MNetTV**
   - **MoeVideo**: LetitBit video services: moevideo.net, playreplay.net and videochart.net
   - **Mofosex**
+ - **MofosexEmbed**
   - **Mojvideo**
   - **Morningstar**: morningstar.com
   - **Motherless**
@@ -513,7 +516,6 @@ # Supported sites
   - **mtvjapan**
   - **mtvservices:embedded**
   - **MuenchenTV**: münchen.tv
- - **MusicPlayOn**
   - **mva**: Microsoft Virtual Academy videos
   - **mva:course**: Microsoft Virtual Academy courses
   - **Mwave**
@@ -619,16 +621,25 @@ # Supported sites
   - **Ooyala**
   - **OoyalaExternal**
   - **OraTV**
+ - **orf:burgenland**: Radio Burgenland
   - **orf:fm4**: radio FM4
   - **orf:fm4:story**: fm4.orf.at stories
   - **orf:iptv**: iptv.ORF.at
+ - **orf:kaernten**: Radio Kärnten
+ - **orf:noe**: Radio Niederösterreich
+ - **orf:oberoesterreich**: Radio Oberösterreich
   - **orf:oe1**: Radio Österreich 1
+ - **orf:oe3**: Radio Österreich 3
+ - **orf:salzburg**: Radio Salzburg
+ - **orf:steiermark**: Radio Steiermark
+ - **orf:tirol**: Radio Tirol
   - **orf:tvthek**: ORF TVthek
+ - **orf:vorarlberg**: Radio Vorarlberg
+ - **orf:wien**: Radio Wien
   - **OsnatelTV**
   - **OutsideTV**
   - **PacktPub**
   - **PacktPubCourse**
- - **PandaTV**: 熊猫TV
   - **pandora.tv**: 판도라TV
   - **ParamountNetwork**
   - **parliamentlive.tv**: UK parliament videos
@@ -662,8 +673,10 @@ # Supported sites
   - **plus.google**: Google Plus
   - **podomatic**
   - **Pokemon**
+ - **PokemonWatch**
   - **PolskieRadio**
   - **PolskieRadioCategory**
+ - **Popcorntimes**
   - **PopcornTV**
   - **PornCom**
   - **PornerBros**
@@ -761,6 +774,7 @@ # Supported sites
   - **screen.yahoo:search**: Yahoo screen search
   - **Screencast**
   - **ScreencastOMatic**
+ - **ScrippsNetworks**
   - **scrippsnetworks:watch**
   - **SCTE**
   - **SCTECourse**
@@ -823,6 +837,9 @@ # Supported sites
   - **stanfordoc**: Stanford Open ClassRoom
   - **Steam**
   - **Stitcher**
+ - **StoryFire**
+ - **StoryFireSeries**
+ - **StoryFireUser**
   - **Streamable**
   - **streamcloud.eu**
   - **StreamCZ**
@@ -913,6 +930,7 @@ # Supported sites
   - **tv2.hu**
   - **TV2Article**
   - **TV2DK**
+ - **TV2DKBornholmPlay**
   - **TV4**: tv4.se and tv4play.se
   - **TV5MondePlus**: TV5MONDE+
   - **TVA**
@@ -937,16 +955,13 @@ # Supported sites
   - **TVPlayHome**
   - **Tweakers**
   - **TwitCasting**
- - **twitch:chapter**
   - **twitch:clips**
- - **twitch:profile**
   - **twitch:stream**
- - **twitch:video**
- - **twitch:videos:all**
- - **twitch:videos:highlights**
- - **twitch:videos:past-broadcasts**
- - **twitch:videos:uploads**
   - **twitch:vod**
+ - **TwitchCollection**
+ - **TwitchVideos**
+ - **TwitchVideosClips**
+ - **TwitchVideosCollections**
   - **twitter**
   - **twitter:amplify**
   - **twitter:broadcast**
@@ -954,6 +969,7 @@ # Supported sites
   - **udemy**
   - **udemy:course**
   - **UDNEmbed**: 聯合影音
+ - **UFCArabia**
   - **UFCTV**
   - **UKTVPlay**
   - **umg:de**: Universal Music Deutschland
@@ -993,7 +1009,6 @@ # Supported sites
   - **videomore**
   - **videomore:season**
   - **videomore:video**
- - **VideoPremium**
   - **VideoPress**
   - **Vidio**
   - **VidLii**
@@ -1003,8 +1018,8 @@ # Supported sites
   - **Vidzi**
   - **vier**: vier.be and vijf.be
   - **vier:videos**
- - **ViewLift**
- - **ViewLiftEmbed**
+ - **viewlift**
+ - **viewlift:embed**
   - **Viidea**
   - **viki**
   - **viki:channel**
@@ -1137,7 +1152,7 @@ # Supported sites
   - **Zaq1**
   - **Zattoo**
   - **ZattooLive**
- - **ZDF**
+ - **ZDF-3sat**
   - **ZDFChannel**
   - **zingmp3**: mp3.zing.vn
   - **Zype**
diff --git a/make_win.bat b/make_win.bat

new file mode 100644 (file)

index 0000000..a63130f
--- /dev/null
+++ b/make_win.bat
@@ -0,0 +1 @@
+pyinstaller.exe youtube_dlc\__main__.py --onefile --name youtube-dlc --version-file win\ver.txt --icon win\icon\cloud.ico
+\ No newline at end of file
diff --git a/setup.cfg b/setup.cfg

index da78a9c471d548a01711d37012e209f421134a37..f658aaa0ace11ac7599e26fac1c6138bea0d18aa 100644 (file)
--- a/setup.cfg
+++ b/setup.cfg
@@ -2,5 +2,5 @@
  universal = True
  
  [flake8]
-exclude = youtube_dl/extractor/__init__.py,devscripts/buildserver.py,devscripts/lazy_load_template.py,devscripts/make_issue_template.py,setup.py,build,.git,venv
+exclude = youtube_dlc/extractor/__init__.py,devscripts/buildserver.py,devscripts/lazy_load_template.py,devscripts/make_issue_template.py,setup.py,build,.git,venv
  ignore = E402,E501,E731,E741,W503
diff --git a/setup.py b/setup.py

index af68b485ef787f217fab474fbadbba2408707dc6..f5f0bae62401cecaa63d26ff7c72982b9812f182 100644 (file)
--- a/setup.py
+++ b/setup.py
@@ -1,68 +1,27 @@
  #!/usr/bin/env python
  # coding: utf-8
  
-from __future__ import print_function
-
+from setuptools import setup, Command, find_packages
  import os.path
  import warnings
  import sys
-
-try:
-    from setuptools import setup, Command
-    setuptools_available = True
-except ImportError:
-    from distutils.core import setup, Command
-    setuptools_available = False
  from distutils.spawn import spawn
  
-try:
-    # This will create an exe that needs Microsoft Visual C++ 2008
-    # Redistributable Package
-    import py2exe
-except ImportError:
-    if len(sys.argv) >= 2 and sys.argv[1] == 'py2exe':
-        print('Cannot import py2exe', file=sys.stderr)
-        exit(1)
-
-py2exe_options = {
-    'bundle_files': 1,
-    'compressed': 1,
-    'optimize': 2,
-    'dist_dir': '.',
-    'dll_excludes': ['w9xpopen.exe', 'crypt32.dll'],
-}
-
-# Get the version from youtube_dl/version.py without importing the package
-exec(compile(open('youtube_dl/version.py').read(),
-             'youtube_dl/version.py', 'exec'))
-
-DESCRIPTION = 'YouTube video downloader'
-LONG_DESCRIPTION = 'Command-line program to download videos from YouTube.com and other video sites'
+# Get the version from youtube_dlc/version.py without importing the package
+exec(compile(open('youtube_dlc/version.py').read(),
+             'youtube_dlc/version.py', 'exec'))
  
-py2exe_console = [{
-    'script': './youtube_dl/__main__.py',
-    'dest_base': 'youtube-dl',
-    'version': __version__,
-    'description': DESCRIPTION,
-    'comments': LONG_DESCRIPTION,
-    'product_name': 'youtube-dl',
-    'product_version': __version__,
-}]
-
-py2exe_params = {
-    'console': py2exe_console,
-    'options': {'py2exe': py2exe_options},
-    'zipfile': None
-}
+DESCRIPTION = 'Media downloader supporting various sites such as youtube'
+LONG_DESCRIPTION = 'Command-line program to download videos from YouTube.com and other video sites. Based on a more active community fork.'
  
  if len(sys.argv) >= 2 and sys.argv[1] == 'py2exe':
-    params = py2exe_params
+    print("inv")
  else:
      files_spec = [
-        ('etc/bash_completion.d', ['youtube-dl.bash-completion']),
-        ('etc/fish/completions', ['youtube-dl.fish']),
-        ('share/doc/youtube_dl', ['README.txt']),
-        ('share/man/man1', ['youtube-dl.1'])
+        ('etc/bash_completion.d', ['youtube-dlc.bash-completion']),
+        ('etc/fish/completions', ['youtube-dlc.fish']),
+        ('share/doc/youtube_dlc', ['README.txt']),
+        ('share/man/man1', ['youtube-dlc.1'])
      ]
      root = os.path.dirname(os.path.abspath(__file__))
      data_files = []
@@ -78,10 +37,10 @@
      params = {
          'data_files': data_files,
      }
-    if setuptools_available:
-        params['entry_points'] = {'console_scripts': ['youtube-dl = youtube_dl:main']}
-    else:
-        params['scripts'] = ['bin/youtube-dl']
+    #if setuptools_available:
+    params['entry_points'] = {'console_scripts': ['youtube-dlc = youtube_dlc:main']}
+    #else:
+    #    params['scripts'] = ['bin/youtube-dlc']
  
  class build_lazy_extractors(Command):
      description = 'Build the extractor lazy loading module'
@@ -95,54 +54,50 @@ def finalize_options(self):
  
      def run(self):
          spawn(
-            [sys.executable, 'devscripts/make_lazy_extractors.py', 'youtube_dl/extractor/lazy_extractors.py'],
+            [sys.executable, 'devscripts/make_lazy_extractors.py', 'youtube_dlc/extractor/lazy_extractors.py'],
              dry_run=self.dry_run,
          )
  
  setup(
-    name='youtube_dl',
+    name="youtube_dlc",
      version=__version__,
+    maintainer="Tom-Oliver Heidel",
+    maintainer_email="theidel@uni-bremen.de",
      description=DESCRIPTION,
      long_description=LONG_DESCRIPTION,
-    url='https://github.com/ytdl-org/youtube-dl',
-    author='Ricardo Garcia',
-    author_email='ytdl@yt-dl.org',
-    maintainer='Sergey M.',
-    maintainer_email='dstftw@gmail.com',
-    license='Unlicense',
-    packages=[
-        'youtube_dl',
-        'youtube_dl.extractor', 'youtube_dl.downloader',
-        'youtube_dl.postprocessor'],
-
-    # Provokes warning on most systems (why?!)
-    # test_suite = 'nose.collector',
-    # test_requires = ['nosetest'],
-
+    # long_description_content_type="text/markdown",
+    url="https://github.com/blackjack4494/youtube-dlc",
+    packages=find_packages(exclude=("youtube_dl",)),
+       #packages=[
+    #    'youtube_dlc',
+    #    'youtube_dlc.extractor', 'youtube_dlc.downloader',
+    #    'youtube_dlc.postprocessor'],
      classifiers=[
-        'Topic :: Multimedia :: Video',
-        'Development Status :: 5 - Production/Stable',
-        'Environment :: Console',
-        'License :: Public Domain',
-        'Programming Language :: Python',
-        'Programming Language :: Python :: 2',
-        'Programming Language :: Python :: 2.6',
-        'Programming Language :: Python :: 2.7',
-        'Programming Language :: Python :: 3',
-        'Programming Language :: Python :: 3.2',
-        'Programming Language :: Python :: 3.3',
-        'Programming Language :: Python :: 3.4',
-        'Programming Language :: Python :: 3.5',
-        'Programming Language :: Python :: 3.6',
-        'Programming Language :: Python :: 3.7',
-        'Programming Language :: Python :: 3.8',
-        'Programming Language :: Python :: Implementation',
-        'Programming Language :: Python :: Implementation :: CPython',
-        'Programming Language :: Python :: Implementation :: IronPython',
-        'Programming Language :: Python :: Implementation :: Jython',
-        'Programming Language :: Python :: Implementation :: PyPy',
+           "Topic :: Multimedia :: Video",
+        "Development Status :: 5 - Production/Stable",
+        "Environment :: Console",
+        "Programming Language :: Python",
+        "Programming Language :: Python :: 2",
+        "Programming Language :: Python :: 2.6",
+        "Programming Language :: Python :: 2.7",
+        "Programming Language :: Python :: 3",
+        "Programming Language :: Python :: 3.2",
+        "Programming Language :: Python :: 3.3",
+        "Programming Language :: Python :: 3.4",
+        "Programming Language :: Python :: 3.5",
+        "Programming Language :: Python :: 3.6",
+        "Programming Language :: Python :: 3.7",
+        "Programming Language :: Python :: 3.8",
+        "Programming Language :: Python :: Implementation",
+        "Programming Language :: Python :: Implementation :: CPython",
+        "Programming Language :: Python :: Implementation :: IronPython",
+        "Programming Language :: Python :: Implementation :: Jython",
+        "Programming Language :: Python :: Implementation :: PyPy",
+        "License :: Public Domain",
+        "Operating System :: OS Independent",
      ],
-
-    cmdclass={'build_lazy_extractors': build_lazy_extractors},
+    python_requires='>=2.6',
+       
+       cmdclass={'build_lazy_extractors': build_lazy_extractors},
      **params
-)
+)
+\ No newline at end of file
diff --git a/test/helper.py b/test/helper.py

index e62aab11e777cca955bb8a7a2149d7216430dbca..f45818b0f124d7e5ecaf70818176bc27fa46be5d 100644 (file)
--- a/test/helper.py
+++ b/test/helper.py
@@ -10,13 +10,13 @@
  import ssl
  import sys
  
-import youtube_dl.extractor
-from youtube_dl import YoutubeDL
-from youtube_dl.compat import (
+import youtube_dlc.extractor
+from youtube_dlc import YoutubeDL
+from youtube_dlc.compat import (
      compat_os_name,
      compat_str,
  )
-from youtube_dl.utils import (
+from youtube_dlc.utils import (
      preferredencoding,
      write_string,
  )
@@ -90,7 +90,7 @@ def report_warning(self, message):
  
  
  def gettestcases(include_onlymatching=False):
-    for ie in youtube_dl.extractor.gen_extractors():
+    for ie in youtube_dlc.extractor.gen_extractors():
          for tc in ie.get_testcases(include_onlymatching):
              yield tc
  
diff --git a/test/test_InfoExtractor.py b/test/test_InfoExtractor.py

index 71f6608feae4a5bcad37e96d74e795e7535a5dc7..bdd01e41a3e767b0e75dfc23a204ecd7e727d535 100644 (file)
--- a/test/test_InfoExtractor.py
+++ b/test/test_InfoExtractor.py
@@ -10,10 +10,10 @@
  sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
  
  from test.helper import FakeYDL, expect_dict, expect_value, http_server_port
-from youtube_dl.compat import compat_etree_fromstring, compat_http_server
-from youtube_dl.extractor.common import InfoExtractor
-from youtube_dl.extractor import YoutubeIE, get_info_extractor
-from youtube_dl.utils import encode_data_uri, strip_jsonp, ExtractorError, RegexNotFoundError
+from youtube_dlc.compat import compat_etree_fromstring, compat_http_server
+from youtube_dlc.extractor.common import InfoExtractor
+from youtube_dlc.extractor import YoutubeIE, get_info_extractor
+from youtube_dlc.utils import encode_data_uri, strip_jsonp, ExtractorError, RegexNotFoundError
  import threading
  
  
diff --git a/test/test_YoutubeDL.py b/test/test_YoutubeDL.py

index ce96661716c42ae0bf9c6a8ccb9ddf48c715e0a2..6d02c2a54dbca10024ff75c7599ac2e267f63512 100644 (file)
--- a/test/test_YoutubeDL.py
+++ b/test/test_YoutubeDL.py
@@ -12,12 +12,12 @@
  import copy
  
  from test.helper import FakeYDL, assertRegexpMatches
-from youtube_dl import YoutubeDL
-from youtube_dl.compat import compat_str, compat_urllib_error
-from youtube_dl.extractor import YoutubeIE
-from youtube_dl.extractor.common import InfoExtractor
-from youtube_dl.postprocessor.common import PostProcessor
-from youtube_dl.utils import ExtractorError, match_filter_func
+from youtube_dlc import YoutubeDL
+from youtube_dlc.compat import compat_str, compat_urllib_error
+from youtube_dlc.extractor import YoutubeIE
+from youtube_dlc.extractor.common import InfoExtractor
+from youtube_dlc.postprocessor.common import PostProcessor
+from youtube_dlc.utils import ExtractorError, match_filter_func
  
  TEST_URL = 'http://localhost/sample.mp4'
  
@@ -816,11 +816,15 @@ def test_playlist_items_selection(self):
              'webpage_url': 'http://example.com',
          }
  
-        def get_ids(params):
+        def get_downloaded_info_dicts(params):
              ydl = YDL(params)
-            # make a copy because the dictionary can be modified
-            ydl.process_ie_result(playlist.copy())
-            return [int(v['id']) for v in ydl.downloaded_info_dicts]
+            # make a deep copy because the dictionary and nested entries
+            # can be modified
+            ydl.process_ie_result(copy.deepcopy(playlist))
+            return ydl.downloaded_info_dicts
+
+        def get_ids(params):
+            return [int(v['id']) for v in get_downloaded_info_dicts(params)]
  
          result = get_ids({})
          self.assertEqual(result, [1, 2, 3, 4])
@@ -852,6 +856,22 @@ def get_ids(params):
          result = get_ids({'playlist_items': '2-4,3-4,3'})
          self.assertEqual(result, [2, 3, 4])
  
+        # Tests for https://github.com/ytdl-org/youtube-dl/issues/10591
+        # @{
+        result = get_downloaded_info_dicts({'playlist_items': '2-4,3-4,3'})
+        self.assertEqual(result[0]['playlist_index'], 2)
+        self.assertEqual(result[1]['playlist_index'], 3)
+
+        result = get_downloaded_info_dicts({'playlist_items': '2-4,3-4,3'})
+        self.assertEqual(result[0]['playlist_index'], 2)
+        self.assertEqual(result[1]['playlist_index'], 3)
+        self.assertEqual(result[2]['playlist_index'], 4)
+
+        result = get_downloaded_info_dicts({'playlist_items': '4,2'})
+        self.assertEqual(result[0]['playlist_index'], 4)
+        self.assertEqual(result[1]['playlist_index'], 2)
+        # @}
+
      def test_urlopen_no_file_protocol(self):
          # see https://github.com/ytdl-org/youtube-dl/issues/8227
          ydl = YDL()
diff --git a/test/test_YoutubeDLCookieJar.py b/test/test_YoutubeDLCookieJar.py

index f959798deb595165fddac6a8e555570c0420e454..615d8a9d882d751ad9d5640da971d1353320cf0e 100644 (file)
--- a/test/test_YoutubeDLCookieJar.py
+++ b/test/test_YoutubeDLCookieJar.py
@@ -10,7 +10,7 @@
  import unittest
  sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
  
-from youtube_dl.utils import YoutubeDLCookieJar
+from youtube_dlc.utils import YoutubeDLCookieJar
  
  
  class TestYoutubeDLCookieJar(unittest.TestCase):
@@ -39,6 +39,13 @@ def assert_cookie_has_value(key):
          assert_cookie_has_value('HTTPONLY_COOKIE')
          assert_cookie_has_value('JS_ACCESSIBLE_COOKIE')
  
+    def test_malformed_cookies(self):
+        cookiejar = YoutubeDLCookieJar('./test/testdata/cookies/malformed_cookies.txt')
+        cookiejar.load(ignore_discard=True, ignore_expires=True)
+        # Cookies should be empty since all malformed cookie file entries
+        # will be ignored
+        self.assertFalse(cookiejar._cookies)
+
  
  if __name__ == '__main__':
      unittest.main()
diff --git a/test/test_aes.py b/test/test_aes.py

index cc89fb6ab2770ebeb4b47b5458e20399ec37e765..ef1e1b189ccd31f88fef40339f2cdf9680860bad 100644 (file)
--- a/test/test_aes.py
+++ b/test/test_aes.py
@@ -8,8 +8,8 @@
  import unittest
  sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
  
-from youtube_dl.aes import aes_decrypt, aes_encrypt, aes_cbc_decrypt, aes_cbc_encrypt, aes_decrypt_text
-from youtube_dl.utils import bytes_to_intlist, intlist_to_bytes
+from youtube_dlc.aes import aes_decrypt, aes_encrypt, aes_cbc_decrypt, aes_cbc_encrypt, aes_decrypt_text
+from youtube_dlc.utils import bytes_to_intlist, intlist_to_bytes
  import base64
  
  # the encrypted data can be generate with 'devscripts/generate_aes_testdata.py'
diff --git a/test/test_age_restriction.py b/test/test_age_restriction.py

index 6f5513faa2c5551ce3f3c9d2967a55b28adad858..b73bdd7674cf79f40cb54792e007d806ab968af7 100644 (file)
--- a/test/test_age_restriction.py
+++ b/test/test_age_restriction.py
@@ -10,7 +10,7 @@
  from test.helper import try_rm
  
  
-from youtube_dl import YoutubeDL
+from youtube_dlc import YoutubeDL
  
  
  def _download_restricted(url, filename, age):
diff --git a/test/test_all_urls.py b/test/test_all_urls.py

index 81056a999d2014506b589ea4b31ab67951486f1b..7b6664cac31a0680944d7d85a2de9a1a7723313c 100644 (file)
--- a/test/test_all_urls.py
+++ b/test/test_all_urls.py
@@ -12,7 +12,7 @@
  
  from test.helper import gettestcases
  
-from youtube_dl.extractor import (
+from youtube_dlc.extractor import (
      FacebookIE,
      gen_extractors,
      YoutubeIE,
@@ -70,7 +70,7 @@ def test_youtube_show_matching(self):
  
      def test_youtube_search_matching(self):
          self.assertMatch('http://www.youtube.com/results?search_query=making+mustard', ['youtube:search_url'])
-        self.assertMatch('https://www.youtube.com/results?baz=bar&search_query=youtube-dl+test+video&filters=video&lclk=video', ['youtube:search_url'])
+        self.assertMatch('https://www.youtube.com/results?baz=bar&search_query=youtube-dlc+test+video&filters=video&lclk=video', ['youtube:search_url'])
  
      def test_youtube_extract(self):
          assertExtractId = lambda url, id: self.assertEqual(YoutubeIE.extract_id(url), id)
diff --git a/test/test_cache.py b/test/test_cache.py

index a161601420db336f949eacd68e2f846bcf8d3699..1167519d11ba1dae567567d030cbec31a54cd3ac 100644 (file)
--- a/test/test_cache.py
+++ b/test/test_cache.py
@@ -13,7 +13,7 @@
  
  
  from test.helper import FakeYDL
-from youtube_dl.cache import Cache
+from youtube_dlc.cache import Cache
  
  
  def _is_empty(d):
diff --git a/test/test_compat.py b/test/test_compat.py

index 86ff389fdfc560b987c2d82e98629a1aa7344002..8c49a001e5eabe6d058af188e1b566f7e3d58474 100644 (file)
--- a/test/test_compat.py
+++ b/test/test_compat.py
@@ -10,7 +10,7 @@
  sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
  
  
-from youtube_dl.compat import (
+from youtube_dlc.compat import (
      compat_getenv,
      compat_setenv,
      compat_etree_Element,
@@ -28,11 +28,11 @@
  class TestCompat(unittest.TestCase):
      def test_compat_getenv(self):
          test_str = 'тест'
-        compat_setenv('YOUTUBE_DL_COMPAT_GETENV', test_str)
-        self.assertEqual(compat_getenv('YOUTUBE_DL_COMPAT_GETENV'), test_str)
+        compat_setenv('youtube_dlc_COMPAT_GETENV', test_str)
+        self.assertEqual(compat_getenv('youtube_dlc_COMPAT_GETENV'), test_str)
  
      def test_compat_setenv(self):
-        test_var = 'YOUTUBE_DL_COMPAT_SETENV'
+        test_var = 'youtube_dlc_COMPAT_SETENV'
          test_str = 'тест'
          compat_setenv(test_var, test_str)
          compat_getenv(test_var)
@@ -46,11 +46,11 @@ def test_compat_expanduser(self):
          compat_setenv('HOME', old_home or '')
  
      def test_all_present(self):
-        import youtube_dl.compat
-        all_names = youtube_dl.compat.__all__
+        import youtube_dlc.compat
+        all_names = youtube_dlc.compat.__all__
          present_names = set(filter(
              lambda c: '_' in c and not c.startswith('_'),
-            dir(youtube_dl.compat))) - set(['unicode_literals'])
+            dir(youtube_dlc.compat))) - set(['unicode_literals'])
          self.assertEqual(all_names, sorted(present_names))
  
      def test_compat_urllib_parse_unquote(self):
diff --git a/test/test_download.py b/test/test_download.py

index ebe820dfc1990e4df6758795345375375402900b..bcd3b40417b8e0f2b17b0de10873aa528a96f276 100644 (file)
--- a/test/test_download.py
+++ b/test/test_download.py
@@ -24,24 +24,24 @@
  import json
  import socket
  
-import youtube_dl.YoutubeDL
-from youtube_dl.compat import (
+import youtube_dlc.YoutubeDL
+from youtube_dlc.compat import (
      compat_http_client,
      compat_urllib_error,
      compat_HTTPError,
  )
-from youtube_dl.utils import (
+from youtube_dlc.utils import (
      DownloadError,
      ExtractorError,
      format_bytes,
      UnavailableVideoError,
  )
-from youtube_dl.extractor import get_info_extractor
+from youtube_dlc.extractor import get_info_extractor
  
  RETRIES = 3
  
  
-class YoutubeDL(youtube_dl.YoutubeDL):
+class YoutubeDL(youtube_dlc.YoutubeDL):
      def __init__(self, *args, **kwargs):
          self.to_stderr = self.to_screen
          self.processed_info_dicts = []
@@ -92,7 +92,7 @@ def setUp(self):
  def generator(test_case, tname):
  
      def test_template(self):
-        ie = youtube_dl.extractor.get_info_extractor(test_case['name'])()
+        ie = youtube_dlc.extractor.get_info_extractor(test_case['name'])()
          other_ies = [get_info_extractor(ie_key)() for ie_key in test_case.get('add_ie', [])]
          is_playlist = any(k.startswith('playlist') for k in test_case)
          test_cases = test_case.get(
diff --git a/test/test_downloader_http.py b/test/test_downloader_http.py

index 7504722810b4e706f6b1143c7a36208ee0478749..c8e28bd3a3c51136077a1aa8adc11ec43c73b309 100644 (file)
--- a/test/test_downloader_http.py
+++ b/test/test_downloader_http.py
@@ -10,10 +10,10 @@
  sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
  
  from test.helper import http_server_port, try_rm
-from youtube_dl import YoutubeDL
-from youtube_dl.compat import compat_http_server
-from youtube_dl.downloader.http import HttpFD
-from youtube_dl.utils import encodeFilename
+from youtube_dlc import YoutubeDL
+from youtube_dlc.compat import compat_http_server
+from youtube_dlc.downloader.http import HttpFD
+from youtube_dlc.utils import encodeFilename
  import threading
  
  TEST_DIR = os.path.dirname(os.path.abspath(__file__))
diff --git a/test/test_execution.py b/test/test_execution.py

index 11661bb68148f4eb229b50c37f67dc744491c7df..b18e63d73f64e0c39066a39fe3c564c050389950 100644 (file)
--- a/test/test_execution.py
+++ b/test/test_execution.py
@@ -10,7 +10,7 @@
  import subprocess
  sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
  
-from youtube_dl.utils import encodeArgument
+from youtube_dlc.utils import encodeArgument
  
  rootDir = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
  
@@ -23,18 +23,18 @@
  
  class TestExecution(unittest.TestCase):
      def test_import(self):
-        subprocess.check_call([sys.executable, '-c', 'import youtube_dl'], cwd=rootDir)
+        subprocess.check_call([sys.executable, '-c', 'import youtube_dlc'], cwd=rootDir)
  
      def test_module_exec(self):
          if sys.version_info >= (2, 7):  # Python 2.6 doesn't support package execution
-            subprocess.check_call([sys.executable, '-m', 'youtube_dl', '--version'], cwd=rootDir, stdout=_DEV_NULL)
+            subprocess.check_call([sys.executable, '-m', 'youtube_dlc', '--version'], cwd=rootDir, stdout=_DEV_NULL)
  
      def test_main_exec(self):
-        subprocess.check_call([sys.executable, 'youtube_dl/__main__.py', '--version'], cwd=rootDir, stdout=_DEV_NULL)
+        subprocess.check_call([sys.executable, 'youtube_dlc/__main__.py', '--version'], cwd=rootDir, stdout=_DEV_NULL)
  
      def test_cmdline_umlauts(self):
          p = subprocess.Popen(
-            [sys.executable, 'youtube_dl/__main__.py', encodeArgument('ä'), '--version'],
+            [sys.executable, 'youtube_dlc/__main__.py', encodeArgument('ä'), '--version'],
              cwd=rootDir, stdout=_DEV_NULL, stderr=subprocess.PIPE)
          _, stderr = p.communicate()
          self.assertFalse(stderr)
diff --git a/test/test_http.py b/test/test_http.py

index 3ee0a5dda8df4446f915391e031f6d13da486150..55c3c6183d7671cbe619dbb9a641b33a97b4a4aa 100644 (file)
--- a/test/test_http.py
+++ b/test/test_http.py
@@ -9,8 +9,8 @@
  sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
  
  from test.helper import http_server_port
-from youtube_dl import YoutubeDL
-from youtube_dl.compat import compat_http_server, compat_urllib_request
+from youtube_dlc import YoutubeDL
+from youtube_dlc.compat import compat_http_server, compat_urllib_request
  import ssl
  import threading
  
diff --git a/test/test_iqiyi_sdk_interpreter.py b/test/test_iqiyi_sdk_interpreter.py

index 789059dbea38026362caea2be08f9d36796a7b1d..303609baa43085a08fe282cab8e5bba6e3b5ee11 100644 (file)
--- a/test/test_iqiyi_sdk_interpreter.py
+++ b/test/test_iqiyi_sdk_interpreter.py
@@ -9,7 +9,7 @@
  sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
  
  from test.helper import FakeYDL
-from youtube_dl.extractor import IqiyiIE
+from youtube_dlc.extractor import IqiyiIE
  
  
  class IqiyiIEWithCredentials(IqiyiIE):
diff --git a/test/test_jsinterp.py b/test/test_jsinterp.py

index c24b8ca742acc308ca9c455378564bbac053765d..97fc8d5aa88626f569126dbdba2a21828743cfdc 100644 (file)
--- a/test/test_jsinterp.py
+++ b/test/test_jsinterp.py
@@ -8,7 +8,7 @@
  import unittest
  sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
  
-from youtube_dl.jsinterp import JSInterpreter
+from youtube_dlc.jsinterp import JSInterpreter
  
  
  class TestJSInterpreter(unittest.TestCase):
diff --git a/test/test_netrc.py b/test/test_netrc.py

index 7cf3a6a2e672e6300f26dcb8060bebca869f0310..566ba37a643b94b1156a38e15936748c9b2320b8 100644 (file)
--- a/test/test_netrc.py
+++ b/test/test_netrc.py
@@ -7,7 +7,7 @@
  sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
  
  
-from youtube_dl.extractor import (
+from youtube_dlc.extractor import (
      gen_extractors,
  )
  
diff --git a/test/test_options.py b/test/test_options.py

index 3a25a6ba37ca2c7b409de640acb8fbf74dc03f0b..dce2533736525f5976469b53ac27b9a98ef38f5f 100644 (file)
--- a/test/test_options.py
+++ b/test/test_options.py
@@ -8,7 +8,7 @@
  import unittest
  sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
  
-from youtube_dl.options import _hide_login_info
+from youtube_dlc.options import _hide_login_info
  
  
  class TestOptions(unittest.TestCase):
diff --git a/test/test_postprocessors.py b/test/test_postprocessors.py

index 4209d1d9a0cefa96fc5ea9d26229f6ac44116996..6f538a3da0f823361f86e84f502e58f25540594e 100644 (file)
--- a/test/test_postprocessors.py
+++ b/test/test_postprocessors.py
@@ -8,7 +8,7 @@
  import unittest
  sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
  
-from youtube_dl.postprocessor import MetadataFromTitlePP
+from youtube_dlc.postprocessor import MetadataFromTitlePP
  
  
  class TestMetadataFromTitle(unittest.TestCase):
diff --git a/test/test_socks.py b/test/test_socks.py

index 1e68eb0daea04a466f07da4e790a1d25c88b5615..be52e2343e0fe28dd369f0cd4f35da0e4128f20b 100644 (file)
--- a/test/test_socks.py
+++ b/test/test_socks.py
@@ -15,7 +15,7 @@
      FakeYDL,
      get_params,
  )
-from youtube_dl.compat import (
+from youtube_dlc.compat import (
      compat_str,
      compat_urllib_request,
  )
diff --git a/test/test_subtitles.py b/test/test_subtitles.py

index 7d57a628e5ef79c5e12d13ccd0a2b515548ffa60..86e20cb4be444c4ab24924eb31e3fa7a266041fd 100644 (file)
--- a/test/test_subtitles.py
+++ b/test/test_subtitles.py
@@ -10,7 +10,7 @@
  from test.helper import FakeYDL, md5
  
  
-from youtube_dl.extractor import (
+from youtube_dlc.extractor import (
      YoutubeIE,
      DailymotionIE,
      TEDIE,
@@ -26,7 +26,6 @@
      ThePlatformIE,
      ThePlatformFeedIE,
      RTVEALaCartaIE,
-    FunnyOrDieIE,
      DemocracynowIE,
  )
  
@@ -65,8 +64,8 @@ def test_youtube_allsubtitles(self):
          self.DL.params['allsubtitles'] = True
          subtitles = self.getSubtitles()
          self.assertEqual(len(subtitles.keys()), 13)
-        self.assertEqual(md5(subtitles['en']), '3cb210999d3e021bd6c7f0ea751eab06')
-        self.assertEqual(md5(subtitles['it']), '6d752b98c31f1cf8d597050c7a2cb4b5')
+        self.assertEqual(md5(subtitles['en']), '688dd1ce0981683867e7fe6fde2a224b')
+        self.assertEqual(md5(subtitles['it']), '31324d30b8430b309f7f5979a504a769')
          for lang in ['fr', 'de']:
              self.assertTrue(subtitles.get(lang) is not None, 'Subtitles for \'%s\' not extracted' % lang)
  
@@ -74,13 +73,13 @@ def test_youtube_subtitles_ttml_format(self):
          self.DL.params['writesubtitles'] = True
          self.DL.params['subtitlesformat'] = 'ttml'
          subtitles = self.getSubtitles()
-        self.assertEqual(md5(subtitles['en']), 'e306f8c42842f723447d9f63ad65df54')
+        self.assertEqual(md5(subtitles['en']), 'c97ddf1217390906fa9fbd34901f3da2')
  
      def test_youtube_subtitles_vtt_format(self):
          self.DL.params['writesubtitles'] = True
          self.DL.params['subtitlesformat'] = 'vtt'
          subtitles = self.getSubtitles()
-        self.assertEqual(md5(subtitles['en']), '3cb210999d3e021bd6c7f0ea751eab06')
+        self.assertEqual(md5(subtitles['en']), 'ae1bd34126571a77aabd4d276b28044d')
  
      def test_youtube_automatic_captions(self):
          self.url = '8YoUxe5ncPo'
@@ -89,9 +88,15 @@ def test_youtube_automatic_captions(self):
          subtitles = self.getSubtitles()
          self.assertTrue(subtitles['it'] is not None)
  
+    def test_youtube_no_automatic_captions(self):
+        self.url = 'QRS8MkLhQmM'
+        self.DL.params['writeautomaticsub'] = True
+        subtitles = self.getSubtitles()
+        self.assertTrue(not subtitles)
+
      def test_youtube_translated_subtitles(self):
          # This video has a subtitles track, which can be translated
-        self.url = 'Ky9eprVWzlI'
+        self.url = 'i0ZabxXmH4Y'
          self.DL.params['writeautomaticsub'] = True
          self.DL.params['subtitleslangs'] = ['it']
          subtitles = self.getSubtitles()
@@ -322,18 +327,6 @@ def test_allsubtitles(self):
          self.assertEqual(md5(subtitles['es']), '69e70cae2d40574fb7316f31d6eb7fca')
  
  
-class TestFunnyOrDieSubtitles(BaseTestSubtitles):
-    url = 'http://www.funnyordie.com/videos/224829ff6d/judd-apatow-will-direct-your-vine'
-    IE = FunnyOrDieIE
-
-    def test_allsubtitles(self):
-        self.DL.params['writesubtitles'] = True
-        self.DL.params['allsubtitles'] = True
-        subtitles = self.getSubtitles()
-        self.assertEqual(set(subtitles.keys()), set(['en']))
-        self.assertEqual(md5(subtitles['en']), 'c5593c193eacd353596c11c2d4f9ecc4')
-
-
  class TestDemocracynowSubtitles(BaseTestSubtitles):
      url = 'http://www.democracynow.org/shows/2015/7/3'
      IE = DemocracynowIE
diff --git a/test/test_swfinterp.py b/test/test_swfinterp.py

index 9f18055e629d3c21826ad8159bdf0ae55409bca2..1a8b353e8d29df62dc53afa8f7d6181d30d023d8 100644 (file)
--- a/test/test_swfinterp.py
+++ b/test/test_swfinterp.py
@@ -14,7 +14,7 @@
  import re
  import subprocess
  
-from youtube_dl.swfinterp import SWFInterpreter
+from youtube_dlc.swfinterp import SWFInterpreter
  
  
  TEST_DIR = os.path.join(
diff --git a/test/test_update.py b/test/test_update.py

index d9c71511db795a24d72b846ea40be574a7f4e549..1b144c43c42076062d949dcb9a55aa9ab4816524 100644 (file)
--- a/test/test_update.py
+++ b/test/test_update.py
@@ -10,7 +10,7 @@
  
  
  import json
-from youtube_dl.update import rsa_verify
+from youtube_dlc.update import rsa_verify
  
  
  class TestUpdate(unittest.TestCase):
diff --git a/test/test_utils.py b/test/test_utils.py

index 0896f41506aa6d6cdb45b1c601203d6e717946d6..95231200b7fc47ae800affcf28d0614a1d962896 100644 (file)
--- a/test/test_utils.py
+++ b/test/test_utils.py
@@ -15,7 +15,7 @@
  import json
  import xml.etree.ElementTree
  
-from youtube_dl.utils import (
+from youtube_dlc.utils import (
      age_restricted,
      args_to_str,
      encode_base_n,
@@ -105,7 +105,7 @@
      cli_bool_option,
      parse_codecs,
  )
-from youtube_dl.compat import (
+from youtube_dlc.compat import (
      compat_chr,
      compat_etree_fromstring,
      compat_getenv,
@@ -240,12 +240,12 @@ def test_expand_path(self):
          def env(var):
              return '%{0}%'.format(var) if sys.platform == 'win32' else '${0}'.format(var)
  
-        compat_setenv('YOUTUBE_DL_EXPATH_PATH', 'expanded')
-        self.assertEqual(expand_path(env('YOUTUBE_DL_EXPATH_PATH')), 'expanded')
+        compat_setenv('youtube_dlc_EXPATH_PATH', 'expanded')
+        self.assertEqual(expand_path(env('youtube_dlc_EXPATH_PATH')), 'expanded')
          self.assertEqual(expand_path(env('HOME')), compat_getenv('HOME'))
          self.assertEqual(expand_path('~'), compat_getenv('HOME'))
          self.assertEqual(
-            expand_path('~/%s' % env('YOUTUBE_DL_EXPATH_PATH')),
+            expand_path('~/%s' % env('youtube_dlc_EXPATH_PATH')),
              '%s/expanded' % compat_getenv('HOME'))
  
      def test_prepend_extension(self):
@@ -803,6 +803,8 @@ def test_mimetype2ext(self):
          self.assertEqual(mimetype2ext('text/vtt'), 'vtt')
          self.assertEqual(mimetype2ext('text/vtt;charset=utf-8'), 'vtt')
          self.assertEqual(mimetype2ext('text/html; charset=utf-8'), 'html')
+        self.assertEqual(mimetype2ext('audio/x-wav'), 'wav')
+        self.assertEqual(mimetype2ext('audio/x-wav;codec=pcm'), 'wav')
  
      def test_month_by_name(self):
          self.assertEqual(month_by_name(None), None)
@@ -1388,8 +1390,8 @@ def test_caesar(self):
          self.assertEqual(caesar('ebg', 'acegik', -2), 'abc')
  
      def test_rot47(self):
-        self.assertEqual(rot47('youtube-dl'), r'J@FEF36\5=')
-        self.assertEqual(rot47('YOUTUBE-DL'), r'*~&%&qt\s{')
+        self.assertEqual(rot47('youtube-dlc'), r'J@FEF36\5=4')
+        self.assertEqual(rot47('YOUTUBE-DLC'), r'*~&%&qt\s{r')
  
      def test_urshift(self):
          self.assertEqual(urshift(3, 1), 1)
diff --git a/test/test_verbose_output.py b/test/test_verbose_output.py

index c1465fe8c51d8bf3789606fbf6c61da0deabfa90..462f25e03f648d79df48f35fc4a7ddfdbc1f532d 100644 (file)
--- a/test/test_verbose_output.py
+++ b/test/test_verbose_output.py
@@ -17,7 +17,7 @@ class TestVerboseOutput(unittest.TestCase):
      def test_private_info_arg(self):
          outp = subprocess.Popen(
              [
-                sys.executable, 'youtube_dl/__main__.py', '-v',
+                sys.executable, 'youtube_dlc/__main__.py', '-v',
                  '--username', 'johnsmith@gmail.com',
                  '--password', 'secret',
              ], cwd=rootDir, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
@@ -30,7 +30,7 @@ def test_private_info_arg(self):
      def test_private_info_shortarg(self):
          outp = subprocess.Popen(
              [
-                sys.executable, 'youtube_dl/__main__.py', '-v',
+                sys.executable, 'youtube_dlc/__main__.py', '-v',
                  '-u', 'johnsmith@gmail.com',
                  '-p', 'secret',
              ], cwd=rootDir, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
@@ -43,7 +43,7 @@ def test_private_info_shortarg(self):
      def test_private_info_eq(self):
          outp = subprocess.Popen(
              [
-                sys.executable, 'youtube_dl/__main__.py', '-v',
+                sys.executable, 'youtube_dlc/__main__.py', '-v',
                  '--username=johnsmith@gmail.com',
                  '--password=secret',
              ], cwd=rootDir, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
@@ -56,7 +56,7 @@ def test_private_info_eq(self):
      def test_private_info_shortarg_eq(self):
          outp = subprocess.Popen(
              [
-                sys.executable, 'youtube_dl/__main__.py', '-v',
+                sys.executable, 'youtube_dlc/__main__.py', '-v',
                  '-u=johnsmith@gmail.com',
                  '-p=secret',
              ], cwd=rootDir, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
diff --git a/test/test_write_annotations.py b/test/test_write_annotations.py

index 41abdfe3b99eaabf562ebabc222fc50fead77631..d98c96c15ffb00e3d495e6d5d78ae50701921d64 100644 (file)
--- a/test/test_write_annotations.py
+++ b/test/test_write_annotations.py
@@ -15,11 +15,11 @@
  
  import xml.etree.ElementTree
  
-import youtube_dl.YoutubeDL
-import youtube_dl.extractor
+import youtube_dlc.YoutubeDL
+import youtube_dlc.extractor
  
  
-class YoutubeDL(youtube_dl.YoutubeDL):
+class YoutubeDL(youtube_dlc.YoutubeDL):
      def __init__(self, *args, **kwargs):
          super(YoutubeDL, self).__init__(*args, **kwargs)
          self.to_stderr = self.to_screen
@@ -45,7 +45,7 @@ def setUp(self):
  
      def test_info_json(self):
          expected = list(EXPECTED_ANNOTATIONS)  # Two annotations could have the same text.
-        ie = youtube_dl.extractor.YoutubeIE()
+        ie = youtube_dlc.extractor.YoutubeIE()
          ydl = YoutubeDL(params)
          ydl.add_info_extractor(ie)
          ydl.download([TEST_ID])
diff --git a/test/test_youtube_chapters.py b/test/test_youtube_chapters.py

index 324ca852578531757d9964f2c90cf6f8e1c4d3b1..4529d2e84d0e5f5f427f2078f0de782002c611b6 100644 (file)
--- a/test/test_youtube_chapters.py
+++ b/test/test_youtube_chapters.py
@@ -9,7 +9,7 @@
  sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
  
  from test.helper import expect_value
-from youtube_dl.extractor import YoutubeIE
+from youtube_dlc.extractor import YoutubeIE
  
  
  class TestYoutubeChapters(unittest.TestCase):
@@ -267,7 +267,7 @@ def test_youtube_chapters(self):
          for description, duration, expected_chapters in self._TEST_CASES:
              ie = YoutubeIE()
              expect_value(
-                self, ie._extract_chapters(description, duration),
+                self, ie._extract_chapters_from_description(description, duration),
                  expected_chapters, None)
  
  
diff --git a/test/test_youtube_lists.py b/test/test_youtube_lists.py

index c4f0abbeaaacbe3b469320c13a205e8f738c443b..a693963ef9076e0a5bd5634b668e3fc7852acdd0 100644 (file)
--- a/test/test_youtube_lists.py
+++ b/test/test_youtube_lists.py
@@ -10,7 +10,7 @@
  from test.helper import FakeYDL
  
  
-from youtube_dl.extractor import (
+from youtube_dlc.extractor import (
      YoutubePlaylistIE,
      YoutubeIE,
  )
diff --git a/test/test_youtube_signature.py b/test/test_youtube_signature.py

index f0c370eeedc8942abc0b8cd8c10e57b4361d00c2..a54b36198fd898de18fd01981317d6462ae8cc58 100644 (file)
--- a/test/test_youtube_signature.py
+++ b/test/test_youtube_signature.py
@@ -13,8 +13,8 @@
  import string
  
  from test.helper import FakeYDL
-from youtube_dl.extractor import YoutubeIE
-from youtube_dl.compat import compat_str, compat_urlretrieve
+from youtube_dlc.extractor import YoutubeIE
+from youtube_dlc.compat import compat_str, compat_urlretrieve
  
  _TESTS = [
      (
@@ -74,6 +74,28 @@
  ]
  
  
+class TestPlayerInfo(unittest.TestCase):
+    def test_youtube_extract_player_info(self):
+        PLAYER_URLS = (
+            ('https://www.youtube.com/s/player/64dddad9/player_ias.vflset/en_US/base.js', '64dddad9'),
+            # obsolete
+            ('https://www.youtube.com/yts/jsbin/player_ias-vfle4-e03/en_US/base.js', 'vfle4-e03'),
+            ('https://www.youtube.com/yts/jsbin/player_ias-vfl49f_g4/en_US/base.js', 'vfl49f_g4'),
+            ('https://www.youtube.com/yts/jsbin/player_ias-vflCPQUIL/en_US/base.js', 'vflCPQUIL'),
+            ('https://www.youtube.com/yts/jsbin/player-vflzQZbt7/en_US/base.js', 'vflzQZbt7'),
+            ('https://www.youtube.com/yts/jsbin/player-en_US-vflaxXRn1/base.js', 'vflaxXRn1'),
+            ('https://s.ytimg.com/yts/jsbin/html5player-en_US-vflXGBaUN.js', 'vflXGBaUN'),
+            ('https://s.ytimg.com/yts/jsbin/html5player-en_US-vflKjOTVq/html5player.js', 'vflKjOTVq'),
+            ('http://s.ytimg.com/yt/swfbin/watch_as3-vflrEm9Nq.swf', 'vflrEm9Nq'),
+            ('https://s.ytimg.com/yts/swfbin/player-vflenCdZL/watch_as3.swf', 'vflenCdZL'),
+        )
+        for player_url, expected_player_id in PLAYER_URLS:
+            expected_player_type = player_url.split('.')[-1]
+            player_type, player_id = YoutubeIE._extract_player_info(player_url)
+            self.assertEqual(player_type, expected_player_type)
+            self.assertEqual(player_id, expected_player_id)
+
+
  class TestSignature(unittest.TestCase):
      def setUp(self):
          TEST_DIR = os.path.dirname(os.path.abspath(__file__))
diff --git a/test/testdata/cookies/malformed_cookies.txt b/test/testdata/cookies/malformed_cookies.txt

new file mode 100644 (file)

index 0000000..17bc403
--- /dev/null
+++ b/test/testdata/cookies/malformed_cookies.txt
@@ -0,0 +1,9 @@
+# Netscape HTTP Cookie File
+# http://curl.haxx.se/rfc/cookie_spec.html
+# This is a generated file!  Do not edit.
+
+# Cookie file entry with invalid number of fields - 6 instead of 7
+www.foobar.foobar      FALSE   /       FALSE   0       COOKIE
+
+# Cookie file entry with invalid expires at
+www.foobar.foobar      FALSE   /       FALSE   1.7976931348623157e+308 COOKIE  VALUE
diff --git a/tox.ini b/tox.ini

index 9c4e4a3d1eab285d8def7fce06e7d5ceb108952e..842091d65c1adc2cfdc26b46fc4f6bf6c4ba1445 100644 (file)
--- a/tox.ini
+++ b/tox.ini
@@ -10,5 +10,5 @@ defaultargs = test --exclude test_download.py --exclude test_age_restriction.py
      --exclude test_subtitles.py --exclude test_write_annotations.py
      --exclude test_youtube_lists.py --exclude test_iqiyi_sdk_interpreter.py
      --exclude test_socks.py
-commands = nosetests --verbose {posargs:{[testenv]defaultargs}}  # --with-coverage --cover-package=youtube_dl --cover-html
+commands = nosetests --verbose {posargs:{[testenv]defaultargs}}  # --with-coverage --cover-package=youtube_dlc --cover-html
                                                 # test.test_download:TestDownload.test_NowVideo
diff --git a/win/icon/cloud.ico b/win/icon/cloud.ico

new file mode 100644 (file)

index 0000000..6d742ce

Binary files /dev/null and b/win/icon/cloud.ico differ
diff --git a/win/ver.txt b/win/ver.txt

new file mode 100644 (file)

index 0000000..0ad4344
--- /dev/null
+++ b/win/ver.txt
@@ -0,0 +1,45 @@
+# UTF-8
+#
+# For more details about fixed file info 'ffi' see:
+# http://msdn.microsoft.com/en-us/library/ms646997.aspx
+VSVersionInfo(
+  ffi=FixedFileInfo(
+    # filevers and prodvers should be always a tuple with four items: (1, 2, 3, 4)
+    # Set not needed items to zero 0.
+    filevers=(6, 9, 2020, 0),
+    prodvers=(6, 9, 2020, 0),
+    # Contains a bitmask that specifies the valid bits 'flags'r
+    mask=0x3f,
+    # Contains a bitmask that specifies the Boolean attributes of the file.
+    flags=0x0,
+    # The operating system for which this file was designed.
+    # 0x4 - NT and there is no need to change it.
+    # OS=0x40004,
+       OS=0x4,
+    # The general type of file.
+    # 0x1 - the file is an application.
+    fileType=0x1,
+    # The function of the file.
+    # 0x0 - the function is not defined for this fileType
+    subtype=0x0,
+    # Creation date and time stamp.
+    date=(0, 0)
+    ),
+  kids=[
+    StringFileInfo(
+      [
+      StringTable(
+        u'040904B0',
+        [StringStruct(u'Comments', u'Youtube-dlc Command Line Interface.'),
+       StringStruct(u'CompanyName', u'theidel@uni-bremen.de'),
+        StringStruct(u'FileDescription', u'Media Downloader'),
+        StringStruct(u'FileVersion', u'6.9.2020.0'),
+        StringStruct(u'InternalName', u'youtube-dlc'),
+        StringStruct(u'LegalCopyright', u'theidel@uni-bremen.de | UNLICENSE'),
+        StringStruct(u'OriginalFilename', u'youtube-dlc.exe'),
+        StringStruct(u'ProductName', u'Youtube-dlc'),
+        StringStruct(u'ProductVersion', u'6.9.2020.0 | git.io/JUGsM')])
+      ]), 
+    VarFileInfo([VarStruct(u'Translation', [0, 1200])])
+  ]
+)
diff --git a/youtube-dl.plugin.zsh b/youtube-dl.plugin.zsh

deleted file mode 100644 (file)

index 17ab134..0000000
--- a/youtube-dl.plugin.zsh
+++ /dev/null
@@ -1,24 +0,0 @@
-# This allows the youtube-dl command to be installed in ZSH using antigen.
-# Antigen is a bundle manager. It allows you to enhance the functionality of
-# your zsh session by installing bundles and themes easily.
-
-# Antigen documentation:
-# http://antigen.sharats.me/
-# https://github.com/zsh-users/antigen
-
-# Install youtube-dl:
-# antigen bundle ytdl-org/youtube-dl
-# Bundles installed by antigen are available for use immediately.
-
-# Update youtube-dl (and all other antigen bundles):
-# antigen update
-
-# The antigen command will download the git repository to a folder and then
-# execute an enabling script (this file). The complete process for loading the
-# code is documented here:
-# https://github.com/zsh-users/antigen#notes-on-writing-plugins
-
-# This specific script just aliases youtube-dl to the python script that this
-# library provides. This requires updating the PYTHONPATH to ensure that the
-# full set of code can be located.
-alias youtube-dl="PYTHONPATH=$(dirname $0) $(dirname $0)/bin/youtube-dl"
diff --git a/youtube_dl/extractor/ard.py b/youtube_dl/extractor/ard.py

deleted file mode 100644 (file)

index 8adae46..0000000
--- a/youtube_dl/extractor/ard.py
+++ /dev/null
@@ -1,400 +0,0 @@
-# coding: utf-8
-from __future__ import unicode_literals
-
-import re
-
-from .common import InfoExtractor
-from .generic import GenericIE
-from ..utils import (
-    determine_ext,
-    ExtractorError,
-    int_or_none,
-    parse_duration,
-    qualities,
-    str_or_none,
-    try_get,
-    unified_strdate,
-    unified_timestamp,
-    update_url_query,
-    url_or_none,
-    xpath_text,
-)
-from ..compat import compat_etree_fromstring
-
-
-class ARDMediathekIE(InfoExtractor):
-    IE_NAME = 'ARD:mediathek'
-    _VALID_URL = r'^https?://(?:(?:(?:www|classic)\.)?ardmediathek\.de|mediathek\.(?:daserste|rbb-online)\.de|one\.ard\.de)/(?:.*/)(?P<video_id>[0-9]+|[^0-9][^/\?]+)[^/\?]*(?:\?.*)?'
-
-    _TESTS = [{
-        # available till 26.07.2022
-        'url': 'http://www.ardmediathek.de/tv/S%C3%9CDLICHT/Was-ist-die-Kunst-der-Zukunft-liebe-Ann/BR-Fernsehen/Video?bcastId=34633636&documentId=44726822',
-        'info_dict': {
-            'id': '44726822',
-            'ext': 'mp4',
-            'title': 'Was ist die Kunst der Zukunft, liebe Anna McCarthy?',
-            'description': 'md5:4ada28b3e3b5df01647310e41f3a62f5',
-            'duration': 1740,
-        },
-        'params': {
-            # m3u8 download
-            'skip_download': True,
-        }
-    }, {
-        'url': 'https://one.ard.de/tv/Mord-mit-Aussicht/Mord-mit-Aussicht-6-39-T%C3%B6dliche-Nach/ONE/Video?bcastId=46384294&documentId=55586872',
-        'only_matching': True,
-    }, {
-        # audio
-        'url': 'http://www.ardmediathek.de/tv/WDR-H%C3%B6rspiel-Speicher/Tod-eines-Fu%C3%9Fballers/WDR-3/Audio-Podcast?documentId=28488308&bcastId=23074086',
-        'only_matching': True,
-    }, {
-        'url': 'http://mediathek.daserste.de/sendungen_a-z/328454_anne-will/22429276_vertrauen-ist-gut-spionieren-ist-besser-geht',
-        'only_matching': True,
-    }, {
-        # audio
-        'url': 'http://mediathek.rbb-online.de/radio/Hörspiel/Vor-dem-Fest/kulturradio/Audio?documentId=30796318&topRessort=radio&bcastId=9839158',
-        'only_matching': True,
-    }, {
-        'url': 'https://classic.ardmediathek.de/tv/Panda-Gorilla-Co/Panda-Gorilla-Co-Folge-274/Das-Erste/Video?bcastId=16355486&documentId=58234698',
-        'only_matching': True,
-    }]
-
-    @classmethod
-    def suitable(cls, url):
-        return False if ARDBetaMediathekIE.suitable(url) else super(ARDMediathekIE, cls).suitable(url)
-
-    def _extract_media_info(self, media_info_url, webpage, video_id):
-        media_info = self._download_json(
-            media_info_url, video_id, 'Downloading media JSON')
-
-        formats = self._extract_formats(media_info, video_id)
-
-        if not formats:
-            if '"fsk"' in webpage:
-                raise ExtractorError(
-                    'This video is only available after 20:00', expected=True)
-            elif media_info.get('_geoblocked'):
-                raise ExtractorError('This video is not available due to geo restriction', expected=True)
-
-        self._sort_formats(formats)
-
-        duration = int_or_none(media_info.get('_duration'))
-        thumbnail = media_info.get('_previewImage')
-        is_live = media_info.get('_isLive') is True
-
-        subtitles = {}
-        subtitle_url = media_info.get('_subtitleUrl')
-        if subtitle_url:
-            subtitles['de'] = [{
-                'ext': 'ttml',
-                'url': subtitle_url,
-            }]
-
-        return {
-            'id': video_id,
-            'duration': duration,
-            'thumbnail': thumbnail,
-            'is_live': is_live,
-            'formats': formats,
-            'subtitles': subtitles,
-        }
-
-    def _extract_formats(self, media_info, video_id):
-        type_ = media_info.get('_type')
-        media_array = media_info.get('_mediaArray', [])
-        formats = []
-        for num, media in enumerate(media_array):
-            for stream in media.get('_mediaStreamArray', []):
-                stream_urls = stream.get('_stream')
-                if not stream_urls:
-                    continue
-                if not isinstance(stream_urls, list):
-                    stream_urls = [stream_urls]
-                quality = stream.get('_quality')
-                server = stream.get('_server')
-                for stream_url in stream_urls:
-                    if not url_or_none(stream_url):
-                        continue
-                    ext = determine_ext(stream_url)
-                    if quality != 'auto' and ext in ('f4m', 'm3u8'):
-                        continue
-                    if ext == 'f4m':
-                        formats.extend(self._extract_f4m_formats(
-                            update_url_query(stream_url, {
-                                'hdcore': '3.1.1',
-                                'plugin': 'aasp-3.1.1.69.124'
-                            }),
-                            video_id, f4m_id='hds', fatal=False))
-                    elif ext == 'm3u8':
-                        formats.extend(self._extract_m3u8_formats(
-                            stream_url, video_id, 'mp4', m3u8_id='hls', fatal=False))
-                    else:
-                        if server and server.startswith('rtmp'):
-                            f = {
-                                'url': server,
-                                'play_path': stream_url,
-                                'format_id': 'a%s-rtmp-%s' % (num, quality),
-                            }
-                        else:
-                            f = {
-                                'url': stream_url,
-                                'format_id': 'a%s-%s-%s' % (num, ext, quality)
-                            }
-                        m = re.search(r'_(?P<width>\d+)x(?P<height>\d+)\.mp4$', stream_url)
-                        if m:
-                            f.update({
-                                'width': int(m.group('width')),
-                                'height': int(m.group('height')),
-                            })
-                        if type_ == 'audio':
-                            f['vcodec'] = 'none'
-                        formats.append(f)
-        return formats
-
-    def _real_extract(self, url):
-        # determine video id from url
-        m = re.match(self._VALID_URL, url)
-
-        document_id = None
-
-        numid = re.search(r'documentId=([0-9]+)', url)
-        if numid:
-            document_id = video_id = numid.group(1)
-        else:
-            video_id = m.group('video_id')
-
-        webpage = self._download_webpage(url, video_id)
-
-        ERRORS = (
-            ('>Leider liegt eine Störung vor.', 'Video %s is unavailable'),
-            ('>Der gewünschte Beitrag ist nicht mehr verfügbar.<',
-             'Video %s is no longer available'),
-        )
-
-        for pattern, message in ERRORS:
-            if pattern in webpage:
-                raise ExtractorError(message % video_id, expected=True)
-
-        if re.search(r'[\?&]rss($|[=&])', url):
-            doc = compat_etree_fromstring(webpage.encode('utf-8'))
-            if doc.tag == 'rss':
-                return GenericIE()._extract_rss(url, video_id, doc)
-
-        title = self._html_search_regex(
-            [r'<h1(?:\s+class="boxTopHeadline")?>(.*?)</h1>',
-             r'<meta name="dcterms\.title" content="(.*?)"/>',
-             r'<h4 class="headline">(.*?)</h4>',
-             r'<title[^>]*>(.*?)</title>'],
-            webpage, 'title')
-        description = self._html_search_meta(
-            'dcterms.abstract', webpage, 'description', default=None)
-        if description is None:
-            description = self._html_search_meta(
-                'description', webpage, 'meta description', default=None)
-        if description is None:
-            description = self._html_search_regex(
-                r'<p\s+class="teasertext">(.+?)</p>',
-                webpage, 'teaser text', default=None)
-
-        # Thumbnail is sometimes not present.
-        # It is in the mobile version, but that seems to use a different URL
-        # structure altogether.
-        thumbnail = self._og_search_thumbnail(webpage, default=None)
-
-        media_streams = re.findall(r'''(?x)
-            mediaCollection\.addMediaStream\([0-9]+,\s*[0-9]+,\s*"[^"]*",\s*
-            "([^"]+)"''', webpage)
-
-        if media_streams:
-            QUALITIES = qualities(['lo', 'hi', 'hq'])
-            formats = []
-            for furl in set(media_streams):
-                if furl.endswith('.f4m'):
-                    fid = 'f4m'
-                else:
-                    fid_m = re.match(r'.*\.([^.]+)\.[^.]+$', furl)
-                    fid = fid_m.group(1) if fid_m else None
-                formats.append({
-                    'quality': QUALITIES(fid),
-                    'format_id': fid,
-                    'url': furl,
-                })
-            self._sort_formats(formats)
-            info = {
-                'formats': formats,
-            }
-        else:  # request JSON file
-            if not document_id:
-                video_id = self._search_regex(
-                    r'/play/(?:config|media)/(\d+)', webpage, 'media id')
-            info = self._extract_media_info(
-                'http://www.ardmediathek.de/play/media/%s' % video_id,
-                webpage, video_id)
-
-        info.update({
-            'id': video_id,
-            'title': self._live_title(title) if info.get('is_live') else title,
-            'description': description,
-            'thumbnail': thumbnail,
-        })
-
-        return info
-
-
-class ARDIE(InfoExtractor):
-    _VALID_URL = r'(?P<mainurl>https?://(www\.)?daserste\.de/[^?#]+/videos/(?P<display_id>[^/?#]+)-(?P<id>[0-9]+))\.html'
-    _TESTS = [{
-        # available till 14.02.2019
-        'url': 'http://www.daserste.de/information/talk/maischberger/videos/das-groko-drama-zerlegen-sich-die-volksparteien-video-102.html',
-        'md5': '8e4ec85f31be7c7fc08a26cdbc5a1f49',
-        'info_dict': {
-            'display_id': 'das-groko-drama-zerlegen-sich-die-volksparteien-video',
-            'id': '102',
-            'ext': 'mp4',
-            'duration': 4435.0,
-            'title': 'Das GroKo-Drama: Zerlegen sich die Volksparteien?',
-            'upload_date': '20180214',
-            'thumbnail': r're:^https?://.*\.jpg$',
-        },
-    }, {
-        'url': 'http://www.daserste.de/information/reportage-dokumentation/dokus/videos/die-story-im-ersten-mission-unter-falscher-flagge-100.html',
-        'only_matching': True,
-    }]
-
-    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-        display_id = mobj.group('display_id')
-
-        player_url = mobj.group('mainurl') + '~playerXml.xml'
-        doc = self._download_xml(player_url, display_id)
-        video_node = doc.find('./video')
-        upload_date = unified_strdate(xpath_text(
-            video_node, './broadcastDate'))
-        thumbnail = xpath_text(video_node, './/teaserImage//variant/url')
-
-        formats = []
-        for a in video_node.findall('.//asset'):
-            f = {
-                'format_id': a.attrib['type'],
-                'width': int_or_none(a.find('./frameWidth').text),
-                'height': int_or_none(a.find('./frameHeight').text),
-                'vbr': int_or_none(a.find('./bitrateVideo').text),
-                'abr': int_or_none(a.find('./bitrateAudio').text),
-                'vcodec': a.find('./codecVideo').text,
-                'tbr': int_or_none(a.find('./totalBitrate').text),
-            }
-            if a.find('./serverPrefix').text:
-                f['url'] = a.find('./serverPrefix').text
-                f['playpath'] = a.find('./fileName').text
-            else:
-                f['url'] = a.find('./fileName').text
-            formats.append(f)
-        self._sort_formats(formats)
-
-        return {
-            'id': mobj.group('id'),
-            'formats': formats,
-            'display_id': display_id,
-            'title': video_node.find('./title').text,
-            'duration': parse_duration(video_node.find('./duration').text),
-            'upload_date': upload_date,
-            'thumbnail': thumbnail,
-        }
-
-
-class ARDBetaMediathekIE(InfoExtractor):
-    _VALID_URL = r'https://(?:beta|www)\.ardmediathek\.de/[^/]+/(?:player|live)/(?P<video_id>[a-zA-Z0-9]+)(?:/(?P<display_id>[^/?#]+))?'
-    _TESTS = [{
-        'url': 'https://beta.ardmediathek.de/ard/player/Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhdG9ydC9mYmM4NGM1NC0xNzU4LTRmZGYtYWFhZS0wYzcyZTIxNGEyMDE/die-robuste-roswita',
-        'md5': '2d02d996156ea3c397cfc5036b5d7f8f',
-        'info_dict': {
-            'display_id': 'die-robuste-roswita',
-            'id': 'Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhdG9ydC9mYmM4NGM1NC0xNzU4LTRmZGYtYWFhZS0wYzcyZTIxNGEyMDE',
-            'title': 'Tatort: Die robuste Roswita',
-            'description': r're:^Der Mord.*trüber ist als die Ilm.',
-            'duration': 5316,
-            'thumbnail': 'https://img.ardmediathek.de/standard/00/55/43/59/34/-1774185891/16x9/960?mandant=ard',
-            'upload_date': '20180826',
-            'ext': 'mp4',
-        },
-    }, {
-        'url': 'https://www.ardmediathek.de/ard/player/Y3JpZDovL3N3ci5kZS9hZXgvbzEwNzE5MTU/',
-        'only_matching': True,
-    }, {
-        'url': 'https://www.ardmediathek.de/swr/live/Y3JpZDovL3N3ci5kZS8xMzQ4MTA0Mg',
-        'only_matching': True,
-    }]
-
-    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-        video_id = mobj.group('video_id')
-        display_id = mobj.group('display_id') or video_id
-
-        webpage = self._download_webpage(url, display_id)
-        data_json = self._search_regex(r'window\.__APOLLO_STATE__\s*=\s*(\{.*);\n', webpage, 'json')
-        data = self._parse_json(data_json, display_id)
-
-        res = {
-            'id': video_id,
-            'display_id': display_id,
-        }
-        formats = []
-        subtitles = {}
-        geoblocked = False
-        for widget in data.values():
-            if widget.get('_geoblocked') is True:
-                geoblocked = True
-            if '_duration' in widget:
-                res['duration'] = int_or_none(widget['_duration'])
-            if 'clipTitle' in widget:
-                res['title'] = widget['clipTitle']
-            if '_previewImage' in widget:
-                res['thumbnail'] = widget['_previewImage']
-            if 'broadcastedOn' in widget:
-                res['timestamp'] = unified_timestamp(widget['broadcastedOn'])
-            if 'synopsis' in widget:
-                res['description'] = widget['synopsis']
-            subtitle_url = url_or_none(widget.get('_subtitleUrl'))
-            if subtitle_url:
-                subtitles.setdefault('de', []).append({
-                    'ext': 'ttml',
-                    'url': subtitle_url,
-                })
-            if '_quality' in widget:
-                format_url = url_or_none(try_get(
-                    widget, lambda x: x['_stream']['json'][0]))
-                if not format_url:
-                    continue
-                ext = determine_ext(format_url)
-                if ext == 'f4m':
-                    formats.extend(self._extract_f4m_formats(
-                        format_url + '?hdcore=3.11.0',
-                        video_id, f4m_id='hds', fatal=False))
-                elif ext == 'm3u8':
-                    formats.extend(self._extract_m3u8_formats(
-                        format_url, video_id, 'mp4', m3u8_id='hls',
-                        fatal=False))
-                else:
-                    # HTTP formats are not available when geoblocked is True,
-                    # other formats are fine though
-                    if geoblocked:
-                        continue
-                    quality = str_or_none(widget.get('_quality'))
-                    formats.append({
-                        'format_id': ('http-' + quality) if quality else 'http',
-                        'url': format_url,
-                        'preference': 10,  # Plain HTTP, that's nice
-                    })
-
-        if not formats and geoblocked:
-            self.raise_geo_restricted(
-                msg='This video is not available due to geoblocking',
-                countries=['DE'])
-
-        self._sort_formats(formats)
-        res.update({
-            'subtitles': subtitles,
-            'formats': formats,
-        })
-
-        return res
diff --git a/youtube_dl/extractor/deezer.py b/youtube_dl/extractor/deezer.py

deleted file mode 100644 (file)

index a38b268..0000000
--- a/youtube_dl/extractor/deezer.py
+++ /dev/null
@@ -1,91 +0,0 @@
-from __future__ import unicode_literals
-
-import json
-import re
-
-from .common import InfoExtractor
-from ..utils import (
-    ExtractorError,
-    int_or_none,
-    orderedSet,
-)
-
-
-class DeezerPlaylistIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:www\.)?deezer\.com/playlist/(?P<id>[0-9]+)'
-    _TEST = {
-        'url': 'http://www.deezer.com/playlist/176747451',
-        'info_dict': {
-            'id': '176747451',
-            'title': 'Best!',
-            'uploader': 'Anonymous',
-            'thumbnail': r're:^https?://cdn-images\.deezer\.com/images/cover/.*\.jpg$',
-        },
-        'playlist_count': 30,
-        'skip': 'Only available in .de',
-    }
-
-    def _real_extract(self, url):
-        if 'test' not in self._downloader.params:
-            self._downloader.report_warning('For now, this extractor only supports the 30 second previews. Patches welcome!')
-
-        mobj = re.match(self._VALID_URL, url)
-        playlist_id = mobj.group('id')
-
-        webpage = self._download_webpage(url, playlist_id)
-        geoblocking_msg = self._html_search_regex(
-            r'<p class="soon-txt">(.*?)</p>', webpage, 'geoblocking message',
-            default=None)
-        if geoblocking_msg is not None:
-            raise ExtractorError(
-                'Deezer said: %s' % geoblocking_msg, expected=True)
-
-        data_json = self._search_regex(
-            (r'__DZR_APP_STATE__\s*=\s*({.+?})\s*</script>',
-             r'naboo\.display\(\'[^\']+\',\s*(.*?)\);\n'),
-            webpage, 'data JSON')
-        data = json.loads(data_json)
-
-        playlist_title = data.get('DATA', {}).get('TITLE')
-        playlist_uploader = data.get('DATA', {}).get('PARENT_USERNAME')
-        playlist_thumbnail = self._search_regex(
-            r'<img id="naboo_playlist_image".*?src="([^"]+)"', webpage,
-            'playlist thumbnail')
-
-        preview_pattern = self._search_regex(
-            r"var SOUND_PREVIEW_GATEWAY\s*=\s*'([^']+)';", webpage,
-            'preview URL pattern', fatal=False)
-        entries = []
-        for s in data['SONGS']['data']:
-            puid = s['MD5_ORIGIN']
-            preview_video_url = preview_pattern.\
-                replace('{0}', puid[0]).\
-                replace('{1}', puid).\
-                replace('{2}', s['MEDIA_VERSION'])
-            formats = [{
-                'format_id': 'preview',
-                'url': preview_video_url,
-                'preference': -100,  # Only the first 30 seconds
-                'ext': 'mp3',
-            }]
-            self._sort_formats(formats)
-            artists = ', '.join(
-                orderedSet(a['ART_NAME'] for a in s['ARTISTS']))
-            entries.append({
-                'id': s['SNG_ID'],
-                'duration': int_or_none(s.get('DURATION')),
-                'title': '%s - %s' % (artists, s['SNG_TITLE']),
-                'uploader': s['ART_NAME'],
-                'uploader_id': s['ART_ID'],
-                'age_limit': 16 if s.get('EXPLICIT_LYRICS') == '1' else 0,
-                'formats': formats,
-            })
-
-        return {
-            '_type': 'playlist',
-            'id': playlist_id,
-            'title': playlist_title,
-            'uploader': playlist_uploader,
-            'thumbnail': playlist_thumbnail,
-            'entries': entries,
-        }
diff --git a/youtube_dl/extractor/dreisat.py b/youtube_dl/extractor/dreisat.py

deleted file mode 100644 (file)

index 848d387..0000000
--- a/youtube_dl/extractor/dreisat.py
+++ /dev/null
@@ -1,193 +0,0 @@
-from __future__ import unicode_literals
-
-import re
-
-from .common import InfoExtractor
-from ..utils import (
-    int_or_none,
-    unified_strdate,
-    xpath_text,
-    determine_ext,
-    float_or_none,
-    ExtractorError,
-)
-
-
-class DreiSatIE(InfoExtractor):
-    IE_NAME = '3sat'
-    _GEO_COUNTRIES = ['DE']
-    _VALID_URL = r'https?://(?:www\.)?3sat\.de/mediathek/(?:(?:index|mediathek)\.php)?\?(?:(?:mode|display)=[^&]+&)*obj=(?P<id>[0-9]+)'
-    _TESTS = [
-        {
-            'url': 'http://www.3sat.de/mediathek/index.php?mode=play&obj=45918',
-            'md5': 'be37228896d30a88f315b638900a026e',
-            'info_dict': {
-                'id': '45918',
-                'ext': 'mp4',
-                'title': 'Waidmannsheil',
-                'description': 'md5:cce00ca1d70e21425e72c86a98a56817',
-                'uploader': 'SCHWEIZWEIT',
-                'uploader_id': '100000210',
-                'upload_date': '20140913'
-            },
-            'params': {
-                'skip_download': True,  # m3u8 downloads
-            }
-        },
-        {
-            'url': 'http://www.3sat.de/mediathek/mediathek.php?mode=play&obj=51066',
-            'only_matching': True,
-        },
-    ]
-
-    def _parse_smil_formats(self, smil, smil_url, video_id, namespace=None, f4m_params=None, transform_rtmp_url=None):
-        param_groups = {}
-        for param_group in smil.findall(self._xpath_ns('./head/paramGroup', namespace)):
-            group_id = param_group.get(self._xpath_ns(
-                'id', 'http://www.w3.org/XML/1998/namespace'))
-            params = {}
-            for param in param_group:
-                params[param.get('name')] = param.get('value')
-            param_groups[group_id] = params
-
-        formats = []
-        for video in smil.findall(self._xpath_ns('.//video', namespace)):
-            src = video.get('src')
-            if not src:
-                continue
-            bitrate = int_or_none(self._search_regex(r'_(\d+)k', src, 'bitrate', None)) or float_or_none(video.get('system-bitrate') or video.get('systemBitrate'), 1000)
-            group_id = video.get('paramGroup')
-            param_group = param_groups[group_id]
-            for proto in param_group['protocols'].split(','):
-                formats.append({
-                    'url': '%s://%s' % (proto, param_group['host']),
-                    'app': param_group['app'],
-                    'play_path': src,
-                    'ext': 'flv',
-                    'format_id': '%s-%d' % (proto, bitrate),
-                    'tbr': bitrate,
-                })
-        self._sort_formats(formats)
-        return formats
-
-    def extract_from_xml_url(self, video_id, xml_url):
-        doc = self._download_xml(
-            xml_url, video_id,
-            note='Downloading video info',
-            errnote='Failed to download video info')
-
-        status_code = xpath_text(doc, './status/statuscode')
-        if status_code and status_code != 'ok':
-            if status_code == 'notVisibleAnymore':
-                message = 'Video %s is not available' % video_id
-            else:
-                message = '%s returned error: %s' % (self.IE_NAME, status_code)
-            raise ExtractorError(message, expected=True)
-
-        title = xpath_text(doc, './/information/title', 'title', True)
-
-        urls = []
-        formats = []
-        for fnode in doc.findall('.//formitaeten/formitaet'):
-            video_url = xpath_text(fnode, 'url')
-            if not video_url or video_url in urls:
-                continue
-            urls.append(video_url)
-
-            is_available = 'http://www.metafilegenerator' not in video_url
-            geoloced = 'static_geoloced_online' in video_url
-            if not is_available or geoloced:
-                continue
-
-            format_id = fnode.attrib['basetype']
-            format_m = re.match(r'''(?x)
-                (?P<vcodec>[^_]+)_(?P<acodec>[^_]+)_(?P<container>[^_]+)_
-                (?P<proto>[^_]+)_(?P<index>[^_]+)_(?P<indexproto>[^_]+)
-            ''', format_id)
-
-            ext = determine_ext(video_url, None) or format_m.group('container')
-
-            if ext == 'meta':
-                continue
-            elif ext == 'smil':
-                formats.extend(self._extract_smil_formats(
-                    video_url, video_id, fatal=False))
-            elif ext == 'm3u8':
-                # the certificates are misconfigured (see
-                # https://github.com/ytdl-org/youtube-dl/issues/8665)
-                if video_url.startswith('https://'):
-                    continue
-                formats.extend(self._extract_m3u8_formats(
-                    video_url, video_id, 'mp4', 'm3u8_native',
-                    m3u8_id=format_id, fatal=False))
-            elif ext == 'f4m':
-                formats.extend(self._extract_f4m_formats(
-                    video_url, video_id, f4m_id=format_id, fatal=False))
-            else:
-                quality = xpath_text(fnode, './quality')
-                if quality:
-                    format_id += '-' + quality
-
-                abr = int_or_none(xpath_text(fnode, './audioBitrate'), 1000)
-                vbr = int_or_none(xpath_text(fnode, './videoBitrate'), 1000)
-
-                tbr = int_or_none(self._search_regex(
-                    r'_(\d+)k', video_url, 'bitrate', None))
-                if tbr and vbr and not abr:
-                    abr = tbr - vbr
-
-                formats.append({
-                    'format_id': format_id,
-                    'url': video_url,
-                    'ext': ext,
-                    'acodec': format_m.group('acodec'),
-                    'vcodec': format_m.group('vcodec'),
-                    'abr': abr,
-                    'vbr': vbr,
-                    'tbr': tbr,
-                    'width': int_or_none(xpath_text(fnode, './width')),
-                    'height': int_or_none(xpath_text(fnode, './height')),
-                    'filesize': int_or_none(xpath_text(fnode, './filesize')),
-                    'protocol': format_m.group('proto').lower(),
-                })
-
-        geolocation = xpath_text(doc, './/details/geolocation')
-        if not formats and geolocation and geolocation != 'none':
-            self.raise_geo_restricted(countries=self._GEO_COUNTRIES)
-
-        self._sort_formats(formats)
-
-        thumbnails = []
-        for node in doc.findall('.//teaserimages/teaserimage'):
-            thumbnail_url = node.text
-            if not thumbnail_url:
-                continue
-            thumbnail = {
-                'url': thumbnail_url,
-            }
-            thumbnail_key = node.get('key')
-            if thumbnail_key:
-                m = re.match('^([0-9]+)x([0-9]+)$', thumbnail_key)
-                if m:
-                    thumbnail['width'] = int(m.group(1))
-                    thumbnail['height'] = int(m.group(2))
-            thumbnails.append(thumbnail)
-
-        upload_date = unified_strdate(xpath_text(doc, './/details/airtime'))
-
-        return {
-            'id': video_id,
-            'title': title,
-            'description': xpath_text(doc, './/information/detail'),
-            'duration': int_or_none(xpath_text(doc, './/details/lengthSec')),
-            'thumbnails': thumbnails,
-            'uploader': xpath_text(doc, './/details/originChannelTitle'),
-            'uploader_id': xpath_text(doc, './/details/originChannelId'),
-            'upload_date': upload_date,
-            'formats': formats,
-        }
-
-    def _real_extract(self, url):
-        video_id = self._match_id(url)
-        details_url = 'http://www.3sat.de/mediathek/xmlservice/web/beitragsDetails?id=%s' % video_id
-        return self.extract_from_xml_url(video_id, details_url)
diff --git a/youtube_dl/extractor/hellporno.py b/youtube_dl/extractor/hellporno.py

deleted file mode 100644 (file)

index 0ee8ea7..0000000
--- a/youtube_dl/extractor/hellporno.py
+++ /dev/null
@@ -1,75 +0,0 @@
-from __future__ import unicode_literals
-
-import re
-
-from .common import InfoExtractor
-from ..utils import (
-    js_to_json,
-    remove_end,
-    determine_ext,
-)
-
-
-class HellPornoIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:www\.)?hellporno\.(?:com/videos|net/v)/(?P<id>[^/]+)'
-    _TESTS = [{
-        'url': 'http://hellporno.com/videos/dixie-is-posing-with-naked-ass-very-erotic/',
-        'md5': '1fee339c610d2049699ef2aa699439f1',
-        'info_dict': {
-            'id': '149116',
-            'display_id': 'dixie-is-posing-with-naked-ass-very-erotic',
-            'ext': 'mp4',
-            'title': 'Dixie is posing with naked ass very erotic',
-            'thumbnail': r're:https?://.*\.jpg$',
-            'age_limit': 18,
-        }
-    }, {
-        'url': 'http://hellporno.net/v/186271/',
-        'only_matching': True,
-    }]
-
-    def _real_extract(self, url):
-        display_id = self._match_id(url)
-
-        webpage = self._download_webpage(url, display_id)
-
-        title = remove_end(self._html_search_regex(
-            r'<title>([^<]+)</title>', webpage, 'title'), ' - Hell Porno')
-
-        flashvars = self._parse_json(self._search_regex(
-            r'var\s+flashvars\s*=\s*({.+?});', webpage, 'flashvars'),
-            display_id, transform_source=js_to_json)
-
-        video_id = flashvars.get('video_id')
-        thumbnail = flashvars.get('preview_url')
-        ext = determine_ext(flashvars.get('postfix'), 'mp4')
-
-        formats = []
-        for video_url_key in ['video_url', 'video_alt_url']:
-            video_url = flashvars.get(video_url_key)
-            if not video_url:
-                continue
-            video_text = flashvars.get('%s_text' % video_url_key)
-            fmt = {
-                'url': video_url,
-                'ext': ext,
-                'format_id': video_text,
-            }
-            m = re.search(r'^(?P<height>\d+)[pP]', video_text)
-            if m:
-                fmt['height'] = int(m.group('height'))
-            formats.append(fmt)
-        self._sort_formats(formats)
-
-        categories = self._html_search_meta(
-            'keywords', webpage, 'categories', default='').split(',')
-
-        return {
-            'id': video_id,
-            'display_id': display_id,
-            'title': title,
-            'thumbnail': thumbnail,
-            'categories': categories,
-            'age_limit': 18,
-            'formats': formats,
-        }
diff --git a/youtube_dl/extractor/jpopsukitv.py b/youtube_dl/extractor/jpopsukitv.py

deleted file mode 100644 (file)

index 4b5f346..0000000
--- a/youtube_dl/extractor/jpopsukitv.py
+++ /dev/null
@@ -1,68 +0,0 @@
-# coding: utf-8
-from __future__ import unicode_literals
-
-from .common import InfoExtractor
-from ..utils import (
-    int_or_none,
-    unified_strdate,
-)
-
-
-class JpopsukiIE(InfoExtractor):
-    IE_NAME = 'jpopsuki.tv'
-    _VALID_URL = r'https?://(?:www\.)?jpopsuki\.tv/(?:category/)?video/[^/]+/(?P<id>\S+)'
-
-    _TEST = {
-        'url': 'http://www.jpopsuki.tv/video/ayumi-hamasaki---evolution/00be659d23b0b40508169cdee4545771',
-        'md5': '88018c0c1a9b1387940e90ec9e7e198e',
-        'info_dict': {
-            'id': '00be659d23b0b40508169cdee4545771',
-            'ext': 'mp4',
-            'title': 'ayumi hamasaki - evolution',
-            'description': 'Release date: 2001.01.31\r\n浜崎あゆみ - evolution',
-            'thumbnail': 'http://www.jpopsuki.tv/cache/89722c74d2a2ebe58bcac65321c115b2.jpg',
-            'uploader': 'plama_chan',
-            'uploader_id': '404',
-            'upload_date': '20121101'
-        }
-    }
-
-    def _real_extract(self, url):
-        video_id = self._match_id(url)
-
-        webpage = self._download_webpage(url, video_id)
-
-        video_url = 'http://www.jpopsuki.tv' + self._html_search_regex(
-            r'<source src="(.*?)" type', webpage, 'video url')
-
-        video_title = self._og_search_title(webpage)
-        description = self._og_search_description(webpage)
-        thumbnail = self._og_search_thumbnail(webpage)
-        uploader = self._html_search_regex(
-            r'<li>from: <a href="/user/view/user/(.*?)/uid/',
-            webpage, 'video uploader', fatal=False)
-        uploader_id = self._html_search_regex(
-            r'<li>from: <a href="/user/view/user/\S*?/uid/(\d*)',
-            webpage, 'video uploader_id', fatal=False)
-        upload_date = unified_strdate(self._html_search_regex(
-            r'<li>uploaded: (.*?)</li>', webpage, 'video upload_date',
-            fatal=False))
-        view_count_str = self._html_search_regex(
-            r'<li>Hits: ([0-9]+?)</li>', webpage, 'video view_count',
-            fatal=False)
-        comment_count_str = self._html_search_regex(
-            r'<h2>([0-9]+?) comments</h2>', webpage, 'video comment_count',
-            fatal=False)
-
-        return {
-            'id': video_id,
-            'url': video_url,
-            'title': video_title,
-            'description': description,
-            'thumbnail': thumbnail,
-            'uploader': uploader,
-            'uploader_id': uploader_id,
-            'upload_date': upload_date,
-            'view_count': int_or_none(view_count_str),
-            'comment_count': int_or_none(comment_count_str),
-        }
diff --git a/youtube_dl/extractor/lego.py b/youtube_dl/extractor/lego.py

deleted file mode 100644 (file)

index b312e77..0000000
--- a/youtube_dl/extractor/lego.py
+++ /dev/null
@@ -1,128 +0,0 @@
-# coding: utf-8
-from __future__ import unicode_literals
-
-import re
-
-from .common import InfoExtractor
-from ..compat import compat_str
-from ..utils import (
-    unescapeHTML,
-    parse_duration,
-    get_element_by_class,
-)
-
-
-class LEGOIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:www\.)?lego\.com/(?P<locale>[^/]+)/(?:[^/]+/)*videos/(?:[^/]+/)*[^/?#]+-(?P<id>[0-9a-f]+)'
-    _TESTS = [{
-        'url': 'http://www.lego.com/en-us/videos/themes/club/blocumentary-kawaguchi-55492d823b1b4d5e985787fa8c2973b1',
-        'md5': 'f34468f176cfd76488767fc162c405fa',
-        'info_dict': {
-            'id': '55492d823b1b4d5e985787fa8c2973b1',
-            'ext': 'mp4',
-            'title': 'Blocumentary Great Creations: Akiyuki Kawaguchi',
-            'description': 'Blocumentary Great Creations: Akiyuki Kawaguchi',
-        },
-    }, {
-        # geo-restricted but the contentUrl contain a valid url
-        'url': 'http://www.lego.com/nl-nl/videos/themes/nexoknights/episode-20-kingdom-of-heroes-13bdc2299ab24d9685701a915b3d71e7##sp=399',
-        'md5': '4c3fec48a12e40c6e5995abc3d36cc2e',
-        'info_dict': {
-            'id': '13bdc2299ab24d9685701a915b3d71e7',
-            'ext': 'mp4',
-            'title': 'Aflevering 20 - Helden van het koninkrijk',
-            'description': 'md5:8ee499aac26d7fa8bcb0cedb7f9c3941',
-        },
-    }, {
-        # special characters in title
-        'url': 'http://www.lego.com/en-us/starwars/videos/lego-star-wars-force-surprise-9685ee9d12e84ff38e84b4e3d0db533d',
-        'info_dict': {
-            'id': '9685ee9d12e84ff38e84b4e3d0db533d',
-            'ext': 'mp4',
-            'title': 'Force Surprise – LEGO® Star Wars™ Microfighters',
-            'description': 'md5:9c673c96ce6f6271b88563fe9dc56de3',
-        },
-        'params': {
-            'skip_download': True,
-        },
-    }]
-    _BITRATES = [256, 512, 1024, 1536, 2560]
-
-    def _real_extract(self, url):
-        locale, video_id = re.match(self._VALID_URL, url).groups()
-        webpage = self._download_webpage(url, video_id)
-        title = get_element_by_class('video-header', webpage).strip()
-        progressive_base = 'https://lc-mediaplayerns-live-s.legocdn.com/'
-        streaming_base = 'http://legoprod-f.akamaihd.net/'
-        content_url = self._html_search_meta('contentUrl', webpage)
-        path = self._search_regex(
-            r'(?:https?:)?//[^/]+/(?:[iz]/s/)?public/(.+)_[0-9,]+\.(?:mp4|webm)',
-            content_url, 'video path', default=None)
-        if not path:
-            player_url = self._proto_relative_url(self._search_regex(
-                r'<iframe[^>]+src="((?:https?)?//(?:www\.)?lego\.com/[^/]+/mediaplayer/video/[^"]+)',
-                webpage, 'player url', default=None))
-            if not player_url:
-                base_url = self._proto_relative_url(self._search_regex(
-                    r'data-baseurl="([^"]+)"', webpage, 'base url',
-                    default='http://www.lego.com/%s/mediaplayer/video/' % locale))
-                player_url = base_url + video_id
-            player_webpage = self._download_webpage(player_url, video_id)
-            video_data = self._parse_json(unescapeHTML(self._search_regex(
-                r"video='([^']+)'", player_webpage, 'video data')), video_id)
-            progressive_base = self._search_regex(
-                r'data-video-progressive-url="([^"]+)"',
-                player_webpage, 'progressive base', default='https://lc-mediaplayerns-live-s.legocdn.com/')
-            streaming_base = self._search_regex(
-                r'data-video-streaming-url="([^"]+)"',
-                player_webpage, 'streaming base', default='http://legoprod-f.akamaihd.net/')
-            item_id = video_data['ItemId']
-
-            net_storage_path = video_data.get('NetStoragePath') or '/'.join([item_id[:2], item_id[2:4]])
-            base_path = '_'.join([item_id, video_data['VideoId'], video_data['Locale'], compat_str(video_data['VideoVersion'])])
-            path = '/'.join([net_storage_path, base_path])
-        streaming_path = ','.join(map(lambda bitrate: compat_str(bitrate), self._BITRATES))
-
-        formats = self._extract_akamai_formats(
-            '%si/s/public/%s_,%s,.mp4.csmil/master.m3u8' % (streaming_base, path, streaming_path), video_id)
-        m3u8_formats = list(filter(
-            lambda f: f.get('protocol') == 'm3u8_native' and f.get('vcodec') != 'none',
-            formats))
-        if len(m3u8_formats) == len(self._BITRATES):
-            self._sort_formats(m3u8_formats)
-            for bitrate, m3u8_format in zip(self._BITRATES, m3u8_formats):
-                progressive_base_url = '%spublic/%s_%d.' % (progressive_base, path, bitrate)
-                mp4_f = m3u8_format.copy()
-                mp4_f.update({
-                    'url': progressive_base_url + 'mp4',
-                    'format_id': m3u8_format['format_id'].replace('hls', 'mp4'),
-                    'protocol': 'http',
-                })
-                web_f = {
-                    'url': progressive_base_url + 'webm',
-                    'format_id': m3u8_format['format_id'].replace('hls', 'webm'),
-                    'width': m3u8_format['width'],
-                    'height': m3u8_format['height'],
-                    'tbr': m3u8_format.get('tbr'),
-                    'ext': 'webm',
-                }
-                formats.extend([web_f, mp4_f])
-        else:
-            for bitrate in self._BITRATES:
-                for ext in ('web', 'mp4'):
-                    formats.append({
-                        'format_id': '%s-%s' % (ext, bitrate),
-                        'url': '%spublic/%s_%d.%s' % (progressive_base, path, bitrate, ext),
-                        'tbr': bitrate,
-                        'ext': ext,
-                    })
-        self._sort_formats(formats)
-
-        return {
-            'id': video_id,
-            'title': title,
-            'description': self._html_search_meta('description', webpage),
-            'thumbnail': self._html_search_meta('thumbnail', webpage),
-            'duration': parse_duration(self._html_search_meta('duration', webpage)),
-            'formats': formats,
-        }
diff --git a/youtube_dl/extractor/mitele.py b/youtube_dl/extractor/mitele.py

deleted file mode 100644 (file)

index 40f214a..0000000
--- a/youtube_dl/extractor/mitele.py
+++ /dev/null
@@ -1,120 +0,0 @@
-# coding: utf-8
-from __future__ import unicode_literals
-
-from .common import InfoExtractor
-from ..utils import (
-    int_or_none,
-    smuggle_url,
-    parse_duration,
-)
-
-
-class MiTeleIE(InfoExtractor):
-    IE_DESC = 'mitele.es'
-    _VALID_URL = r'https?://(?:www\.)?mitele\.es/(?:[^/]+/)+(?P<id>[^/]+)/player'
-
-    _TESTS = [{
-        'url': 'http://www.mitele.es/programas-tv/diario-de/57b0dfb9c715da65618b4afa/player',
-        'info_dict': {
-            'id': 'FhYW1iNTE6J6H7NkQRIEzfne6t2quqPg',
-            'ext': 'mp4',
-            'title': 'Tor, la web invisible',
-            'description': 'md5:3b6fce7eaa41b2d97358726378d9369f',
-            'series': 'Diario de',
-            'season': 'La redacción',
-            'season_number': 14,
-            'season_id': 'diario_de_t14_11981',
-            'episode': 'Programa 144',
-            'episode_number': 3,
-            'thumbnail': r're:(?i)^https?://.*\.jpg$',
-            'duration': 2913,
-        },
-        'add_ie': ['Ooyala'],
-    }, {
-        # no explicit title
-        'url': 'http://www.mitele.es/programas-tv/cuarto-milenio/57b0de3dc915da14058b4876/player',
-        'info_dict': {
-            'id': 'oyNG1iNTE6TAPP-JmCjbwfwJqqMMX3Vq',
-            'ext': 'mp4',
-            'title': 'Cuarto Milenio Temporada 6 Programa 226',
-            'description': 'md5:5ff132013f0cd968ffbf1f5f3538a65f',
-            'series': 'Cuarto Milenio',
-            'season': 'Temporada 6',
-            'season_number': 6,
-            'season_id': 'cuarto_milenio_t06_12715',
-            'episode': 'Programa 226',
-            'episode_number': 24,
-            'thumbnail': r're:(?i)^https?://.*\.jpg$',
-            'duration': 7313,
-        },
-        'params': {
-            'skip_download': True,
-        },
-        'add_ie': ['Ooyala'],
-    }, {
-        'url': 'http://www.mitele.es/series-online/la-que-se-avecina/57aac5c1c915da951a8b45ed/player',
-        'only_matching': True,
-    }]
-
-    def _real_extract(self, url):
-        video_id = self._match_id(url)
-
-        paths = self._download_json(
-            'https://www.mitele.es/amd/agp/web/metadata/general_configuration',
-            video_id, 'Downloading paths JSON')
-
-        ooyala_s = paths['general_configuration']['api_configuration']['ooyala_search']
-        base_url = ooyala_s.get('base_url', 'cdn-search-mediaset.carbyne.ps.ooyala.com')
-        full_path = ooyala_s.get('full_path', '/search/v1/full/providers/')
-        source = self._download_json(
-            '%s://%s%s%s/docs/%s' % (
-                ooyala_s.get('protocol', 'https'), base_url, full_path,
-                ooyala_s.get('provider_id', '104951'), video_id),
-            video_id, 'Downloading data JSON', query={
-                'include_titles': 'Series,Season',
-                'product_name': ooyala_s.get('product_name', 'test'),
-                'format': 'full',
-            })['hits']['hits'][0]['_source']
-
-        embedCode = source['offers'][0]['embed_codes'][0]
-        titles = source['localizable_titles'][0]
-
-        title = titles.get('title_medium') or titles['title_long']
-
-        description = titles.get('summary_long') or titles.get('summary_medium')
-
-        def get(key1, key2):
-            value1 = source.get(key1)
-            if not value1 or not isinstance(value1, list):
-                return
-            if not isinstance(value1[0], dict):
-                return
-            return value1[0].get(key2)
-
-        series = get('localizable_titles_series', 'title_medium')
-
-        season = get('localizable_titles_season', 'title_medium')
-        season_number = int_or_none(source.get('season_number'))
-        season_id = source.get('season_id')
-
-        episode = titles.get('title_sort_name')
-        episode_number = int_or_none(source.get('episode_number'))
-
-        duration = parse_duration(get('videos', 'duration'))
-
-        return {
-            '_type': 'url_transparent',
-            # for some reason only HLS is supported
-            'url': smuggle_url('ooyala:' + embedCode, {'supportedformats': 'm3u8,dash'}),
-            'id': video_id,
-            'title': title,
-            'description': description,
-            'series': series,
-            'season': season,
-            'season_number': season_number,
-            'season_id': season_id,
-            'episode': episode,
-            'episode_number': episode_number,
-            'duration': duration,
-            'thumbnail': get('images', 'url'),
-        }
diff --git a/youtube_dl/extractor/pandatv.py b/youtube_dl/extractor/pandatv.py

deleted file mode 100644 (file)

index 4219802..0000000
--- a/youtube_dl/extractor/pandatv.py
+++ /dev/null
@@ -1,99 +0,0 @@
-# coding: utf-8
-from __future__ import unicode_literals
-
-from .common import InfoExtractor
-from ..utils import (
-    ExtractorError,
-    qualities,
-)
-
-
-class PandaTVIE(InfoExtractor):
-    IE_DESC = '熊猫TV'
-    _VALID_URL = r'https?://(?:www\.)?panda\.tv/(?P<id>[0-9]+)'
-    _TESTS = [{
-        'url': 'http://www.panda.tv/66666',
-        'info_dict': {
-            'id': '66666',
-            'title': 're:.+',
-            'uploader': '刘杀鸡',
-            'ext': 'flv',
-            'is_live': True,
-        },
-        'params': {
-            'skip_download': True,
-        },
-        'skip': 'Live stream is offline',
-    }, {
-        'url': 'https://www.panda.tv/66666',
-        'only_matching': True,
-    }]
-
-    def _real_extract(self, url):
-        video_id = self._match_id(url)
-
-        config = self._download_json(
-            'https://www.panda.tv/api_room_v2?roomid=%s' % video_id, video_id)
-
-        error_code = config.get('errno', 0)
-        if error_code != 0:
-            raise ExtractorError(
-                '%s returned error %s: %s'
-                % (self.IE_NAME, error_code, config['errmsg']),
-                expected=True)
-
-        data = config['data']
-        video_info = data['videoinfo']
-
-        # 2 = live, 3 = offline
-        if video_info.get('status') != '2':
-            raise ExtractorError(
-                'Live stream is offline', expected=True)
-
-        title = data['roominfo']['name']
-        uploader = data.get('hostinfo', {}).get('name')
-        room_key = video_info['room_key']
-        stream_addr = video_info.get(
-            'stream_addr', {'OD': '1', 'HD': '1', 'SD': '1'})
-
-        # Reverse engineered from web player swf
-        # (http://s6.pdim.gs/static/07153e425f581151.swf at the moment of
-        # writing).
-        plflag0, plflag1 = video_info['plflag'].split('_')
-        plflag0 = int(plflag0) - 1
-        if plflag1 == '21':
-            plflag0 = 10
-            plflag1 = '4'
-        live_panda = 'live_panda' if plflag0 < 1 else ''
-
-        plflag_auth = self._parse_json(video_info['plflag_list'], video_id)
-        sign = plflag_auth['auth']['sign']
-        ts = plflag_auth['auth']['time']
-        rid = plflag_auth['auth']['rid']
-
-        quality_key = qualities(['OD', 'HD', 'SD'])
-        suffix = ['_small', '_mid', '']
-        formats = []
-        for k, v in stream_addr.items():
-            if v != '1':
-                continue
-            quality = quality_key(k)
-            if quality <= 0:
-                continue
-            for pref, (ext, pl) in enumerate((('m3u8', '-hls'), ('flv', ''))):
-                formats.append({
-                    'url': 'https://pl%s%s.live.panda.tv/live_panda/%s%s%s.%s?sign=%s&ts=%s&rid=%s'
-                    % (pl, plflag1, room_key, live_panda, suffix[quality], ext, sign, ts, rid),
-                    'format_id': '%s-%s' % (k, ext),
-                    'quality': quality,
-                    'source_preference': pref,
-                })
-        self._sort_formats(formats)
-
-        return {
-            'id': video_id,
-            'title': self._live_title(title),
-            'uploader': uploader,
-            'formats': formats,
-            'is_live': True,
-        }
diff --git a/youtube_dl/extractor/phoenix.py b/youtube_dl/extractor/phoenix.py

deleted file mode 100644 (file)

index e435c28..0000000
--- a/youtube_dl/extractor/phoenix.py
+++ /dev/null
@@ -1,45 +0,0 @@
-from __future__ import unicode_literals
-
-from .dreisat import DreiSatIE
-
-
-class PhoenixIE(DreiSatIE):
-    IE_NAME = 'phoenix.de'
-    _VALID_URL = r'''(?x)https?://(?:www\.)?phoenix\.de/content/
-        (?:
-            phoenix/die_sendungen/(?:[^/]+/)?
-        )?
-        (?P<id>[0-9]+)'''
-    _TESTS = [
-        {
-            'url': 'http://www.phoenix.de/content/884301',
-            'md5': 'ed249f045256150c92e72dbb70eadec6',
-            'info_dict': {
-                'id': '884301',
-                'ext': 'mp4',
-                'title': 'Michael Krons mit Hans-Werner Sinn',
-                'description': 'Im Dialog - Sa. 25.10.14, 00.00 - 00.35 Uhr',
-                'upload_date': '20141025',
-                'uploader': 'Im Dialog',
-            }
-        },
-        {
-            'url': 'http://www.phoenix.de/content/phoenix/die_sendungen/869815',
-            'only_matching': True,
-        },
-        {
-            'url': 'http://www.phoenix.de/content/phoenix/die_sendungen/diskussionen/928234',
-            'only_matching': True,
-        },
-    ]
-
-    def _real_extract(self, url):
-        video_id = self._match_id(url)
-        webpage = self._download_webpage(url, video_id)
-
-        internal_id = self._search_regex(
-            r'<div class="phx_vod" id="phx_vod_([0-9]+)"',
-            webpage, 'internal video ID')
-
-        api_url = 'http://www.phoenix.de/php/mediaplayer/data/beitrags_details.php?ak=web&id=%s' % internal_id
-        return self.extract_from_xml_url(video_id, api_url)
diff --git a/youtube_dl/extractor/pokemon.py b/youtube_dl/extractor/pokemon.py

deleted file mode 100644 (file)

index dd5f17f..0000000
--- a/youtube_dl/extractor/pokemon.py
+++ /dev/null
@@ -1,75 +0,0 @@
-# coding: utf-8
-from __future__ import unicode_literals
-
-import re
-
-from .common import InfoExtractor
-from ..utils import (
-    extract_attributes,
-    int_or_none,
-)
-
-
-class PokemonIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:www\.)?pokemon\.com/[a-z]{2}(?:.*?play=(?P<id>[a-z0-9]{32})|/(?:[^/]+/)+(?P<display_id>[^/?#&]+))'
-    _TESTS = [{
-        'url': 'https://www.pokemon.com/us/pokemon-episodes/20_30-the-ol-raise-and-switch/',
-        'md5': '2fe8eaec69768b25ef898cda9c43062e',
-        'info_dict': {
-            'id': 'afe22e30f01c41f49d4f1d9eab5cd9a4',
-            'ext': 'mp4',
-            'title': 'The Ol’ Raise and Switch!',
-            'description': 'md5:7db77f7107f98ba88401d3adc80ff7af',
-            'timestamp': 1511824728,
-            'upload_date': '20171127',
-        },
-        'add_id': ['LimelightMedia'],
-    }, {
-        # no data-video-title
-        'url': 'https://www.pokemon.com/us/pokemon-episodes/pokemon-movies/pokemon-the-rise-of-darkrai-2008',
-        'info_dict': {
-            'id': '99f3bae270bf4e5097274817239ce9c8',
-            'ext': 'mp4',
-            'title': 'Pokémon: The Rise of Darkrai',
-            'description': 'md5:ea8fbbf942e1e497d54b19025dd57d9d',
-            'timestamp': 1417778347,
-            'upload_date': '20141205',
-        },
-        'add_id': ['LimelightMedia'],
-        'params': {
-            'skip_download': True,
-        },
-    }, {
-        'url': 'http://www.pokemon.com/uk/pokemon-episodes/?play=2e8b5c761f1d4a9286165d7748c1ece2',
-        'only_matching': True,
-    }, {
-        'url': 'http://www.pokemon.com/fr/episodes-pokemon/18_09-un-hiver-inattendu/',
-        'only_matching': True,
-    }, {
-        'url': 'http://www.pokemon.com/de/pokemon-folgen/01_20-bye-bye-smettbo/',
-        'only_matching': True,
-    }]
-
-    def _real_extract(self, url):
-        video_id, display_id = re.match(self._VALID_URL, url).groups()
-        webpage = self._download_webpage(url, video_id or display_id)
-        video_data = extract_attributes(self._search_regex(
-            r'(<[^>]+data-video-id="%s"[^>]*>)' % (video_id if video_id else '[a-z0-9]{32}'),
-            webpage, 'video data element'))
-        video_id = video_data['data-video-id']
-        title = video_data.get('data-video-title') or self._html_search_meta(
-            'pkm-title', webpage, ' title', default=None) or self._search_regex(
-            r'<h1[^>]+\bclass=["\']us-title[^>]+>([^<]+)', webpage, 'title')
-        return {
-            '_type': 'url_transparent',
-            'id': video_id,
-            'url': 'limelight:media:%s' % video_id,
-            'title': title,
-            'description': video_data.get('data-video-summary'),
-            'thumbnail': video_data.get('data-video-poster'),
-            'series': 'Pokémon',
-            'season_number': int_or_none(video_data.get('data-video-season')),
-            'episode': title,
-            'episode_number': int_or_none(video_data.get('data-video-episode')),
-            'ie_key': 'LimelightMedia',
-        }
diff --git a/youtube_dl/extractor/spankwire.py b/youtube_dl/extractor/spankwire.py

deleted file mode 100644 (file)

index 44d8fa5..0000000
--- a/youtube_dl/extractor/spankwire.py
+++ /dev/null
@@ -1,127 +0,0 @@
-from __future__ import unicode_literals
-
-import re
-
-from .common import InfoExtractor
-from ..compat import (
-    compat_urllib_parse_unquote,
-    compat_urllib_parse_urlparse,
-)
-from ..utils import (
-    sanitized_Request,
-    str_to_int,
-    unified_strdate,
-)
-from ..aes import aes_decrypt_text
-
-
-class SpankwireIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:www\.)?(?P<url>spankwire\.com/[^/]*/video(?P<id>[0-9]+)/?)'
-    _TESTS = [{
-        # download URL pattern: */<height>P_<tbr>K_<video_id>.mp4
-        'url': 'http://www.spankwire.com/Buckcherry-s-X-Rated-Music-Video-Crazy-Bitch/video103545/',
-        'md5': '8bbfde12b101204b39e4b9fe7eb67095',
-        'info_dict': {
-            'id': '103545',
-            'ext': 'mp4',
-            'title': 'Buckcherry`s X Rated Music Video Crazy Bitch',
-            'description': 'Crazy Bitch X rated music video.',
-            'uploader': 'oreusz',
-            'uploader_id': '124697',
-            'upload_date': '20070507',
-            'age_limit': 18,
-        }
-    }, {
-        # download URL pattern: */mp4_<format_id>_<video_id>.mp4
-        'url': 'http://www.spankwire.com/Titcums-Compiloation-I/video1921551/',
-        'md5': '09b3c20833308b736ae8902db2f8d7e6',
-        'info_dict': {
-            'id': '1921551',
-            'ext': 'mp4',
-            'title': 'Titcums Compiloation I',
-            'description': 'cum on tits',
-            'uploader': 'dannyh78999',
-            'uploader_id': '3056053',
-            'upload_date': '20150822',
-            'age_limit': 18,
-        },
-    }]
-
-    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-        video_id = mobj.group('id')
-
-        req = sanitized_Request('http://www.' + mobj.group('url'))
-        req.add_header('Cookie', 'age_verified=1')
-        webpage = self._download_webpage(req, video_id)
-
-        title = self._html_search_regex(
-            r'<h1>([^<]+)', webpage, 'title')
-        description = self._html_search_regex(
-            r'(?s)<div\s+id="descriptionContent">(.+?)</div>',
-            webpage, 'description', fatal=False)
-        thumbnail = self._html_search_regex(
-            r'playerData\.screenShot\s*=\s*["\']([^"\']+)["\']',
-            webpage, 'thumbnail', fatal=False)
-
-        uploader = self._html_search_regex(
-            r'by:\s*<a [^>]*>(.+?)</a>',
-            webpage, 'uploader', fatal=False)
-        uploader_id = self._html_search_regex(
-            r'by:\s*<a href="/(?:user/viewProfile|Profile\.aspx)\?.*?UserId=(\d+).*?"',
-            webpage, 'uploader id', fatal=False)
-        upload_date = unified_strdate(self._html_search_regex(
-            r'</a> on (.+?) at \d+:\d+',
-            webpage, 'upload date', fatal=False))
-
-        view_count = str_to_int(self._html_search_regex(
-            r'<div id="viewsCounter"><span>([\d,\.]+)</span> views</div>',
-            webpage, 'view count', fatal=False))
-        comment_count = str_to_int(self._html_search_regex(
-            r'<span\s+id="spCommentCount"[^>]*>([\d,\.]+)</span>',
-            webpage, 'comment count', fatal=False))
-
-        videos = re.findall(
-            r'playerData\.cdnPath([0-9]{3,})\s*=\s*(?:encodeURIComponent\()?["\']([^"\']+)["\']', webpage)
-        heights = [int(video[0]) for video in videos]
-        video_urls = list(map(compat_urllib_parse_unquote, [video[1] for video in videos]))
-        if webpage.find(r'flashvars\.encrypted = "true"') != -1:
-            password = self._search_regex(
-                r'flashvars\.video_title = "([^"]+)',
-                webpage, 'password').replace('+', ' ')
-            video_urls = list(map(
-                lambda s: aes_decrypt_text(s, password, 32).decode('utf-8'),
-                video_urls))
-
-        formats = []
-        for height, video_url in zip(heights, video_urls):
-            path = compat_urllib_parse_urlparse(video_url).path
-            m = re.search(r'/(?P<height>\d+)[pP]_(?P<tbr>\d+)[kK]', path)
-            if m:
-                tbr = int(m.group('tbr'))
-                height = int(m.group('height'))
-            else:
-                tbr = None
-            formats.append({
-                'url': video_url,
-                'format_id': '%dp' % height,
-                'height': height,
-                'tbr': tbr,
-            })
-        self._sort_formats(formats)
-
-        age_limit = self._rta_search(webpage)
-
-        return {
-            'id': video_id,
-            'title': title,
-            'description': description,
-            'thumbnail': thumbnail,
-            'uploader': uploader,
-            'uploader_id': uploader_id,
-            'upload_date': upload_date,
-            'view_count': view_count,
-            'comment_count': comment_count,
-            'formats': formats,
-            'age_limit': age_limit,
-        }
diff --git a/youtube_dl/extractor/stretchinternet.py b/youtube_dl/extractor/stretchinternet.py

deleted file mode 100644 (file)

index ae2ac1b..0000000
--- a/youtube_dl/extractor/stretchinternet.py
+++ /dev/null
@@ -1,48 +0,0 @@
-from __future__ import unicode_literals
-
-from .common import InfoExtractor
-from ..utils import int_or_none
-
-
-class StretchInternetIE(InfoExtractor):
-    _VALID_URL = r'https?://portal\.stretchinternet\.com/[^/]+/portal\.htm\?.*?\beventId=(?P<id>\d+)'
-    _TEST = {
-        'url': 'https://portal.stretchinternet.com/umary/portal.htm?eventId=313900&streamType=video',
-        'info_dict': {
-            'id': '313900',
-            'ext': 'mp4',
-            'title': 'Augustana (S.D.) Baseball vs University of Mary',
-            'description': 'md5:7578478614aae3bdd4a90f578f787438',
-            'timestamp': 1490468400,
-            'upload_date': '20170325',
-        }
-    }
-
-    def _real_extract(self, url):
-        video_id = self._match_id(url)
-
-        stream = self._download_json(
-            'https://neo-client.stretchinternet.com/streamservice/v1/media/stream/v%s'
-            % video_id, video_id)
-
-        video_url = 'https://%s' % stream['source']
-
-        event = self._download_json(
-            'https://neo-client.stretchinternet.com/portal-ws/getEvent.json',
-            video_id, query={
-                'clientID': 99997,
-                'eventID': video_id,
-                'token': 'asdf',
-            })['event']
-
-        title = event.get('title') or event['mobileTitle']
-        description = event.get('customText')
-        timestamp = int_or_none(event.get('longtime'))
-
-        return {
-            'id': video_id,
-            'title': title,
-            'description': description,
-            'timestamp': timestamp,
-            'url': video_url,
-        }
diff --git a/youtube_dl/extractor/tele5.py b/youtube_dl/extractor/tele5.py

deleted file mode 100644 (file)

index 33a7208..0000000
--- a/youtube_dl/extractor/tele5.py
+++ /dev/null
@@ -1,57 +0,0 @@
-# coding: utf-8
-from __future__ import unicode_literals
-
-from .common import InfoExtractor
-from .nexx import NexxIE
-from ..compat import compat_urlparse
-
-
-class Tele5IE(InfoExtractor):
-    _VALID_URL = r'https?://(?:www\.)?tele5\.de/(?:[^/]+/)*(?P<id>[^/?#&]+)'
-    _TESTS = [{
-        'url': 'https://www.tele5.de/mediathek/filme-online/videos?vid=1549416',
-        'info_dict': {
-            'id': '1549416',
-            'ext': 'mp4',
-            'upload_date': '20180814',
-            'timestamp': 1534290623,
-            'title': 'Pandorum',
-        },
-        'params': {
-            'skip_download': True,
-        },
-    }, {
-        'url': 'https://www.tele5.de/kalkofes-mattscheibe/video-clips/politik-und-gesellschaft?ve_id=1551191',
-        'only_matching': True,
-    }, {
-        'url': 'https://www.tele5.de/video-clip/?ve_id=1609440',
-        'only_matching': True,
-    }, {
-        'url': 'https://www.tele5.de/filme/schlefaz-dragon-crusaders/',
-        'only_matching': True,
-    }, {
-        'url': 'https://www.tele5.de/filme/making-of/avengers-endgame/',
-        'only_matching': True,
-    }, {
-        'url': 'https://www.tele5.de/star-trek/raumschiff-voyager/ganze-folge/das-vinculum/',
-        'only_matching': True,
-    }, {
-        'url': 'https://www.tele5.de/anders-ist-sevda/',
-        'only_matching': True,
-    }]
-
-    def _real_extract(self, url):
-        qs = compat_urlparse.parse_qs(compat_urlparse.urlparse(url).query)
-        video_id = (qs.get('vid') or qs.get('ve_id') or [None])[0]
-
-        if not video_id:
-            display_id = self._match_id(url)
-            webpage = self._download_webpage(url, display_id)
-            video_id = self._html_search_regex(
-                (r'id\s*=\s*["\']video-player["\'][^>]+data-id\s*=\s*["\'](\d+)',
-                 r'\s+id\s*=\s*["\']player_(\d{6,})',
-                 r'\bdata-id\s*=\s*["\'](\d{6,})'), webpage, 'video id')
-
-        return self.url_result(
-            'https://api.nexx.cloud/v3/759/videos/byid/%s' % video_id,
-            ie=NexxIE.ie_key(), video_id=video_id)
diff --git a/youtube_dl/extractor/trunews.py b/youtube_dl/extractor/trunews.py

deleted file mode 100644 (file)

index b0c7caa..0000000
--- a/youtube_dl/extractor/trunews.py
+++ /dev/null
@@ -1,75 +0,0 @@
-from __future__ import unicode_literals
-
-from .common import InfoExtractor
-from ..utils import (
-    dict_get,
-    float_or_none,
-    int_or_none,
-    unified_timestamp,
-    update_url_query,
-    url_or_none,
-)
-
-
-class TruNewsIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:www\.)?trunews\.com/stream/(?P<id>[^/?#&]+)'
-    _TEST = {
-        'url': 'https://www.trunews.com/stream/will-democrats-stage-a-circus-during-president-trump-s-state-of-the-union-speech',
-        'md5': 'a19c024c3906ff954fac9b96ce66bb08',
-        'info_dict': {
-            'id': '5c5a21e65d3c196e1c0020cc',
-            'display_id': 'will-democrats-stage-a-circus-during-president-trump-s-state-of-the-union-speech',
-            'ext': 'mp4',
-            'title': "Will Democrats Stage a Circus During President Trump's State of the Union Speech?",
-            'description': 'md5:c583b72147cc92cf21f56a31aff7a670',
-            'duration': 3685,
-            'timestamp': 1549411440,
-            'upload_date': '20190206',
-        },
-        'add_ie': ['Zype'],
-    }
-
-    def _real_extract(self, url):
-        display_id = self._match_id(url)
-
-        video = self._download_json(
-            'https://api.zype.com/videos', display_id, query={
-                'app_key': 'PUVKp9WgGUb3-JUw6EqafLx8tFVP6VKZTWbUOR-HOm__g4fNDt1bCsm_LgYf_k9H',
-                'per_page': 1,
-                'active': 'true',
-                'friendly_title': display_id,
-            })['response'][0]
-
-        zype_id = video['_id']
-
-        thumbnails = []
-        thumbnails_list = video.get('thumbnails')
-        if isinstance(thumbnails_list, list):
-            for thumbnail in thumbnails_list:
-                if not isinstance(thumbnail, dict):
-                    continue
-                thumbnail_url = url_or_none(thumbnail.get('url'))
-                if not thumbnail_url:
-                    continue
-                thumbnails.append({
-                    'url': thumbnail_url,
-                    'width': int_or_none(thumbnail.get('width')),
-                    'height': int_or_none(thumbnail.get('height')),
-                })
-
-        return {
-            '_type': 'url_transparent',
-            'url': update_url_query(
-                'https://player.zype.com/embed/%s.js' % zype_id,
-                {'api_key': 'X5XnahkjCwJrT_l5zUqypnaLEObotyvtUKJWWlONxDoHVjP8vqxlArLV8llxMbyt'}),
-            'ie_key': 'Zype',
-            'id': zype_id,
-            'display_id': display_id,
-            'title': video.get('title'),
-            'description': dict_get(video, ('description', 'ott_description', 'short_description')),
-            'duration': int_or_none(video.get('duration')),
-            'timestamp': unified_timestamp(video.get('published_at')),
-            'average_rating': float_or_none(video.get('rating')),
-            'view_count': int_or_none(video.get('request_count')),
-            'thumbnails': thumbnails,
-        }
diff --git a/youtube_dl/extractor/tv5mondeplus.py b/youtube_dl/extractor/tv5mondeplus.py

deleted file mode 100644 (file)

index 88b6baa..0000000
--- a/youtube_dl/extractor/tv5mondeplus.py
+++ /dev/null
@@ -1,79 +0,0 @@
-# coding: utf-8
-from __future__ import unicode_literals
-
-from .common import InfoExtractor
-from ..utils import (
-    clean_html,
-    determine_ext,
-    extract_attributes,
-    get_element_by_class,
-    int_or_none,
-    parse_duration,
-    parse_iso8601,
-)
-
-
-class TV5MondePlusIE(InfoExtractor):
-    IE_DESC = 'TV5MONDE+'
-    _VALID_URL = r'https?://(?:www\.)?tv5mondeplus\.com/toutes-les-videos/[^/]+/(?P<id>[^/?#]+)'
-    _TEST = {
-        'url': 'http://www.tv5mondeplus.com/toutes-les-videos/documentaire/tdah-mon-amour-tele-quebec-tdah-mon-amour-ep001-enfants',
-        'md5': '12130fc199f020673138a83466542ec6',
-        'info_dict': {
-            'id': 'tdah-mon-amour-tele-quebec-tdah-mon-amour-ep001-enfants',
-            'ext': 'mp4',
-            'title': 'Tdah, mon amour - Enfants',
-            'description': 'md5:230e3aca23115afcf8006d1bece6df74',
-            'upload_date': '20170401',
-            'timestamp': 1491022860,
-        }
-    }
-    _GEO_BYPASS = False
-
-    def _real_extract(self, url):
-        display_id = self._match_id(url)
-        webpage = self._download_webpage(url, display_id)
-
-        if ">Ce programme n'est malheureusement pas disponible pour votre zone géographique.<" in webpage:
-            self.raise_geo_restricted(countries=['FR'])
-
-        series = get_element_by_class('video-detail__title', webpage)
-        title = episode = get_element_by_class(
-            'video-detail__subtitle', webpage) or series
-        if series and series != title:
-            title = '%s - %s' % (series, title)
-        vpl_data = extract_attributes(self._search_regex(
-            r'(<[^>]+class="video_player_loader"[^>]+>)',
-            webpage, 'video player loader'))
-
-        video_files = self._parse_json(
-            vpl_data['data-broadcast'], display_id).get('files', [])
-        formats = []
-        for video_file in video_files:
-            v_url = video_file.get('url')
-            if not v_url:
-                continue
-            video_format = video_file.get('format') or determine_ext(v_url)
-            if video_format == 'm3u8':
-                formats.extend(self._extract_m3u8_formats(
-                    v_url, display_id, 'mp4', 'm3u8_native',
-                    m3u8_id='hls', fatal=False))
-            else:
-                formats.append({
-                    'url': v_url,
-                    'format_id': video_format,
-                })
-        self._sort_formats(formats)
-
-        return {
-            'id': display_id,
-            'display_id': display_id,
-            'title': title,
-            'description': clean_html(get_element_by_class('video-detail__description', webpage)),
-            'thumbnail': vpl_data.get('data-image'),
-            'duration': int_or_none(vpl_data.get('data-duration')) or parse_duration(self._html_search_meta('duration', webpage)),
-            'timestamp': parse_iso8601(self._html_search_meta('uploadDate', webpage)),
-            'formats': formats,
-            'episode': episode,
-            'series': series,
-        }
diff --git a/youtube_dl/extractor/viewlift.py b/youtube_dl/extractor/viewlift.py

deleted file mode 100644 (file)

index 851ad93..0000000
--- a/youtube_dl/extractor/viewlift.py
+++ /dev/null
@@ -1,302 +0,0 @@
-from __future__ import unicode_literals
-
-import base64
-import re
-
-from .common import InfoExtractor
-from ..compat import compat_urllib_parse_unquote
-from ..utils import (
-    ExtractorError,
-    clean_html,
-    determine_ext,
-    int_or_none,
-    js_to_json,
-    parse_age_limit,
-    parse_duration,
-    try_get,
-)
-
-
-class ViewLiftBaseIE(InfoExtractor):
-    _DOMAINS_REGEX = r'(?:(?:main\.)?snagfilms|snagxtreme|funnyforfree|kiddovid|winnersview|(?:monumental|lax)sportsnetwork|vayafilm)\.com|hoichoi\.tv'
-
-
-class ViewLiftEmbedIE(ViewLiftBaseIE):
-    _VALID_URL = r'https?://(?:(?:www|embed)\.)?(?:%s)/embed/player\?.*\bfilmId=(?P<id>[\da-f]{8}-(?:[\da-f]{4}-){3}[\da-f]{12})' % ViewLiftBaseIE._DOMAINS_REGEX
-    _TESTS = [{
-        'url': 'http://embed.snagfilms.com/embed/player?filmId=74849a00-85a9-11e1-9660-123139220831&w=500',
-        'md5': '2924e9215c6eff7a55ed35b72276bd93',
-        'info_dict': {
-            'id': '74849a00-85a9-11e1-9660-123139220831',
-            'ext': 'mp4',
-            'title': '#whilewewatch',
-        }
-    }, {
-        # invalid labels, 360p is better that 480p
-        'url': 'http://www.snagfilms.com/embed/player?filmId=17ca0950-a74a-11e0-a92a-0026bb61d036',
-        'md5': '882fca19b9eb27ef865efeeaed376a48',
-        'info_dict': {
-            'id': '17ca0950-a74a-11e0-a92a-0026bb61d036',
-            'ext': 'mp4',
-            'title': 'Life in Limbo',
-        }
-    }, {
-        'url': 'http://www.snagfilms.com/embed/player?filmId=0000014c-de2f-d5d6-abcf-ffef58af0017',
-        'only_matching': True,
-    }]
-
-    @staticmethod
-    def _extract_url(webpage):
-        mobj = re.search(
-            r'<iframe[^>]+?src=(["\'])(?P<url>(?:https?:)?//(?:embed\.)?(?:%s)/embed/player.+?)\1' % ViewLiftBaseIE._DOMAINS_REGEX,
-            webpage)
-        if mobj:
-            return mobj.group('url')
-
-    def _real_extract(self, url):
-        video_id = self._match_id(url)
-
-        webpage = self._download_webpage(url, video_id)
-
-        if '>This film is not playable in your area.<' in webpage:
-            raise ExtractorError(
-                'Film %s is not playable in your area.' % video_id, expected=True)
-
-        formats = []
-        has_bitrate = False
-        sources = self._parse_json(self._search_regex(
-            r'(?s)sources:\s*(\[.+?\]),', webpage,
-            'sources', default='[]'), video_id, js_to_json)
-        for source in sources:
-            file_ = source.get('file')
-            if not file_:
-                continue
-            type_ = source.get('type')
-            ext = determine_ext(file_)
-            format_id = source.get('label') or ext
-            if all(v in ('m3u8', 'hls') for v in (type_, ext)):
-                formats.extend(self._extract_m3u8_formats(
-                    file_, video_id, 'mp4', 'm3u8_native',
-                    m3u8_id='hls', fatal=False))
-            else:
-                bitrate = int_or_none(self._search_regex(
-                    [r'(\d+)kbps', r'_\d{1,2}x\d{1,2}_(\d{3,})\.%s' % ext],
-                    file_, 'bitrate', default=None))
-                if not has_bitrate and bitrate:
-                    has_bitrate = True
-                height = int_or_none(self._search_regex(
-                    r'^(\d+)[pP]$', format_id, 'height', default=None))
-                formats.append({
-                    'url': file_,
-                    'format_id': 'http-%s%s' % (format_id, ('-%dk' % bitrate if bitrate else '')),
-                    'tbr': bitrate,
-                    'height': height,
-                })
-        if not formats:
-            hls_url = self._parse_json(self._search_regex(
-                r'filmInfo\.src\s*=\s*({.+?});',
-                webpage, 'src'), video_id, js_to_json)['src']
-            formats = self._extract_m3u8_formats(
-                hls_url, video_id, 'mp4', 'm3u8_native',
-                m3u8_id='hls', fatal=False)
-        field_preference = None if has_bitrate else ('height', 'tbr', 'format_id')
-        self._sort_formats(formats, field_preference)
-
-        title = self._search_regex(
-            [r"title\s*:\s*'([^']+)'", r'<title>([^<]+)</title>'],
-            webpage, 'title')
-
-        return {
-            'id': video_id,
-            'title': title,
-            'formats': formats,
-        }
-
-
-class ViewLiftIE(ViewLiftBaseIE):
-    _VALID_URL = r'https?://(?:www\.)?(?P<domain>%s)(?:/(?:films/title|show|(?:news/)?videos?))?/(?P<id>[^?#]+)' % ViewLiftBaseIE._DOMAINS_REGEX
-    _TESTS = [{
-        'url': 'http://www.snagfilms.com/films/title/lost_for_life',
-        'md5': '19844f897b35af219773fd63bdec2942',
-        'info_dict': {
-            'id': '0000014c-de2f-d5d6-abcf-ffef58af0017',
-            'display_id': 'lost_for_life',
-            'ext': 'mp4',
-            'title': 'Lost for Life',
-            'description': 'md5:ea10b5a50405ae1f7b5269a6ec594102',
-            'thumbnail': r're:^https?://.*\.jpg',
-            'duration': 4489,
-            'categories': 'mincount:3',
-            'age_limit': 14,
-            'upload_date': '20150421',
-            'timestamp': 1429656820,
-        }
-    }, {
-        'url': 'http://www.snagfilms.com/show/the_world_cut_project/india',
-        'md5': 'e6292e5b837642bbda82d7f8bf3fbdfd',
-        'info_dict': {
-            'id': '00000145-d75c-d96e-a9c7-ff5c67b20000',
-            'display_id': 'the_world_cut_project/india',
-            'ext': 'mp4',
-            'title': 'India',
-            'description': 'md5:5c168c5a8f4719c146aad2e0dfac6f5f',
-            'thumbnail': r're:^https?://.*\.jpg',
-            'duration': 979,
-            'timestamp': 1399478279,
-            'upload_date': '20140507',
-        }
-    }, {
-        'url': 'http://main.snagfilms.com/augie_alone/s_2_ep_12_love',
-        'info_dict': {
-            'id': '00000148-7b53-de26-a9fb-fbf306f70020',
-            'display_id': 'augie_alone/s_2_ep_12_love',
-            'ext': 'mp4',
-            'title': 'Augie, Alone:S. 2 Ep. 12 - Love',
-            'description': 'md5:db2a5c72d994f16a780c1eb353a8f403',
-            'thumbnail': r're:^https?://.*\.jpg',
-            'duration': 107,
-        },
-        'params': {
-            'skip_download': True,
-        },
-    }, {
-        'url': 'http://main.snagfilms.com/films/title/the_freebie',
-        'only_matching': True,
-    }, {
-        # Film is not playable in your area.
-        'url': 'http://www.snagfilms.com/films/title/inside_mecca',
-        'only_matching': True,
-    }, {
-        # Film is not available.
-        'url': 'http://www.snagfilms.com/show/augie_alone/flirting',
-        'only_matching': True,
-    }, {
-        'url': 'http://www.winnersview.com/videos/the-good-son',
-        'only_matching': True,
-    }, {
-        # Was once Kaltura embed
-        'url': 'https://www.monumentalsportsnetwork.com/videos/john-carlson-postgame-2-25-15',
-        'only_matching': True,
-    }]
-
-    @classmethod
-    def suitable(cls, url):
-        return False if ViewLiftEmbedIE.suitable(url) else super(ViewLiftIE, cls).suitable(url)
-
-    def _real_extract(self, url):
-        domain, display_id = re.match(self._VALID_URL, url).groups()
-
-        webpage = self._download_webpage(url, display_id)
-
-        if ">Sorry, the Film you're looking for is not available.<" in webpage:
-            raise ExtractorError(
-                'Film %s is not available.' % display_id, expected=True)
-
-        initial_store_state = self._search_regex(
-            r"window\.initialStoreState\s*=.*?JSON\.parse\(unescape\(atob\('([^']+)'\)\)\)",
-            webpage, 'Initial Store State', default=None)
-        if initial_store_state:
-            modules = self._parse_json(compat_urllib_parse_unquote(base64.b64decode(
-                initial_store_state).decode()), display_id)['page']['data']['modules']
-            content_data = next(m['contentData'][0] for m in modules if m.get('moduleType') == 'VideoDetailModule')
-            gist = content_data['gist']
-            film_id = gist['id']
-            title = gist['title']
-            video_assets = try_get(
-                content_data, lambda x: x['streamingInfo']['videoAssets'], dict)
-            if not video_assets:
-                token = self._download_json(
-                    'https://prod-api.viewlift.com/identity/anonymous-token',
-                    film_id, 'Downloading authorization token',
-                    query={'site': 'snagfilms'})['authorizationToken']
-                video_assets = self._download_json(
-                    'https://prod-api.viewlift.com/entitlement/video/status',
-                    film_id, headers={
-                        'Authorization': token,
-                        'Referer': url,
-                    }, query={
-                        'id': film_id
-                    })['video']['streamingInfo']['videoAssets']
-
-            formats = []
-            mpeg_video_assets = video_assets.get('mpeg') or []
-            for video_asset in mpeg_video_assets:
-                video_asset_url = video_asset.get('url')
-                if not video_asset:
-                    continue
-                bitrate = int_or_none(video_asset.get('bitrate'))
-                height = int_or_none(self._search_regex(
-                    r'^_?(\d+)[pP]$', video_asset.get('renditionValue'),
-                    'height', default=None))
-                formats.append({
-                    'url': video_asset_url,
-                    'format_id': 'http%s' % ('-%d' % bitrate if bitrate else ''),
-                    'tbr': bitrate,
-                    'height': height,
-                    'vcodec': video_asset.get('codec'),
-                })
-
-            hls_url = video_assets.get('hls')
-            if hls_url:
-                formats.extend(self._extract_m3u8_formats(
-                    hls_url, film_id, 'mp4', 'm3u8_native', m3u8_id='hls', fatal=False))
-            self._sort_formats(formats, ('height', 'tbr', 'format_id'))
-
-            info = {
-                'id': film_id,
-                'display_id': display_id,
-                'title': title,
-                'description': gist.get('description'),
-                'thumbnail': gist.get('videoImageUrl'),
-                'duration': int_or_none(gist.get('runtime')),
-                'age_limit': parse_age_limit(content_data.get('parentalRating')),
-                'timestamp': int_or_none(gist.get('publishDate'), 1000),
-                'formats': formats,
-            }
-            for k in ('categories', 'tags'):
-                info[k] = [v['title'] for v in content_data.get(k, []) if v.get('title')]
-            return info
-        else:
-            film_id = self._search_regex(r'filmId=([\da-f-]{36})"', webpage, 'film id')
-
-            snag = self._parse_json(
-                self._search_regex(
-                    r'Snag\.page\.data\s*=\s*(\[.+?\]);', webpage, 'snag', default='[]'),
-                display_id)
-
-            for item in snag:
-                if item.get('data', {}).get('film', {}).get('id') == film_id:
-                    data = item['data']['film']
-                    title = data['title']
-                    description = clean_html(data.get('synopsis'))
-                    thumbnail = data.get('image')
-                    duration = int_or_none(data.get('duration') or data.get('runtime'))
-                    categories = [
-                        category['title'] for category in data.get('categories', [])
-                        if category.get('title')]
-                    break
-            else:
-                title = self._html_search_regex(
-                    (r'itemprop="title">([^<]+)<',
-                     r'(?s)itemprop="title">(.+?)<div'), webpage, 'title')
-                description = self._html_search_regex(
-                    r'(?s)<div itemprop="description" class="film-synopsis-inner ">(.+?)</div>',
-                    webpage, 'description', default=None) or self._og_search_description(webpage)
-                thumbnail = self._og_search_thumbnail(webpage)
-                duration = parse_duration(self._search_regex(
-                    r'<span itemprop="duration" class="film-duration strong">([^<]+)<',
-                    webpage, 'duration', fatal=False))
-                categories = re.findall(r'<a href="/movies/[^"]+">([^<]+)</a>', webpage)
-
-            return {
-                '_type': 'url_transparent',
-                'url': 'http://%s/embed/player?filmId=%s' % (domain, film_id),
-                'id': film_id,
-                'display_id': display_id,
-                'title': title,
-                'description': description,
-                'thumbnail': thumbnail,
-                'duration': duration,
-                'categories': categories,
-                'ie_key': 'ViewLiftEmbed',
-            }
diff --git a/youtube_dl/extractor/voicerepublic.py b/youtube_dl/extractor/voicerepublic.py

deleted file mode 100644 (file)

index 59e1359..0000000
--- a/youtube_dl/extractor/voicerepublic.py
+++ /dev/null
@@ -1,100 +0,0 @@
-from __future__ import unicode_literals
-
-import re
-
-from .common import InfoExtractor
-from ..compat import (
-    compat_str,
-    compat_urlparse,
-)
-from ..utils import (
-    ExtractorError,
-    determine_ext,
-    int_or_none,
-    sanitized_Request,
-)
-
-
-class VoiceRepublicIE(InfoExtractor):
-    _VALID_URL = r'https?://voicerepublic\.com/(?:talks|embed)/(?P<id>[0-9a-z-]+)'
-    _TESTS = [{
-        'url': 'http://voicerepublic.com/talks/watching-the-watchers-building-a-sousveillance-state',
-        'md5': 'b9174d651323f17783000876347116e3',
-        'info_dict': {
-            'id': '2296',
-            'display_id': 'watching-the-watchers-building-a-sousveillance-state',
-            'ext': 'm4a',
-            'title': 'Watching the Watchers: Building a Sousveillance State',
-            'description': 'Secret surveillance programs have metadata too. The people and companies that operate secret surveillance programs can be surveilled.',
-            'thumbnail': r're:^https?://.*\.(?:png|jpg)$',
-            'duration': 1800,
-            'view_count': int,
-        }
-    }, {
-        'url': 'http://voicerepublic.com/embed/watching-the-watchers-building-a-sousveillance-state',
-        'only_matching': True,
-    }]
-
-    def _real_extract(self, url):
-        display_id = self._match_id(url)
-
-        req = sanitized_Request(
-            compat_urlparse.urljoin(url, '/talks/%s' % display_id))
-        # Older versions of Firefox get redirected to an "upgrade browser" page
-        req.add_header('User-Agent', 'youtube-dl')
-        webpage = self._download_webpage(req, display_id)
-
-        if '>Queued for processing, please stand by...<' in webpage:
-            raise ExtractorError(
-                'Audio is still queued for processing', expected=True)
-
-        config = self._search_regex(
-            r'(?s)return ({.+?});\s*\n', webpage,
-            'data', default=None)
-        data = self._parse_json(config, display_id, fatal=False) if config else None
-        if data:
-            title = data['title']
-            description = data.get('teaser')
-            talk_id = compat_str(data.get('talk_id') or display_id)
-            talk = data['talk']
-            duration = int_or_none(talk.get('duration'))
-            formats = [{
-                'url': compat_urlparse.urljoin(url, talk_url),
-                'format_id': format_id,
-                'ext': determine_ext(talk_url) or format_id,
-                'vcodec': 'none',
-            } for format_id, talk_url in talk['links'].items()]
-        else:
-            title = self._og_search_title(webpage)
-            description = self._html_search_regex(
-                r"(?s)<div class='talk-teaser'[^>]*>(.+?)</div>",
-                webpage, 'description', fatal=False)
-            talk_id = self._search_regex(
-                [r"id='jc-(\d+)'", r"data-shareable-id='(\d+)'"],
-                webpage, 'talk id', default=None) or display_id
-            duration = None
-            player = self._search_regex(
-                r"class='vr-player jp-jplayer'([^>]+)>", webpage, 'player')
-            formats = [{
-                'url': compat_urlparse.urljoin(url, talk_url),
-                'format_id': format_id,
-                'ext': determine_ext(talk_url) or format_id,
-                'vcodec': 'none',
-            } for format_id, talk_url in re.findall(r"data-([^=]+)='([^']+)'", player)]
-        self._sort_formats(formats)
-
-        thumbnail = self._og_search_thumbnail(webpage)
-        view_count = int_or_none(self._search_regex(
-            r"class='play-count[^']*'>\s*(\d+) plays",
-            webpage, 'play count', fatal=False))
-
-        return {
-            'id': talk_id,
-            'display_id': display_id,
-            'title': title,
-            'description': description,
-            'thumbnail': thumbnail,
-            'duration': duration,
-            'view_count': view_count,
-            'formats': formats,
-        }
diff --git a/youtube_dl/extractor/wistia.py b/youtube_dl/extractor/wistia.py

deleted file mode 100644 (file)

index 0fbc888..0000000
--- a/youtube_dl/extractor/wistia.py
+++ /dev/null
@@ -1,127 +0,0 @@
-from __future__ import unicode_literals
-
-import re
-
-from .common import InfoExtractor
-from ..utils import (
-    ExtractorError,
-    int_or_none,
-    float_or_none,
-    unescapeHTML,
-)
-
-
-class WistiaIE(InfoExtractor):
-    _VALID_URL = r'(?:wistia:|https?://(?:fast\.)?wistia\.(?:net|com)/embed/(?:iframe|medias)/)(?P<id>[a-z0-9]{10})'
-    _API_URL = 'http://fast.wistia.com/embed/medias/%s.json'
-    _IFRAME_URL = 'http://fast.wistia.net/embed/iframe/%s'
-
-    _TESTS = [{
-        'url': 'http://fast.wistia.net/embed/iframe/sh7fpupwlt',
-        'md5': 'cafeb56ec0c53c18c97405eecb3133df',
-        'info_dict': {
-            'id': 'sh7fpupwlt',
-            'ext': 'mov',
-            'title': 'Being Resourceful',
-            'description': 'a Clients From Hell Video Series video from worldwidewebhosting',
-            'upload_date': '20131204',
-            'timestamp': 1386185018,
-            'duration': 117,
-        },
-    }, {
-        'url': 'wistia:sh7fpupwlt',
-        'only_matching': True,
-    }, {
-        # with hls video
-        'url': 'wistia:807fafadvk',
-        'only_matching': True,
-    }, {
-        'url': 'http://fast.wistia.com/embed/iframe/sh7fpupwlt',
-        'only_matching': True,
-    }, {
-        'url': 'http://fast.wistia.net/embed/medias/sh7fpupwlt.json',
-        'only_matching': True,
-    }]
-
-    # https://wistia.com/support/embed-and-share/video-on-your-website
-    @staticmethod
-    def _extract_url(webpage):
-        match = re.search(
-            r'<(?:meta[^>]+?content|(?:iframe|script)[^>]+?src)=["\'](?P<url>(?:https?:)?//(?:fast\.)?wistia\.(?:net|com)/embed/(?:iframe|medias)/[a-z0-9]{10})', webpage)
-        if match:
-            return unescapeHTML(match.group('url'))
-
-        match = re.search(
-            r'''(?sx)
-                <script[^>]+src=(["'])(?:https?:)?//fast\.wistia\.com/assets/external/E-v1\.js\1[^>]*>.*?
-                <div[^>]+class=(["']).*?\bwistia_async_(?P<id>[a-z0-9]{10})\b.*?\2
-            ''', webpage)
-        if match:
-            return 'wistia:%s' % match.group('id')
-
-        match = re.search(r'(?:data-wistia-?id=["\']|Wistia\.embed\(["\']|id=["\']wistia_)(?P<id>[a-z0-9]{10})', webpage)
-        if match:
-            return 'wistia:%s' % match.group('id')
-
-    def _real_extract(self, url):
-        video_id = self._match_id(url)
-
-        data_json = self._download_json(
-            self._API_URL % video_id, video_id,
-            # Some videos require this.
-            headers={
-                'Referer': url if url.startswith('http') else self._IFRAME_URL % video_id,
-            })
-
-        if data_json.get('error'):
-            raise ExtractorError(
-                'Error while getting the playlist', expected=True)
-
-        data = data_json['media']
-        title = data['name']
-
-        formats = []
-        thumbnails = []
-        for a in data['assets']:
-            aurl = a.get('url')
-            if not aurl:
-                continue
-            astatus = a.get('status')
-            atype = a.get('type')
-            if (astatus is not None and astatus != 2) or atype in ('preview', 'storyboard'):
-                continue
-            elif atype in ('still', 'still_image'):
-                thumbnails.append({
-                    'url': aurl,
-                    'width': int_or_none(a.get('width')),
-                    'height': int_or_none(a.get('height')),
-                })
-            else:
-                aext = a.get('ext')
-                is_m3u8 = a.get('container') == 'm3u8' or aext == 'm3u8'
-                formats.append({
-                    'format_id': atype,
-                    'url': aurl,
-                    'tbr': int_or_none(a.get('bitrate')),
-                    'vbr': int_or_none(a.get('opt_vbitrate')),
-                    'width': int_or_none(a.get('width')),
-                    'height': int_or_none(a.get('height')),
-                    'filesize': int_or_none(a.get('size')),
-                    'vcodec': a.get('codec'),
-                    'container': a.get('container'),
-                    'ext': 'mp4' if is_m3u8 else aext,
-                    'protocol': 'm3u8' if is_m3u8 else None,
-                    'preference': 1 if atype == 'original' else None,
-                })
-
-        self._sort_formats(formats)
-
-        return {
-            'id': video_id,
-            'title': title,
-            'description': data.get('seoDescription'),
-            'formats': formats,
-            'thumbnails': thumbnails,
-            'duration': float_or_none(data.get('duration')),
-            'timestamp': int_or_none(data.get('createdAt')),
-        }
diff --git a/youtube_dl/extractor/zype.py b/youtube_dl/extractor/zype.py

deleted file mode 100644 (file)

index 3b16e70..0000000
--- a/youtube_dl/extractor/zype.py
+++ /dev/null
@@ -1,57 +0,0 @@
-# coding: utf-8
-from __future__ import unicode_literals
-
-import re
-
-from .common import InfoExtractor
-
-
-class ZypeIE(InfoExtractor):
-    _VALID_URL = r'https?://player\.zype\.com/embed/(?P<id>[\da-fA-F]+)\.js\?.*?api_key=[^&]+'
-    _TEST = {
-        'url': 'https://player.zype.com/embed/5b400b834b32992a310622b9.js?api_key=jZ9GUhRmxcPvX7M3SlfejB6Hle9jyHTdk2jVxG7wOHPLODgncEKVdPYBhuz9iWXQ&autoplay=false&controls=true&da=false',
-        'md5': 'eaee31d474c76a955bdaba02a505c595',
-        'info_dict': {
-            'id': '5b400b834b32992a310622b9',
-            'ext': 'mp4',
-            'title': 'Smoky Barbecue Favorites',
-            'thumbnail': r're:^https?://.*\.jpe?g',
-        },
-    }
-
-    @staticmethod
-    def _extract_urls(webpage):
-        return [
-            mobj.group('url')
-            for mobj in re.finditer(
-                r'<script[^>]+\bsrc=(["\'])(?P<url>(?:https?:)?//player\.zype\.com/embed/[\da-fA-F]+\.js\?.*?api_key=.+?)\1',
-                webpage)]
-
-    def _real_extract(self, url):
-        video_id = self._match_id(url)
-
-        webpage = self._download_webpage(url, video_id)
-
-        title = self._search_regex(
-            r'video_title\s*[:=]\s*(["\'])(?P<value>(?:(?!\1).)+)\1', webpage,
-            'title', group='value')
-
-        m3u8_url = self._search_regex(
-            r'(["\'])(?P<url>(?:(?!\1).)+\.m3u8(?:(?!\1).)*)\1', webpage,
-            'm3u8 url', group='url')
-
-        formats = self._extract_m3u8_formats(
-            m3u8_url, video_id, 'mp4', entry_protocol='m3u8_native',
-            m3u8_id='hls')
-        self._sort_formats(formats)
-
-        thumbnail = self._search_regex(
-            r'poster\s*[:=]\s*(["\'])(?P<url>(?:(?!\1).)+)\1', webpage, 'thumbnail',
-            default=False, group='url')
-
-        return {
-            'id': video_id,
-            'title': title,
-            'thumbnail': thumbnail,
-            'formats': formats,
-        }
diff --git a/youtube_dl/YoutubeDL.py b/youtube_dlc/YoutubeDL.py

old mode 100755 (executable)

new mode 100644 (file)

similarity index 98%

rename from youtube_dl/YoutubeDL.py

rename to youtube_dlc/YoutubeDL.py

index f5cb463..f79d31d
--- a/youtube_dl/YoutubeDL.py
+++ b/youtube_dlc/YoutubeDL.py
@@ -92,6 +92,7 @@
      YoutubeDLCookieJar,
      YoutubeDLCookieProcessor,
      YoutubeDLHandler,
+    YoutubeDLRedirectHandler,
  )
  from .cache import Cache
  from .extractor import get_info_extractor, gen_extractor_classes, _LAZY_LOADER
@@ -227,7 +228,7 @@ class YoutubeDL(object):
                         playlist items.
      postprocessors:    A list of dictionaries, each with an entry
                         * key:  The name of the postprocessor. See
-                               youtube_dl/postprocessor/__init__.py for a list.
+                               youtube_dlc/postprocessor/__init__.py for a list.
                         as well as any further keyword arguments for the
                         postprocessor.
      progress_hooks:    A list of functions that get called on download
@@ -263,7 +264,7 @@ class YoutubeDL(object):
                                             about it, warn otherwise (default)
      source_address:    Client-side IP address to bind to.
      call_home:         Boolean, true iff we are allowed to contact the
-                       youtube-dl servers for debugging.
+                       youtube-dlc servers for debugging.
      sleep_interval:    Number of seconds to sleep before each download when
                         used alone or a lower bound of a range for randomized
                         sleep before each download (minimum possible number
@@ -300,7 +301,7 @@ class YoutubeDL(object):
                         use downloader suggested by extractor if None.
  
      The following parameters are not used by YoutubeDL itself, they are used by
-    the downloader (see youtube_dl/downloader/common.py):
+    the downloader (see youtube_dlc/downloader/common.py):
      nopart, updatetime, buffersize, ratelimit, min_filesize, max_filesize, test,
      noresizebuffer, retries, continuedl, noprogress, consoletitle,
      xattr_set_filesize, external_downloader_args, hls_use_mpegts,
@@ -440,7 +441,7 @@ def warn_if_short_id(self, argv):
              if re.match(r'^-[0-9A-Za-z_-]{10}$', a)]
          if idxs:
              correct_argv = (
-                ['youtube-dl']
+                ['youtube-dlc']
                  + [a for i, a in enumerate(argv) if i not in idxs]
                  + ['--'] + [argv[i] for i in idxs]
              )
@@ -990,7 +991,7 @@ def report_download(num_entries):
                      'playlist_title': ie_result.get('title'),
                      'playlist_uploader': ie_result.get('uploader'),
                      'playlist_uploader_id': ie_result.get('uploader_id'),
-                    'playlist_index': i + playliststart,
+                    'playlist_index': playlistitems[i - 1] if playlistitems else i + playliststart,
                      'extractor': ie_result['extractor'],
                      'webpage_url': ie_result['webpage_url'],
                      'webpage_url_basename': url_basename(ie_result['webpage_url']),
@@ -1804,6 +1805,14 @@ def ensure_dir_exists(path):
                      self.report_error('Cannot write annotations file: ' + annofn)
                      return
  
+        def dl(name, info):
+            fd = get_suitable_downloader(info, self.params)(self, self.params)
+            for ph in self._progress_hooks:
+                fd.add_progress_hook(ph)
+            if self.params.get('verbose'):
+                self.to_stdout('[debug] Invoking downloader on %r' % info.get('url'))
+            return fd.download(name, info)
+
          subtitles_are_requested = any([self.params.get('writesubtitles', False),
                                         self.params.get('writeautomaticsub')])
  
@@ -1811,14 +1820,12 @@ def ensure_dir_exists(path):
              # subtitles download errors are already managed as troubles in relevant IE
              # that way it will silently go on when used with unsupporting IE
              subtitles = info_dict['requested_subtitles']
-            ie = self.get_info_extractor(info_dict['extractor_key'])
              for sub_lang, sub_info in subtitles.items():
                  sub_format = sub_info['ext']
                  sub_filename = subtitles_filename(filename, sub_lang, sub_format, info_dict.get('ext'))
                  if self.params.get('nooverwrites', False) and os.path.exists(encodeFilename(sub_filename)):
                      self.to_screen('[info] Video subtitle %s.%s is already present' % (sub_lang, sub_format))
                  else:
-                    self.to_screen('[info] Writing video subtitles to: ' + sub_filename)
                      if sub_info.get('data') is not None:
                          try:
                              # Use newline='' to prevent conversion of newline characters
@@ -1830,11 +1837,11 @@ def ensure_dir_exists(path):
                              return
                      else:
                          try:
-                            sub_data = ie._request_webpage(
-                                sub_info['url'], info_dict['id'], note=False).read()
-                            with io.open(encodeFilename(sub_filename), 'wb') as subfile:
-                                subfile.write(sub_data)
-                        except (ExtractorError, IOError, OSError, ValueError) as err:
+                            dl(sub_filename, sub_info)
+                        except (ExtractorError, IOError, OSError, ValueError,
+                                compat_urllib_error.URLError,
+                                compat_http_client.HTTPException,
+                                socket.error) as err:
                              self.report_warning('Unable to download subtitle for "%s": %s' %
                                                  (sub_lang, error_to_compat_str(err)))
                              continue
@@ -1855,14 +1862,6 @@ def ensure_dir_exists(path):
  
          if not self.params.get('skip_download', False):
              try:
-                def dl(name, info):
-                    fd = get_suitable_downloader(info, self.params)(self, self.params)
-                    for ph in self._progress_hooks:
-                        fd.add_progress_hook(ph)
-                    if self.params.get('verbose'):
-                        self.to_stdout('[debug] Invoking downloader on %r' % info.get('url'))
-                    return fd.download(name, info)
-
                  if info_dict.get('requested_formats') is not None:
                      downloaded = []
                      success = True
@@ -2255,7 +2254,7 @@ def print_debug_header(self):
                  self.get_encoding()))
          write_string(encoding_str, encoding=None)
  
-        self._write_string('[debug] youtube-dl version ' + __version__ + '\n')
+        self._write_string('[debug] youtube-dlc version ' + __version__ + '\n')
          if _LAZY_LOADER:
              self._write_string('[debug] Lazy loading extractors enabled' + '\n')
          try:
@@ -2343,6 +2342,7 @@ def _setup_opener(self):
          debuglevel = 1 if self.params.get('debug_printtraffic') else 0
          https_handler = make_HTTPS_handler(self.params, debuglevel=debuglevel)
          ydlh = YoutubeDLHandler(self.params, debuglevel=debuglevel)
+        redirect_handler = YoutubeDLRedirectHandler()
          data_handler = compat_urllib_request_DataHandler()
  
          # When passing our own FileHandler instance, build_opener won't add the
@@ -2352,11 +2352,11 @@ def _setup_opener(self):
          file_handler = compat_urllib_request.FileHandler()
  
          def file_open(*args, **kwargs):
-            raise compat_urllib_error.URLError('file:// scheme is explicitly disabled in youtube-dl for security reasons')
+            raise compat_urllib_error.URLError('file:// scheme is explicitly disabled in youtube-dlc for security reasons')
          file_handler.file_open = file_open
  
          opener = compat_urllib_request.build_opener(
-            proxy_handler, https_handler, cookie_processor, ydlh, data_handler, file_handler)
+            proxy_handler, https_handler, cookie_processor, ydlh, redirect_handler, data_handler, file_handler)
  
          # Delete the default user-agent header, which would otherwise apply in
          # cases where our custom HTTP handler doesn't come into play
diff --git a/youtube_dl/__init__.py b/youtube_dlc/__init__.py

similarity index 99%

rename from youtube_dl/__init__.py

rename to youtube_dlc/__init__.py

index 9a659fc654d2a3af5d63e47363f8cdfdcdc0c333..a663417dab292caf7c6e4a72599a6419391a108c 100644 (file)
--- a/youtube_dl/__init__.py
+++ b/youtube_dlc/__init__.py
@@ -53,7 +53,7 @@ def _real_main(argv=None):
  
      workaround_optparse_bug9161()
  
-    setproctitle('youtube-dl')
+    setproctitle('youtube-dlc')
  
      parser, opts, args = parseOpts(argv)
  
@@ -455,7 +455,7 @@ def parse_retries(retries):
              ydl.warn_if_short_id(sys.argv[1:] if argv is None else argv)
              parser.error(
                  'You must provide at least one URL.\n'
-                'Type youtube-dl --help to see a list of all options.')
+                'Type youtube-dlc --help to see a list of all options.')
  
          try:
              if opts.load_info_filename is not None:
diff --git a/youtube_dl/__main__.py b/youtube_dlc/__main__.py

old mode 100755 (executable)

new mode 100644 (file)

similarity index 73%

rename from youtube_dl/__main__.py

rename to youtube_dlc/__main__.py

index 138f5fb..0e76016
--- a/youtube_dl/__main__.py
+++ b/youtube_dlc/__main__.py
@@ -2,8 +2,8 @@
  from __future__ import unicode_literals
  
  # Execute with
-# $ python youtube_dl/__main__.py (2.6+)
-# $ python -m youtube_dl          (2.7+)
+# $ python youtube_dlc/__main__.py (2.6+)
+# $ python -m youtube_dlc          (2.7+)
  
  import sys
  
@@ -13,7 +13,7 @@
      path = os.path.realpath(os.path.abspath(__file__))
      sys.path.insert(0, os.path.dirname(os.path.dirname(path)))
  
-import youtube_dl
+import youtube_dlc
  
  if __name__ == '__main__':
-    youtube_dl.main()
+    youtube_dlc.main()
diff --git a/youtube_dl/aes.py b/youtube_dlc/aes.py

similarity index 100%

rename from youtube_dl/aes.py

rename to youtube_dlc/aes.py
diff --git a/youtube_dl/cache.py b/youtube_dlc/cache.py

similarity index 98%

rename from youtube_dl/cache.py

rename to youtube_dlc/cache.py

index 7bdade1bdb49a7406457688400830a91a98ef186..ada6aa1f28ba39f4e6f22df92e4327d81d7cd0f7 100644 (file)
--- a/youtube_dl/cache.py
+++ b/youtube_dlc/cache.py
@@ -23,7 +23,7 @@ def _get_root_dir(self):
          res = self._ydl.params.get('cachedir')
          if res is None:
              cache_root = compat_getenv('XDG_CACHE_HOME', '~/.cache')
-            res = os.path.join(cache_root, 'youtube-dl')
+            res = os.path.join(cache_root, 'youtube-dlc')
          return expand_path(res)
  
      def _get_cache_fn(self, section, key, dtype):
diff --git a/youtube_dl/compat.py b/youtube_dlc/compat.py

similarity index 98%

rename from youtube_dl/compat.py

rename to youtube_dlc/compat.py

index c75ab131b9955cec1367ec42aa41d8dadde423da..1cf7efed615af3a4ec5bcdfe084ff0bf197b8b3c 100644 (file)
--- a/youtube_dl/compat.py
+++ b/youtube_dlc/compat.py
@@ -57,6 +57,17 @@
  except ImportError:  # Python 2
      import cookielib as compat_cookiejar
  
+if sys.version_info[0] == 2:
+    class compat_cookiejar_Cookie(compat_cookiejar.Cookie):
+        def __init__(self, version, name, value, *args, **kwargs):
+            if isinstance(name, compat_str):
+                name = name.encode()
+            if isinstance(value, compat_str):
+                value = value.encode()
+            compat_cookiejar.Cookie.__init__(self, version, name, value, *args, **kwargs)
+else:
+    compat_cookiejar_Cookie = compat_cookiejar.Cookie
+
  try:
      import http.cookies as compat_cookies
  except ImportError:  # Python 2
@@ -2754,6 +2765,17 @@ def compat_expanduser(path):
          compat_expanduser = os.path.expanduser
  
  
+if compat_os_name == 'nt' and sys.version_info < (3, 8):
+    # os.path.realpath on Windows does not follow symbolic links
+    # prior to Python 3.8 (see https://bugs.python.org/issue9949)
+    def compat_realpath(path):
+        while os.path.islink(path):
+            path = os.path.abspath(os.readlink(path))
+        return path
+else:
+    compat_realpath = os.path.realpath
+
+
  if sys.version_info < (3, 0):
      def compat_print(s):
          from .utils import preferredencoding
@@ -2951,7 +2973,7 @@ def compat_b64decode(s, *args, **kwargs):
  
  if platform.python_implementation() == 'PyPy' and sys.pypy_version_info < (5, 4, 0):
      # PyPy2 prior to version 5.4.0 expects byte strings as Windows function
-    # names, see the original PyPy issue [1] and the youtube-dl one [2].
+    # names, see the original PyPy issue [1] and the youtube-dlc one [2].
      # 1. https://bitbucket.org/pypy/pypy/issues/2360/windows-ctypescdll-typeerror-function-name
      # 2. https://github.com/ytdl-org/youtube-dl/pull/4392
      def compat_ctypes_WINFUNCTYPE(*args, **kwargs):
@@ -2976,6 +2998,7 @@ def compat_ctypes_WINFUNCTYPE(*args, **kwargs):
      'compat_basestring',
      'compat_chr',
      'compat_cookiejar',
+    'compat_cookiejar_Cookie',
      'compat_cookies',
      'compat_ctypes_WINFUNCTYPE',
      'compat_etree_Element',
@@ -2998,6 +3021,7 @@ def compat_ctypes_WINFUNCTYPE(*args, **kwargs):
      'compat_os_name',
      'compat_parse_qs',
      'compat_print',
+    'compat_realpath',
      'compat_setenv',
      'compat_shlex_quote',
      'compat_shlex_split',
diff --git a/youtube_dl/downloader/__init__.py b/youtube_dlc/downloader/__init__.py

similarity index 93%

rename from youtube_dl/downloader/__init__.py

rename to youtube_dlc/downloader/__init__.py

index 2e485df9dac09e197af6183ded3570e57805fad2..4ae81f516e63958c30c798c917e8c5df44aa867f 100644 (file)
--- a/youtube_dl/downloader/__init__.py
+++ b/youtube_dlc/downloader/__init__.py
@@ -8,6 +8,7 @@
  from .dash import DashSegmentsFD
  from .rtsp import RtspFD
  from .ism import IsmFD
+from .youtube_live_chat import YoutubeLiveChatReplayFD
  from .external import (
      get_external_downloader,
      FFmpegFD,
@@ -26,6 +27,7 @@
      'f4m': F4mFD,
      'http_dash_segments': DashSegmentsFD,
      'ism': IsmFD,
+    'youtube_live_chat_replay': YoutubeLiveChatReplayFD,
  }
  
  
diff --git a/youtube_dl/downloader/common.py b/youtube_dlc/downloader/common.py

similarity index 99%

rename from youtube_dl/downloader/common.py

rename to youtube_dlc/downloader/common.py

index 1cdba89cd9b093c1cc45071d34e307028aac02ea..31c2864584b917d70cbbc903823a100d21a114dd 100644 (file)
--- a/youtube_dl/downloader/common.py
+++ b/youtube_dlc/downloader/common.py
@@ -243,7 +243,7 @@ def _report_progress_status(self, msg, is_last_line=False):
              else:
                  clear_line = ('\r\x1b[K' if sys.stderr.isatty() else '\r')
              self.to_screen(clear_line + fullmsg, skip_eol=not is_last_line)
-        self.to_console_title('youtube-dl ' + msg)
+        self.to_console_title('youtube-dlc ' + msg)
  
      def report_progress(self, s):
          if s['status'] == 'finished':
diff --git a/youtube_dl/downloader/dash.py b/youtube_dlc/downloader/dash.py

similarity index 100%

rename from youtube_dl/downloader/dash.py

rename to youtube_dlc/downloader/dash.py
diff --git a/youtube_dl/downloader/external.py b/youtube_dlc/downloader/external.py

similarity index 100%

rename from youtube_dl/downloader/external.py

rename to youtube_dlc/downloader/external.py
diff --git a/youtube_dl/downloader/f4m.py b/youtube_dlc/downloader/f4m.py

similarity index 100%

rename from youtube_dl/downloader/f4m.py

rename to youtube_dlc/downloader/f4m.py
diff --git a/youtube_dl/downloader/fragment.py b/youtube_dlc/downloader/fragment.py

similarity index 98%

rename from youtube_dl/downloader/fragment.py

rename to youtube_dlc/downloader/fragment.py

index 02f35459e82ddb0e39dad3b7790f2db7a90d85fa..9339b3a62c4bd4d360bf719f0e1e3e16ebb0f840 100644 (file)
--- a/youtube_dl/downloader/fragment.py
+++ b/youtube_dlc/downloader/fragment.py
@@ -32,9 +32,9 @@ class FragmentFD(FileDownloader):
      keep_fragments:     Keep downloaded fragments on disk after downloading is
                          finished
  
-    For each incomplete fragment download youtube-dl keeps on disk a special
+    For each incomplete fragment download youtube-dlc keeps on disk a special
      bookkeeping file with download state and metadata (in future such files will
-    be used for any incomplete download handled by youtube-dl). This file is
+    be used for any incomplete download handled by youtube-dlc). This file is
      used to properly handle resuming, check download file consistency and detect
      potential errors. The file has a .ytdl extension and represents a standard
      JSON file of the following format:
diff --git a/youtube_dl/downloader/hls.py b/youtube_dlc/downloader/hls.py

similarity index 100%

rename from youtube_dl/downloader/hls.py

rename to youtube_dlc/downloader/hls.py
diff --git a/youtube_dl/downloader/http.py b/youtube_dlc/downloader/http.py

similarity index 98%

rename from youtube_dl/downloader/http.py

rename to youtube_dlc/downloader/http.py

index 3c72ea18b2304befd5221960503ff5b6141304c3..5046878dfcd874013e737e85d32764a95737406e 100644 (file)
--- a/youtube_dl/downloader/http.py
+++ b/youtube_dlc/downloader/http.py
@@ -227,7 +227,7 @@ def retry(e):
              while True:
                  try:
                      # Download and write
-                    data_block = ctx.data.read(block_size if not is_test else min(block_size, data_len - byte_counter))
+                    data_block = ctx.data.read(block_size if data_len is None else min(block_size, data_len - byte_counter))
                  # socket.timeout is a subclass of socket.error but may not have
                  # errno set
                  except socket.timeout as e:
@@ -299,7 +299,7 @@ def retry(e):
                      'elapsed': now - ctx.start_time,
                  })
  
-                if is_test and byte_counter == data_len:
+                if data_len is not None and byte_counter == data_len:
                      break
  
              if not is_test and ctx.chunk_size and ctx.data_len is not None and byte_counter < ctx.data_len:
diff --git a/youtube_dl/downloader/ism.py b/youtube_dlc/downloader/ism.py

similarity index 100%

rename from youtube_dl/downloader/ism.py

rename to youtube_dlc/downloader/ism.py
diff --git a/youtube_dl/downloader/rtmp.py b/youtube_dlc/downloader/rtmp.py

similarity index 100%

rename from youtube_dl/downloader/rtmp.py

rename to youtube_dlc/downloader/rtmp.py
diff --git a/youtube_dl/downloader/rtsp.py b/youtube_dlc/downloader/rtsp.py

similarity index 100%

rename from youtube_dl/downloader/rtsp.py

rename to youtube_dlc/downloader/rtsp.py
diff --git a/youtube_dlc/downloader/youtube_live_chat.py b/youtube_dlc/downloader/youtube_live_chat.py

new file mode 100644 (file)

index 0000000..4932dd9
--- /dev/null
+++ b/youtube_dlc/downloader/youtube_live_chat.py
@@ -0,0 +1,94 @@
+from __future__ import division, unicode_literals
+
+import re
+import json
+
+from .fragment import FragmentFD
+
+
+class YoutubeLiveChatReplayFD(FragmentFD):
+    """ Downloads YouTube live chat replays fragment by fragment """
+
+    FD_NAME = 'youtube_live_chat_replay'
+
+    def real_download(self, filename, info_dict):
+        video_id = info_dict['video_id']
+        self.to_screen('[%s] Downloading live chat' % self.FD_NAME)
+
+        test = self.params.get('test', False)
+
+        ctx = {
+            'filename': filename,
+            'live': True,
+            'total_frags': None,
+        }
+
+        def dl_fragment(url):
+            headers = info_dict.get('http_headers', {})
+            return self._download_fragment(ctx, url, info_dict, headers)
+
+        def parse_yt_initial_data(data):
+            window_patt = b'window\\["ytInitialData"\\]\\s*=\\s*(.*?)(?<=});'
+            var_patt = b'var\\s+ytInitialData\\s*=\\s*(.*?)(?<=});'
+            for patt in window_patt, var_patt:
+                try:
+                    raw_json = re.search(patt, data).group(1)
+                    return json.loads(raw_json)
+                except AttributeError:
+                    continue
+
+        self._prepare_and_start_frag_download(ctx)
+
+        success, raw_fragment = dl_fragment(
+            'https://www.youtube.com/watch?v={}'.format(video_id))
+        if not success:
+            return False
+        data = parse_yt_initial_data(raw_fragment)
+        continuation_id = data['contents']['twoColumnWatchNextResults']['conversationBar']['liveChatRenderer']['continuations'][0]['reloadContinuationData']['continuation']
+        # no data yet but required to call _append_fragment
+        self._append_fragment(ctx, b'')
+
+        first = True
+        offset = None
+        while continuation_id is not None:
+            data = None
+            if first:
+                url = 'https://www.youtube.com/live_chat_replay?continuation={}'.format(continuation_id)
+                success, raw_fragment = dl_fragment(url)
+                if not success:
+                    return False
+                data = parse_yt_initial_data(raw_fragment)
+            else:
+                url = ('https://www.youtube.com/live_chat_replay/get_live_chat_replay'
+                       + '?continuation={}'.format(continuation_id)
+                       + '&playerOffsetMs={}'.format(offset - 5000)
+                       + '&hidden=false'
+                       + '&pbj=1')
+                success, raw_fragment = dl_fragment(url)
+                if not success:
+                    return False
+                data = json.loads(raw_fragment)['response']
+
+            first = False
+            continuation_id = None
+
+            live_chat_continuation = data['continuationContents']['liveChatContinuation']
+            offset = None
+            processed_fragment = bytearray()
+            if 'actions' in live_chat_continuation:
+                for action in live_chat_continuation['actions']:
+                    if 'replayChatItemAction' in action:
+                        replay_chat_item_action = action['replayChatItemAction']
+                        offset = int(replay_chat_item_action['videoOffsetTimeMsec'])
+                    processed_fragment.extend(
+                        json.dumps(action, ensure_ascii=False).encode('utf-8') + b'\n')
+                continuation_id = live_chat_continuation['continuations'][0]['liveChatReplayContinuationData']['continuation']
+
+            self._append_fragment(ctx, processed_fragment)
+
+            if test or offset is None:
+                break
+
+        self._finish_frag_download(ctx)
+
+        return True
diff --git a/youtube_dl/extractor/__init__.py b/youtube_dlc/extractor/__init__.py

similarity index 100%

rename from youtube_dl/extractor/__init__.py

rename to youtube_dlc/extractor/__init__.py
diff --git a/youtube_dl/extractor/abc.py b/youtube_dlc/extractor/abc.py

similarity index 56%

rename from youtube_dl/extractor/abc.py

rename to youtube_dlc/extractor/abc.py

index 4ac323bf6de6d17016c2425c133aad460072cadd..3e202168ed39947baa38ad710622f12b8bed71e3 100644 (file)
--- a/youtube_dl/extractor/abc.py
+++ b/youtube_dlc/extractor/abc.py
@@ -12,6 +12,7 @@
      js_to_json,
      int_or_none,
      parse_iso8601,
+    str_or_none,
      try_get,
      unescapeHTML,
      update_url_query,
@@ -20,7 +21,7 @@
  
  class ABCIE(InfoExtractor):
      IE_NAME = 'abc.net.au'
-    _VALID_URL = r'https?://(?:www\.)?abc\.net\.au/news/(?:[^/]+/){1,2}(?P<id>\d+)'
+    _VALID_URL = r'https?://(?:www\.)?abc\.net\.au/(?:news|btn)/(?:[^/]+/){1,4}(?P<id>\d{5,})'
  
      _TESTS = [{
          'url': 'http://www.abc.net.au/news/2014-11-05/australia-to-staff-ebola-treatment-centre-in-sierra-leone/5868334',
@@ -34,7 +35,7 @@ class ABCIE(InfoExtractor):
          'skip': 'this video has expired',
      }, {
          'url': 'http://www.abc.net.au/news/2015-08-17/warren-entsch-introduces-same-sex-marriage-bill/6702326',
-        'md5': 'db2a5369238b51f9811ad815b69dc086',
+        'md5': '4ebd61bdc82d9a8b722f64f1f4b4d121',
          'info_dict': {
              'id': 'NvqvPeNZsHU',
              'ext': 'mp4',
@@ -58,39 +59,102 @@ class ABCIE(InfoExtractor):
      }, {
          'url': 'http://www.abc.net.au/news/2015-10-19/6866214',
          'only_matching': True,
+    }, {
+        'url': 'https://www.abc.net.au/btn/classroom/wwi-centenary/10527914',
+        'info_dict': {
+            'id': '10527914',
+            'ext': 'mp4',
+            'title': 'WWI Centenary',
+            'description': 'md5:c2379ec0ca84072e86b446e536954546',
+        }
+    }, {
+        'url': 'https://www.abc.net.au/news/programs/the-world/2020-06-10/black-lives-matter-protests-spawn-support-for/12342074',
+        'info_dict': {
+            'id': '12342074',
+            'ext': 'mp4',
+            'title': 'Black Lives Matter protests spawn support for Papuans in Indonesia',
+            'description': 'md5:2961a17dc53abc558589ccd0fb8edd6f',
+        }
+    }, {
+        'url': 'https://www.abc.net.au/btn/newsbreak/btn-newsbreak-20200814/12560476',
+        'info_dict': {
+            'id': 'tDL8Ld4dK_8',
+            'ext': 'mp4',
+            'title': 'Fortnite Banned From Apple and Google App Stores',
+            'description': 'md5:a6df3f36ce8f816b74af4bd6462f5651',
+            'upload_date': '20200813',
+            'uploader': 'Behind the News',
+            'uploader_id': 'behindthenews',
+        }
      }]
  
      def _real_extract(self, url):
          video_id = self._match_id(url)
          webpage = self._download_webpage(url, video_id)
  
-        mobj = re.search(
-            r'inline(?P<type>Video|Audio|YouTube)Data\.push\((?P<json_data>[^)]+)\);',
-            webpage)
+        mobj = re.search(r'<a\s+href="(?P<url>[^"]+)"\s+data-duration="\d+"\s+title="Download audio directly">', webpage)
+        if mobj:
+            urls_info = mobj.groupdict()
+            youtube = False
+            video = False
+        else:
+            mobj = re.search(r'<a href="(?P<url>http://www\.youtube\.com/watch\?v=[^"]+)"><span><strong>External Link:</strong>',
+                             webpage)
+            if mobj is None:
+                mobj = re.search(r'<iframe width="100%" src="(?P<url>//www\.youtube-nocookie\.com/embed/[^?"]+)', webpage)
+            if mobj:
+                urls_info = mobj.groupdict()
+                youtube = True
+                video = True
+
          if mobj is None:
-            expired = self._html_search_regex(r'(?s)class="expired-(?:video|audio)".+?<span>(.+?)</span>', webpage, 'expired', None)
-            if expired:
-                raise ExtractorError('%s said: %s' % (self.IE_NAME, expired), expected=True)
-            raise ExtractorError('Unable to extract video urls')
+            mobj = re.search(r'(?P<type>)"sources": (?P<json_data>\[[^\]]+\]),', webpage)
+            if mobj is None:
+                mobj = re.search(
+                    r'inline(?P<type>Video|Audio|YouTube)Data\.push\((?P<json_data>[^)]+)\);',
+                    webpage)
+                if mobj is None:
+                    expired = self._html_search_regex(r'(?s)class="expired-(?:video|audio)".+?<span>(.+?)</span>', webpage, 'expired', None)
+                    if expired:
+                        raise ExtractorError('%s said: %s' % (self.IE_NAME, expired), expected=True)
+                    raise ExtractorError('Unable to extract video urls')
  
-        urls_info = self._parse_json(
-            mobj.group('json_data'), video_id, transform_source=js_to_json)
+            urls_info = self._parse_json(
+                mobj.group('json_data'), video_id, transform_source=js_to_json)
+            youtube = mobj.group('type') == 'YouTube'
+            video = mobj.group('type') == 'Video' or urls_info[0]['contentType'] == 'video/mp4'
  
          if not isinstance(urls_info, list):
              urls_info = [urls_info]
  
-        if mobj.group('type') == 'YouTube':
+        if youtube:
              return self.playlist_result([
                  self.url_result(url_info['url']) for url_info in urls_info])
  
-        formats = [{
-            'url': url_info['url'],
-            'vcodec': url_info.get('codec') if mobj.group('type') == 'Video' else 'none',
-            'width': int_or_none(url_info.get('width')),
-            'height': int_or_none(url_info.get('height')),
-            'tbr': int_or_none(url_info.get('bitrate')),
-            'filesize': int_or_none(url_info.get('filesize')),
-        } for url_info in urls_info]
+        formats = []
+        for url_info in urls_info:
+            height = int_or_none(url_info.get('height'))
+            bitrate = int_or_none(url_info.get('bitrate'))
+            width = int_or_none(url_info.get('width'))
+            format_id = None
+            mobj = re.search(r'_(?:(?P<height>\d+)|(?P<bitrate>\d+)k)\.mp4$', url_info['url'])
+            if mobj:
+                height_from_url = mobj.group('height')
+                if height_from_url:
+                    height = height or int_or_none(height_from_url)
+                    width = width or int_or_none(url_info.get('label'))
+                else:
+                    bitrate = bitrate or int_or_none(mobj.group('bitrate'))
+                    format_id = str_or_none(url_info.get('label'))
+            formats.append({
+                'url': url_info['url'],
+                'vcodec': url_info.get('codec') if video else 'none',
+                'width': width,
+                'height': height,
+                'tbr': bitrate,
+                'filesize': int_or_none(url_info.get('filesize')),
+                'format_id': format_id
+            })
  
          self._sort_formats(formats)
  
@@ -110,17 +174,17 @@ class ABCIViewIE(InfoExtractor):
  
      # ABC iview programs are normally available for 14 days only.
      _TESTS = [{
-        'url': 'https://iview.abc.net.au/show/ben-and-hollys-little-kingdom/series/0/video/ZX9371A050S00',
-        'md5': 'cde42d728b3b7c2b32b1b94b4a548afc',
+        'url': 'https://iview.abc.net.au/show/gruen/series/11/video/LE1927H001S00',
+        'md5': '67715ce3c78426b11ba167d875ac6abf',
          'info_dict': {
-            'id': 'ZX9371A050S00',
+            'id': 'LE1927H001S00',
              'ext': 'mp4',
-            'title': "Gaston's Birthday",
-            'series': "Ben And Holly's Little Kingdom",
-            'description': 'md5:f9de914d02f226968f598ac76f105bcf',
-            'upload_date': '20180604',
-            'uploader_id': 'abc4kids',
-            'timestamp': 1528140219,
+            'title': "Series 11 Ep 1",
+            'series': "Gruen",
+            'description': 'md5:52cc744ad35045baf6aded2ce7287f67',
+            'upload_date': '20190925',
+            'uploader_id': 'abc1',
+            'timestamp': 1569445289,
          },
          'params': {
              'skip_download': True,
@@ -148,7 +212,7 @@ def tokenize_url(url, token):
                  'hdnea': token,
              })
  
-        for sd in ('sd', 'sd-low'):
+        for sd in ('720', 'sd', 'sd-low'):
              sd_url = try_get(
                  stream, lambda x: x['streams']['hls'][sd], compat_str)
              if not sd_url:
diff --git a/youtube_dl/extractor/abcnews.py b/youtube_dlc/extractor/abcnews.py

similarity index 100%

rename from youtube_dl/extractor/abcnews.py

rename to youtube_dlc/extractor/abcnews.py
diff --git a/youtube_dl/extractor/abcotvs.py b/youtube_dlc/extractor/abcotvs.py

similarity index 100%

rename from youtube_dl/extractor/abcotvs.py

rename to youtube_dlc/extractor/abcotvs.py
diff --git a/youtube_dl/extractor/academicearth.py b/youtube_dlc/extractor/academicearth.py

similarity index 100%

rename from youtube_dl/extractor/academicearth.py

rename to youtube_dlc/extractor/academicearth.py
diff --git a/youtube_dl/extractor/acast.py b/youtube_dlc/extractor/acast.py

similarity index 100%

rename from youtube_dl/extractor/acast.py

rename to youtube_dlc/extractor/acast.py
diff --git a/youtube_dl/extractor/adn.py b/youtube_dlc/extractor/adn.py

similarity index 100%

rename from youtube_dl/extractor/adn.py

rename to youtube_dlc/extractor/adn.py
diff --git a/youtube_dl/extractor/adobeconnect.py b/youtube_dlc/extractor/adobeconnect.py

similarity index 100%

rename from youtube_dl/extractor/adobeconnect.py

rename to youtube_dlc/extractor/adobeconnect.py
diff --git a/youtube_dl/extractor/adobepass.py b/youtube_dlc/extractor/adobepass.py

similarity index 100%

rename from youtube_dl/extractor/adobepass.py

rename to youtube_dlc/extractor/adobepass.py
diff --git a/youtube_dl/extractor/adobetv.py b/youtube_dlc/extractor/adobetv.py

similarity index 100%

rename from youtube_dl/extractor/adobetv.py

rename to youtube_dlc/extractor/adobetv.py
diff --git a/youtube_dl/extractor/adultswim.py b/youtube_dlc/extractor/adultswim.py

similarity index 100%

rename from youtube_dl/extractor/adultswim.py

rename to youtube_dlc/extractor/adultswim.py
diff --git a/youtube_dl/extractor/aenetworks.py b/youtube_dlc/extractor/aenetworks.py

similarity index 100%

rename from youtube_dl/extractor/aenetworks.py

rename to youtube_dlc/extractor/aenetworks.py
diff --git a/youtube_dl/extractor/afreecatv.py b/youtube_dlc/extractor/afreecatv.py

similarity index 100%

rename from youtube_dl/extractor/afreecatv.py

rename to youtube_dlc/extractor/afreecatv.py
diff --git a/youtube_dl/extractor/airmozilla.py b/youtube_dlc/extractor/airmozilla.py

similarity index 100%

rename from youtube_dl/extractor/airmozilla.py

rename to youtube_dlc/extractor/airmozilla.py
diff --git a/youtube_dl/extractor/aliexpress.py b/youtube_dlc/extractor/aliexpress.py

similarity index 100%

rename from youtube_dl/extractor/aliexpress.py

rename to youtube_dlc/extractor/aliexpress.py
diff --git a/youtube_dl/extractor/aljazeera.py b/youtube_dlc/extractor/aljazeera.py

similarity index 100%

rename from youtube_dl/extractor/aljazeera.py

rename to youtube_dlc/extractor/aljazeera.py
diff --git a/youtube_dl/extractor/allocine.py b/youtube_dlc/extractor/allocine.py

similarity index 100%

rename from youtube_dl/extractor/allocine.py

rename to youtube_dlc/extractor/allocine.py
diff --git a/youtube_dl/extractor/alphaporno.py b/youtube_dlc/extractor/alphaporno.py

similarity index 100%

rename from youtube_dl/extractor/alphaporno.py

rename to youtube_dlc/extractor/alphaporno.py
diff --git a/youtube_dlc/extractor/alura.py b/youtube_dlc/extractor/alura.py

new file mode 100644 (file)

index 0000000..36b4d95
--- /dev/null
+++ b/youtube_dlc/extractor/alura.py
@@ -0,0 +1,180 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+import re
+
+from .common import InfoExtractor
+
+from ..compat import (
+    compat_urlparse,
+)
+
+from ..utils import (
+    urlencode_postdata,
+    urljoin,
+    int_or_none,
+    clean_html,
+    ExtractorError
+)
+
+
+class AluraIE(InfoExtractor):
+    _VALID_URL = r'https?://(?:cursos\.)?alura\.com\.br/course/(?P<course_name>[^/]+)/task/(?P<id>\d+)'
+    _LOGIN_URL = 'https://cursos.alura.com.br/loginForm?urlAfterLogin=/loginForm'
+    _VIDEO_URL = 'https://cursos.alura.com.br/course/%s/task/%s/video'
+    _NETRC_MACHINE = 'alura'
+    _TESTS = [{
+        'url': 'https://cursos.alura.com.br/course/clojure-mutabilidade-com-atoms-e-refs/task/60095',
+        'info_dict': {
+            'id': '60095',
+            'ext': 'mp4',
+            'title': 'Referências, ref-set e alter'
+        },
+        'skip': 'Requires alura account credentials'},
+        {
+            # URL without video
+            'url': 'https://cursos.alura.com.br/course/clojure-mutabilidade-com-atoms-e-refs/task/60098',
+            'only_matching': True},
+        {
+            'url': 'https://cursos.alura.com.br/course/fundamentos-market-digital/task/55219',
+            'only_matching': True}
+    ]
+
+    def _real_extract(self, url):
+
+        video_id = self._match_id(url)
+        course = self._search_regex(self._VALID_URL, url, 'post url', group='course_name')
+        video_url = self._VIDEO_URL % (course, video_id)
+
+        video_dict = self._download_json(video_url, video_id, 'Searching for videos')
+
+        if video_dict:
+            webpage = self._download_webpage(url, video_id)
+            video_title = clean_html(self._search_regex(
+                r'<span[^>]+class=(["\'])task-body-header-title-text\1[^>]*>(?P<title>[^<]+)',
+                webpage, 'title', group='title'))
+
+            formats = []
+            for video_obj in video_dict:
+                video_url_m3u8 = video_obj.get('link')
+                video_format = self._extract_m3u8_formats(
+                    video_url_m3u8, None, 'mp4', entry_protocol='m3u8_native',
+                    m3u8_id='hls', fatal=False)
+                for f in video_format:
+                    m = re.search(r'^[\w \W]*-(?P<res>\w*).mp4[\W \w]*', f['url'])
+                    if m:
+                        if not f.get('height'):
+                            f['height'] = int('720' if m.group('res') == 'hd' else '480')
+                formats.extend(video_format)
+
+            self._sort_formats(formats, field_preference=('height', 'width', 'tbr', 'format_id'))
+
+            return {
+                'id': video_id,
+                'title': video_title,
+                "formats": formats
+            }
+
+    def _real_initialize(self):
+        self._login()
+
+    def _login(self):
+        username, password = self._get_login_info()
+        if username is None:
+            return
+        pass
+
+        login_page = self._download_webpage(
+            self._LOGIN_URL, None, 'Downloading login popup')
+
+        def is_logged(webpage):
+            return any(re.search(p, webpage) for p in (
+                r'href=[\"|\']?/signout[\"|\']',
+                r'>Logout<'))
+
+        # already logged in
+        if is_logged(login_page):
+            return
+
+        login_form = self._hidden_inputs(login_page)
+
+        login_form.update({
+            'username': username,
+            'password': password,
+        })
+
+        post_url = self._search_regex(
+            r'<form[^>]+class=["|\']signin-form["|\'] action=["|\'](?P<url>.+?)["|\']', login_page,
+            'post url', default=self._LOGIN_URL, group='url')
+
+        if not post_url.startswith('http'):
+            post_url = compat_urlparse.urljoin(self._LOGIN_URL, post_url)
+
+        response = self._download_webpage(
+            post_url, None, 'Logging in',
+            data=urlencode_postdata(login_form),
+            headers={'Content-Type': 'application/x-www-form-urlencoded'})
+
+        if not is_logged(response):
+            error = self._html_search_regex(
+                r'(?s)<p[^>]+class="alert-message[^"]*">(.+?)</p>',
+                response, 'error message', default=None)
+            if error:
+                raise ExtractorError('Unable to login: %s' % error, expected=True)
+            raise ExtractorError('Unable to log in')
+
+
+class AluraCourseIE(AluraIE):
+
+    _VALID_URL = r'https?://(?:cursos\.)?alura\.com\.br/course/(?P<id>[^/]+)'
+    _LOGIN_URL = 'https://cursos.alura.com.br/loginForm?urlAfterLogin=/loginForm'
+    _NETRC_MACHINE = 'aluracourse'
+    _TESTS = [{
+        'url': 'https://cursos.alura.com.br/course/clojure-mutabilidade-com-atoms-e-refs',
+        'only_matching': True,
+    }]
+
+    @classmethod
+    def suitable(cls, url):
+        return False if AluraIE.suitable(url) else super(AluraCourseIE, cls).suitable(url)
+
+    def _real_extract(self, url):
+
+        course_path = self._match_id(url)
+        webpage = self._download_webpage(url, course_path)
+
+        course_title = self._search_regex(
+            r'<h1.*?>(.*?)<strong>(?P<course_title>.*?)</strong></h[0-9]>', webpage,
+            'course title', default=course_path, group='course_title')
+
+        entries = []
+        if webpage:
+            for path in re.findall(r'<a\b(?=[^>]* class="[^"]*(?<=[" ])courseSectionList-section[" ])(?=[^>]* href="([^"]*))', webpage):
+                page_url = urljoin(url, path)
+                section_path = self._download_webpage(page_url, course_path)
+                for path_video in re.findall(r'<a\b(?=[^>]* class="[^"]*(?<=[" ])task-menu-nav-item-link-VIDEO[" ])(?=[^>]* href="([^"]*))', section_path):
+                    chapter = clean_html(
+                        self._search_regex(
+                            r'<h3[^>]+class=(["\'])task-menu-section-title-text\1[^>]*>(?P<chapter>[^<]+)',
+                            section_path,
+                            'chapter',
+                            group='chapter'))
+
+                    chapter_number = int_or_none(
+                        self._search_regex(
+                            r'<span[^>]+class=(["\'])task-menu-section-title-number[^>]*>(.*?)<strong>(?P<chapter_number>[^<]+)</strong>',
+                            section_path,
+                            'chapter number',
+                            group='chapter_number'))
+                    video_url = urljoin(url, path_video)
+
+                    entry = {
+                        '_type': 'url_transparent',
+                        'id': self._match_id(video_url),
+                        'url': video_url,
+                        'id_key': self.ie_key(),
+                        'chapter': chapter,
+                        'chapter_number': chapter_number
+                    }
+                    entries.append(entry)
+        return self.playlist_result(entries, course_path, course_title)
diff --git a/youtube_dl/extractor/amcnetworks.py b/youtube_dlc/extractor/amcnetworks.py

similarity index 100%

rename from youtube_dl/extractor/amcnetworks.py

rename to youtube_dlc/extractor/amcnetworks.py
diff --git a/youtube_dl/extractor/americastestkitchen.py b/youtube_dlc/extractor/americastestkitchen.py

similarity index 62%

rename from youtube_dl/extractor/americastestkitchen.py

rename to youtube_dlc/extractor/americastestkitchen.py

index 8b32aa886e9696e9334f73a777a70264f28c9433..9c9d77ae107e0b822b46368d89445f21e9e830a6 100644 (file)
--- a/youtube_dl/extractor/americastestkitchen.py
+++ b/youtube_dlc/extractor/americastestkitchen.py
@@ -5,6 +5,7 @@
  from ..utils import (
      clean_html,
      int_or_none,
+    js_to_json,
      try_get,
      unified_strdate,
  )
@@ -13,22 +14,21 @@
  class AmericasTestKitchenIE(InfoExtractor):
      _VALID_URL = r'https?://(?:www\.)?americastestkitchen\.com/(?:episode|videos)/(?P<id>\d+)'
      _TESTS = [{
-        'url': 'https://www.americastestkitchen.com/episode/548-summer-dinner-party',
+        'url': 'https://www.americastestkitchen.com/episode/582-weeknight-japanese-suppers',
          'md5': 'b861c3e365ac38ad319cfd509c30577f',
          'info_dict': {
-            'id': '1_5g5zua6e',
-            'title': 'Summer Dinner Party',
+            'id': '5b400b9ee338f922cb06450c',
+            'title': 'Weeknight Japanese Suppers',
              'ext': 'mp4',
-            'description': 'md5:858d986e73a4826979b6a5d9f8f6a1ec',
-            'thumbnail': r're:^https?://.*\.jpg',
-            'timestamp': 1497285541,
-            'upload_date': '20170612',
-            'uploader_id': 'roger.metcalf@americastestkitchen.com',
-            'release_date': '20170617',
+            'description': 'md5:3d0c1a44bb3b27607ce82652db25b4a8',
+            'thumbnail': r're:^https?://',
+            'timestamp': 1523664000,
+            'upload_date': '20180414',
+            'release_date': '20180414',
              'series': "America's Test Kitchen",
-            'season_number': 17,
-            'episode': 'Summer Dinner Party',
-            'episode_number': 24,
+            'season_number': 18,
+            'episode': 'Weeknight Japanese Suppers',
+            'episode_number': 15,
          },
          'params': {
              'skip_download': True,
@@ -47,7 +47,7 @@ def _real_extract(self, url):
              self._search_regex(
                  r'window\.__INITIAL_STATE__\s*=\s*({.+?})\s*;\s*</script>',
                  webpage, 'initial context'),
-            video_id)
+            video_id, js_to_json)
  
          ep_data = try_get(
              video_data,
@@ -55,17 +55,7 @@ def _real_extract(self, url):
               lambda x: x['videoDetail']['content']['data']), dict)
          ep_meta = ep_data.get('full_video', {})
  
-        zype_id = ep_meta.get('zype_id')
-        if zype_id:
-            embed_url = 'https://player.zype.com/embed/%s.js?api_key=jZ9GUhRmxcPvX7M3SlfejB6Hle9jyHTdk2jVxG7wOHPLODgncEKVdPYBhuz9iWXQ' % zype_id
-            ie_key = 'Zype'
-        else:
-            partner_id = self._search_regex(
-                r'src=["\'](?:https?:)?//(?:[^/]+\.)kaltura\.com/(?:[^/]+/)*(?:p|partner_id)/(\d+)',
-                webpage, 'kaltura partner id')
-            external_id = ep_data.get('external_id') or ep_meta['external_id']
-            embed_url = 'kaltura:%s:%s' % (partner_id, external_id)
-            ie_key = 'Kaltura'
+        zype_id = ep_data.get('zype_id') or ep_meta['zype_id']
  
          title = ep_data.get('title') or ep_meta.get('title')
          description = clean_html(ep_meta.get('episode_description') or ep_data.get(
@@ -79,8 +69,8 @@ def _real_extract(self, url):
  
          return {
              '_type': 'url_transparent',
-            'url': embed_url,
-            'ie_key': ie_key,
+            'url': 'https://player.zype.com/embed/%s.js?api_key=jZ9GUhRmxcPvX7M3SlfejB6Hle9jyHTdk2jVxG7wOHPLODgncEKVdPYBhuz9iWXQ' % zype_id,
+            'ie_key': 'Zype',
              'title': title,
              'description': description,
              'thumbnail': thumbnail,
diff --git a/youtube_dl/extractor/amp.py b/youtube_dlc/extractor/amp.py

similarity index 100%

rename from youtube_dl/extractor/amp.py

rename to youtube_dlc/extractor/amp.py
diff --git a/youtube_dl/extractor/animeondemand.py b/youtube_dlc/extractor/animeondemand.py

similarity index 100%

rename from youtube_dl/extractor/animeondemand.py

rename to youtube_dlc/extractor/animeondemand.py
diff --git a/youtube_dl/extractor/anvato.py b/youtube_dlc/extractor/anvato.py

similarity index 100%

rename from youtube_dl/extractor/anvato.py

rename to youtube_dlc/extractor/anvato.py
diff --git a/youtube_dl/extractor/aol.py b/youtube_dlc/extractor/aol.py

similarity index 100%

rename from youtube_dl/extractor/aol.py

rename to youtube_dlc/extractor/aol.py
diff --git a/youtube_dl/extractor/apa.py b/youtube_dlc/extractor/apa.py

similarity index 100%

rename from youtube_dl/extractor/apa.py

rename to youtube_dlc/extractor/apa.py
diff --git a/youtube_dl/extractor/aparat.py b/youtube_dlc/extractor/aparat.py

similarity index 100%

rename from youtube_dl/extractor/aparat.py

rename to youtube_dlc/extractor/aparat.py
diff --git a/youtube_dl/extractor/appleconnect.py b/youtube_dlc/extractor/appleconnect.py

similarity index 100%

rename from youtube_dl/extractor/appleconnect.py

rename to youtube_dlc/extractor/appleconnect.py
diff --git a/youtube_dl/extractor/appletrailers.py b/youtube_dlc/extractor/appletrailers.py

similarity index 99%

rename from youtube_dl/extractor/appletrailers.py

rename to youtube_dlc/extractor/appletrailers.py

index a9ef733e011237338d904f956c4324cc6dd7a72b..b5ed2b88b3e227216db12da1936b9d4c3466efa8 100644 (file)
--- a/youtube_dl/extractor/appletrailers.py
+++ b/youtube_dlc/extractor/appletrailers.py
@@ -199,7 +199,7 @@ def _clean_json(m):
                  'upload_date': upload_date,
                  'uploader_id': uploader_id,
                  'http_headers': {
-                    'User-Agent': 'QuickTime compatible (youtube-dl)',
+                    'User-Agent': 'QuickTime compatible (youtube-dlc)',
                  },
              })
  
diff --git a/youtube_dl/extractor/archiveorg.py b/youtube_dlc/extractor/archiveorg.py

similarity index 100%

rename from youtube_dl/extractor/archiveorg.py

rename to youtube_dlc/extractor/archiveorg.py
diff --git a/youtube_dlc/extractor/ard.py b/youtube_dlc/extractor/ard.py

new file mode 100644 (file)

index 0000000..6f1e477
--- /dev/null
+++ b/youtube_dlc/extractor/ard.py
@@ -0,0 +1,574 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+import json
+import re
+
+from .common import InfoExtractor
+from .generic import GenericIE
+from ..utils import (
+    determine_ext,
+    ExtractorError,
+    int_or_none,
+    parse_duration,
+    qualities,
+    str_or_none,
+    try_get,
+    unified_strdate,
+    unified_timestamp,
+    update_url_query,
+    url_or_none,
+    xpath_text,
+)
+from ..compat import compat_etree_fromstring
+
+
+class ARDMediathekBaseIE(InfoExtractor):
+    _GEO_COUNTRIES = ['DE']
+
+    def _extract_media_info(self, media_info_url, webpage, video_id):
+        media_info = self._download_json(
+            media_info_url, video_id, 'Downloading media JSON')
+        return self._parse_media_info(media_info, video_id, '"fsk"' in webpage)
+
+    def _parse_media_info(self, media_info, video_id, fsk):
+        formats = self._extract_formats(media_info, video_id)
+
+        if not formats:
+            if fsk:
+                raise ExtractorError(
+                    'This video is only available after 20:00', expected=True)
+            elif media_info.get('_geoblocked'):
+                self.raise_geo_restricted(
+                    'This video is not available due to geoblocking',
+                    countries=self._GEO_COUNTRIES)
+
+        self._sort_formats(formats)
+
+        subtitles = {}
+        subtitle_url = media_info.get('_subtitleUrl')
+        if subtitle_url:
+            subtitles['de'] = [{
+                'ext': 'ttml',
+                'url': subtitle_url,
+            }]
+
+        return {
+            'id': video_id,
+            'duration': int_or_none(media_info.get('_duration')),
+            'thumbnail': media_info.get('_previewImage'),
+            'is_live': media_info.get('_isLive') is True,
+            'formats': formats,
+            'subtitles': subtitles,
+        }
+
+    def _ARD_extract_episode_info(self, title):
+        """Try to extract season/episode data from the title."""
+        res = {}
+        if not title:
+            return res
+
+        for pattern in [
+            # Pattern for title like "Homo sapiens (S06/E07) - Originalversion"
+            # from: https://www.ardmediathek.de/one/sendung/doctor-who/Y3JpZDovL3dkci5kZS9vbmUvZG9jdG9yIHdobw
+            r'.*(?P<ep_info> \(S(?P<season_number>\d+)/E(?P<episode_number>\d+)\)).*',
+            # E.g.: title="Fritjof aus Norwegen (2) (AD)"
+            # from: https://www.ardmediathek.de/ard/sammlung/der-krieg-und-ich/68cMkqJdllm639Skj4c7sS/
+            r'.*(?P<ep_info> \((?:Folge |Teil )?(?P<episode_number>\d+)(?:/\d+)?\)).*',
+            r'.*(?P<ep_info>Folge (?P<episode_number>\d+)(?:\:| -|) )\"(?P<episode>.+)\".*',
+            # E.g.: title="Folge 25/42: Symmetrie"
+            # from: https://www.ardmediathek.de/ard/video/grips-mathe/folge-25-42-symmetrie/ard-alpha/Y3JpZDovL2JyLmRlL3ZpZGVvLzMyYzI0ZjczLWQ1N2MtNDAxNC05ZmZhLTFjYzRkZDA5NDU5OQ/
+            # E.g.: title="Folge 1063 - Vertrauen"
+            # from: https://www.ardmediathek.de/ard/sendung/die-fallers/Y3JpZDovL3N3ci5kZS8yMzAyMDQ4/
+            r'.*(?P<ep_info>Folge (?P<episode_number>\d+)(?:/\d+)?(?:\:| -|) ).*',
+        ]:
+            m = re.match(pattern, title)
+            if m:
+                groupdict = m.groupdict()
+                res['season_number'] = int_or_none(groupdict.get('season_number'))
+                res['episode_number'] = int_or_none(groupdict.get('episode_number'))
+                res['episode'] = str_or_none(groupdict.get('episode'))
+                # Build the episode title by removing numeric episode information:
+                if groupdict.get('ep_info') and not res['episode']:
+                    res['episode'] = str_or_none(
+                        title.replace(groupdict.get('ep_info'), ''))
+                if res['episode']:
+                    res['episode'] = res['episode'].strip()
+                break
+
+        # As a fallback use the whole title as the episode name:
+        if not res.get('episode'):
+            res['episode'] = title.strip()
+        return res
+
+    def _extract_formats(self, media_info, video_id):
+        type_ = media_info.get('_type')
+        media_array = media_info.get('_mediaArray', [])
+        formats = []
+        for num, media in enumerate(media_array):
+            for stream in media.get('_mediaStreamArray', []):
+                stream_urls = stream.get('_stream')
+                if not stream_urls:
+                    continue
+                if not isinstance(stream_urls, list):
+                    stream_urls = [stream_urls]
+                quality = stream.get('_quality')
+                server = stream.get('_server')
+                for stream_url in stream_urls:
+                    if not url_or_none(stream_url):
+                        continue
+                    ext = determine_ext(stream_url)
+                    if quality != 'auto' and ext in ('f4m', 'm3u8'):
+                        continue
+                    if ext == 'f4m':
+                        formats.extend(self._extract_f4m_formats(
+                            update_url_query(stream_url, {
+                                'hdcore': '3.1.1',
+                                'plugin': 'aasp-3.1.1.69.124'
+                            }), video_id, f4m_id='hds', fatal=False))
+                    elif ext == 'm3u8':
+                        formats.extend(self._extract_m3u8_formats(
+                            stream_url, video_id, 'mp4', 'm3u8_native',
+                            m3u8_id='hls', fatal=False))
+                    else:
+                        if server and server.startswith('rtmp'):
+                            f = {
+                                'url': server,
+                                'play_path': stream_url,
+                                'format_id': 'a%s-rtmp-%s' % (num, quality),
+                            }
+                        else:
+                            f = {
+                                'url': stream_url,
+                                'format_id': 'a%s-%s-%s' % (num, ext, quality)
+                            }
+                        m = re.search(
+                            r'_(?P<width>\d+)x(?P<height>\d+)\.mp4$',
+                            stream_url)
+                        if m:
+                            f.update({
+                                'width': int(m.group('width')),
+                                'height': int(m.group('height')),
+                            })
+                        if type_ == 'audio':
+                            f['vcodec'] = 'none'
+                        formats.append(f)
+        return formats
+
+
+class ARDMediathekIE(ARDMediathekBaseIE):
+    IE_NAME = 'ARD:mediathek'
+    _VALID_URL = r'^https?://(?:(?:(?:www|classic)\.)?ardmediathek\.de|mediathek\.(?:daserste|rbb-online)\.de|one\.ard\.de)/(?:.*/)(?P<video_id>[0-9]+|[^0-9][^/\?]+)[^/\?]*(?:\?.*)?'
+
+    _TESTS = [{
+        # available till 26.07.2022
+        'url': 'http://www.ardmediathek.de/tv/S%C3%9CDLICHT/Was-ist-die-Kunst-der-Zukunft-liebe-Ann/BR-Fernsehen/Video?bcastId=34633636&documentId=44726822',
+        'info_dict': {
+            'id': '44726822',
+            'ext': 'mp4',
+            'title': 'Was ist die Kunst der Zukunft, liebe Anna McCarthy?',
+            'description': 'md5:4ada28b3e3b5df01647310e41f3a62f5',
+            'duration': 1740,
+        },
+        'params': {
+            # m3u8 download
+            'skip_download': True,
+        }
+    }, {
+        'url': 'https://one.ard.de/tv/Mord-mit-Aussicht/Mord-mit-Aussicht-6-39-T%C3%B6dliche-Nach/ONE/Video?bcastId=46384294&documentId=55586872',
+        'only_matching': True,
+    }, {
+        # audio
+        'url': 'http://www.ardmediathek.de/tv/WDR-H%C3%B6rspiel-Speicher/Tod-eines-Fu%C3%9Fballers/WDR-3/Audio-Podcast?documentId=28488308&bcastId=23074086',
+        'only_matching': True,
+    }, {
+        'url': 'http://mediathek.daserste.de/sendungen_a-z/328454_anne-will/22429276_vertrauen-ist-gut-spionieren-ist-besser-geht',
+        'only_matching': True,
+    }, {
+        # audio
+        'url': 'http://mediathek.rbb-online.de/radio/Hörspiel/Vor-dem-Fest/kulturradio/Audio?documentId=30796318&topRessort=radio&bcastId=9839158',
+        'only_matching': True,
+    }, {
+        'url': 'https://classic.ardmediathek.de/tv/Panda-Gorilla-Co/Panda-Gorilla-Co-Folge-274/Das-Erste/Video?bcastId=16355486&documentId=58234698',
+        'only_matching': True,
+    }]
+
+    @classmethod
+    def suitable(cls, url):
+        return False if ARDBetaMediathekIE.suitable(url) else super(ARDMediathekIE, cls).suitable(url)
+
+    def _real_extract(self, url):
+        # determine video id from url
+        m = re.match(self._VALID_URL, url)
+
+        document_id = None
+
+        numid = re.search(r'documentId=([0-9]+)', url)
+        if numid:
+            document_id = video_id = numid.group(1)
+        else:
+            video_id = m.group('video_id')
+
+        webpage = self._download_webpage(url, video_id)
+
+        ERRORS = (
+            ('>Leider liegt eine Störung vor.', 'Video %s is unavailable'),
+            ('>Der gewünschte Beitrag ist nicht mehr verfügbar.<',
+             'Video %s is no longer available'),
+        )
+
+        for pattern, message in ERRORS:
+            if pattern in webpage:
+                raise ExtractorError(message % video_id, expected=True)
+
+        if re.search(r'[\?&]rss($|[=&])', url):
+            doc = compat_etree_fromstring(webpage.encode('utf-8'))
+            if doc.tag == 'rss':
+                return GenericIE()._extract_rss(url, video_id, doc)
+
+        title = self._html_search_regex(
+            [r'<h1(?:\s+class="boxTopHeadline")?>(.*?)</h1>',
+             r'<meta name="dcterms\.title" content="(.*?)"/>',
+             r'<h4 class="headline">(.*?)</h4>',
+             r'<title[^>]*>(.*?)</title>'],
+            webpage, 'title')
+        description = self._html_search_meta(
+            'dcterms.abstract', webpage, 'description', default=None)
+        if description is None:
+            description = self._html_search_meta(
+                'description', webpage, 'meta description', default=None)
+        if description is None:
+            description = self._html_search_regex(
+                r'<p\s+class="teasertext">(.+?)</p>',
+                webpage, 'teaser text', default=None)
+
+        # Thumbnail is sometimes not present.
+        # It is in the mobile version, but that seems to use a different URL
+        # structure altogether.
+        thumbnail = self._og_search_thumbnail(webpage, default=None)
+
+        media_streams = re.findall(r'''(?x)
+            mediaCollection\.addMediaStream\([0-9]+,\s*[0-9]+,\s*"[^"]*",\s*
+            "([^"]+)"''', webpage)
+
+        if media_streams:
+            QUALITIES = qualities(['lo', 'hi', 'hq'])
+            formats = []
+            for furl in set(media_streams):
+                if furl.endswith('.f4m'):
+                    fid = 'f4m'
+                else:
+                    fid_m = re.match(r'.*\.([^.]+)\.[^.]+$', furl)
+                    fid = fid_m.group(1) if fid_m else None
+                formats.append({
+                    'quality': QUALITIES(fid),
+                    'format_id': fid,
+                    'url': furl,
+                })
+            self._sort_formats(formats)
+            info = {
+                'formats': formats,
+            }
+        else:  # request JSON file
+            if not document_id:
+                video_id = self._search_regex(
+                    r'/play/(?:config|media)/(\d+)', webpage, 'media id')
+            info = self._extract_media_info(
+                'http://www.ardmediathek.de/play/media/%s' % video_id,
+                webpage, video_id)
+
+        info.update({
+            'id': video_id,
+            'title': self._live_title(title) if info.get('is_live') else title,
+            'description': description,
+            'thumbnail': thumbnail,
+        })
+        info.update(self._ARD_extract_episode_info(info['title']))
+
+        return info
+
+
+class ARDIE(InfoExtractor):
+    _VALID_URL = r'(?P<mainurl>https?://(www\.)?daserste\.de/[^?#]+/videos(?:extern)?/(?P<display_id>[^/?#]+)-(?P<id>[0-9]+))\.html'
+    _TESTS = [{
+        # available till 14.02.2019
+        'url': 'http://www.daserste.de/information/talk/maischberger/videos/das-groko-drama-zerlegen-sich-die-volksparteien-video-102.html',
+        'md5': '8e4ec85f31be7c7fc08a26cdbc5a1f49',
+        'info_dict': {
+            'display_id': 'das-groko-drama-zerlegen-sich-die-volksparteien-video',
+            'id': '102',
+            'ext': 'mp4',
+            'duration': 4435.0,
+            'title': 'Das GroKo-Drama: Zerlegen sich die Volksparteien?',
+            'upload_date': '20180214',
+            'thumbnail': r're:^https?://.*\.jpg$',
+        },
+    }, {
+        'url': 'https://www.daserste.de/information/reportage-dokumentation/erlebnis-erde/videosextern/woelfe-und-herdenschutzhunde-ungleiche-brueder-102.html',
+        'only_matching': True,
+    }, {
+        'url': 'http://www.daserste.de/information/reportage-dokumentation/dokus/videos/die-story-im-ersten-mission-unter-falscher-flagge-100.html',
+        'only_matching': True,
+    }]
+
+    def _real_extract(self, url):
+        mobj = re.match(self._VALID_URL, url)
+        display_id = mobj.group('display_id')
+
+        player_url = mobj.group('mainurl') + '~playerXml.xml'
+        doc = self._download_xml(player_url, display_id)
+        video_node = doc.find('./video')
+        upload_date = unified_strdate(xpath_text(
+            video_node, './broadcastDate'))
+        thumbnail = xpath_text(video_node, './/teaserImage//variant/url')
+
+        formats = []
+        for a in video_node.findall('.//asset'):
+            f = {
+                'format_id': a.attrib['type'],
+                'width': int_or_none(a.find('./frameWidth').text),
+                'height': int_or_none(a.find('./frameHeight').text),
+                'vbr': int_or_none(a.find('./bitrateVideo').text),
+                'abr': int_or_none(a.find('./bitrateAudio').text),
+                'vcodec': a.find('./codecVideo').text,
+                'tbr': int_or_none(a.find('./totalBitrate').text),
+            }
+            if a.find('./serverPrefix').text:
+                f['url'] = a.find('./serverPrefix').text
+                f['playpath'] = a.find('./fileName').text
+            else:
+                f['url'] = a.find('./fileName').text
+            formats.append(f)
+        self._sort_formats(formats)
+
+        return {
+            'id': mobj.group('id'),
+            'formats': formats,
+            'display_id': display_id,
+            'title': video_node.find('./title').text,
+            'duration': parse_duration(video_node.find('./duration').text),
+            'upload_date': upload_date,
+            'thumbnail': thumbnail,
+        }
+
+
+class ARDBetaMediathekIE(ARDMediathekBaseIE):
+    _VALID_URL = r'https://(?:(?:beta|www)\.)?ardmediathek\.de/(?P<client>[^/]+)/(?P<mode>player|live|video|sendung|sammlung)/(?P<display_id>(?:[^/]+/)*)(?P<video_id>[a-zA-Z0-9]+)'
+    _TESTS = [{
+        'url': 'https://ardmediathek.de/ard/video/die-robuste-roswita/Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhdG9ydC9mYmM4NGM1NC0xNzU4LTRmZGYtYWFhZS0wYzcyZTIxNGEyMDE',
+        'md5': 'dfdc87d2e7e09d073d5a80770a9ce88f',
+        'info_dict': {
+            'display_id': 'die-robuste-roswita',
+            'id': '70153354',
+            'title': 'Die robuste Roswita',
+            'description': r're:^Der Mord.*trüber ist als die Ilm.',
+            'duration': 5316,
+            'thumbnail': 'https://img.ardmediathek.de/standard/00/70/15/33/90/-1852531467/16x9/960?mandant=ard',
+            'timestamp': 1577047500,
+            'upload_date': '20191222',
+            'ext': 'mp4',
+        },
+    }, {
+        'url': 'https://beta.ardmediathek.de/ard/video/Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhdG9ydC9mYmM4NGM1NC0xNzU4LTRmZGYtYWFhZS0wYzcyZTIxNGEyMDE',
+        'only_matching': True,
+    }, {
+        'url': 'https://ardmediathek.de/ard/video/saartalk/saartalk-gesellschaftsgift-haltung-gegen-hass/sr-fernsehen/Y3JpZDovL3NyLW9ubGluZS5kZS9TVF84MTY4MA/',
+        'only_matching': True,
+    }, {
+        'url': 'https://www.ardmediathek.de/ard/video/trailer/private-eyes-s01-e01/one/Y3JpZDovL3dkci5kZS9CZWl0cmFnLTE1MTgwYzczLWNiMTEtNGNkMS1iMjUyLTg5MGYzOWQxZmQ1YQ/',
+        'only_matching': True,
+    }, {
+        'url': 'https://www.ardmediathek.de/ard/player/Y3JpZDovL3N3ci5kZS9hZXgvbzEwNzE5MTU/',
+        'only_matching': True,
+    }, {
+        'url': 'https://www.ardmediathek.de/swr/live/Y3JpZDovL3N3ci5kZS8xMzQ4MTA0Mg',
+        'only_matching': True,
+    }, {
+        # playlist of type 'sendung'
+        'url': 'https://www.ardmediathek.de/ard/sendung/doctor-who/Y3JpZDovL3dkci5kZS9vbmUvZG9jdG9yIHdobw/',
+        'only_matching': True,
+    }, {
+        # playlist of type 'sammlung'
+        'url': 'https://www.ardmediathek.de/ard/sammlung/team-muenster/5JpTzLSbWUAK8184IOvEir/',
+        'only_matching': True,
+    }]
+
+    def _ARD_load_playlist_snipped(self, playlist_id, display_id, client, mode, pageNumber):
+        """ Query the ARD server for playlist information
+        and returns the data in "raw" format """
+        if mode == 'sendung':
+            graphQL = json.dumps({
+                'query': '''{
+                    showPage(
+                        client: "%s"
+                        showId: "%s"
+                        pageNumber: %d
+                    ) {
+                        pagination {
+                            pageSize
+                            totalElements
+                        }
+                        teasers {        # Array
+                            mediumTitle
+                            links { target { id href title } }
+                            type
+                        }
+                    }}''' % (client, playlist_id, pageNumber),
+            }).encode()
+        else:  # mode == 'sammlung'
+            graphQL = json.dumps({
+                'query': '''{
+                    morePage(
+                        client: "%s"
+                        compilationId: "%s"
+                        pageNumber: %d
+                    ) {
+                        widget {
+                            pagination {
+                                pageSize
+                                totalElements
+                            }
+                            teasers {        # Array
+                                mediumTitle
+                                links { target { id href title } }
+                                type
+                            }
+                        }
+                    }}''' % (client, playlist_id, pageNumber),
+            }).encode()
+        # Ressources for ARD graphQL debugging:
+        # https://api-test.ardmediathek.de/public-gateway
+        show_page = self._download_json(
+            'https://api.ardmediathek.de/public-gateway',
+            '[Playlist] %s' % display_id,
+            data=graphQL,
+            headers={'Content-Type': 'application/json'})['data']
+        # align the structure of the returned data:
+        if mode == 'sendung':
+            show_page = show_page['showPage']
+        else:  # mode == 'sammlung'
+            show_page = show_page['morePage']['widget']
+        return show_page
+
+    def _ARD_extract_playlist(self, url, playlist_id, display_id, client, mode):
+        """ Collects all playlist entries and returns them as info dict.
+        Supports playlists of mode 'sendung' and 'sammlung', and also nested
+        playlists. """
+        entries = []
+        pageNumber = 0
+        while True:  # iterate by pageNumber
+            show_page = self._ARD_load_playlist_snipped(
+                playlist_id, display_id, client, mode, pageNumber)
+            for teaser in show_page['teasers']:  # process playlist items
+                if '/compilation/' in teaser['links']['target']['href']:
+                    # alternativ cond.: teaser['type'] == "compilation"
+                    # => This is an nested compilation, e.g. like:
+                    # https://www.ardmediathek.de/ard/sammlung/die-kirche-bleibt-im-dorf/5eOHzt8XB2sqeFXbIoJlg2/
+                    link_mode = 'sammlung'
+                else:
+                    link_mode = 'video'
+
+                item_url = 'https://www.ardmediathek.de/%s/%s/%s/%s/%s' % (
+                    client, link_mode, display_id,
+                    # perform HTLM quoting of episode title similar to ARD:
+                    re.sub('^-|-$', '',  # remove '-' from begin/end
+                           re.sub('[^a-zA-Z0-9]+', '-',  # replace special chars by -
+                                  teaser['links']['target']['title'].lower()
+                                  .replace('ä', 'ae').replace('ö', 'oe')
+                                  .replace('ü', 'ue').replace('ß', 'ss'))),
+                    teaser['links']['target']['id'])
+                entries.append(self.url_result(
+                    item_url,
+                    ie=ARDBetaMediathekIE.ie_key()))
+
+            if (show_page['pagination']['pageSize'] * (pageNumber + 1)
+               >= show_page['pagination']['totalElements']):
+                # we've processed enough pages to get all playlist entries
+                break
+            pageNumber = pageNumber + 1
+
+        return self.playlist_result(entries, playlist_title=display_id)
+
+    def _real_extract(self, url):
+        mobj = re.match(self._VALID_URL, url)
+        video_id = mobj.group('video_id')
+        display_id = mobj.group('display_id')
+        if display_id:
+            display_id = display_id.rstrip('/')
+        if not display_id:
+            display_id = video_id
+
+        if mobj.group('mode') in ('sendung', 'sammlung'):
+            # this is a playlist-URL
+            return self._ARD_extract_playlist(
+                url, video_id, display_id,
+                mobj.group('client'),
+                mobj.group('mode'))
+
+        player_page = self._download_json(
+            'https://api.ardmediathek.de/public-gateway',
+            display_id, data=json.dumps({
+                'query': '''{
+  playerPage(client:"%s", clipId: "%s") {
+    blockedByFsk
+    broadcastedOn
+    maturityContentRating
+    mediaCollection {
+      _duration
+      _geoblocked
+      _isLive
+      _mediaArray {
+        _mediaStreamArray {
+          _quality
+          _server
+          _stream
+        }
+      }
+      _previewImage
+      _subtitleUrl
+      _type
+    }
+    show {
+      title
+    }
+    synopsis
+    title
+    tracking {
+      atiCustomVars {
+        contentId
+      }
+    }
+  }
+}''' % (mobj.group('client'), video_id),
+            }).encode(), headers={
+                'Content-Type': 'application/json'
+            })['data']['playerPage']
+        title = player_page['title']
+        content_id = str_or_none(try_get(
+            player_page, lambda x: x['tracking']['atiCustomVars']['contentId']))
+        media_collection = player_page.get('mediaCollection') or {}
+        if not media_collection and content_id:
+            media_collection = self._download_json(
+                'https://www.ardmediathek.de/play/media/' + content_id,
+                content_id, fatal=False) or {}
+        info = self._parse_media_info(
+            media_collection, content_id or video_id,
+            player_page.get('blockedByFsk'))
+        age_limit = None
+        description = player_page.get('synopsis')
+        maturity_content_rating = player_page.get('maturityContentRating')
+        if maturity_content_rating:
+            age_limit = int_or_none(maturity_content_rating.lstrip('FSK'))
+        if not age_limit and description:
+            age_limit = int_or_none(self._search_regex(
+                r'\(FSK\s*(\d+)\)\s*$', description, 'age limit', default=None))
+        info.update({
+            'age_limit': age_limit,
+            'display_id': display_id,
+            'title': title,
+            'description': description,
+            'timestamp': unified_timestamp(player_page.get('broadcastedOn')),
+            'series': try_get(player_page, lambda x: x['show']['title']),
+        })
+        info.update(self._ARD_extract_episode_info(info['title']))
+        return info
diff --git a/youtube_dl/extractor/arkena.py b/youtube_dlc/extractor/arkena.py

similarity index 100%

rename from youtube_dl/extractor/arkena.py

rename to youtube_dlc/extractor/arkena.py
diff --git a/youtube_dl/extractor/arte.py b/youtube_dlc/extractor/arte.py

similarity index 100%

rename from youtube_dl/extractor/arte.py

rename to youtube_dlc/extractor/arte.py
diff --git a/youtube_dl/extractor/asiancrush.py b/youtube_dlc/extractor/asiancrush.py

similarity index 100%

rename from youtube_dl/extractor/asiancrush.py

rename to youtube_dlc/extractor/asiancrush.py
diff --git a/youtube_dl/extractor/atresplayer.py b/youtube_dlc/extractor/atresplayer.py

similarity index 100%

rename from youtube_dl/extractor/atresplayer.py

rename to youtube_dlc/extractor/atresplayer.py
diff --git a/youtube_dl/extractor/atttechchannel.py b/youtube_dlc/extractor/atttechchannel.py

similarity index 100%

rename from youtube_dl/extractor/atttechchannel.py

rename to youtube_dlc/extractor/atttechchannel.py
diff --git a/youtube_dl/extractor/atvat.py b/youtube_dlc/extractor/atvat.py

similarity index 100%

rename from youtube_dl/extractor/atvat.py

rename to youtube_dlc/extractor/atvat.py
diff --git a/youtube_dl/extractor/audimedia.py b/youtube_dlc/extractor/audimedia.py

similarity index 100%

rename from youtube_dl/extractor/audimedia.py

rename to youtube_dlc/extractor/audimedia.py
diff --git a/youtube_dl/extractor/audioboom.py b/youtube_dlc/extractor/audioboom.py

similarity index 100%

rename from youtube_dl/extractor/audioboom.py

rename to youtube_dlc/extractor/audioboom.py
diff --git a/youtube_dl/extractor/audiomack.py b/youtube_dlc/extractor/audiomack.py

similarity index 100%

rename from youtube_dl/extractor/audiomack.py

rename to youtube_dlc/extractor/audiomack.py
diff --git a/youtube_dl/extractor/awaan.py b/youtube_dlc/extractor/awaan.py

similarity index 100%

rename from youtube_dl/extractor/awaan.py

rename to youtube_dlc/extractor/awaan.py
diff --git a/youtube_dl/extractor/aws.py b/youtube_dlc/extractor/aws.py

similarity index 100%

rename from youtube_dl/extractor/aws.py

rename to youtube_dlc/extractor/aws.py
diff --git a/youtube_dl/extractor/azmedien.py b/youtube_dlc/extractor/azmedien.py

similarity index 59%

rename from youtube_dl/extractor/azmedien.py

rename to youtube_dlc/extractor/azmedien.py

index fcbdc71b98d98076852e0f88559f4a2ed428d7af..b1e20def5343e6b1a077ff3ba0b36f6a96c4f2c4 100644 (file)
--- a/youtube_dl/extractor/azmedien.py
+++ b/youtube_dlc/extractor/azmedien.py
@@ -47,39 +47,19 @@ class AZMedienIE(InfoExtractor):
          'url': 'https://www.telebaern.tv/telebaern-news/montag-1-oktober-2018-ganze-sendung-133531189#video=0_7xjo9lf1',
          'only_matching': True
      }]
-
+    _API_TEMPL = 'https://www.%s/api/pub/gql/%s/NewsArticleTeaser/cb9f2f81ed22e9b47f4ca64ea3cc5a5d13e88d1d'
      _PARTNER_ID = '1719221'
  
      def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-        host = mobj.group('host')
-        video_id = mobj.group('id')
-        entry_id = mobj.group('kaltura_id')
+        host, display_id, article_id, entry_id = re.match(self._VALID_URL, url).groups()
  
          if not entry_id:
-            api_url = 'https://www.%s/api/pub/gql/%s' % (host, host.split('.')[0])
-            payload = {
-                'query': '''query VideoContext($articleId: ID!) {
-                    article: node(id: $articleId) {
-                      ... on Article {
-                        mainAssetRelation {
-                          asset {
-                            ... on VideoAsset {
-                              kalturaId
-                            }
-                          }
-                        }
-                      }
-                    }
-                  }''',
-                'variables': {'articleId': 'Article:%s' % mobj.group('article_id')},
-            }
-            json_data = self._download_json(
-                api_url, video_id, headers={
-                    'Content-Type': 'application/json',
-                },
-                data=json.dumps(payload).encode())
-            entry_id = json_data['data']['article']['mainAssetRelation']['asset']['kalturaId']
+            entry_id = self._download_json(
+                self._API_TEMPL % (host, host.split('.')[0]), display_id, query={
+                    'variables': json.dumps({
+                        'contextId': 'NewsArticle:' + article_id,
+                    }),
+                })['data']['context']['mainAsset']['video']['kaltura']['kalturaId']
  
          return self.url_result(
              'kaltura:%s:%s' % (self._PARTNER_ID, entry_id),
diff --git a/youtube_dl/extractor/baidu.py b/youtube_dlc/extractor/baidu.py

similarity index 100%

rename from youtube_dl/extractor/baidu.py

rename to youtube_dlc/extractor/baidu.py
diff --git a/youtube_dl/extractor/bandcamp.py b/youtube_dlc/extractor/bandcamp.py

similarity index 98%

rename from youtube_dl/extractor/bandcamp.py

rename to youtube_dlc/extractor/bandcamp.py

index f14b407dc82cf8f945f581ab059bd6218866eb57..b8a57e6a50b26ba61c1137b4c8c72d5fc603a8d6 100644 (file)
--- a/youtube_dl/extractor/bandcamp.py
+++ b/youtube_dlc/extractor/bandcamp.py
@@ -28,12 +28,12 @@
  class BandcampIE(InfoExtractor):
      _VALID_URL = r'https?://[^/]+\.bandcamp\.com/track/(?P<title>[^/?#&]+)'
      _TESTS = [{
-        'url': 'http://youtube-dl.bandcamp.com/track/youtube-dl-test-song',
+        'url': 'http://youtube-dlc.bandcamp.com/track/youtube-dlc-test-song',
          'md5': 'c557841d5e50261777a6585648adf439',
          'info_dict': {
              'id': '1812978515',
              'ext': 'mp3',
-            'title': "youtube-dl  \"'/\\\u00e4\u21ad - youtube-dl test song \"'/\\\u00e4\u21ad",
+            'title': "youtube-dlc  \"'/\\\u00e4\u21ad - youtube-dlc test song \"'/\\\u00e4\u21ad",
              'duration': 9.8485,
          },
          '_skip': 'There is a limit of 200 free downloads / month for the test song'
diff --git a/youtube_dl/extractor/bbc.py b/youtube_dlc/extractor/bbc.py

similarity index 99%

rename from youtube_dl/extractor/bbc.py

rename to youtube_dlc/extractor/bbc.py

index 901c5a54fb6f9d3320fbd0827a222c2c9e9676f0..002c39c394bf5d8bd3b74600d92a594e1a8ca2b2 100644 (file)
--- a/youtube_dl/extractor/bbc.py
+++ b/youtube_dlc/extractor/bbc.py
@@ -528,7 +528,7 @@ def _extract_from_legacy_playlist(self, playlist, playlist_id):
  
              def get_programme_id(item):
                  def get_from_attributes(item):
-                    for p in('identifier', 'group'):
+                    for p in ('identifier', 'group'):
                          value = item.get(p)
                          if value and re.match(r'^[pb][\da-z]{7}$', value):
                              return value
diff --git a/youtube_dl/extractor/beampro.py b/youtube_dlc/extractor/beampro.py

similarity index 100%

rename from youtube_dl/extractor/beampro.py

rename to youtube_dlc/extractor/beampro.py
diff --git a/youtube_dl/extractor/beatport.py b/youtube_dlc/extractor/beatport.py

similarity index 100%

rename from youtube_dl/extractor/beatport.py

rename to youtube_dlc/extractor/beatport.py
diff --git a/youtube_dl/extractor/beeg.py b/youtube_dlc/extractor/beeg.py

similarity index 100%

rename from youtube_dl/extractor/beeg.py

rename to youtube_dlc/extractor/beeg.py
diff --git a/youtube_dl/extractor/behindkink.py b/youtube_dlc/extractor/behindkink.py

similarity index 100%

rename from youtube_dl/extractor/behindkink.py

rename to youtube_dlc/extractor/behindkink.py
diff --git a/youtube_dl/extractor/bellmedia.py b/youtube_dlc/extractor/bellmedia.py

similarity index 92%

rename from youtube_dl/extractor/bellmedia.py

rename to youtube_dlc/extractor/bellmedia.py

index 485173774d9f9c2534f9b18f1668a8d5fb204dc9..9f9de96c61332ac405b33bfc1f5758f2c8fd6456 100644 (file)
--- a/youtube_dl/extractor/bellmedia.py
+++ b/youtube_dlc/extractor/bellmedia.py
@@ -25,8 +25,8 @@ class BellMediaIE(InfoExtractor):
                  etalk|
                  marilyn
              )\.ca|
-            much\.com
-        )/.*?(?:\bvid(?:eoid)?=|-vid|~|%7E|/(?:episode)?)(?P<id>[0-9]{6,})'''
+            (?:much|cp24)\.com
+        )/.*?(?:\b(?:vid(?:eoid)?|clipId)=|-vid|~|%7E|/(?:episode)?)(?P<id>[0-9]{6,})'''
      _TESTS = [{
          'url': 'https://www.bnnbloomberg.ca/video/david-cockfield-s-top-picks~1403070',
          'md5': '36d3ef559cfe8af8efe15922cd3ce950',
@@ -62,6 +62,9 @@ class BellMediaIE(InfoExtractor):
      }, {
          'url': 'http://www.etalk.ca/video?videoid=663455',
          'only_matching': True,
+    }, {
+        'url': 'https://www.cp24.com/video?clipId=1982548',
+        'only_matching': True,
      }]
      _DOMAINS = {
          'thecomedynetwork': 'comedy',
diff --git a/youtube_dl/extractor/bet.py b/youtube_dlc/extractor/bet.py

similarity index 100%

rename from youtube_dl/extractor/bet.py

rename to youtube_dlc/extractor/bet.py
diff --git a/youtube_dl/extractor/bfi.py b/youtube_dlc/extractor/bfi.py

similarity index 100%

rename from youtube_dl/extractor/bfi.py

rename to youtube_dlc/extractor/bfi.py
diff --git a/youtube_dl/extractor/bigflix.py b/youtube_dlc/extractor/bigflix.py

similarity index 100%

rename from youtube_dl/extractor/bigflix.py

rename to youtube_dlc/extractor/bigflix.py
diff --git a/youtube_dl/extractor/bild.py b/youtube_dlc/extractor/bild.py

similarity index 100%

rename from youtube_dl/extractor/bild.py

rename to youtube_dlc/extractor/bild.py
diff --git a/youtube_dl/extractor/bilibili.py b/youtube_dlc/extractor/bilibili.py

similarity index 92%

rename from youtube_dl/extractor/bilibili.py

rename to youtube_dlc/extractor/bilibili.py

index 80bd696e21f3a4af3c996e9899ce439116e13d19..d39ee8ffeca763c882747d15e82d4813503ba26c 100644 (file)
--- a/youtube_dl/extractor/bilibili.py
+++ b/youtube_dlc/extractor/bilibili.py
@@ -24,7 +24,18 @@
  
  
  class BiliBiliIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:www\.|bangumi\.|)bilibili\.(?:tv|com)/(?:video/av|anime/(?P<anime_id>\d+)/play#)(?P<id>\d+)'
+    _VALID_URL = r'''(?x)
+                    https?://
+                        (?:(?:www|bangumi)\.)?
+                        bilibili\.(?:tv|com)/
+                        (?:
+                            (?:
+                                video/[aA][vV]|
+                                anime/(?P<anime_id>\d+)/play\#
+                            )(?P<id_bv>\d+)|
+                            video/[bB][vV](?P<id>[^/?#&]+)
+                        )
+                    '''
  
      _TESTS = [{
          'url': 'http://www.bilibili.tv/video/av1074402/',
@@ -92,6 +103,10 @@ class BiliBiliIE(InfoExtractor):
                  'skip_download': True,  # Test metadata only
              },
          }]
+    }, {
+        # new BV video id format
+        'url': 'https://www.bilibili.com/video/BV1JE411F741',
+        'only_matching': True,
      }]
  
      _APP_KEY = 'iVGUTjsxvpLeuDCf'
@@ -109,7 +124,7 @@ def _real_extract(self, url):
          url, smuggled_data = unsmuggle_url(url, {})
  
          mobj = re.match(self._VALID_URL, url)
-        video_id = mobj.group('id')
+        video_id = mobj.group('id') or mobj.group('id_bv')
          anime_id = mobj.group('anime_id')
          webpage = self._download_webpage(url, video_id)
  
@@ -124,7 +139,7 @@ def _real_extract(self, url):
                  webpage, 'player parameters'))['cid'][0]
          else:
              if 'no_bangumi_tip' not in smuggled_data:
-                self.to_screen('Downloading episode %s. To download all videos in anime %s, re-run youtube-dl with %s' % (
+                self.to_screen('Downloading episode %s. To download all videos in anime %s, re-run youtube-dlc with %s' % (
                      video_id, anime_id, compat_urlparse.urljoin(url, '//bangumi.bilibili.com/anime/%s' % anime_id)))
              headers = {
                  'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
@@ -419,3 +434,17 @@ def _real_extract(self, url):
                      entries, am_id, album_title, album_data.get('intro'))
  
          return self.playlist_result(entries, am_id)
+
+
+class BiliBiliPlayerIE(InfoExtractor):
+    _VALID_URL = r'https?://player\.bilibili\.com/player\.html\?.*?\baid=(?P<id>\d+)'
+    _TEST = {
+        'url': 'http://player.bilibili.com/player.html?aid=92494333&cid=157926707&page=1',
+        'only_matching': True,
+    }
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+        return self.url_result(
+            'http://www.bilibili.tv/video/av%s/' % video_id,
+            ie=BiliBiliIE.ie_key(), video_id=video_id)
diff --git a/youtube_dl/extractor/biobiochiletv.py b/youtube_dlc/extractor/biobiochiletv.py

similarity index 100%

rename from youtube_dl/extractor/biobiochiletv.py

rename to youtube_dlc/extractor/biobiochiletv.py
diff --git a/youtube_dl/extractor/biqle.py b/youtube_dlc/extractor/biqle.py

similarity index 83%

rename from youtube_dl/extractor/biqle.py

rename to youtube_dlc/extractor/biqle.py

index af21e3ee5e53fbfdfafa5ee219c541ee6ca97de3..17ebbb25766bb500e6401f55b6105c37fcfd25f5 100644 (file)
--- a/youtube_dl/extractor/biqle.py
+++ b/youtube_dlc/extractor/biqle.py
@@ -3,10 +3,11 @@
  
  from .common import InfoExtractor
  from .vk import VKIE
-from ..utils import (
-    HEADRequest,
-    int_or_none,
+from ..compat import (
+    compat_b64decode,
+    compat_urllib_parse_unquote,
  )
+from ..utils import int_or_none
  
  
  class BIQLEIE(InfoExtractor):
@@ -47,9 +48,16 @@ def _real_extract(self, url):
          if VKIE.suitable(embed_url):
              return self.url_result(embed_url, VKIE.ie_key(), video_id)
  
-        self._request_webpage(
-            HEADRequest(embed_url), video_id, headers={'Referer': url})
-        video_id, sig, _, access_token = self._get_cookies(embed_url)['video_ext'].value.split('%3A')
+        embed_page = self._download_webpage(
+            embed_url, video_id, headers={'Referer': url})
+        video_ext = self._get_cookies(embed_url).get('video_ext')
+        if video_ext:
+            video_ext = compat_urllib_parse_unquote(video_ext.value)
+        if not video_ext:
+            video_ext = compat_b64decode(self._search_regex(
+                r'video_ext\s*:\s*[\'"]([A-Za-z0-9+/=]+)',
+                embed_page, 'video_ext')).decode()
+        video_id, sig, _, access_token = video_ext.split(':')
          item = self._download_json(
              'https://api.vk.com/method/video.get', video_id,
              headers={'User-Agent': 'okhttp/3.4.1'}, query={
diff --git a/youtube_dl/extractor/bitchute.py b/youtube_dlc/extractor/bitchute.py

similarity index 91%

rename from youtube_dl/extractor/bitchute.py

rename to youtube_dlc/extractor/bitchute.py

index 0c773e66e1c7349802b4bb8425e844163d18da75..92fc70b5aa910d3d17acaaaafaf16484d5499788 100644 (file)
--- a/youtube_dl/extractor/bitchute.py
+++ b/youtube_dlc/extractor/bitchute.py
@@ -6,6 +6,8 @@
  
  from .common import InfoExtractor
  from ..utils import (
+    ExtractorError,
+    GeoRestrictedError,
      orderedSet,
      unified_strdate,
      urlencode_postdata,
@@ -59,8 +61,14 @@ def _real_extract(self, url):
              for format_url in orderedSet(format_urls)]
  
          if not formats:
-            formats = self._parse_html5_media_entries(
-                url, webpage, video_id)[0]['formats']
+            entries = self._parse_html5_media_entries(
+                url, webpage, video_id)
+            if not entries:
+                error = self._html_search_regex(r'<h1 class="page-title">([^<]+)</h1>', webpage, 'error', default='Cannot find video')
+                if error == 'Video Unavailable':
+                    raise GeoRestrictedError(error)
+                raise ExtractorError(error)
+            formats = entries[0]['formats']
  
          self._check_formats(formats, video_id)
          self._sort_formats(formats)
diff --git a/youtube_dl/extractor/bleacherreport.py b/youtube_dlc/extractor/bleacherreport.py

similarity index 100%

rename from youtube_dl/extractor/bleacherreport.py

rename to youtube_dlc/extractor/bleacherreport.py
diff --git a/youtube_dl/extractor/blinkx.py b/youtube_dlc/extractor/blinkx.py

similarity index 100%

rename from youtube_dl/extractor/blinkx.py

rename to youtube_dlc/extractor/blinkx.py
diff --git a/youtube_dl/extractor/bloomberg.py b/youtube_dlc/extractor/bloomberg.py

similarity index 100%

rename from youtube_dl/extractor/bloomberg.py

rename to youtube_dlc/extractor/bloomberg.py
diff --git a/youtube_dl/extractor/bokecc.py b/youtube_dlc/extractor/bokecc.py

similarity index 100%

rename from youtube_dl/extractor/bokecc.py

rename to youtube_dlc/extractor/bokecc.py
diff --git a/youtube_dl/extractor/bostonglobe.py b/youtube_dlc/extractor/bostonglobe.py

similarity index 100%

rename from youtube_dl/extractor/bostonglobe.py

rename to youtube_dlc/extractor/bostonglobe.py
diff --git a/youtube_dl/extractor/bpb.py b/youtube_dlc/extractor/bpb.py

similarity index 100%

rename from youtube_dl/extractor/bpb.py

rename to youtube_dlc/extractor/bpb.py
diff --git a/youtube_dl/extractor/br.py b/youtube_dlc/extractor/br.py

similarity index 100%

rename from youtube_dl/extractor/br.py

rename to youtube_dlc/extractor/br.py
diff --git a/youtube_dl/extractor/bravotv.py b/youtube_dlc/extractor/bravotv.py

similarity index 100%

rename from youtube_dl/extractor/bravotv.py

rename to youtube_dlc/extractor/bravotv.py
diff --git a/youtube_dl/extractor/breakcom.py b/youtube_dlc/extractor/breakcom.py

similarity index 100%

rename from youtube_dl/extractor/breakcom.py

rename to youtube_dlc/extractor/breakcom.py
diff --git a/youtube_dl/extractor/brightcove.py b/youtube_dlc/extractor/brightcove.py

similarity index 90%

rename from youtube_dl/extractor/brightcove.py

rename to youtube_dlc/extractor/brightcove.py

index 8e2f7217ab85a81a58d1bb902af02b6e62ec2ab6..2aa9f4782e0dfdb2b78225c2d1fe83a8568effe3 100644 (file)
--- a/youtube_dl/extractor/brightcove.py
+++ b/youtube_dlc/extractor/brightcove.py
@@ -5,32 +5,34 @@
  import re
  import struct
  
-from .common import InfoExtractor
  from .adobepass import AdobePassIE
+from .common import InfoExtractor
  from ..compat import (
      compat_etree_fromstring,
+    compat_HTTPError,
      compat_parse_qs,
      compat_urllib_parse_urlparse,
      compat_urlparse,
      compat_xml_parse_error,
-    compat_HTTPError,
  )
  from ..utils import (
-    ExtractorError,
+    clean_html,
      extract_attributes,
+    ExtractorError,
      find_xpath_attr,
      fix_xml_ampersands,
      float_or_none,
-    js_to_json,
      int_or_none,
+    js_to_json,
+    mimetype2ext,
      parse_iso8601,
      smuggle_url,
+    str_or_none,
      unescapeHTML,
      unsmuggle_url,
-    update_url_query,
-    clean_html,
-    mimetype2ext,
      UnsupportedError,
+    update_url_query,
+    url_or_none,
  )
  
  
@@ -424,7 +426,7 @@ def _extract_urls(ie, webpage):
          # [2] looks like:
          for video, script_tag, account_id, player_id, embed in re.findall(
                  r'''(?isx)
-                    (<video\s+[^>]*\bdata-video-id\s*=\s*['"]?[^>]+>)
+                    (<video(?:-js)?\s+[^>]*\bdata-video-id\s*=\s*['"]?[^>]+>)
                      (?:.*?
                          (<script[^>]+
                              src=["\'](?:https?:)?//players\.brightcove\.net/
@@ -553,10 +555,16 @@ def build_format_id(kind):
  
          subtitles = {}
          for text_track in json_data.get('text_tracks', []):
-            if text_track.get('src'):
-                subtitles.setdefault(text_track.get('srclang'), []).append({
-                    'url': text_track['src'],
-                })
+            if text_track.get('kind') != 'captions':
+                continue
+            text_track_url = url_or_none(text_track.get('src'))
+            if not text_track_url:
+                continue
+            lang = (str_or_none(text_track.get('srclang'))
+                    or str_or_none(text_track.get('label')) or 'en').lower()
+            subtitles.setdefault(lang, []).append({
+                'url': text_track_url,
+            })
  
          is_live = False
          duration = float_or_none(json_data.get('duration'), 1000)
@@ -586,45 +594,63 @@ def _real_extract(self, url):
  
          account_id, player_id, embed, content_type, video_id = re.match(self._VALID_URL, url).groups()
  
-        webpage = self._download_webpage(
-            'http://players.brightcove.net/%s/%s_%s/index.min.js'
-            % (account_id, player_id, embed), video_id)
+        policy_key_id = '%s_%s' % (account_id, player_id)
+        policy_key = self._downloader.cache.load('brightcove', policy_key_id)
+        policy_key_extracted = False
+        store_pk = lambda x: self._downloader.cache.store('brightcove', policy_key_id, x)
  
-        policy_key = None
+        def extract_policy_key():
+            webpage = self._download_webpage(
+                'http://players.brightcove.net/%s/%s_%s/index.min.js'
+                % (account_id, player_id, embed), video_id)
  
-        catalog = self._search_regex(
-            r'catalog\(({.+?})\);', webpage, 'catalog', default=None)
-        if catalog:
-            catalog = self._parse_json(
-                js_to_json(catalog), video_id, fatal=False)
+            policy_key = None
+
+            catalog = self._search_regex(
+                r'catalog\(({.+?})\);', webpage, 'catalog', default=None)
              if catalog:
-                policy_key = catalog.get('policyKey')
+                catalog = self._parse_json(
+                    js_to_json(catalog), video_id, fatal=False)
+                if catalog:
+                    policy_key = catalog.get('policyKey')
+
+            if not policy_key:
+                policy_key = self._search_regex(
+                    r'policyKey\s*:\s*(["\'])(?P<pk>.+?)\1',
+                    webpage, 'policy key', group='pk')
  
-        if not policy_key:
-            policy_key = self._search_regex(
-                r'policyKey\s*:\s*(["\'])(?P<pk>.+?)\1',
-                webpage, 'policy key', group='pk')
+            store_pk(policy_key)
+            return policy_key
  
          api_url = 'https://edge.api.brightcove.com/playback/v1/accounts/%s/%ss/%s' % (account_id, content_type, video_id)
-        headers = {
-            'Accept': 'application/json;pk=%s' % policy_key,
-        }
+        headers = {}
          referrer = smuggled_data.get('referrer')
          if referrer:
              headers.update({
                  'Referer': referrer,
                  'Origin': re.search(r'https?://[^/]+', referrer).group(0),
              })
-        try:
-            json_data = self._download_json(api_url, video_id, headers=headers)
-        except ExtractorError as e:
-            if isinstance(e.cause, compat_HTTPError) and e.cause.code == 403:
-                json_data = self._parse_json(e.cause.read().decode(), video_id)[0]
-                message = json_data.get('message') or json_data['error_code']
-                if json_data.get('error_subcode') == 'CLIENT_GEO':
-                    self.raise_geo_restricted(msg=message)
-                raise ExtractorError(message, expected=True)
-            raise
+
+        for _ in range(2):
+            if not policy_key:
+                policy_key = extract_policy_key()
+                policy_key_extracted = True
+            headers['Accept'] = 'application/json;pk=%s' % policy_key
+            try:
+                json_data = self._download_json(api_url, video_id, headers=headers)
+                break
+            except ExtractorError as e:
+                if isinstance(e.cause, compat_HTTPError) and e.cause.code in (401, 403):
+                    json_data = self._parse_json(e.cause.read().decode(), video_id)[0]
+                    message = json_data.get('message') or json_data['error_code']
+                    if json_data.get('error_subcode') == 'CLIENT_GEO':
+                        self.raise_geo_restricted(msg=message)
+                    elif json_data.get('error_code') == 'INVALID_POLICY_KEY' and not policy_key_extracted:
+                        policy_key = None
+                        store_pk(None)
+                        continue
+                    raise ExtractorError(message, expected=True)
+                raise
  
          errors = json_data.get('errors')
          if errors and errors[0].get('error_subcode') == 'TVE_AUTH':
diff --git a/youtube_dl/extractor/businessinsider.py b/youtube_dlc/extractor/businessinsider.py

similarity index 59%

rename from youtube_dl/extractor/businessinsider.py

rename to youtube_dlc/extractor/businessinsider.py

index dfcf9bc6b50b9274d2e45ff7e0b6d1af9920cab0..73a57b1e4db835ab09ac308704bd105d796628ac 100644 (file)
--- a/youtube_dl/extractor/businessinsider.py
+++ b/youtube_dlc/extractor/businessinsider.py
@@ -9,21 +9,26 @@ class BusinessInsiderIE(InfoExtractor):
      _VALID_URL = r'https?://(?:[^/]+\.)?businessinsider\.(?:com|nl)/(?:[^/]+/)*(?P<id>[^/?#&]+)'
      _TESTS = [{
          'url': 'http://uk.businessinsider.com/how-much-radiation-youre-exposed-to-in-everyday-life-2016-6',
-        'md5': 'ca237a53a8eb20b6dc5bd60564d4ab3e',
+        'md5': 'ffed3e1e12a6f950aa2f7d83851b497a',
          'info_dict': {
-            'id': 'hZRllCfw',
+            'id': 'cjGDb0X9',
              'ext': 'mp4',
-            'title': "Here's how much radiation you're exposed to in everyday life",
-            'description': 'md5:9a0d6e2c279948aadaa5e84d6d9b99bd',
-            'upload_date': '20170709',
-            'timestamp': 1499606400,
-        },
-        'params': {
-            'skip_download': True,
+            'title': "Bananas give you more radiation exposure than living next to a nuclear power plant",
+            'description': 'md5:0175a3baf200dd8fa658f94cade841b3',
+            'upload_date': '20160611',
+            'timestamp': 1465675620,
          },
      }, {
          'url': 'https://www.businessinsider.nl/5-scientifically-proven-things-make-you-less-attractive-2017-7/',
-        'only_matching': True,
+        'md5': '43f438dbc6da0b89f5ac42f68529d84a',
+        'info_dict': {
+            'id': '5zJwd4FK',
+            'ext': 'mp4',
+            'title': 'Deze dingen zorgen ervoor dat je minder snel een date scoort',
+            'description': 'md5:2af8975825d38a4fed24717bbe51db49',
+            'upload_date': '20170705',
+            'timestamp': 1499270528,
+        },
      }, {
          'url': 'http://www.businessinsider.com/excel-index-match-vlookup-video-how-to-2015-2?IR=T',
          'only_matching': True,
@@ -35,7 +40,8 @@ def _real_extract(self, url):
          jwplatform_id = self._search_regex(
              (r'data-media-id=["\']([a-zA-Z0-9]{8})',
               r'id=["\']jwplayer_([a-zA-Z0-9]{8})',
-             r'id["\']?\s*:\s*["\']?([a-zA-Z0-9]{8})'),
+             r'id["\']?\s*:\s*["\']?([a-zA-Z0-9]{8})',
+             r'(?:jwplatform\.com/players/|jwplayer_)([a-zA-Z0-9]{8})'),
              webpage, 'jwplatform id')
          return self.url_result(
              'jwplatform:%s' % jwplatform_id, ie=JWPlatformIE.ie_key(),
diff --git a/youtube_dl/extractor/buzzfeed.py b/youtube_dlc/extractor/buzzfeed.py

similarity index 100%

rename from youtube_dl/extractor/buzzfeed.py

rename to youtube_dlc/extractor/buzzfeed.py
diff --git a/youtube_dl/extractor/byutv.py b/youtube_dlc/extractor/byutv.py

similarity index 100%

rename from youtube_dl/extractor/byutv.py

rename to youtube_dlc/extractor/byutv.py
diff --git a/youtube_dl/extractor/c56.py b/youtube_dlc/extractor/c56.py

similarity index 100%

rename from youtube_dl/extractor/c56.py

rename to youtube_dlc/extractor/c56.py
diff --git a/youtube_dl/extractor/camdemy.py b/youtube_dlc/extractor/camdemy.py

similarity index 100%

rename from youtube_dl/extractor/camdemy.py

rename to youtube_dlc/extractor/camdemy.py
diff --git a/youtube_dl/extractor/cammodels.py b/youtube_dlc/extractor/cammodels.py

similarity index 100%

rename from youtube_dl/extractor/cammodels.py

rename to youtube_dlc/extractor/cammodels.py
diff --git a/youtube_dl/extractor/camtube.py b/youtube_dlc/extractor/camtube.py

similarity index 100%

rename from youtube_dl/extractor/camtube.py

rename to youtube_dlc/extractor/camtube.py
diff --git a/youtube_dl/extractor/camwithher.py b/youtube_dlc/extractor/camwithher.py

similarity index 100%

rename from youtube_dl/extractor/camwithher.py

rename to youtube_dlc/extractor/camwithher.py
diff --git a/youtube_dl/extractor/canalc2.py b/youtube_dlc/extractor/canalc2.py

similarity index 100%

rename from youtube_dl/extractor/canalc2.py

rename to youtube_dlc/extractor/canalc2.py
diff --git a/youtube_dl/extractor/canalplus.py b/youtube_dlc/extractor/canalplus.py

similarity index 100%

rename from youtube_dl/extractor/canalplus.py

rename to youtube_dlc/extractor/canalplus.py
diff --git a/youtube_dl/extractor/canvas.py b/youtube_dlc/extractor/canvas.py

similarity index 79%

rename from youtube_dl/extractor/canvas.py

rename to youtube_dlc/extractor/canvas.py

index c506bc5dd2402a95752bdf3223fe4a24cf9d06ae..8667a0d0457cccfc145cc52bc1eb1c7816aa04b8 100644 (file)
--- a/youtube_dl/extractor/canvas.py
+++ b/youtube_dlc/extractor/canvas.py
@@ -13,6 +13,8 @@
      int_or_none,
      merge_dicts,
      parse_iso8601,
+    str_or_none,
+    url_or_none,
  )
  
  
@@ -20,15 +22,15 @@ class CanvasIE(InfoExtractor):
      _VALID_URL = r'https?://mediazone\.vrt\.be/api/v1/(?P<site_id>canvas|een|ketnet|vrt(?:video|nieuws)|sporza)/assets/(?P<id>[^/?#&]+)'
      _TESTS = [{
          'url': 'https://mediazone.vrt.be/api/v1/ketnet/assets/md-ast-4ac54990-ce66-4d00-a8ca-9eac86f4c475',
-        'md5': '90139b746a0a9bd7bb631283f6e2a64e',
+        'md5': '68993eda72ef62386a15ea2cf3c93107',
          'info_dict': {
              'id': 'md-ast-4ac54990-ce66-4d00-a8ca-9eac86f4c475',
              'display_id': 'md-ast-4ac54990-ce66-4d00-a8ca-9eac86f4c475',
-            'ext': 'flv',
+            'ext': 'mp4',
              'title': 'Nachtwacht: De Greystook',
-            'description': 'md5:1db3f5dc4c7109c821261e7512975be7',
+            'description': 'Nachtwacht: De Greystook',
              'thumbnail': r're:^https?://.*\.jpg$',
-            'duration': 1468.03,
+            'duration': 1468.04,
          },
          'expected_warnings': ['is not a supported codec', 'Unknown MIME type'],
      }, {
@@ -39,23 +41,45 @@ class CanvasIE(InfoExtractor):
          'HLS': 'm3u8_native',
          'HLS_AES': 'm3u8',
      }
+    _REST_API_BASE = 'https://media-services-public.vrt.be/vualto-video-aggregator-web/rest/external/v1'
  
      def _real_extract(self, url):
          mobj = re.match(self._VALID_URL, url)
          site_id, video_id = mobj.group('site_id'), mobj.group('id')
  
+        # Old API endpoint, serves more formats but may fail for some videos
          data = self._download_json(
              'https://mediazone.vrt.be/api/v1/%s/assets/%s'
-            % (site_id, video_id), video_id)
+            % (site_id, video_id), video_id, 'Downloading asset JSON',
+            'Unable to download asset JSON', fatal=False)
+
+        # New API endpoint
+        if not data:
+            token = self._download_json(
+                '%s/tokens' % self._REST_API_BASE, video_id,
+                'Downloading token', data=b'',
+                headers={'Content-Type': 'application/json'})['vrtPlayerToken']
+            data = self._download_json(
+                '%s/videos/%s' % (self._REST_API_BASE, video_id),
+                video_id, 'Downloading video JSON', fatal=False, query={
+                    'vrtPlayerToken': token,
+                    'client': '%s@PROD' % site_id,
+                }, expected_status=400)
+            message = data.get('message')
+            if message and not data.get('title'):
+                if data.get('code') == 'AUTHENTICATION_REQUIRED':
+                    self.raise_login_required(message)
+                raise ExtractorError(message, expected=True)
  
          title = data['title']
          description = data.get('description')
  
          formats = []
          for target in data['targetUrls']:
-            format_url, format_type = target.get('url'), target.get('type')
+            format_url, format_type = url_or_none(target.get('url')), str_or_none(target.get('type'))
              if not format_url or not format_type:
                  continue
+            format_type = format_type.upper()
              if format_type in self._HLS_ENTRY_PROTOCOLS_MAP:
                  formats.extend(self._extract_m3u8_formats(
                      format_url, video_id, 'mp4', self._HLS_ENTRY_PROTOCOLS_MAP[format_type],
@@ -134,20 +158,20 @@ class CanvasEenIE(InfoExtractor):
          },
          'skip': 'Pagina niet gevonden',
      }, {
-        'url': 'https://www.een.be/sorry-voor-alles/herbekijk-sorry-voor-alles',
+        'url': 'https://www.een.be/thuis/emma-pakt-thilly-aan',
          'info_dict': {
-            'id': 'mz-ast-11a587f8-b921-4266-82e2-0bce3e80d07f',
-            'display_id': 'herbekijk-sorry-voor-alles',
+            'id': 'md-ast-3a24ced2-64d7-44fb-b4ed-ed1aafbf90b8',
+            'display_id': 'emma-pakt-thilly-aan',
              'ext': 'mp4',
-            'title': 'Herbekijk Sorry voor alles',
-            'description': 'md5:8bb2805df8164e5eb95d6a7a29dc0dd3',
+            'title': 'Emma pakt Thilly aan',
+            'description': 'md5:c5c9b572388a99b2690030afa3f3bad7',
              'thumbnail': r're:^https?://.*\.jpg$',
-            'duration': 3788.06,
+            'duration': 118.24,
          },
          'params': {
              'skip_download': True,
          },
-        'skip': 'Episode no longer available',
+        'expected_warnings': ['is not a supported codec'],
      }, {
          'url': 'https://www.canvas.be/check-point/najaar-2016/de-politie-uw-vriend',
          'only_matching': True,
@@ -183,19 +207,44 @@ class VrtNUIE(GigyaBaseIE):
      IE_DESC = 'VrtNU.be'
      _VALID_URL = r'https?://(?:www\.)?vrt\.be/(?P<site_id>vrtnu)/(?:[^/]+/)*(?P<id>[^/?#&]+)'
      _TESTS = [{
+        # Available via old API endpoint
          'url': 'https://www.vrt.be/vrtnu/a-z/postbus-x/1/postbus-x-s1a1/',
          'info_dict': {
              'id': 'pbs-pub-2e2d8c27-df26-45c9-9dc6-90c78153044d$vid-90c932b1-e21d-4fb8-99b1-db7b49cf74de',
-            'ext': 'flv',
+            'ext': 'mp4',
              'title': 'De zwarte weduwe',
-            'description': 'md5:d90c21dced7db869a85db89a623998d4',
+            'description': 'md5:db1227b0f318c849ba5eab1fef895ee4',
              'duration': 1457.04,
              'thumbnail': r're:^https?://.*\.jpg$',
-            'season': '1',
+            'season': 'Season 1',
              'season_number': 1,
              'episode_number': 1,
          },
-        'skip': 'This video is only available for registered users'
+        'skip': 'This video is only available for registered users',
+        'params': {
+            'username': '<snip>',
+            'password': '<snip>',
+        },
+        'expected_warnings': ['is not a supported codec'],
+    }, {
+        # Only available via new API endpoint
+        'url': 'https://www.vrt.be/vrtnu/a-z/kamp-waes/1/kamp-waes-s1a5/',
+        'info_dict': {
+            'id': 'pbs-pub-0763b56c-64fb-4d38-b95b-af60bf433c71$vid-ad36a73c-4735-4f1f-b2c0-a38e6e6aa7e1',
+            'ext': 'mp4',
+            'title': 'Aflevering 5',
+            'description': 'Wie valt door de mand tijdens een missie?',
+            'duration': 2967.06,
+            'season': 'Season 1',
+            'season_number': 1,
+            'episode_number': 5,
+        },
+        'skip': 'This video is only available for registered users',
+        'params': {
+            'username': '<snip>',
+            'password': '<snip>',
+        },
+        'expected_warnings': ['Unable to download asset JSON', 'is not a supported codec', 'Unknown MIME type'],
      }]
      _NETRC_MACHINE = 'vrtnu'
      _APIKEY = '3_0Z2HujMtiWq_pkAjgnS2Md2E11a1AwZjYiBETtwNE-EoEHDINgtnvcAOpNgmrVGy'
diff --git a/youtube_dl/extractor/carambatv.py b/youtube_dlc/extractor/carambatv.py

similarity index 100%

rename from youtube_dl/extractor/carambatv.py

rename to youtube_dlc/extractor/carambatv.py
diff --git a/youtube_dl/extractor/cartoonnetwork.py b/youtube_dlc/extractor/cartoonnetwork.py

similarity index 100%

rename from youtube_dl/extractor/cartoonnetwork.py

rename to youtube_dlc/extractor/cartoonnetwork.py
diff --git a/youtube_dl/extractor/cbc.py b/youtube_dlc/extractor/cbc.py

similarity index 90%

rename from youtube_dl/extractor/cbc.py

rename to youtube_dlc/extractor/cbc.py

index 751a3a8f26c94ecb19c130503593515312bac6c7..fd5ec6033b80513012cf2615fc56e80c7e82cadc 100644 (file)
--- a/youtube_dl/extractor/cbc.py
+++ b/youtube_dlc/extractor/cbc.py
@@ -1,8 +1,10 @@
  # coding: utf-8
  from __future__ import unicode_literals
  
+import hashlib
  import json
  import re
+from xml.sax.saxutils import escape
  
  from .common import InfoExtractor
  from ..compat import (
@@ -216,6 +218,29 @@ class CBCWatchBaseIE(InfoExtractor):
          'clearleap': 'http://www.clearleap.com/namespace/clearleap/1.0/',
      }
      _GEO_COUNTRIES = ['CA']
+    _LOGIN_URL = 'https://api.loginradius.com/identity/v2/auth/login'
+    _TOKEN_URL = 'https://cloud-api.loginradius.com/sso/jwt/api/token'
+    _API_KEY = '3f4beddd-2061-49b0-ae80-6f1f2ed65b37'
+    _NETRC_MACHINE = 'cbcwatch'
+
+    def _signature(self, email, password):
+        data = json.dumps({
+            'email': email,
+            'password': password,
+        }).encode()
+        headers = {'content-type': 'application/json'}
+        query = {'apikey': self._API_KEY}
+        resp = self._download_json(self._LOGIN_URL, None, data=data, headers=headers, query=query)
+        access_token = resp['access_token']
+
+        # token
+        query = {
+            'access_token': access_token,
+            'apikey': self._API_KEY,
+            'jwtapp': 'jwt',
+        }
+        resp = self._download_json(self._TOKEN_URL, None, headers=headers, query=query)
+        return resp['signature']
  
      def _call_api(self, path, video_id):
          url = path if path.startswith('http') else self._API_BASE_URL + path
@@ -239,7 +264,8 @@ def _call_api(self, path, video_id):
      def _real_initialize(self):
          if self._valid_device_token():
              return
-        device = self._downloader.cache.load('cbcwatch', 'device') or {}
+        device = self._downloader.cache.load(
+            'cbcwatch', self._cache_device_key()) or {}
          self._device_id, self._device_token = device.get('id'), device.get('token')
          if self._valid_device_token():
              return
@@ -248,16 +274,30 @@ def _real_initialize(self):
      def _valid_device_token(self):
          return self._device_id and self._device_token
  
+    def _cache_device_key(self):
+        email, _ = self._get_login_info()
+        return '%s_device' % hashlib.sha256(email.encode()).hexdigest() if email else 'device'
+
      def _register_device(self):
-        self._device_id = self._device_token = None
          result = self._download_xml(
              self._API_BASE_URL + 'device/register',
              None, 'Acquiring device token',
              data=b'<device><type>web</type></device>')
          self._device_id = xpath_text(result, 'deviceId', fatal=True)
-        self._device_token = xpath_text(result, 'deviceToken', fatal=True)
+        email, password = self._get_login_info()
+        if email and password:
+            signature = self._signature(email, password)
+            data = '<login><token>{0}</token><device><deviceId>{1}</deviceId><type>web</type></device></login>'.format(
+                escape(signature), escape(self._device_id)).encode()
+            url = self._API_BASE_URL + 'device/login'
+            result = self._download_xml(
+                url, None, data=data,
+                headers={'content-type': 'application/xml'})
+            self._device_token = xpath_text(result, 'token', fatal=True)
+        else:
+            self._device_token = xpath_text(result, 'deviceToken', fatal=True)
          self._downloader.cache.store(
-            'cbcwatch', 'device', {
+            'cbcwatch', self._cache_device_key(), {
                  'id': self._device_id,
                  'token': self._device_token,
              })
diff --git a/youtube_dl/extractor/cbs.py b/youtube_dlc/extractor/cbs.py

similarity index 100%

rename from youtube_dl/extractor/cbs.py

rename to youtube_dlc/extractor/cbs.py
diff --git a/youtube_dl/extractor/cbsinteractive.py b/youtube_dlc/extractor/cbsinteractive.py

similarity index 100%

rename from youtube_dl/extractor/cbsinteractive.py

rename to youtube_dlc/extractor/cbsinteractive.py
diff --git a/youtube_dl/extractor/cbslocal.py b/youtube_dlc/extractor/cbslocal.py

similarity index 100%

rename from youtube_dl/extractor/cbslocal.py

rename to youtube_dlc/extractor/cbslocal.py
diff --git a/youtube_dl/extractor/cbsnews.py b/youtube_dlc/extractor/cbsnews.py

similarity index 100%

rename from youtube_dl/extractor/cbsnews.py

rename to youtube_dlc/extractor/cbsnews.py
diff --git a/youtube_dl/extractor/cbssports.py b/youtube_dlc/extractor/cbssports.py

similarity index 100%

rename from youtube_dl/extractor/cbssports.py

rename to youtube_dlc/extractor/cbssports.py
diff --git a/youtube_dl/extractor/ccc.py b/youtube_dlc/extractor/ccc.py

similarity index 100%

rename from youtube_dl/extractor/ccc.py

rename to youtube_dlc/extractor/ccc.py
diff --git a/youtube_dl/extractor/ccma.py b/youtube_dlc/extractor/ccma.py

similarity index 100%

rename from youtube_dl/extractor/ccma.py

rename to youtube_dlc/extractor/ccma.py
diff --git a/youtube_dl/extractor/cctv.py b/youtube_dlc/extractor/cctv.py

similarity index 100%

rename from youtube_dl/extractor/cctv.py

rename to youtube_dlc/extractor/cctv.py
diff --git a/youtube_dl/extractor/cda.py b/youtube_dlc/extractor/cda.py

similarity index 100%

rename from youtube_dl/extractor/cda.py

rename to youtube_dlc/extractor/cda.py
diff --git a/youtube_dl/extractor/ceskatelevize.py b/youtube_dlc/extractor/ceskatelevize.py

similarity index 100%

rename from youtube_dl/extractor/ceskatelevize.py

rename to youtube_dlc/extractor/ceskatelevize.py
diff --git a/youtube_dl/extractor/channel9.py b/youtube_dlc/extractor/channel9.py

similarity index 100%

rename from youtube_dl/extractor/channel9.py

rename to youtube_dlc/extractor/channel9.py
diff --git a/youtube_dl/extractor/charlierose.py b/youtube_dlc/extractor/charlierose.py

similarity index 100%

rename from youtube_dl/extractor/charlierose.py

rename to youtube_dlc/extractor/charlierose.py
diff --git a/youtube_dl/extractor/chaturbate.py b/youtube_dlc/extractor/chaturbate.py

similarity index 100%

rename from youtube_dl/extractor/chaturbate.py

rename to youtube_dlc/extractor/chaturbate.py
diff --git a/youtube_dl/extractor/chilloutzone.py b/youtube_dlc/extractor/chilloutzone.py

similarity index 100%

rename from youtube_dl/extractor/chilloutzone.py

rename to youtube_dlc/extractor/chilloutzone.py
diff --git a/youtube_dl/extractor/chirbit.py b/youtube_dlc/extractor/chirbit.py

similarity index 100%

rename from youtube_dl/extractor/chirbit.py

rename to youtube_dlc/extractor/chirbit.py
diff --git a/youtube_dl/extractor/cinchcast.py b/youtube_dlc/extractor/cinchcast.py

similarity index 100%

rename from youtube_dl/extractor/cinchcast.py

rename to youtube_dlc/extractor/cinchcast.py
diff --git a/youtube_dl/extractor/cinemax.py b/youtube_dlc/extractor/cinemax.py

similarity index 100%

rename from youtube_dl/extractor/cinemax.py

rename to youtube_dlc/extractor/cinemax.py
diff --git a/youtube_dl/extractor/ciscolive.py b/youtube_dlc/extractor/ciscolive.py

similarity index 100%

rename from youtube_dl/extractor/ciscolive.py

rename to youtube_dlc/extractor/ciscolive.py
diff --git a/youtube_dl/extractor/cjsw.py b/youtube_dlc/extractor/cjsw.py

similarity index 100%

rename from youtube_dl/extractor/cjsw.py

rename to youtube_dlc/extractor/cjsw.py
diff --git a/youtube_dl/extractor/cliphunter.py b/youtube_dlc/extractor/cliphunter.py

similarity index 100%

rename from youtube_dl/extractor/cliphunter.py

rename to youtube_dlc/extractor/cliphunter.py
diff --git a/youtube_dl/extractor/clippit.py b/youtube_dlc/extractor/clippit.py

similarity index 100%

rename from youtube_dl/extractor/clippit.py

rename to youtube_dlc/extractor/clippit.py
diff --git a/youtube_dl/extractor/cliprs.py b/youtube_dlc/extractor/cliprs.py

similarity index 100%

rename from youtube_dl/extractor/cliprs.py

rename to youtube_dlc/extractor/cliprs.py
diff --git a/youtube_dl/extractor/clipsyndicate.py b/youtube_dlc/extractor/clipsyndicate.py

similarity index 100%

rename from youtube_dl/extractor/clipsyndicate.py

rename to youtube_dlc/extractor/clipsyndicate.py
diff --git a/youtube_dl/extractor/closertotruth.py b/youtube_dlc/extractor/closertotruth.py

similarity index 100%

rename from youtube_dl/extractor/closertotruth.py

rename to youtube_dlc/extractor/closertotruth.py
diff --git a/youtube_dl/extractor/cloudflarestream.py b/youtube_dlc/extractor/cloudflarestream.py

similarity index 59%

rename from youtube_dl/extractor/cloudflarestream.py

rename to youtube_dlc/extractor/cloudflarestream.py

index 8ff2c6531570ee3a06210832cf967c7033d6ab9f..2fdcfbb3af1fbffb9e66abff56b86e31762ad449 100644 (file)
--- a/youtube_dl/extractor/cloudflarestream.py
+++ b/youtube_dlc/extractor/cloudflarestream.py
@@ -1,20 +1,24 @@
  # coding: utf-8
  from __future__ import unicode_literals
  
+import base64
  import re
  
  from .common import InfoExtractor
  
  
  class CloudflareStreamIE(InfoExtractor):
+    _DOMAIN_RE = r'(?:cloudflarestream\.com|(?:videodelivery|bytehighway)\.net)'
+    _EMBED_RE = r'embed\.%s/embed/[^/]+\.js\?.*?\bvideo=' % _DOMAIN_RE
+    _ID_RE = r'[\da-f]{32}|[\w-]+\.[\w-]+\.[\w-]+'
      _VALID_URL = r'''(?x)
                      https?://
                          (?:
-                            (?:watch\.)?(?:cloudflarestream\.com|videodelivery\.net)/|
-                            embed\.(?:cloudflarestream\.com|videodelivery\.net)/embed/[^/]+\.js\?.*?\bvideo=
+                            (?:watch\.)?%s/|
+                            %s
                          )
-                        (?P<id>[\da-f]+)
-                    '''
+                        (?P<id>%s)
+                    ''' % (_DOMAIN_RE, _EMBED_RE, _ID_RE)
      _TESTS = [{
          'url': 'https://embed.cloudflarestream.com/embed/we4g.fla9.latest.js?video=31c9291ab41fac05471db4e73aa11717',
          'info_dict': {
@@ -41,23 +45,28 @@ def _extract_urls(webpage):
          return [
              mobj.group('url')
              for mobj in re.finditer(
-                r'<script[^>]+\bsrc=(["\'])(?P<url>(?:https?:)?//embed\.(?:cloudflarestream\.com|videodelivery\.net)/embed/[^/]+\.js\?.*?\bvideo=[\da-f]+?.*?)\1',
+                r'<script[^>]+\bsrc=(["\'])(?P<url>(?:https?:)?//%s(?:%s).*?)\1' % (CloudflareStreamIE._EMBED_RE, CloudflareStreamIE._ID_RE),
                  webpage)]
  
      def _real_extract(self, url):
          video_id = self._match_id(url)
+        domain = 'bytehighway.net' if 'bytehighway.net/' in url else 'videodelivery.net'
+        base_url = 'https://%s/%s/' % (domain, video_id)
+        if '.' in video_id:
+            video_id = self._parse_json(base64.urlsafe_b64decode(
+                video_id.split('.')[1]), video_id)['sub']
+        manifest_base_url = base_url + 'manifest/video.'
  
          formats = self._extract_m3u8_formats(
-            'https://cloudflarestream.com/%s/manifest/video.m3u8' % video_id,
-            video_id, 'mp4', entry_protocol='m3u8_native', m3u8_id='hls',
-            fatal=False)
+            manifest_base_url + 'm3u8', video_id, 'mp4',
+            'm3u8_native', m3u8_id='hls', fatal=False)
          formats.extend(self._extract_mpd_formats(
-            'https://cloudflarestream.com/%s/manifest/video.mpd' % video_id,
-            video_id, mpd_id='dash', fatal=False))
+            manifest_base_url + 'mpd', video_id, mpd_id='dash', fatal=False))
          self._sort_formats(formats)
  
          return {
              'id': video_id,
              'title': video_id,
+            'thumbnail': base_url + 'thumbnails/thumbnail.jpg',
              'formats': formats,
          }
diff --git a/youtube_dl/extractor/cloudy.py b/youtube_dlc/extractor/cloudy.py

similarity index 100%

rename from youtube_dl/extractor/cloudy.py

rename to youtube_dlc/extractor/cloudy.py
diff --git a/youtube_dl/extractor/clubic.py b/youtube_dlc/extractor/clubic.py

similarity index 100%

rename from youtube_dl/extractor/clubic.py

rename to youtube_dlc/extractor/clubic.py
diff --git a/youtube_dl/extractor/clyp.py b/youtube_dlc/extractor/clyp.py

similarity index 100%

rename from youtube_dl/extractor/clyp.py

rename to youtube_dlc/extractor/clyp.py
diff --git a/youtube_dl/extractor/cmt.py b/youtube_dlc/extractor/cmt.py

similarity index 100%

rename from youtube_dl/extractor/cmt.py

rename to youtube_dlc/extractor/cmt.py
diff --git a/youtube_dl/extractor/cnbc.py b/youtube_dlc/extractor/cnbc.py

similarity index 100%

rename from youtube_dl/extractor/cnbc.py

rename to youtube_dlc/extractor/cnbc.py
diff --git a/youtube_dl/extractor/cnn.py b/youtube_dlc/extractor/cnn.py

similarity index 100%

rename from youtube_dl/extractor/cnn.py

rename to youtube_dlc/extractor/cnn.py
diff --git a/youtube_dl/extractor/comedycentral.py b/youtube_dlc/extractor/comedycentral.py

similarity index 100%

rename from youtube_dl/extractor/comedycentral.py

rename to youtube_dlc/extractor/comedycentral.py
diff --git a/youtube_dl/extractor/common.py b/youtube_dlc/extractor/common.py

similarity index 98%

rename from youtube_dl/extractor/common.py

rename to youtube_dlc/extractor/common.py

index eaae5e484f99311ccf018301f773c87f9c8cf544..c1ea5d84603f9e821e789648bac743a7c6d2c9b6 100644 (file)
--- a/youtube_dl/extractor/common.py
+++ b/youtube_dlc/extractor/common.py
@@ -15,7 +15,7 @@
  import math
  
  from ..compat import (
-    compat_cookiejar,
+    compat_cookiejar_Cookie,
      compat_cookies,
      compat_etree_Element,
      compat_etree_fromstring,
@@ -269,7 +269,7 @@ class InfoExtractor(object):
                                       Set to "root" to indicate that this is a
                                       comment to the original video.
      age_limit:      Age restriction for the video, as an integer (years)
-    webpage_url:    The URL to the video webpage, if given to youtube-dl it
+    webpage_url:    The URL to the video webpage, if given to youtube-dlc it
                      should allow to get the same result again. (It will be set
                      by YoutubeDL if it's missing)
      categories:     A list of categories that the video falls in, for example
@@ -1182,16 +1182,33 @@ def _twitter_search_player(self, html):
                                        'twitter card player')
  
      def _search_json_ld(self, html, video_id, expected_type=None, **kwargs):
-        json_ld = self._search_regex(
-            JSON_LD_RE, html, 'JSON-LD', group='json_ld', **kwargs)
+        json_ld_list = list(re.finditer(JSON_LD_RE, html))
          default = kwargs.get('default', NO_DEFAULT)
-        if not json_ld:
-            return default if default is not NO_DEFAULT else {}
          # JSON-LD may be malformed and thus `fatal` should be respected.
          # At the same time `default` may be passed that assumes `fatal=False`
          # for _search_regex. Let's simulate the same behavior here as well.
          fatal = kwargs.get('fatal', True) if default == NO_DEFAULT else False
-        return self._json_ld(json_ld, video_id, fatal=fatal, expected_type=expected_type)
+        json_ld = []
+        for mobj in json_ld_list:
+            json_ld_item = self._parse_json(
+                mobj.group('json_ld'), video_id, fatal=fatal)
+            if not json_ld_item:
+                continue
+            if isinstance(json_ld_item, dict):
+                json_ld.append(json_ld_item)
+            elif isinstance(json_ld_item, (list, tuple)):
+                json_ld.extend(json_ld_item)
+        if json_ld:
+            json_ld = self._json_ld(json_ld, video_id, fatal=fatal, expected_type=expected_type)
+        if json_ld:
+            return json_ld
+        if default is not NO_DEFAULT:
+            return default
+        elif fatal:
+            raise RegexNotFoundError('Unable to extract JSON-LD')
+        else:
+            self._downloader.report_warning('unable to extract JSON-LD %s' % bug_reports_message())
+            return {}
  
      def _json_ld(self, json_ld, video_id, fatal=True, expected_type=None):
          if isinstance(json_ld, compat_str):
@@ -1256,10 +1273,10 @@ def extract_video_object(e):
              extract_interaction_statistic(e)
  
          for e in json_ld:
-            if isinstance(e.get('@context'), compat_str) and re.match(r'^https?://schema.org/?$', e.get('@context')):
+            if '@context' in e:
                  item_type = e.get('@type')
                  if expected_type is not None and expected_type != item_type:
-                    return info
+                    continue
                  if item_type in ('TVEpisode', 'Episode'):
                      episode_name = unescapeHTML(e.get('name'))
                      info.update({
@@ -1293,11 +1310,17 @@ def extract_video_object(e):
                      })
                  elif item_type == 'VideoObject':
                      extract_video_object(e)
-                    continue
+                    if expected_type is None:
+                        continue
+                    else:
+                        break
                  video = e.get('video')
                  if isinstance(video, dict) and video.get('@type') == 'VideoObject':
                      extract_video_object(video)
-                break
+                if expected_type is None:
+                    continue
+                else:
+                    break
          return dict((k, v) for k, v in info.items() if v is not None)
  
      @staticmethod
@@ -1477,7 +1500,7 @@ def _parse_f4m_formats(self, manifest, manifest_url, video_id, preference=None,
          if not isinstance(manifest, compat_etree_Element) and not fatal:
              return []
  
-        # currently youtube-dl cannot decode the playerVerificationChallenge as Akamai uses Adobe Alchemy
+        # currently youtube-dlc cannot decode the playerVerificationChallenge as Akamai uses Adobe Alchemy
          akamai_pv = manifest.find('{http://ns.adobe.com/f4m/1.0}pv-2.0')
          if akamai_pv is not None and ';' in akamai_pv.text:
              playerVerificationChallenge = akamai_pv.text.split(';')[0]
@@ -2340,6 +2363,8 @@ def _extract_ism_formats(self, ism_url, video_id, ism_id=None, note=None, errnot
          if res is False:
              return []
          ism_doc, urlh = res
+        if ism_doc is None:
+            return []
  
          return self._parse_ism_formats(ism_doc, urlh.geturl(), ism_id)
  
@@ -2818,7 +2843,7 @@ def _float(self, v, name, fatal=False, **kwargs):
  
      def _set_cookie(self, domain, name, value, expire_time=None, port=None,
                      path='/', secure=False, discard=False, rest={}, **kwargs):
-        cookie = compat_cookiejar.Cookie(
+        cookie = compat_cookiejar_Cookie(
              0, name, value, port, port is not None, domain, True,
              domain.startswith('.'), path, True, secure, expire_time,
              discard, None, None, rest)
diff --git a/youtube_dl/extractor/commonmistakes.py b/youtube_dlc/extractor/commonmistakes.py

similarity index 92%

rename from youtube_dl/extractor/commonmistakes.py

rename to youtube_dlc/extractor/commonmistakes.py

index 7e12499b1e83ab39026fb0448673dfdacddaf968..933b89eb3e5e649acb98764b7dcb3aaa3c78db16 100644 (file)
--- a/youtube_dl/extractor/commonmistakes.py
+++ b/youtube_dlc/extractor/commonmistakes.py
@@ -22,12 +22,12 @@ class CommonMistakesIE(InfoExtractor):
  
      def _real_extract(self, url):
          msg = (
-            'You\'ve asked youtube-dl to download the URL "%s". '
+            'You\'ve asked youtube-dlc to download the URL "%s". '
              'That doesn\'t make any sense. '
              'Simply remove the parameter in your command or configuration.'
          ) % url
          if not self._downloader.params.get('verbose'):
-            msg += ' Add -v to the command line to see what arguments and configuration youtube-dl got.'
+            msg += ' Add -v to the command line to see what arguments and configuration youtube-dlc got.'
          raise ExtractorError(msg, expected=True)
  
  
diff --git a/youtube_dl/extractor/commonprotocols.py b/youtube_dlc/extractor/commonprotocols.py

similarity index 100%

rename from youtube_dl/extractor/commonprotocols.py

rename to youtube_dlc/extractor/commonprotocols.py
diff --git a/youtube_dl/extractor/condenast.py b/youtube_dlc/extractor/condenast.py

similarity index 100%

rename from youtube_dl/extractor/condenast.py

rename to youtube_dlc/extractor/condenast.py
diff --git a/youtube_dl/extractor/contv.py b/youtube_dlc/extractor/contv.py

similarity index 100%

rename from youtube_dl/extractor/contv.py

rename to youtube_dlc/extractor/contv.py
diff --git a/youtube_dl/extractor/corus.py b/youtube_dlc/extractor/corus.py

similarity index 100%

rename from youtube_dl/extractor/corus.py

rename to youtube_dlc/extractor/corus.py
diff --git a/youtube_dl/extractor/coub.py b/youtube_dlc/extractor/coub.py

similarity index 100%

rename from youtube_dl/extractor/coub.py

rename to youtube_dlc/extractor/coub.py
diff --git a/youtube_dl/extractor/cracked.py b/youtube_dlc/extractor/cracked.py

similarity index 100%

rename from youtube_dl/extractor/cracked.py

rename to youtube_dlc/extractor/cracked.py
diff --git a/youtube_dl/extractor/crackle.py b/youtube_dlc/extractor/crackle.py

similarity index 100%

rename from youtube_dl/extractor/crackle.py

rename to youtube_dlc/extractor/crackle.py
diff --git a/youtube_dl/extractor/crooksandliars.py b/youtube_dlc/extractor/crooksandliars.py

similarity index 100%

rename from youtube_dl/extractor/crooksandliars.py

rename to youtube_dlc/extractor/crooksandliars.py
diff --git a/youtube_dl/extractor/crunchyroll.py b/youtube_dlc/extractor/crunchyroll.py

similarity index 94%

rename from youtube_dl/extractor/crunchyroll.py

rename to youtube_dlc/extractor/crunchyroll.py

index 85a9a577f645395d6edde83e7b72a1f4001561f8..bc2d1fa8b041e3ec1bbc4d6d1b5f055ac31ee140 100644 (file)
--- a/youtube_dl/extractor/crunchyroll.py
+++ b/youtube_dlc/extractor/crunchyroll.py
@@ -13,6 +13,7 @@
      compat_b64decode,
      compat_etree_Element,
      compat_etree_fromstring,
+    compat_str,
      compat_urllib_parse_urlencode,
      compat_urllib_request,
      compat_urlparse,
@@ -25,9 +26,9 @@
      intlist_to_bytes,
      int_or_none,
      lowercase_escape,
+    merge_dicts,
      remove_end,
      sanitized_Request,
-    unified_strdate,
      urlencode_postdata,
      xpath_text,
  )
@@ -136,6 +137,7 @@ class CrunchyrollIE(CrunchyrollBaseIE, VRVIE):
              # rtmp
              'skip_download': True,
          },
+        'skip': 'Video gone',
      }, {
          'url': 'http://www.crunchyroll.com/media-589804/culture-japan-1',
          'info_dict': {
@@ -157,11 +159,12 @@ class CrunchyrollIE(CrunchyrollBaseIE, VRVIE):
          'info_dict': {
              'id': '702409',
              'ext': 'mp4',
-            'title': 'Re:ZERO -Starting Life in Another World- Episode 5 – The Morning of Our Promise Is Still Distant',
-            'description': 'md5:97664de1ab24bbf77a9c01918cb7dca9',
+            'title': compat_str,
+            'description': compat_str,
              'thumbnail': r're:^https?://.*\.jpg$',
-            'uploader': 'TV TOKYO',
-            'upload_date': '20160508',
+            'uploader': 'Re:Zero Partners',
+            'timestamp': 1462098900,
+            'upload_date': '20160501',
          },
          'params': {
              # m3u8 download
@@ -172,12 +175,13 @@ class CrunchyrollIE(CrunchyrollBaseIE, VRVIE):
          'info_dict': {
              'id': '727589',
              'ext': 'mp4',
-            'title': "KONOSUBA -God's blessing on this wonderful world! 2 Episode 1 – Give Me Deliverance From This Judicial Injustice!",
-            'description': 'md5:cbcf05e528124b0f3a0a419fc805ea7d',
+            'title': compat_str,
+            'description': compat_str,
              'thumbnail': r're:^https?://.*\.jpg$',
              'uploader': 'Kadokawa Pictures Inc.',
-            'upload_date': '20170118',
-            'series': "KONOSUBA -God's blessing on this wonderful world!",
+            'timestamp': 1484130900,
+            'upload_date': '20170111',
+            'series': compat_str,
              'season': "KONOSUBA -God's blessing on this wonderful world! 2",
              'season_number': 2,
              'episode': 'Give Me Deliverance From This Judicial Injustice!',
@@ -200,10 +204,11 @@ class CrunchyrollIE(CrunchyrollBaseIE, VRVIE):
          'info_dict': {
              'id': '535080',
              'ext': 'mp4',
-            'title': '11eyes Episode 1 – Red Night ~ Piros éjszaka',
-            'description': 'Kakeru and Yuka are thrown into an alternate nightmarish world they call "Red Night".',
+            'title': compat_str,
+            'description': compat_str,
              'uploader': 'Marvelous AQL Inc.',
-            'upload_date': '20091021',
+            'timestamp': 1255512600,
+            'upload_date': '20091014',
          },
          'params': {
              # Just test metadata extraction
@@ -224,15 +229,17 @@ class CrunchyrollIE(CrunchyrollBaseIE, VRVIE):
              # just test metadata extraction
              'skip_download': True,
          },
+        'skip': 'Video gone',
      }, {
          # A video with a vastly different season name compared to the series name
          'url': 'http://www.crunchyroll.com/nyarko-san-another-crawling-chaos/episode-1-test-590532',
          'info_dict': {
              'id': '590532',
              'ext': 'mp4',
-            'title': 'Haiyoru! Nyaruani (ONA) Episode 1 – Test',
-            'description': 'Mahiro and Nyaruko talk about official certification.',
+            'title': compat_str,
+            'description': compat_str,
              'uploader': 'TV TOKYO',
+            'timestamp': 1330956000,
              'upload_date': '20120305',
              'series': 'Nyarko-san: Another Crawling Chaos',
              'season': 'Haiyoru! Nyaruani (ONA)',
@@ -442,23 +449,21 @@ def _real_extract(self, url):
              webpage, 'language', default=None, group='lang')
  
          video_title = self._html_search_regex(
-            r'(?s)<h1[^>]*>((?:(?!<h1).)*?<span[^>]+itemprop=["\']title["\'][^>]*>(?:(?!<h1).)+?)</h1>',
-            webpage, 'video_title')
+            (r'(?s)<h1[^>]*>((?:(?!<h1).)*?<(?:span[^>]+itemprop=["\']title["\']|meta[^>]+itemprop=["\']position["\'])[^>]*>(?:(?!<h1).)+?)</h1>',
+             r'<title>(.+?),\s+-\s+.+? Crunchyroll'),
+            webpage, 'video_title', default=None)
+        if not video_title:
+            video_title = re.sub(r'^Watch\s+', '', self._og_search_description(webpage))
          video_title = re.sub(r' {2,}', ' ', video_title)
          video_description = (self._parse_json(self._html_search_regex(
              r'<script[^>]*>\s*.+?\[media_id=%s\].+?({.+?"description"\s*:.+?})\);' % video_id,
              webpage, 'description', default='{}'), video_id) or media_metadata).get('description')
          if video_description:
              video_description = lowercase_escape(video_description.replace(r'\r\n', '\n'))
-        video_upload_date = self._html_search_regex(
-            [r'<div>Availability for free users:(.+?)</div>', r'<div>[^<>]+<span>\s*(.+?\d{4})\s*</span></div>'],
-            webpage, 'video_upload_date', fatal=False, flags=re.DOTALL)
-        if video_upload_date:
-            video_upload_date = unified_strdate(video_upload_date)
          video_uploader = self._html_search_regex(
              # try looking for both an uploader that's a link and one that's not
              [r'<a[^>]+href="/publisher/[^"]+"[^>]*>([^<]+)</a>', r'<div>\s*Publisher:\s*<span>\s*(.+?)\s*</span>\s*</div>'],
-            webpage, 'video_uploader', fatal=False)
+            webpage, 'video_uploader', default=False)
  
          formats = []
          for stream in media.get('streams', []):
@@ -611,14 +616,15 @@ def _real_extract(self, url):
              r'(?s)<h\d[^>]+id=["\']showmedia_about_episode_num[^>]+>.+?</h\d>\s*<h4>\s*Season (\d+)',
              webpage, 'season number', default=None))
  
-        return {
+        info = self._search_json_ld(webpage, video_id, default={})
+
+        return merge_dicts({
              'id': video_id,
              'title': video_title,
              'description': video_description,
              'duration': duration,
              'thumbnail': thumbnail,
              'uploader': video_uploader,
-            'upload_date': video_upload_date,
              'series': series,
              'season': season,
              'season_number': season_number,
@@ -626,7 +632,7 @@ def _real_extract(self, url):
              'episode_number': episode_number,
              'subtitles': subtitles,
              'formats': formats,
-        }
+        }, info)
  
  
  class CrunchyrollShowPlaylistIE(CrunchyrollBaseIE):
diff --git a/youtube_dl/extractor/cspan.py b/youtube_dlc/extractor/cspan.py

similarity index 100%

rename from youtube_dl/extractor/cspan.py

rename to youtube_dlc/extractor/cspan.py
diff --git a/youtube_dl/extractor/ctsnews.py b/youtube_dlc/extractor/ctsnews.py

similarity index 100%

rename from youtube_dl/extractor/ctsnews.py

rename to youtube_dlc/extractor/ctsnews.py
diff --git a/youtube_dl/extractor/ctvnews.py b/youtube_dlc/extractor/ctvnews.py

similarity index 100%

rename from youtube_dl/extractor/ctvnews.py

rename to youtube_dlc/extractor/ctvnews.py
diff --git a/youtube_dl/extractor/cultureunplugged.py b/youtube_dlc/extractor/cultureunplugged.py

similarity index 100%

rename from youtube_dl/extractor/cultureunplugged.py

rename to youtube_dlc/extractor/cultureunplugged.py
diff --git a/youtube_dl/extractor/curiositystream.py b/youtube_dlc/extractor/curiositystream.py

similarity index 100%

rename from youtube_dl/extractor/curiositystream.py

rename to youtube_dlc/extractor/curiositystream.py
diff --git a/youtube_dl/extractor/cwtv.py b/youtube_dlc/extractor/cwtv.py

similarity index 100%

rename from youtube_dl/extractor/cwtv.py

rename to youtube_dlc/extractor/cwtv.py
diff --git a/youtube_dl/extractor/dailymail.py b/youtube_dlc/extractor/dailymail.py

similarity index 100%

rename from youtube_dl/extractor/dailymail.py

rename to youtube_dlc/extractor/dailymail.py
diff --git a/youtube_dl/extractor/dailymotion.py b/youtube_dlc/extractor/dailymotion.py

similarity index 99%

rename from youtube_dl/extractor/dailymotion.py

rename to youtube_dlc/extractor/dailymotion.py

index 327fdb04a71215b8de170314d7080093ced3bbba..b8529050c45c8ee1479e2624ba26aac3f002b9fe 100644 (file)
--- a/youtube_dl/extractor/dailymotion.py
+++ b/youtube_dlc/extractor/dailymotion.py
@@ -32,7 +32,7 @@ def _get_dailymotion_cookies(self):
  
      @staticmethod
      def _get_cookie_value(cookies, name):
-        cookie = cookies.get('name')
+        cookie = cookies.get(name)
          if cookie:
              return cookie.value
  
diff --git a/youtube_dl/extractor/daum.py b/youtube_dlc/extractor/daum.py

similarity index 100%

rename from youtube_dl/extractor/daum.py

rename to youtube_dlc/extractor/daum.py
diff --git a/youtube_dl/extractor/dbtv.py b/youtube_dlc/extractor/dbtv.py

similarity index 100%

rename from youtube_dl/extractor/dbtv.py

rename to youtube_dlc/extractor/dbtv.py
diff --git a/youtube_dl/extractor/dctp.py b/youtube_dlc/extractor/dctp.py

similarity index 71%

rename from youtube_dl/extractor/dctp.py

rename to youtube_dlc/extractor/dctp.py

index 04ff214f727826a60bbdde5ec17bb48ba004a91e..e700f8d86531415da0f1db0f2ccdaef6ea10ac53 100644 (file)
--- a/youtube_dl/extractor/dctp.py
+++ b/youtube_dlc/extractor/dctp.py
@@ -16,10 +16,11 @@ class DctpTvIE(InfoExtractor):
      _TESTS = [{
          # 4x3
          'url': 'http://www.dctp.tv/filme/videoinstallation-fuer-eine-kaufhausfassade/',
+        'md5': '3ffbd1556c3fe210724d7088fad723e3',
          'info_dict': {
              'id': '95eaa4f33dad413aa17b4ee613cccc6c',
              'display_id': 'videoinstallation-fuer-eine-kaufhausfassade',
-            'ext': 'flv',
+            'ext': 'm4v',
              'title': 'Videoinstallation für eine Kaufhausfassade',
              'description': 'Kurzfilm',
              'thumbnail': r're:^https?://.*\.jpg$',
@@ -27,10 +28,6 @@ class DctpTvIE(InfoExtractor):
              'timestamp': 1302172322,
              'upload_date': '20110407',
          },
-        'params': {
-            # rtmp download
-            'skip_download': True,
-        },
      }, {
          # 16x9
          'url': 'http://www.dctp.tv/filme/sind-youtuber-die-besseren-lehrer/',
@@ -59,33 +56,26 @@ def _real_extract(self, url):
  
          uuid = media['uuid']
          title = media['title']
-        ratio = '16x9' if media.get('is_wide') else '4x3'
-        play_path = 'mp4:%s_dctp_0500_%s.m4v' % (uuid, ratio)
-
-        servers = self._download_json(
-            'http://www.dctp.tv/streaming_servers/', display_id,
-            note='Downloading server list JSON', fatal=False)
-
-        if servers:
-            endpoint = next(
-                server['endpoint']
-                for server in servers
-                if url_or_none(server.get('endpoint'))
-                and 'cloudfront' in server['endpoint'])
-        else:
-            endpoint = 'rtmpe://s2pqqn4u96e4j8.cloudfront.net/cfx/st/'
-
-        app = self._search_regex(
-            r'^rtmpe?://[^/]+/(?P<app>.*)$', endpoint, 'app')
-
-        formats = [{
-            'url': endpoint,
-            'app': app,
-            'play_path': play_path,
-            'page_url': url,
-            'player_url': 'http://svm-prod-dctptv-static.s3.amazonaws.com/dctptv-relaunch2012-110.swf',
-            'ext': 'flv',
-        }]
+        is_wide = media.get('is_wide')
+        formats = []
+
+        def add_formats(suffix):
+            templ = 'https://%%s/%s_dctp_%s.m4v' % (uuid, suffix)
+            formats.extend([{
+                'format_id': 'hls-' + suffix,
+                'url': templ % 'cdn-segments.dctp.tv' + '/playlist.m3u8',
+                'protocol': 'm3u8_native',
+            }, {
+                'format_id': 's3-' + suffix,
+                'url': templ % 'completed-media.s3.amazonaws.com',
+            }, {
+                'format_id': 'http-' + suffix,
+                'url': templ % 'cdn-media.dctp.tv',
+            }])
+
+        add_formats('0500_' + ('16x9' if is_wide else '4x3'))
+        if is_wide:
+            add_formats('720p')
  
          thumbnails = []
          images = media.get('images')
diff --git a/youtube_dlc/extractor/deezer.py b/youtube_dlc/extractor/deezer.py

new file mode 100644 (file)

index 0000000..3031671
--- /dev/null
+++ b/youtube_dlc/extractor/deezer.py
@@ -0,0 +1,147 @@
+from __future__ import unicode_literals
+
+import json
+import re
+
+from .common import InfoExtractor
+from ..utils import (
+    ExtractorError,
+    int_or_none,
+    orderedSet,
+)
+
+
+class DeezerBaseInfoExtractor(InfoExtractor):
+    def get_data(self, url):
+        if not self._downloader.params.get('test'):
+            self._downloader.report_warning('For now, this extractor only supports the 30 second previews. Patches welcome!')
+
+        mobj = re.match(self._VALID_URL, url)
+        data_id = mobj.group('id')
+
+        webpage = self._download_webpage(url, data_id)
+        geoblocking_msg = self._html_search_regex(
+            r'<p class="soon-txt">(.*?)</p>', webpage, 'geoblocking message',
+            default=None)
+        if geoblocking_msg is not None:
+            raise ExtractorError(
+                'Deezer said: %s' % geoblocking_msg, expected=True)
+
+        data_json = self._search_regex(
+            (r'__DZR_APP_STATE__\s*=\s*({.+?})\s*</script>',
+             r'naboo\.display\(\'[^\']+\',\s*(.*?)\);\n'),
+            webpage, 'data JSON')
+        data = json.loads(data_json)
+        return data_id, webpage, data
+
+
+class DeezerPlaylistIE(DeezerBaseInfoExtractor):
+    _VALID_URL = r'https?://(?:www\.)?deezer\.com/(../)?playlist/(?P<id>[0-9]+)'
+    _TEST = {
+        'url': 'http://www.deezer.com/playlist/176747451',
+        'info_dict': {
+            'id': '176747451',
+            'title': 'Best!',
+            'uploader': 'anonymous',
+            'thumbnail': r're:^https?://(e-)?cdns-images\.dzcdn\.net/images/cover/.*\.jpg$',
+        },
+        'playlist_count': 29,
+    }
+
+    def _real_extract(self, url):
+        playlist_id, webpage, data = self.get_data(url)
+
+        playlist_title = data.get('DATA', {}).get('TITLE')
+        playlist_uploader = data.get('DATA', {}).get('PARENT_USERNAME')
+        playlist_thumbnail = self._search_regex(
+            r'<img id="naboo_playlist_image".*?src="([^"]+)"', webpage,
+            'playlist thumbnail')
+
+        entries = []
+        for s in data.get('SONGS', {}).get('data'):
+            formats = [{
+                'format_id': 'preview',
+                'url': s.get('MEDIA', [{}])[0].get('HREF'),
+                'preference': -100,  # Only the first 30 seconds
+                'ext': 'mp3',
+            }]
+            self._sort_formats(formats)
+            artists = ', '.join(
+                orderedSet(a.get('ART_NAME') for a in s.get('ARTISTS')))
+            entries.append({
+                'id': s.get('SNG_ID'),
+                'duration': int_or_none(s.get('DURATION')),
+                'title': '%s - %s' % (artists, s.get('SNG_TITLE')),
+                'uploader': s.get('ART_NAME'),
+                'uploader_id': s.get('ART_ID'),
+                'age_limit': 16 if s.get('EXPLICIT_LYRICS') == '1' else 0,
+                'formats': formats,
+            })
+
+        return {
+            '_type': 'playlist',
+            'id': playlist_id,
+            'title': playlist_title,
+            'uploader': playlist_uploader,
+            'thumbnail': playlist_thumbnail,
+            'entries': entries,
+        }
+
+
+class DeezerAlbumIE(DeezerBaseInfoExtractor):
+    _VALID_URL = r'https?://(?:www\.)?deezer\.com/(../)?album/(?P<id>[0-9]+)'
+    _TEST = {
+        'url': 'https://www.deezer.com/fr/album/67505622',
+        'info_dict': {
+            'id': '67505622',
+            'title': 'Last Week',
+            'uploader': 'Home Brew',
+            'thumbnail': r're:^https?://(e-)?cdns-images\.dzcdn\.net/images/cover/.*\.jpg$',
+        },
+        'playlist_count': 7,
+    }
+
+    def _real_extract(self, url):
+        album_id, webpage, data = self.get_data(url)
+
+        album_title = data.get('DATA', {}).get('ALB_TITLE')
+        album_uploader = data.get('DATA', {}).get('ART_NAME')
+        album_thumbnail = self._search_regex(
+            r'<img id="naboo_album_image".*?src="([^"]+)"', webpage,
+            'album thumbnail')
+
+        entries = []
+        for s in data.get('SONGS', {}).get('data'):
+            formats = [{
+                'format_id': 'preview',
+                'url': s.get('MEDIA', [{}])[0].get('HREF'),
+                'preference': -100,  # Only the first 30 seconds
+                'ext': 'mp3',
+            }]
+            self._sort_formats(formats)
+            artists = ', '.join(
+                orderedSet(a.get('ART_NAME') for a in s.get('ARTISTS')))
+            entries.append({
+                'id': s.get('SNG_ID'),
+                'duration': int_or_none(s.get('DURATION')),
+                'title': '%s - %s' % (artists, s.get('SNG_TITLE')),
+                'uploader': s.get('ART_NAME'),
+                'uploader_id': s.get('ART_ID'),
+                'age_limit': 16 if s.get('EXPLICIT_LYRICS') == '1' else 0,
+                'formats': formats,
+                'track': s.get('SNG_TITLE'),
+                'track_number': int_or_none(s.get('TRACK_NUMBER')),
+                'track_id': s.get('SNG_ID'),
+                'artist': album_uploader,
+                'album': album_title,
+                'album_artist': album_uploader,
+            })
+
+        return {
+            '_type': 'playlist',
+            'id': album_id,
+            'title': album_title,
+            'uploader': album_uploader,
+            'thumbnail': album_thumbnail,
+            'entries': entries,
+        }
diff --git a/youtube_dl/extractor/defense.py b/youtube_dlc/extractor/defense.py

similarity index 100%

rename from youtube_dl/extractor/defense.py

rename to youtube_dlc/extractor/defense.py
diff --git a/youtube_dl/extractor/democracynow.py b/youtube_dlc/extractor/democracynow.py

similarity index 100%

rename from youtube_dl/extractor/democracynow.py

rename to youtube_dlc/extractor/democracynow.py
diff --git a/youtube_dl/extractor/dfb.py b/youtube_dlc/extractor/dfb.py

similarity index 100%

rename from youtube_dl/extractor/dfb.py

rename to youtube_dlc/extractor/dfb.py
diff --git a/youtube_dl/extractor/dhm.py b/youtube_dlc/extractor/dhm.py

similarity index 100%

rename from youtube_dl/extractor/dhm.py

rename to youtube_dlc/extractor/dhm.py
diff --git a/youtube_dl/extractor/digg.py b/youtube_dlc/extractor/digg.py

similarity index 100%

rename from youtube_dl/extractor/digg.py

rename to youtube_dlc/extractor/digg.py
diff --git a/youtube_dl/extractor/digiteka.py b/youtube_dlc/extractor/digiteka.py

similarity index 100%

rename from youtube_dl/extractor/digiteka.py

rename to youtube_dlc/extractor/digiteka.py
diff --git a/youtube_dl/extractor/discovery.py b/youtube_dlc/extractor/discovery.py

similarity index 95%

rename from youtube_dl/extractor/discovery.py

rename to youtube_dlc/extractor/discovery.py

index 6a2712cc50429b7297a9d4fe9e1ec2d80177986e..e0139cc862d74bc3c20a9d6567747b96b642c730 100644 (file)
--- a/youtube_dl/extractor/discovery.py
+++ b/youtube_dlc/extractor/discovery.py
@@ -13,8 +13,8 @@
  class DiscoveryIE(DiscoveryGoBaseIE):
      _VALID_URL = r'''(?x)https?://
          (?P<site>
-            (?:(?:www|go)\.)?discovery|
-            (?:www\.)?
+            go\.discovery|
+            www\.
                  (?:
                      investigationdiscovery|
                      discoverylife|
@@ -22,8 +22,7 @@ class DiscoveryIE(DiscoveryGoBaseIE):
                      ahctv|
                      destinationamerica|
                      sciencechannel|
-                    tlc|
-                    velocity
+                    tlc
                  )|
              watch\.
                  (?:
@@ -83,7 +82,7 @@ def _real_extract(self, url):
                      'authRel': 'authorization',
                      'client_id': '3020a40c2356a645b4b4',
                      'nonce': ''.join([random.choice(string.ascii_letters) for _ in range(32)]),
-                    'redirectUri': 'https://fusion.ddmcdn.com/app/mercury-sdk/180/redirectHandler.html?https://www.%s.com' % site,
+                    'redirectUri': 'https://www.discovery.com/',
                  })['access_token']
  
          headers = self.geo_verification_headers()
diff --git a/youtube_dl/extractor/discoverygo.py b/youtube_dlc/extractor/discoverygo.py

similarity index 100%

rename from youtube_dl/extractor/discoverygo.py

rename to youtube_dlc/extractor/discoverygo.py
diff --git a/youtube_dl/extractor/discoverynetworks.py b/youtube_dlc/extractor/discoverynetworks.py

similarity index 100%

rename from youtube_dl/extractor/discoverynetworks.py

rename to youtube_dlc/extractor/discoverynetworks.py
diff --git a/youtube_dl/extractor/discoveryvr.py b/youtube_dlc/extractor/discoveryvr.py

similarity index 100%

rename from youtube_dl/extractor/discoveryvr.py

rename to youtube_dlc/extractor/discoveryvr.py
diff --git a/youtube_dl/extractor/disney.py b/youtube_dlc/extractor/disney.py

similarity index 100%

rename from youtube_dl/extractor/disney.py

rename to youtube_dlc/extractor/disney.py
diff --git a/youtube_dl/extractor/dispeak.py b/youtube_dlc/extractor/dispeak.py

similarity index 100%

rename from youtube_dl/extractor/dispeak.py

rename to youtube_dlc/extractor/dispeak.py
diff --git a/youtube_dl/extractor/dlive.py b/youtube_dlc/extractor/dlive.py

similarity index 100%

rename from youtube_dl/extractor/dlive.py

rename to youtube_dlc/extractor/dlive.py
diff --git a/youtube_dlc/extractor/doodstream.py b/youtube_dlc/extractor/doodstream.py

new file mode 100644 (file)

index 0000000..2c9ea68
--- /dev/null
+++ b/youtube_dlc/extractor/doodstream.py
@@ -0,0 +1,71 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+import string
+import random
+import time
+
+from .common import InfoExtractor
+
+
+class DoodStreamIE(InfoExtractor):
+    _VALID_URL = r'https?://(?:www\.)?dood\.(?:to|watch)/[ed]/(?P<id>[a-z0-9]+)'
+    _TESTS = [{
+        'url': 'http://dood.to/e/5s1wmbdacezb',
+        'md5': '4568b83b31e13242b3f1ff96c55f0595',
+        'info_dict': {
+            'id': '5s1wmbdacezb',
+            'ext': 'mp4',
+            'title': 'Kat Wonders - Monthly May 2020',
+            'description': 'Kat Wonders - Monthly May 2020 | DoodStream.com',
+            'thumbnail': 'https://img.doodcdn.com/snaps/flyus84qgl2fsk4g.jpg',
+        }
+    }, {
+        'url': 'https://dood.to/d/jzrxn12t2s7n',
+        'md5': '3207e199426eca7c2aa23c2872e6728a',
+        'info_dict': {
+            'id': 'jzrxn12t2s7n',
+            'ext': 'mp4',
+            'title': 'Stacy Cruz Cute ALLWAYSWELL',
+            'description': 'Stacy Cruz Cute ALLWAYSWELL | DoodStream.com',
+            'thumbnail': 'https://img.doodcdn.com/snaps/8edqd5nppkac3x8u.jpg',
+        }
+    }]
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+        webpage = self._download_webpage(url, video_id)
+
+        if '/d/' in url:
+            url = "https://dood.to" + self._html_search_regex(
+                r'<iframe src="(/e/[a-z0-9]+)"', webpage, 'embed')
+            video_id = self._match_id(url)
+            webpage = self._download_webpage(url, video_id)
+
+        title = self._html_search_meta(['og:title', 'twitter:title'],
+                                       webpage, default=None)
+        thumb = self._html_search_meta(['og:image', 'twitter:image'],
+                                       webpage, default=None)
+        token = self._html_search_regex(r'[?&]token=([a-z0-9]+)[&\']', webpage, 'token')
+        description = self._html_search_meta(
+            ['og:description', 'description', 'twitter:description'],
+            webpage, default=None)
+        auth_url = 'https://dood.to' + self._html_search_regex(
+            r'(/pass_md5.*?)\'', webpage, 'pass_md5')
+        headers = {
+            'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:53.0) Gecko/20100101 Firefox/66.0',
+            'referer': url
+        }
+
+        webpage = self._download_webpage(auth_url, video_id, headers=headers)
+        final_url = webpage + ''.join([random.choice(string.ascii_letters + string.digits) for _ in range(10)]) + "?token=" + token + "&expiry=" + str(int(time.time() * 1000))
+
+        return {
+            'id': video_id,
+            'title': title,
+            'url': final_url,
+            'http_headers': headers,
+            'ext': 'mp4',
+            'description': description,
+            'thumbnail': thumb,
+        }
diff --git a/youtube_dl/extractor/dotsub.py b/youtube_dlc/extractor/dotsub.py

similarity index 100%

rename from youtube_dl/extractor/dotsub.py

rename to youtube_dlc/extractor/dotsub.py
diff --git a/youtube_dl/extractor/douyutv.py b/youtube_dlc/extractor/douyutv.py

similarity index 100%

rename from youtube_dl/extractor/douyutv.py

rename to youtube_dlc/extractor/douyutv.py
diff --git a/youtube_dl/extractor/dplay.py b/youtube_dlc/extractor/dplay.py

similarity index 100%

rename from youtube_dl/extractor/dplay.py

rename to youtube_dlc/extractor/dplay.py
diff --git a/youtube_dl/extractor/drbonanza.py b/youtube_dlc/extractor/drbonanza.py

similarity index 100%

rename from youtube_dl/extractor/drbonanza.py

rename to youtube_dlc/extractor/drbonanza.py
diff --git a/youtube_dl/extractor/dropbox.py b/youtube_dlc/extractor/dropbox.py

similarity index 90%

rename from youtube_dl/extractor/dropbox.py

rename to youtube_dlc/extractor/dropbox.py

index 14b6c00b0bd1c4d3a306b5513477e2bb6c4cd52d..9dc6614c582811a34198302929313b853045f707 100644 (file)
--- a/youtube_dl/extractor/dropbox.py
+++ b/youtube_dlc/extractor/dropbox.py
@@ -13,11 +13,11 @@ class DropboxIE(InfoExtractor):
      _VALID_URL = r'https?://(?:www\.)?dropbox[.]com/sh?/(?P<id>[a-zA-Z0-9]{15})/.*'
      _TESTS = [
          {
-            'url': 'https://www.dropbox.com/s/nelirfsxnmcfbfh/youtube-dl%20test%20video%20%27%C3%A4%22BaW_jenozKc.mp4?dl=0',
+            'url': 'https://www.dropbox.com/s/nelirfsxnmcfbfh/youtube-dlc%20test%20video%20%27%C3%A4%22BaW_jenozKc.mp4?dl=0',
              'info_dict': {
                  'id': 'nelirfsxnmcfbfh',
                  'ext': 'mp4',
-                'title': 'youtube-dl test video \'ä"BaW_jenozKc'
+                'title': 'youtube-dlc test video \'ä"BaW_jenozKc'
              }
          }, {
              'url': 'https://www.dropbox.com/sh/662glsejgzoj9sr/AAByil3FGH9KFNZ13e08eSa1a/Pregame%20Ceremony%20Program%20PA%2020140518.m4v',
diff --git a/youtube_dl/extractor/drtuber.py b/youtube_dlc/extractor/drtuber.py

similarity index 100%

rename from youtube_dl/extractor/drtuber.py

rename to youtube_dlc/extractor/drtuber.py
diff --git a/youtube_dl/extractor/drtv.py b/youtube_dlc/extractor/drtv.py

similarity index 100%

rename from youtube_dl/extractor/drtv.py

rename to youtube_dlc/extractor/drtv.py
diff --git a/youtube_dl/extractor/dtube.py b/youtube_dlc/extractor/dtube.py

similarity index 100%

rename from youtube_dl/extractor/dtube.py

rename to youtube_dlc/extractor/dtube.py
diff --git a/youtube_dlc/extractor/duboku.py b/youtube_dlc/extractor/duboku.py

new file mode 100644 (file)

index 0000000..fdc695b
--- /dev/null
+++ b/youtube_dlc/extractor/duboku.py
@@ -0,0 +1,242 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+import re
+
+from .common import InfoExtractor
+from ..compat import compat_urlparse
+from ..utils import (
+    clean_html,
+    extract_attributes,
+    ExtractorError,
+    get_elements_by_class,
+    int_or_none,
+    js_to_json,
+    smuggle_url,
+    unescapeHTML,
+)
+
+
+def _get_elements_by_tag_and_attrib(html, tag=None, attribute=None, value=None, escape_value=True):
+    """Return the content of the tag with the specified attribute in the passed HTML document"""
+
+    if tag is None:
+        tag = '[a-zA-Z0-9:._-]+'
+    if attribute is None:
+        attribute = ''
+    else:
+        attribute = r'\s+(?P<attribute>%s)' % re.escape(attribute)
+    if value is None:
+        value = ''
+    else:
+        value = re.escape(value) if escape_value else value
+        value = '=[\'"]?(?P<value>%s)[\'"]?' % value
+
+    retlist = []
+    for m in re.finditer(r'''(?xs)
+        <(?P<tag>%s)
+         (?:\s+[a-zA-Z0-9:._-]+(?:=[a-zA-Z0-9:._-]*|="[^"]*"|='[^']*'|))*?
+         %s%s
+         (?:\s+[a-zA-Z0-9:._-]+(?:=[a-zA-Z0-9:._-]*|="[^"]*"|='[^']*'|))*?
+        \s*>
+        (?P<content>.*?)
+        </\1>
+    ''' % (tag, attribute, value), html):
+        retlist.append(m)
+
+    return retlist
+
+
+def _get_element_by_tag_and_attrib(html, tag=None, attribute=None, value=None, escape_value=True):
+    retval = _get_elements_by_tag_and_attrib(html, tag, attribute, value, escape_value)
+    return retval[0] if retval else None
+
+
+class DubokuIE(InfoExtractor):
+    IE_NAME = 'duboku'
+    IE_DESC = 'www.duboku.co'
+
+    _VALID_URL = r'(?:https?://[^/]+\.duboku\.co/vodplay/)(?P<id>[0-9]+-[0-9-]+)\.html.*'
+    _TESTS = [{
+        'url': 'https://www.duboku.co/vodplay/1575-1-1.html',
+        'info_dict': {
+            'id': '1575-1-1',
+            'ext': 'ts',
+            'series': '白色月光',
+            'title': 'contains:白色月光',
+            'season_number': 1,
+            'episode_number': 1,
+        },
+        'params': {
+            'skip_download': 'm3u8 download',
+        },
+    }, {
+        'url': 'https://www.duboku.co/vodplay/1588-1-1.html',
+        'info_dict': {
+            'id': '1588-1-1',
+            'ext': 'ts',
+            'series': '亲爱的自己',
+            'title': 'contains:预告片',
+            'season_number': 1,
+            'episode_number': 1,
+        },
+        'params': {
+            'skip_download': 'm3u8 download',
+        },
+    }]
+
+    _PLAYER_DATA_PATTERN = r'player_data\s*=\s*(\{\s*(.*)})\s*;?\s*</script'
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+        temp = video_id.split('-')
+        series_id = temp[0]
+        season_id = temp[1]
+        episode_id = temp[2]
+
+        webpage_url = 'https://www.duboku.co/vodplay/%s.html' % video_id
+        webpage_html = self._download_webpage(webpage_url, video_id)
+
+        # extract video url
+
+        player_data = self._search_regex(
+            self._PLAYER_DATA_PATTERN, webpage_html, 'player_data')
+        player_data = self._parse_json(player_data, video_id, js_to_json)
+
+        # extract title
+
+        temp = get_elements_by_class('title', webpage_html)
+        series_title = None
+        title = None
+        for html in temp:
+            mobj = re.search(r'<a\s+.*>(.*)</a>', html)
+            if mobj:
+                href = extract_attributes(mobj.group(0)).get('href')
+                if href:
+                    mobj1 = re.search(r'/(\d+)\.html', href)
+                    if mobj1 and mobj1.group(1) == series_id:
+                        series_title = clean_html(mobj.group(0))
+                        series_title = re.sub(r'[\s\r\n\t]+', ' ', series_title)
+                        title = clean_html(html)
+                        title = re.sub(r'[\s\r\n\t]+', ' ', title)
+                        break
+
+        data_url = player_data.get('url')
+        if not data_url:
+            raise ExtractorError('Cannot find url in player_data')
+        data_from = player_data.get('from')
+
+        # if it is an embedded iframe, maybe it's an external source
+        if data_from == 'iframe':
+            # use _type url_transparent to retain the meaningful details
+            # of the video.
+            return {
+                '_type': 'url_transparent',
+                'url': smuggle_url(data_url, {'http_headers': {'Referer': webpage_url}}),
+                'id': video_id,
+                'title': title,
+                'series': series_title,
+                'season_number': int_or_none(season_id),
+                'season_id': season_id,
+                'episode_number': int_or_none(episode_id),
+                'episode_id': episode_id,
+            }
+
+        formats = self._extract_m3u8_formats(data_url, video_id, 'mp4')
+
+        return {
+            'id': video_id,
+            'title': title,
+            'series': series_title,
+            'season_number': int_or_none(season_id),
+            'season_id': season_id,
+            'episode_number': int_or_none(episode_id),
+            'episode_id': episode_id,
+            'formats': formats,
+            'http_headers': {'Referer': 'https://www.duboku.co/static/player/videojs.html'}
+        }
+
+
+class DubokuPlaylistIE(InfoExtractor):
+    IE_NAME = 'duboku:list'
+    IE_DESC = 'www.duboku.co entire series'
+
+    _VALID_URL = r'(?:https?://[^/]+\.duboku\.co/voddetail/)(?P<id>[0-9]+)\.html.*'
+    _TESTS = [{
+        'url': 'https://www.duboku.co/voddetail/1575.html',
+        'info_dict': {
+            'id': 'startswith:1575',
+            'title': '白色月光',
+        },
+        'playlist_count': 12,
+    }, {
+        'url': 'https://www.duboku.co/voddetail/1554.html',
+        'info_dict': {
+            'id': 'startswith:1554',
+            'title': '以家人之名',
+        },
+        'playlist_mincount': 30,
+    }, {
+        'url': 'https://www.duboku.co/voddetail/1554.html#playlist2',
+        'info_dict': {
+            'id': '1554#playlist2',
+            'title': '以家人之名',
+        },
+        'playlist_mincount': 27,
+    }]
+
+    def _real_extract(self, url):
+        mobj = re.match(self._VALID_URL, url)
+        if mobj is None:
+            raise ExtractorError('Invalid URL: %s' % url)
+        series_id = mobj.group('id')
+        fragment = compat_urlparse.urlparse(url).fragment
+
+        webpage_url = 'https://www.duboku.co/voddetail/%s.html' % series_id
+        webpage_html = self._download_webpage(webpage_url, series_id)
+
+        # extract title
+
+        title = _get_element_by_tag_and_attrib(webpage_html, 'h1', 'class', 'title')
+        title = unescapeHTML(title.group('content')) if title else None
+        if not title:
+            title = self._html_search_meta('keywords', webpage_html)
+        if not title:
+            title = _get_element_by_tag_and_attrib(webpage_html, 'title')
+            title = unescapeHTML(title.group('content')) if title else None
+
+        # extract playlists
+
+        playlists = {}
+        for div in _get_elements_by_tag_and_attrib(
+                webpage_html, attribute='id', value='playlist\\d+', escape_value=False):
+            playlist_id = div.group('value')
+            playlist = []
+            for a in _get_elements_by_tag_and_attrib(
+                    div.group('content'), 'a', 'href', value='[^\'"]+?', escape_value=False):
+                playlist.append({
+                    'href': unescapeHTML(a.group('value')),
+                    'title': unescapeHTML(a.group('content'))
+                })
+            playlists[playlist_id] = playlist
+
+        # select the specified playlist if url fragment exists
+        playlist = None
+        playlist_id = None
+        if fragment:
+            playlist = playlists.get(fragment)
+            playlist_id = fragment
+        else:
+            first = next(iter(playlists.items()), None)
+            if first:
+                (playlist_id, playlist) = first
+        if not playlist:
+            raise ExtractorError(
+                'Cannot find %s' % fragment if fragment else 'Cannot extract playlist')
+
+        # return url results
+        return self.playlist_result([
+            self.url_result(
+                compat_urlparse.urljoin('https://www.duboku.co', x['href']),
+                ie=DubokuIE.ie_key(), video_title=x.get('title'))
+            for x in playlist], series_id + '#' + playlist_id, title)
diff --git a/youtube_dl/extractor/dumpert.py b/youtube_dlc/extractor/dumpert.py

similarity index 100%

rename from youtube_dl/extractor/dumpert.py

rename to youtube_dlc/extractor/dumpert.py
diff --git a/youtube_dl/extractor/dvtv.py b/youtube_dlc/extractor/dvtv.py

similarity index 100%

rename from youtube_dl/extractor/dvtv.py

rename to youtube_dlc/extractor/dvtv.py
diff --git a/youtube_dl/extractor/dw.py b/youtube_dlc/extractor/dw.py

similarity index 100%

rename from youtube_dl/extractor/dw.py

rename to youtube_dlc/extractor/dw.py
diff --git a/youtube_dl/extractor/eagleplatform.py b/youtube_dlc/extractor/eagleplatform.py

similarity index 100%

rename from youtube_dl/extractor/eagleplatform.py

rename to youtube_dlc/extractor/eagleplatform.py
diff --git a/youtube_dl/extractor/ebaumsworld.py b/youtube_dlc/extractor/ebaumsworld.py

similarity index 100%

rename from youtube_dl/extractor/ebaumsworld.py

rename to youtube_dlc/extractor/ebaumsworld.py
diff --git a/youtube_dl/extractor/echomsk.py b/youtube_dlc/extractor/echomsk.py

similarity index 100%

rename from youtube_dl/extractor/echomsk.py

rename to youtube_dlc/extractor/echomsk.py
diff --git a/youtube_dl/extractor/egghead.py b/youtube_dlc/extractor/egghead.py

similarity index 100%

rename from youtube_dl/extractor/egghead.py

rename to youtube_dlc/extractor/egghead.py
diff --git a/youtube_dl/extractor/ehow.py b/youtube_dlc/extractor/ehow.py

similarity index 100%

rename from youtube_dl/extractor/ehow.py

rename to youtube_dlc/extractor/ehow.py
diff --git a/youtube_dl/extractor/eighttracks.py b/youtube_dlc/extractor/eighttracks.py

similarity index 85%

rename from youtube_dl/extractor/eighttracks.py

rename to youtube_dlc/extractor/eighttracks.py

index 9a44f89f3fe801047ef283e9b1fa410c8b88cb37..5ededd31da475ac997dc257f466bb3a0eff2ea2c 100644 (file)
--- a/youtube_dl/extractor/eighttracks.py
+++ b/youtube_dlc/extractor/eighttracks.py
@@ -18,12 +18,12 @@ class EightTracksIE(InfoExtractor):
      _VALID_URL = r'https?://8tracks\.com/(?P<user>[^/]+)/(?P<id>[^/#]+)(?:#.*)?$'
      _TEST = {
          'name': 'EightTracks',
-        'url': 'http://8tracks.com/ytdl/youtube-dl-test-tracks-a',
+        'url': 'http://8tracks.com/ytdl/youtube-dlc-test-tracks-a',
          'info_dict': {
              'id': '1336550',
-            'display_id': 'youtube-dl-test-tracks-a',
+            'display_id': 'youtube-dlc-test-tracks-a',
              'description': "test chars:  \"'/\\ä↭",
-            'title': "youtube-dl test tracks \"'/\\ä↭<>",
+            'title': "youtube-dlc test tracks \"'/\\ä↭<>",
          },
          'playlist': [
              {
@@ -31,7 +31,7 @@ class EightTracksIE(InfoExtractor):
                  'info_dict': {
                      'id': '11885610',
                      'ext': 'm4a',
-                    'title': "youtue-dl project<>\"' - youtube-dl test track 1 \"'/\\\u00e4\u21ad",
+                    'title': "youtue-dl project<>\"' - youtube-dlc test track 1 \"'/\\\u00e4\u21ad",
                      'uploader_id': 'ytdl'
                  }
              },
@@ -40,7 +40,7 @@ class EightTracksIE(InfoExtractor):
                  'info_dict': {
                      'id': '11885608',
                      'ext': 'm4a',
-                    'title': "youtube-dl project - youtube-dl test track 2 \"'/\\\u00e4\u21ad",
+                    'title': "youtube-dlc project - youtube-dlc test track 2 \"'/\\\u00e4\u21ad",
                      'uploader_id': 'ytdl'
                  }
              },
@@ -49,7 +49,7 @@ class EightTracksIE(InfoExtractor):
                  'info_dict': {
                      'id': '11885679',
                      'ext': 'm4a',
-                    'title': "youtube-dl project as well - youtube-dl test track 3 \"'/\\\u00e4\u21ad",
+                    'title': "youtube-dlc project as well - youtube-dlc test track 3 \"'/\\\u00e4\u21ad",
                      'uploader_id': 'ytdl'
                  }
              },
@@ -58,7 +58,7 @@ class EightTracksIE(InfoExtractor):
                  'info_dict': {
                      'id': '11885680',
                      'ext': 'm4a',
-                    'title': "youtube-dl project as well - youtube-dl test track 4 \"'/\\\u00e4\u21ad",
+                    'title': "youtube-dlc project as well - youtube-dlc test track 4 \"'/\\\u00e4\u21ad",
                      'uploader_id': 'ytdl'
                  }
              },
@@ -67,7 +67,7 @@ class EightTracksIE(InfoExtractor):
                  'info_dict': {
                      'id': '11885682',
                      'ext': 'm4a',
-                    'title': "PH - youtube-dl test track 5 \"'/\\\u00e4\u21ad",
+                    'title': "PH - youtube-dlc test track 5 \"'/\\\u00e4\u21ad",
                      'uploader_id': 'ytdl'
                  }
              },
@@ -76,7 +76,7 @@ class EightTracksIE(InfoExtractor):
                  'info_dict': {
                      'id': '11885683',
                      'ext': 'm4a',
-                    'title': "PH - youtube-dl test track 6 \"'/\\\u00e4\u21ad",
+                    'title': "PH - youtube-dlc test track 6 \"'/\\\u00e4\u21ad",
                      'uploader_id': 'ytdl'
                  }
              },
@@ -85,7 +85,7 @@ class EightTracksIE(InfoExtractor):
                  'info_dict': {
                      'id': '11885684',
                      'ext': 'm4a',
-                    'title': "phihag - youtube-dl test track 7 \"'/\\\u00e4\u21ad",
+                    'title': "phihag - youtube-dlc test track 7 \"'/\\\u00e4\u21ad",
                      'uploader_id': 'ytdl'
                  }
              },
@@ -94,7 +94,7 @@ class EightTracksIE(InfoExtractor):
                  'info_dict': {
                      'id': '11885685',
                      'ext': 'm4a',
-                    'title': "phihag - youtube-dl test track 8 \"'/\\\u00e4\u21ad",
+                    'title': "phihag - youtube-dlc test track 8 \"'/\\\u00e4\u21ad",
                      'uploader_id': 'ytdl'
                  }
              }
diff --git a/youtube_dl/extractor/einthusan.py b/youtube_dlc/extractor/einthusan.py

similarity index 100%

rename from youtube_dl/extractor/einthusan.py

rename to youtube_dlc/extractor/einthusan.py
diff --git a/youtube_dl/extractor/eitb.py b/youtube_dlc/extractor/eitb.py

similarity index 100%

rename from youtube_dl/extractor/eitb.py

rename to youtube_dlc/extractor/eitb.py
diff --git a/youtube_dl/extractor/ellentube.py b/youtube_dlc/extractor/ellentube.py

similarity index 100%

rename from youtube_dl/extractor/ellentube.py

rename to youtube_dlc/extractor/ellentube.py
diff --git a/youtube_dl/extractor/elpais.py b/youtube_dlc/extractor/elpais.py

similarity index 100%

rename from youtube_dl/extractor/elpais.py

rename to youtube_dlc/extractor/elpais.py
diff --git a/youtube_dl/extractor/embedly.py b/youtube_dlc/extractor/embedly.py

similarity index 100%

rename from youtube_dl/extractor/embedly.py

rename to youtube_dlc/extractor/embedly.py
diff --git a/youtube_dl/extractor/engadget.py b/youtube_dlc/extractor/engadget.py

similarity index 100%

rename from youtube_dl/extractor/engadget.py

rename to youtube_dlc/extractor/engadget.py
diff --git a/youtube_dl/extractor/eporner.py b/youtube_dlc/extractor/eporner.py

similarity index 97%

rename from youtube_dl/extractor/eporner.py

rename to youtube_dlc/extractor/eporner.py

index c050bf9df3fb7ececed5b3e03a70aea9b2c37417..fe42821c731c711e8f0974fd4ce48f5c9aee8e8f 100644 (file)
--- a/youtube_dl/extractor/eporner.py
+++ b/youtube_dlc/extractor/eporner.py
@@ -4,7 +4,6 @@
  import re
  
  from .common import InfoExtractor
-from ..compat import compat_str
  from ..utils import (
      encode_base_n,
      ExtractorError,
@@ -55,7 +54,7 @@ def _real_extract(self, url):
  
          webpage, urlh = self._download_webpage_handle(url, display_id)
  
-        video_id = self._match_id(compat_str(urlh.geturl()))
+        video_id = self._match_id(urlh.geturl())
  
          hash = self._search_regex(
              r'hash\s*:\s*["\']([\da-f]{32})', webpage, 'hash')
diff --git a/youtube_dl/extractor/eroprofile.py b/youtube_dlc/extractor/eroprofile.py

similarity index 100%

rename from youtube_dl/extractor/eroprofile.py

rename to youtube_dlc/extractor/eroprofile.py
diff --git a/youtube_dl/extractor/escapist.py b/youtube_dlc/extractor/escapist.py

similarity index 100%

rename from youtube_dl/extractor/escapist.py

rename to youtube_dlc/extractor/escapist.py
diff --git a/youtube_dl/extractor/espn.py b/youtube_dlc/extractor/espn.py

similarity index 100%

rename from youtube_dl/extractor/espn.py

rename to youtube_dlc/extractor/espn.py
diff --git a/youtube_dl/extractor/esri.py b/youtube_dlc/extractor/esri.py

similarity index 100%

rename from youtube_dl/extractor/esri.py

rename to youtube_dlc/extractor/esri.py
diff --git a/youtube_dl/extractor/europa.py b/youtube_dlc/extractor/europa.py

similarity index 100%

rename from youtube_dl/extractor/europa.py

rename to youtube_dlc/extractor/europa.py
diff --git a/youtube_dl/extractor/everyonesmixtape.py b/youtube_dlc/extractor/everyonesmixtape.py

similarity index 100%

rename from youtube_dl/extractor/everyonesmixtape.py

rename to youtube_dlc/extractor/everyonesmixtape.py
diff --git a/youtube_dl/extractor/expotv.py b/youtube_dlc/extractor/expotv.py

similarity index 100%

rename from youtube_dl/extractor/expotv.py

rename to youtube_dlc/extractor/expotv.py
diff --git a/youtube_dl/extractor/expressen.py b/youtube_dlc/extractor/expressen.py

similarity index 100%

rename from youtube_dl/extractor/expressen.py

rename to youtube_dlc/extractor/expressen.py
diff --git a/youtube_dl/extractor/extractors.py b/youtube_dlc/extractor/extractors.py

similarity index 97%

rename from youtube_dl/extractor/extractors.py

rename to youtube_dlc/extractor/extractors.py

index 50f69f0b6c153d2be19250b829bf2a28af25cec0..af1bc6e31d0c25693faaa0cfed62f657cda03ee9 100644 (file)
--- a/youtube_dl/extractor/extractors.py
+++ b/youtube_dlc/extractor/extractors.py
@@ -36,6 +36,10 @@
  from .airmozilla import AirMozillaIE
  from .aljazeera import AlJazeeraIE
  from .alphaporno import AlphaPornoIE
+from .alura import (
+    AluraIE,
+    AluraCourseIE
+)
  from .amcnetworks import AMCNetworksIE
  from .americastestkitchen import AmericasTestKitchenIE
  from .animeondemand import AnimeOnDemandIE
@@ -105,6 +109,7 @@
      BiliBiliBangumiIE,
      BilibiliAudioIE,
      BilibiliAudioAlbumIE,
+    BiliBiliPlayerIE,
  )
  from .biobiochiletv import BioBioChileTVIE
  from .bitchute import (
@@ -261,7 +266,10 @@
  )
  from .dbtv import DBTVIE
  from .dctp import DctpTvIE
-from .deezer import DeezerPlaylistIE
+from .deezer import (
+    DeezerPlaylistIE,
+    DeezerAlbumIE,
+)
  from .democracynow import DemocracynowIE
  from .dfb import DFBIE
  from .dhm import DHMIE
@@ -272,7 +280,6 @@
      DouyuTVIE,
  )
  from .dplay import DPlayIE
-from .dreisat import DreiSatIE
  from .drbonanza import DRBonanzaIE
  from .drtuber import DrTuberIE
  from .drtv import (
@@ -281,6 +288,10 @@
  )
  from .dtube import DTubeIE
  from .dvtv import DVTVIE
+from .duboku import (
+    DubokuIE,
+    DubokuPlaylistIE
+)
  from .dumpert import DumpertIE
  from .defense import DefenseGouvFrIE
  from .discovery import DiscoveryIE
@@ -292,6 +303,7 @@
  from .discoveryvr import DiscoveryVRIE
  from .disney import DisneyIE
  from .dispeak import DigitallySpeakingIE
+from .doodstream import DoodStreamIE
  from .dropbox import DropboxIE
  from .dw import (
      DWIE,
@@ -439,6 +451,7 @@
  )
  from .howcast import HowcastIE
  from .howstuffworks import HowStuffWorksIE
+from .hrfensehen import HRFernsehenIE
  from .hrti import (
      HRTiIE,
      HRTiPlaylistIE,
@@ -497,7 +510,6 @@
  from .jove import JoveIE
  from .joj import JojIE
  from .jwplatform import JWPlatformIE
-from .jpopsukitv import JpopsukiIE
  from .kakao import KakaoIE
  from .kaltura import KalturaIE
  from .kanalplay import KanalPlayIE
@@ -585,6 +597,7 @@
      LyndaCourseIE
  )
  from .m6 import M6IE
+from .magentamusik360 import MagentaMusik360IE
  from .mailru import (
      MailRuIE,
      MailRuMusicIE,
@@ -636,7 +649,10 @@
  from .mlb import MLBIE
  from .mnet import MnetIE
  from .moevideo import MoeVideoIE
-from .mofosex import MofosexIE
+from .mofosex import (
+    MofosexIE,
+    MofosexEmbedIE,
+)
  from .mojvideo import MojvideoIE
  from .morningstar import MorningstarIE
  from .motherless import (
@@ -664,6 +680,7 @@
      MyviIE,
      MyviEmbedIE,
  )
+from .myvideoge import MyVideoGeIE
  from .myvidster import MyVidsterIE
  from .nationalgeographic import (
      NationalGeographicVideoIE,
@@ -801,6 +818,16 @@
      ORFFM4IE,
      ORFFM4StoryIE,
      ORFOE1IE,
+    ORFOE3IE,
+    ORFNOEIE,
+    ORFWIEIE,
+    ORFBGLIE,
+    ORFOOEIE,
+    ORFSTMIE,
+    ORFKTNIE,
+    ORFSBGIE,
+    ORFTIRIE,
+    ORFVBGIE,
      ORFIPTVIE,
  )
  from .outsidetv import OutsideTVIE
@@ -808,7 +835,6 @@
      PacktPubIE,
      PacktPubCourseIE,
  )
-from .pandatv import PandaTVIE
  from .pandoratv import PandoraTVIE
  from .parliamentliveuk import ParliamentLiveUKIE
  from .patreon import PatreonIE
@@ -846,11 +872,15 @@
      PluralsightCourseIE,
  )
  from .podomatic import PodomaticIE
-from .pokemon import PokemonIE
+from .pokemon import (
+    PokemonIE,
+    PokemonWatchIE,
+)
  from .polskieradio import (
      PolskieRadioIE,
      PolskieRadioCategoryIE,
  )
+from .popcorntimes import PopcorntimesIE
  from .popcorntv import PopcornTVIE
  from .porn91 import Porn91IE
  from .porncom import PornComIE
@@ -963,7 +993,10 @@
  from .sbs import SBSIE
  from .screencast import ScreencastIE
  from .screencastomatic import ScreencastOMaticIE
-from .scrippsnetworks import ScrippsNetworksWatchIE
+from .scrippsnetworks import (
+    ScrippsNetworksWatchIE,
+    ScrippsNetworksIE,
+)
  from .scte import (
      SCTEIE,
      SCTECourseIE,
@@ -1041,6 +1074,11 @@
      BellatorIE,
      ParamountNetworkIE,
  )
+from .storyfire import (
+    StoryFireIE,
+    StoryFireUserIE,
+    StoryFireSeriesIE,
+)
  from .stitcher import StitcherIE
  from .sport5 import Sport5IE
  from .sportbox import SportBoxIE
@@ -1188,6 +1226,7 @@
  from .tvnoe import TVNoeIE
  from .tvnow import (
      TVNowIE,
+    TVNowFilmIE,
      TVNowNewIE,
      TVNowSeasonIE,
      TVNowAnnualIE,
@@ -1210,14 +1249,11 @@
  from .twentythreevideo import TwentyThreeVideoIE
  from .twitcasting import TwitCastingIE
  from .twitch import (
-    TwitchVideoIE,
-    TwitchChapterIE,
      TwitchVodIE,
-    TwitchProfileIE,
-    TwitchAllVideosIE,
-    TwitchUploadsIE,
-    TwitchPastBroadcastsIE,
-    TwitchHighlightsIE,
+    TwitchCollectionIE,
+    TwitchVideosIE,
+    TwitchVideosClipsIE,
+    TwitchVideosCollectionsIE,
      TwitchStreamIE,
      TwitchClipsIE,
  )
diff --git a/youtube_dl/extractor/extremetube.py b/youtube_dlc/extractor/extremetube.py

similarity index 100%

rename from youtube_dl/extractor/extremetube.py

rename to youtube_dlc/extractor/extremetube.py
diff --git a/youtube_dl/extractor/eyedotv.py b/youtube_dlc/extractor/eyedotv.py

similarity index 100%

rename from youtube_dl/extractor/eyedotv.py

rename to youtube_dlc/extractor/eyedotv.py
diff --git a/youtube_dl/extractor/facebook.py b/youtube_dlc/extractor/facebook.py

similarity index 96%

rename from youtube_dl/extractor/facebook.py

rename to youtube_dlc/extractor/facebook.py

index ce64e26831fdafceb97b6d8ae919c00a78f0f90f..610d6674592384922f9df7af4da5958592ce56bd 100644 (file)
--- a/youtube_dl/extractor/facebook.py
+++ b/youtube_dlc/extractor/facebook.py
@@ -466,15 +466,18 @@ def _real_extract(self, url):
              return info_dict
  
          if '/posts/' in url:
-            entries = [
-                self.url_result('facebook:%s' % vid, FacebookIE.ie_key())
-                for vid in self._parse_json(
-                    self._search_regex(
-                        r'(["\'])video_ids\1\s*:\s*(?P<ids>\[.+?\])',
-                        webpage, 'video ids', group='ids'),
-                    video_id)]
-
-            return self.playlist_result(entries, video_id)
+            video_id_json = self._search_regex(
+                r'(["\'])video_ids\1\s*:\s*(?P<ids>\[.+?\])', webpage, 'video ids', group='ids',
+                default='')
+            if video_id_json:
+                entries = [
+                    self.url_result('facebook:%s' % vid, FacebookIE.ie_key())
+                    for vid in self._parse_json(video_id_json, video_id)]
+                return self.playlist_result(entries, video_id)
+
+            # Single Video?
+            video_id = self._search_regex(r'video_id:\s*"([0-9]+)"', webpage, 'single video id')
+            return self.url_result('facebook:%s' % video_id, FacebookIE.ie_key())
          else:
              _, info_dict = self._extract_from_url(
                  self._VIDEO_PAGE_TEMPLATE % video_id,
diff --git a/youtube_dl/extractor/faz.py b/youtube_dlc/extractor/faz.py

similarity index 100%

rename from youtube_dl/extractor/faz.py

rename to youtube_dlc/extractor/faz.py
diff --git a/youtube_dl/extractor/fc2.py b/youtube_dlc/extractor/fc2.py

similarity index 100%

rename from youtube_dl/extractor/fc2.py

rename to youtube_dlc/extractor/fc2.py
diff --git a/youtube_dl/extractor/fczenit.py b/youtube_dlc/extractor/fczenit.py

similarity index 100%

rename from youtube_dl/extractor/fczenit.py

rename to youtube_dlc/extractor/fczenit.py
diff --git a/youtube_dl/extractor/filmon.py b/youtube_dlc/extractor/filmon.py

similarity index 100%

rename from youtube_dl/extractor/filmon.py

rename to youtube_dlc/extractor/filmon.py
diff --git a/youtube_dl/extractor/filmweb.py b/youtube_dlc/extractor/filmweb.py

similarity index 100%

rename from youtube_dl/extractor/filmweb.py

rename to youtube_dlc/extractor/filmweb.py
diff --git a/youtube_dl/extractor/firsttv.py b/youtube_dlc/extractor/firsttv.py

similarity index 100%

rename from youtube_dl/extractor/firsttv.py

rename to youtube_dlc/extractor/firsttv.py
diff --git a/youtube_dl/extractor/fivemin.py b/youtube_dlc/extractor/fivemin.py

similarity index 100%

rename from youtube_dl/extractor/fivemin.py

rename to youtube_dlc/extractor/fivemin.py
diff --git a/youtube_dl/extractor/fivetv.py b/youtube_dlc/extractor/fivetv.py

similarity index 100%

rename from youtube_dl/extractor/fivetv.py

rename to youtube_dlc/extractor/fivetv.py
diff --git a/youtube_dl/extractor/flickr.py b/youtube_dlc/extractor/flickr.py

similarity index 100%

rename from youtube_dl/extractor/flickr.py

rename to youtube_dlc/extractor/flickr.py
diff --git a/youtube_dl/extractor/folketinget.py b/youtube_dlc/extractor/folketinget.py

similarity index 100%

rename from youtube_dl/extractor/folketinget.py

rename to youtube_dlc/extractor/folketinget.py
diff --git a/youtube_dl/extractor/footyroom.py b/youtube_dlc/extractor/footyroom.py

similarity index 100%

rename from youtube_dl/extractor/footyroom.py

rename to youtube_dlc/extractor/footyroom.py
diff --git a/youtube_dl/extractor/formula1.py b/youtube_dlc/extractor/formula1.py

similarity index 100%

rename from youtube_dl/extractor/formula1.py

rename to youtube_dlc/extractor/formula1.py
diff --git a/youtube_dl/extractor/fourtube.py b/youtube_dlc/extractor/fourtube.py

similarity index 100%

rename from youtube_dl/extractor/fourtube.py

rename to youtube_dlc/extractor/fourtube.py
diff --git a/youtube_dl/extractor/fox.py b/youtube_dlc/extractor/fox.py

similarity index 100%

rename from youtube_dl/extractor/fox.py

rename to youtube_dlc/extractor/fox.py
diff --git a/youtube_dl/extractor/fox9.py b/youtube_dlc/extractor/fox9.py

similarity index 100%

rename from youtube_dl/extractor/fox9.py

rename to youtube_dlc/extractor/fox9.py
diff --git a/youtube_dl/extractor/foxgay.py b/youtube_dlc/extractor/foxgay.py

similarity index 100%

rename from youtube_dl/extractor/foxgay.py

rename to youtube_dlc/extractor/foxgay.py
diff --git a/youtube_dl/extractor/foxnews.py b/youtube_dlc/extractor/foxnews.py

similarity index 100%

rename from youtube_dl/extractor/foxnews.py

rename to youtube_dlc/extractor/foxnews.py
diff --git a/youtube_dl/extractor/foxsports.py b/youtube_dlc/extractor/foxsports.py

similarity index 100%

rename from youtube_dl/extractor/foxsports.py

rename to youtube_dlc/extractor/foxsports.py
diff --git a/youtube_dl/extractor/franceculture.py b/youtube_dlc/extractor/franceculture.py

similarity index 88%

rename from youtube_dl/extractor/franceculture.py

rename to youtube_dlc/extractor/franceculture.py

index b8fa175880f47d6050e4fa3908994bd9777131ac..306b45fc99a4c3495a233d8fb3c649032641d87a 100644 (file)
--- a/youtube_dl/extractor/franceculture.py
+++ b/youtube_dlc/extractor/franceculture.py
@@ -31,7 +31,13 @@ def _real_extract(self, url):
          webpage = self._download_webpage(url, display_id)
  
          video_data = extract_attributes(self._search_regex(
-            r'(?s)<div[^>]+class="[^"]*?(?:title-zone-diffusion|heading-zone-(?:wrapper|player-button))[^"]*?"[^>]*>.*?(<button[^>]+data-asset-source="[^"]+"[^>]+>)',
+            r'''(?sx)
+                (?:
+                    </h1>|
+                    <div[^>]+class="[^"]*?(?:title-zone-diffusion|heading-zone-(?:wrapper|player-button))[^"]*?"[^>]*>
+                ).*?
+                (<button[^>]+data-asset-source="[^"]+"[^>]+>)
+            ''',
              webpage, 'video data'))
  
          video_url = video_data['data-asset-source']
diff --git a/youtube_dl/extractor/franceinter.py b/youtube_dlc/extractor/franceinter.py

similarity index 100%

rename from youtube_dl/extractor/franceinter.py

rename to youtube_dlc/extractor/franceinter.py
diff --git a/youtube_dl/extractor/francetv.py b/youtube_dlc/extractor/francetv.py

similarity index 98%

rename from youtube_dl/extractor/francetv.py

rename to youtube_dlc/extractor/francetv.py

index 81b468c7d1e030f7ba67fed2a7ef2562c8164c76..e340cddba8f5118ac0f5fa0fee43c088bd080498 100644 (file)
--- a/youtube_dl/extractor/francetv.py
+++ b/youtube_dlc/extractor/francetv.py
@@ -316,13 +316,14 @@ class FranceTVInfoIE(FranceTVBaseInfoExtractor):
      _VALID_URL = r'https?://(?:www|mobile|france3-regions)\.francetvinfo\.fr/(?:[^/]+/)*(?P<id>[^/?#&.]+)'
  
      _TESTS = [{
-        'url': 'http://www.francetvinfo.fr/replay-jt/france-3/soir-3/jt-grand-soir-3-lundi-26-aout-2013_393427.html',
+        'url': 'https://www.francetvinfo.fr/replay-jt/france-3/soir-3/jt-grand-soir-3-jeudi-22-aout-2019_3561461.html',
          'info_dict': {
-            'id': '84981923',
+            'id': 'd12458ee-5062-48fe-bfdd-a30d6a01b793',
              'ext': 'mp4',
              'title': 'Soir 3',
-            'upload_date': '20130826',
-            'timestamp': 1377548400,
+            'upload_date': '20190822',
+            'timestamp': 1566510900,
+            'description': 'md5:72d167097237701d6e8452ff03b83c00',
              'subtitles': {
                  'fr': 'mincount:2',
              },
@@ -374,7 +375,8 @@ def _real_extract(self, url):
          video_id = self._search_regex(
              (r'player\.load[^;]+src:\s*["\']([^"\']+)',
               r'id-video=([^@]+@[^"]+)',
-             r'<a[^>]+href="(?:https?:)?//videos\.francetv\.fr/video/([^@]+@[^"]+)"'),
+             r'<a[^>]+href="(?:https?:)?//videos\.francetv\.fr/video/([^@]+@[^"]+)"',
+             r'data-id="([^"]+)"'),
              webpage, 'video id')
  
          return self._make_url_result(video_id)
diff --git a/youtube_dl/extractor/freesound.py b/youtube_dlc/extractor/freesound.py

similarity index 100%

rename from youtube_dl/extractor/freesound.py

rename to youtube_dlc/extractor/freesound.py
diff --git a/youtube_dl/extractor/freespeech.py b/youtube_dlc/extractor/freespeech.py

similarity index 100%

rename from youtube_dl/extractor/freespeech.py

rename to youtube_dlc/extractor/freespeech.py
diff --git a/youtube_dl/extractor/freshlive.py b/youtube_dlc/extractor/freshlive.py

similarity index 100%

rename from youtube_dl/extractor/freshlive.py

rename to youtube_dlc/extractor/freshlive.py
diff --git a/youtube_dl/extractor/frontendmasters.py b/youtube_dlc/extractor/frontendmasters.py

similarity index 100%

rename from youtube_dl/extractor/frontendmasters.py

rename to youtube_dlc/extractor/frontendmasters.py
diff --git a/youtube_dl/extractor/funimation.py b/youtube_dlc/extractor/funimation.py

similarity index 100%

rename from youtube_dl/extractor/funimation.py

rename to youtube_dlc/extractor/funimation.py
diff --git a/youtube_dl/extractor/funk.py b/youtube_dlc/extractor/funk.py

similarity index 100%

rename from youtube_dl/extractor/funk.py

rename to youtube_dlc/extractor/funk.py
diff --git a/youtube_dl/extractor/fusion.py b/youtube_dlc/extractor/fusion.py

similarity index 100%

rename from youtube_dl/extractor/fusion.py

rename to youtube_dlc/extractor/fusion.py
diff --git a/youtube_dl/extractor/fxnetworks.py b/youtube_dlc/extractor/fxnetworks.py

similarity index 100%

rename from youtube_dl/extractor/fxnetworks.py

rename to youtube_dlc/extractor/fxnetworks.py
diff --git a/youtube_dl/extractor/gaia.py b/youtube_dlc/extractor/gaia.py

similarity index 100%

rename from youtube_dl/extractor/gaia.py

rename to youtube_dlc/extractor/gaia.py
diff --git a/youtube_dl/extractor/gameinformer.py b/youtube_dlc/extractor/gameinformer.py

similarity index 100%

rename from youtube_dl/extractor/gameinformer.py

rename to youtube_dlc/extractor/gameinformer.py
diff --git a/youtube_dl/extractor/gamespot.py b/youtube_dlc/extractor/gamespot.py

similarity index 100%

rename from youtube_dl/extractor/gamespot.py

rename to youtube_dlc/extractor/gamespot.py
diff --git a/youtube_dl/extractor/gamestar.py b/youtube_dlc/extractor/gamestar.py

similarity index 100%

rename from youtube_dl/extractor/gamestar.py

rename to youtube_dlc/extractor/gamestar.py
diff --git a/youtube_dl/extractor/gaskrank.py b/youtube_dlc/extractor/gaskrank.py

similarity index 100%

rename from youtube_dl/extractor/gaskrank.py

rename to youtube_dlc/extractor/gaskrank.py
diff --git a/youtube_dl/extractor/gazeta.py b/youtube_dlc/extractor/gazeta.py

similarity index 100%

rename from youtube_dl/extractor/gazeta.py

rename to youtube_dlc/extractor/gazeta.py
diff --git a/youtube_dl/extractor/gdcvault.py b/youtube_dlc/extractor/gdcvault.py

similarity index 100%

rename from youtube_dl/extractor/gdcvault.py

rename to youtube_dlc/extractor/gdcvault.py
diff --git a/youtube_dl/extractor/generic.py b/youtube_dlc/extractor/generic.py

similarity index 98%

rename from youtube_dl/extractor/generic.py

rename to youtube_dlc/extractor/generic.py

index 743ef47dbe2b6a534a65be67dee28e78aa8fe215..aba06b328e19fad48faee2d83eb42494af672f94 100644 (file)
--- a/youtube_dl/extractor/generic.py
+++ b/youtube_dlc/extractor/generic.py
@@ -60,6 +60,9 @@
  from .drtuber import DrTuberIE
  from .redtube import RedTubeIE
  from .tube8 import Tube8IE
+from .mofosex import MofosexEmbedIE
+from .spankwire import SpankwireIE
+from .youporn import YouPornIE
  from .vimeo import VimeoIE
  from .dailymotion import DailymotionIE
  from .dailymail import DailyMailIE
@@ -1705,6 +1708,15 @@ class GenericIE(InfoExtractor):
              },
              'add_ie': ['Kaltura'],
          },
+        {
+            # multiple kaltura embeds, nsfw
+            'url': 'https://www.quartier-rouge.be/prive/femmes/kamila-avec-video-jaime-sadomie.html',
+            'info_dict': {
+                'id': 'kamila-avec-video-jaime-sadomie',
+                'title': "Kamila avec vídeo “J'aime sadomie”",
+            },
+            'playlist_count': 8,
+        },
          {
              # Non-standard Vimeo embed
              'url': 'https://openclassrooms.com/courses/understanding-the-web',
@@ -1935,7 +1947,7 @@ class GenericIE(InfoExtractor):
          },
          {
              # vshare embed
-            'url': 'https://youtube-dl-demo.neocities.org/vshare.html',
+            'url': 'https://youtube-dlc-demo.neocities.org/vshare.html',
              'md5': '17b39f55b5497ae8b59f5fbce8e35886',
              'info_dict': {
                  'id': '0f64ce6',
@@ -2098,6 +2110,9 @@ class GenericIE(InfoExtractor):
                  'ext': 'mp4',
                  'title': 'Smoky Barbecue Favorites',
                  'thumbnail': r're:^https?://.*\.jpe?g',
+                'description': 'md5:5ff01e76316bd8d46508af26dc86023b',
+                'upload_date': '20170909',
+                'timestamp': 1504915200,
              },
              'add_ie': [ZypeIE.ie_key()],
              'params': {
@@ -2248,7 +2263,7 @@ def _real_extract(self, url):
                      if default_search == 'auto_warning':
                          if re.match(r'^(?:url|URL)$', url):
                              raise ExtractorError(
-                                'Invalid URL:  %r . Call youtube-dl like this:  youtube-dl -v "https://www.youtube.com/watch?v=BaW_jenozKc"  ' % url,
+                                'Invalid URL:  %r . Call youtube-dlc like this:  youtube-dlc -v "https://www.youtube.com/watch?v=BaW_jenozKc"  ' % url,
                                  expected=True)
                          else:
                              self._downloader.report_warning(
@@ -2258,7 +2273,7 @@ def _real_extract(self, url):
              if default_search in ('error', 'fixup_error'):
                  raise ExtractorError(
                      '%r is not a valid URL. '
-                    'Set --default-search "ytsearch" (or run  youtube-dl "ytsearch:%s" ) to search YouTube'
+                    'Set --default-search "ytsearch" (or run  youtube-dlc "ytsearch:%s" ) to search YouTube'
                      % (url, url), expected=True)
              else:
                  if ':' not in default_search:
@@ -2284,7 +2299,7 @@ def _real_extract(self, url):
  
          if head_response is not False:
              # Check for redirect
-            new_url = compat_str(head_response.geturl())
+            new_url = head_response.geturl()
              if url != new_url:
                  self.report_following_redirect(new_url)
                  if force_videoid:
@@ -2334,7 +2349,7 @@ def _real_extract(self, url):
              request = sanitized_Request(url)
              # Some webservers may serve compressed content of rather big size (e.g. gzipped flac)
              # making it impossible to download only chunk of the file (yet we need only 512kB to
-            # test whether it's HTML or not). According to youtube-dl default Accept-Encoding
+            # test whether it's HTML or not). According to youtube-dlc default Accept-Encoding
              # that will always result in downloading the whole file that is not desirable.
              # Therefore for extraction pass we have to override Accept-Encoding to any in order
              # to accept raw bytes and being able to download only a chunk.
@@ -2384,12 +2399,12 @@ def _real_extract(self, url):
                  return self.playlist_result(
                      self._parse_xspf(
                          doc, video_id, xspf_url=url,
-                        xspf_base_url=compat_str(full_response.geturl())),
+                        xspf_base_url=full_response.geturl()),
                      video_id)
              elif re.match(r'(?i)^(?:{[^}]+})?MPD$', doc.tag):
                  info_dict['formats'] = self._parse_mpd_formats(
                      doc,
-                    mpd_base_url=compat_str(full_response.geturl()).rpartition('/')[0],
+                    mpd_base_url=full_response.geturl().rpartition('/')[0],
                      mpd_url=url)
                  self._sort_formats(info_dict['formats'])
                  return info_dict
@@ -2533,15 +2548,21 @@ def _real_extract(self, url):
              return self.playlist_from_matches(
                  dailymail_urls, video_id, video_title, ie=DailyMailIE.ie_key())
  
+        # Look for Teachable embeds, must be before Wistia
+        teachable_url = TeachableIE._extract_url(webpage, url)
+        if teachable_url:
+            return self.url_result(teachable_url)
+
          # Look for embedded Wistia player
-        wistia_url = WistiaIE._extract_url(webpage)
-        if wistia_url:
-            return {
-                '_type': 'url_transparent',
-                'url': self._proto_relative_url(wistia_url),
-                'ie_key': WistiaIE.ie_key(),
-                'uploader': video_uploader,
-            }
+        wistia_urls = WistiaIE._extract_urls(webpage)
+        if wistia_urls:
+            playlist = self.playlist_from_matches(wistia_urls, video_id, video_title, ie=WistiaIE.ie_key())
+            for entry in playlist['entries']:
+                entry.update({
+                    '_type': 'url_transparent',
+                    'uploader': video_uploader,
+                })
+            return playlist
  
          # Look for SVT player
          svt_url = SVTIE._extract_url(webpage)
@@ -2706,6 +2727,21 @@ def _real_extract(self, url):
          if tube8_urls:
              return self.playlist_from_matches(tube8_urls, video_id, video_title, ie=Tube8IE.ie_key())
  
+        # Look for embedded Mofosex player
+        mofosex_urls = MofosexEmbedIE._extract_urls(webpage)
+        if mofosex_urls:
+            return self.playlist_from_matches(mofosex_urls, video_id, video_title, ie=MofosexEmbedIE.ie_key())
+
+        # Look for embedded Spankwire player
+        spankwire_urls = SpankwireIE._extract_urls(webpage)
+        if spankwire_urls:
+            return self.playlist_from_matches(spankwire_urls, video_id, video_title, ie=SpankwireIE.ie_key())
+
+        # Look for embedded YouPorn player
+        youporn_urls = YouPornIE._extract_urls(webpage)
+        if youporn_urls:
+            return self.playlist_from_matches(youporn_urls, video_id, video_title, ie=YouPornIE.ie_key())
+
          # Look for embedded Tvigle player
          mobj = re.search(
              r'<iframe[^>]+?src=(["\'])(?P<url>(?:https?:)?//cloud\.tvigle\.ru/video/.+?)\1', webpage)
@@ -2817,9 +2853,12 @@ def _real_extract(self, url):
              return self.url_result(mobj.group('url'), 'Zapiks')
  
          # Look for Kaltura embeds
-        kaltura_url = KalturaIE._extract_url(webpage)
-        if kaltura_url:
-            return self.url_result(smuggle_url(kaltura_url, {'source_url': url}), KalturaIE.ie_key())
+        kaltura_urls = KalturaIE._extract_urls(webpage)
+        if kaltura_urls:
+            return self.playlist_from_matches(
+                kaltura_urls, video_id, video_title,
+                getter=lambda x: smuggle_url(x, {'source_url': url}),
+                ie=KalturaIE.ie_key())
  
          # Look for EaglePlatform embeds
          eagleplatform_url = EaglePlatformIE._extract_url(webpage)
@@ -2960,7 +2999,7 @@ def _real_extract(self, url):
  
          # Look for VODPlatform embeds
          mobj = re.search(
-            r'<iframe[^>]+src=(["\'])(?P<url>(?:https?:)?//(?:www\.)?vod-platform\.net/[eE]mbed/.+?)\1',
+            r'<iframe[^>]+src=(["\'])(?P<url>(?:https?:)?//(?:(?:www\.)?vod-platform\.net|embed\.kwikmotion\.com)/[eE]mbed/.+?)\1',
              webpage)
          if mobj is not None:
              return self.url_result(
@@ -3137,10 +3176,6 @@ def _real_extract(self, url):
              return self.playlist_from_matches(
                  peertube_urls, video_id, video_title, ie=PeerTubeIE.ie_key())
  
-        teachable_url = TeachableIE._extract_url(webpage, url)
-        if teachable_url:
-            return self.url_result(teachable_url)
-
          indavideo_urls = IndavideoEmbedIE._extract_urls(webpage)
          if indavideo_urls:
              return self.playlist_from_matches(
@@ -3337,7 +3372,7 @@ def filter_video(urls):
  
          if not found:
              # twitter:player is a https URL to iframe player that may or may not
-            # be supported by youtube-dl thus this is checked the very last (see
+            # be supported by youtube-dlc thus this is checked the very last (see
              # https://dev.twitter.com/cards/types/player#On_twitter.com_via_desktop_browser)
              embed_url = self._html_search_meta('twitter:player', webpage, default=None)
              if embed_url and embed_url != url:
diff --git a/youtube_dl/extractor/gfycat.py b/youtube_dlc/extractor/gfycat.py

similarity index 100%

rename from youtube_dl/extractor/gfycat.py

rename to youtube_dlc/extractor/gfycat.py
diff --git a/youtube_dl/extractor/giantbomb.py b/youtube_dlc/extractor/giantbomb.py

similarity index 90%

rename from youtube_dl/extractor/giantbomb.py

rename to youtube_dlc/extractor/giantbomb.py

index 6a1b1e96ebf4dc59f7f5e13dc18e3ee08ef1110c..c6477958d2766704ade1ba25bc2ed68676655889 100644 (file)
--- a/youtube_dl/extractor/giantbomb.py
+++ b/youtube_dlc/extractor/giantbomb.py
@@ -13,10 +13,10 @@
  
  
  class GiantBombIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:www\.)?giantbomb\.com/videos/(?P<display_id>[^/]+)/(?P<id>\d+-\d+)'
-    _TEST = {
+    _VALID_URL = r'https?://(?:www\.)?giantbomb\.com/(?:videos|shows)/(?P<display_id>[^/]+)/(?P<id>\d+-\d+)'
+    _TESTS = [{
          'url': 'http://www.giantbomb.com/videos/quick-look-destiny-the-dark-below/2300-9782/',
-        'md5': 'c8ea694254a59246a42831155dec57ac',
+        'md5': '132f5a803e7e0ab0e274d84bda1e77ae',
          'info_dict': {
              'id': '2300-9782',
              'display_id': 'quick-look-destiny-the-dark-below',
@@ -26,7 +26,10 @@ class GiantBombIE(InfoExtractor):
              'duration': 2399,
              'thumbnail': r're:^https?://.*\.jpg$',
          }
-    }
+    }, {
+        'url': 'https://www.giantbomb.com/shows/ben-stranding/2970-20212',
+        'only_matching': True,
+    }]
  
      def _real_extract(self, url):
          mobj = re.match(self._VALID_URL, url)
diff --git a/youtube_dl/extractor/giga.py b/youtube_dlc/extractor/giga.py

similarity index 100%

rename from youtube_dl/extractor/giga.py

rename to youtube_dlc/extractor/giga.py
diff --git a/youtube_dl/extractor/gigya.py b/youtube_dlc/extractor/gigya.py

similarity index 100%

rename from youtube_dl/extractor/gigya.py

rename to youtube_dlc/extractor/gigya.py
diff --git a/youtube_dl/extractor/glide.py b/youtube_dlc/extractor/glide.py

similarity index 100%

rename from youtube_dl/extractor/glide.py

rename to youtube_dlc/extractor/glide.py
diff --git a/youtube_dl/extractor/globo.py b/youtube_dlc/extractor/globo.py

similarity index 100%

rename from youtube_dl/extractor/globo.py

rename to youtube_dlc/extractor/globo.py
diff --git a/youtube_dl/extractor/go.py b/youtube_dlc/extractor/go.py

similarity index 100%

rename from youtube_dl/extractor/go.py

rename to youtube_dlc/extractor/go.py
diff --git a/youtube_dl/extractor/godtube.py b/youtube_dlc/extractor/godtube.py

similarity index 100%

rename from youtube_dl/extractor/godtube.py

rename to youtube_dlc/extractor/godtube.py
diff --git a/youtube_dl/extractor/golem.py b/youtube_dlc/extractor/golem.py

similarity index 100%

rename from youtube_dl/extractor/golem.py

rename to youtube_dlc/extractor/golem.py
diff --git a/youtube_dl/extractor/googledrive.py b/youtube_dlc/extractor/googledrive.py

similarity index 99%

rename from youtube_dl/extractor/googledrive.py

rename to youtube_dlc/extractor/googledrive.py

index 589e4d5c371480d590b504dd1a3738a858c80790..886fdd5328dee16ee68edc292a12cdabc81f4a06 100644 (file)
--- a/youtube_dl/extractor/googledrive.py
+++ b/youtube_dlc/extractor/googledrive.py
@@ -265,6 +265,8 @@ def add_source_format(src_url):
              subtitles_id = ttsurl.encode('utf-8').decode(
                  'unicode_escape').split('=')[-1]
  
+        self._downloader.cookiejar.clear(domain='.google.com', path='/', name='NID')
+
          return {
              'id': video_id,
              'title': title,
diff --git a/youtube_dl/extractor/googleplus.py b/youtube_dlc/extractor/googleplus.py

similarity index 100%

rename from youtube_dl/extractor/googleplus.py

rename to youtube_dlc/extractor/googleplus.py
diff --git a/youtube_dl/extractor/googlesearch.py b/youtube_dlc/extractor/googlesearch.py

similarity index 100%

rename from youtube_dl/extractor/googlesearch.py

rename to youtube_dlc/extractor/googlesearch.py
diff --git a/youtube_dl/extractor/goshgay.py b/youtube_dlc/extractor/goshgay.py

similarity index 100%

rename from youtube_dl/extractor/goshgay.py

rename to youtube_dlc/extractor/goshgay.py
diff --git a/youtube_dl/extractor/gputechconf.py b/youtube_dlc/extractor/gputechconf.py

similarity index 100%

rename from youtube_dl/extractor/gputechconf.py

rename to youtube_dlc/extractor/gputechconf.py
diff --git a/youtube_dl/extractor/groupon.py b/youtube_dlc/extractor/groupon.py

similarity index 100%

rename from youtube_dl/extractor/groupon.py

rename to youtube_dlc/extractor/groupon.py
diff --git a/youtube_dl/extractor/hbo.py b/youtube_dlc/extractor/hbo.py

similarity index 100%

rename from youtube_dl/extractor/hbo.py

rename to youtube_dlc/extractor/hbo.py
diff --git a/youtube_dl/extractor/hearthisat.py b/youtube_dlc/extractor/hearthisat.py

similarity index 100%

rename from youtube_dl/extractor/hearthisat.py

rename to youtube_dlc/extractor/hearthisat.py
diff --git a/youtube_dl/extractor/heise.py b/youtube_dlc/extractor/heise.py

similarity index 100%

rename from youtube_dl/extractor/heise.py

rename to youtube_dlc/extractor/heise.py
diff --git a/youtube_dlc/extractor/hellporno.py b/youtube_dlc/extractor/hellporno.py

new file mode 100644 (file)

index 0000000..fae4251
--- /dev/null
+++ b/youtube_dlc/extractor/hellporno.py
@@ -0,0 +1,76 @@
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..utils import (
+    int_or_none,
+    merge_dicts,
+    remove_end,
+    unified_timestamp,
+)
+
+
+class HellPornoIE(InfoExtractor):
+    _VALID_URL = r'https?://(?:www\.)?hellporno\.(?:com/videos|net/v)/(?P<id>[^/]+)'
+    _TESTS = [{
+        'url': 'http://hellporno.com/videos/dixie-is-posing-with-naked-ass-very-erotic/',
+        'md5': 'f0a46ebc0bed0c72ae8fe4629f7de5f3',
+        'info_dict': {
+            'id': '149116',
+            'display_id': 'dixie-is-posing-with-naked-ass-very-erotic',
+            'ext': 'mp4',
+            'title': 'Dixie is posing with naked ass very erotic',
+            'description': 'md5:9a72922749354edb1c4b6e540ad3d215',
+            'categories': list,
+            'thumbnail': r're:https?://.*\.jpg$',
+            'duration': 240,
+            'timestamp': 1398762720,
+            'upload_date': '20140429',
+            'view_count': int,
+            'age_limit': 18,
+        },
+    }, {
+        'url': 'http://hellporno.net/v/186271/',
+        'only_matching': True,
+    }]
+
+    def _real_extract(self, url):
+        display_id = self._match_id(url)
+
+        webpage = self._download_webpage(url, display_id)
+
+        title = remove_end(self._html_search_regex(
+            r'<title>([^<]+)</title>', webpage, 'title'), ' - Hell Porno')
+
+        info = self._parse_html5_media_entries(url, webpage, display_id)[0]
+        self._sort_formats(info['formats'])
+
+        video_id = self._search_regex(
+            (r'chs_object\s*=\s*["\'](\d+)',
+             r'params\[["\']video_id["\']\]\s*=\s*(\d+)'), webpage, 'video id',
+            default=display_id)
+        description = self._search_regex(
+            r'class=["\']desc_video_view_v2[^>]+>([^<]+)', webpage,
+            'description', fatal=False)
+        categories = [
+            c.strip()
+            for c in self._html_search_meta(
+                'keywords', webpage, 'categories', default='').split(',')
+            if c.strip()]
+        duration = int_or_none(self._og_search_property(
+            'video:duration', webpage, fatal=False))
+        timestamp = unified_timestamp(self._og_search_property(
+            'video:release_date', webpage, fatal=False))
+        view_count = int_or_none(self._search_regex(
+            r'>Views\s+(\d+)', webpage, 'view count', fatal=False))
+
+        return merge_dicts(info, {
+            'id': video_id,
+            'display_id': display_id,
+            'title': title,
+            'description': description,
+            'categories': categories,
+            'duration': duration,
+            'timestamp': timestamp,
+            'view_count': view_count,
+            'age_limit': 18,
+        })
diff --git a/youtube_dl/extractor/helsinki.py b/youtube_dlc/extractor/helsinki.py

similarity index 100%

rename from youtube_dl/extractor/helsinki.py

rename to youtube_dlc/extractor/helsinki.py
diff --git a/youtube_dl/extractor/hentaistigma.py b/youtube_dlc/extractor/hentaistigma.py

similarity index 100%

rename from youtube_dl/extractor/hentaistigma.py

rename to youtube_dlc/extractor/hentaistigma.py
diff --git a/youtube_dl/extractor/hgtv.py b/youtube_dlc/extractor/hgtv.py

similarity index 100%

rename from youtube_dl/extractor/hgtv.py

rename to youtube_dlc/extractor/hgtv.py
diff --git a/youtube_dl/extractor/hidive.py b/youtube_dlc/extractor/hidive.py

similarity index 100%

rename from youtube_dl/extractor/hidive.py

rename to youtube_dlc/extractor/hidive.py
diff --git a/youtube_dl/extractor/historicfilms.py b/youtube_dlc/extractor/historicfilms.py

similarity index 100%

rename from youtube_dl/extractor/historicfilms.py

rename to youtube_dlc/extractor/historicfilms.py
diff --git a/youtube_dl/extractor/hitbox.py b/youtube_dlc/extractor/hitbox.py

similarity index 100%

rename from youtube_dl/extractor/hitbox.py

rename to youtube_dlc/extractor/hitbox.py
diff --git a/youtube_dl/extractor/hitrecord.py b/youtube_dlc/extractor/hitrecord.py

similarity index 100%

rename from youtube_dl/extractor/hitrecord.py

rename to youtube_dlc/extractor/hitrecord.py
diff --git a/youtube_dl/extractor/hketv.py b/youtube_dlc/extractor/hketv.py

similarity index 100%

rename from youtube_dl/extractor/hketv.py

rename to youtube_dlc/extractor/hketv.py
diff --git a/youtube_dl/extractor/hornbunny.py b/youtube_dlc/extractor/hornbunny.py

similarity index 100%

rename from youtube_dl/extractor/hornbunny.py

rename to youtube_dlc/extractor/hornbunny.py
diff --git a/youtube_dl/extractor/hotnewhiphop.py b/youtube_dlc/extractor/hotnewhiphop.py

similarity index 100%

rename from youtube_dl/extractor/hotnewhiphop.py

rename to youtube_dlc/extractor/hotnewhiphop.py
diff --git a/youtube_dl/extractor/hotstar.py b/youtube_dlc/extractor/hotstar.py

similarity index 86%

rename from youtube_dl/extractor/hotstar.py

rename to youtube_dlc/extractor/hotstar.py

index f97eefa3d6789cf58b05a6e6bf91d290463cf444..4d27fea4e8c068c482fec1a707068879931de836 100644 (file)
--- a/youtube_dl/extractor/hotstar.py
+++ b/youtube_dlc/extractor/hotstar.py
@@ -6,6 +6,7 @@
  import re
  import time
  import uuid
+import json
  
  from .common import InfoExtractor
  from ..compat import (
@@ -30,16 +31,25 @@ def _call_api_impl(self, path, video_id, query):
          exp = st + 6000
          auth = 'st=%d~exp=%d~acl=/*' % (st, exp)
          auth += '~hmac=' + hmac.new(self._AKAMAI_ENCRYPTION_KEY, auth.encode(), hashlib.sha256).hexdigest()
+        token = self._download_json(
+            'https://api.hotstar.com/in/aadhar/v2/web/in/user/guest-signup',
+            video_id, note='Downloading token',
+            data=json.dumps({"idType": "device", "id": compat_str(uuid.uuid4())}).encode('utf-8'),
+            headers={
+                'hotstarauth': auth,
+                'Content-Type': 'application/json',
+            })['description']['userIdentity']
          response = self._download_json(
              'https://api.hotstar.com/' + path, video_id, headers={
                  'hotstarauth': auth,
-                'x-country-code': 'IN',
-                'x-platform-code': 'JIO',
+                'x-hs-appversion': '6.72.2',
+                'x-hs-platform': 'web',
+                'x-hs-usertoken': token,
              }, query=query)
-        if response['statusCode'] != 'OK':
+        if response['message'] != "Playback URL's fetched successfully":
              raise ExtractorError(
-                response['body']['message'], expected=True)
-        return response['body']['results']
+                response['message'], expected=True)
+        return response['data']
  
      def _call_api(self, path, video_id, query_name='contentId'):
          return self._call_api_impl(path, video_id, {
@@ -49,13 +59,11 @@ def _call_api(self, path, video_id, query_name='contentId'):
  
      def _call_api_v2(self, path, video_id):
          return self._call_api_impl(
-            '%s/in/contents/%s' % (path, video_id), video_id, {
-                'desiredConfig': 'encryption:plain;ladder:phone,tv;package:hls,dash',
-                'client': 'mweb',
-                'clientVersion': '6.18.0',
-                'deviceId': compat_str(uuid.uuid4()),
-                'osName': 'Windows',
-                'osVersion': '10',
+            '%s/content/%s' % (path, video_id), video_id, {
+                'desired-config': 'encryption:plain;ladder:phone,tv;package:hls,dash',
+                'device-id': compat_str(uuid.uuid4()),
+                'os-name': 'Windows',
+                'os-version': '10',
              })
  
  
@@ -121,7 +129,7 @@ def _real_extract(self, url):
          headers = {'Referer': url}
          formats = []
          geo_restricted = False
-        playback_sets = self._call_api_v2('h/v2/play', video_id)['playBackSets']
+        playback_sets = self._call_api_v2('play/v1/playback', video_id)['playBackSets']
          for playback_set in playback_sets:
              if not isinstance(playback_set, dict):
                  continue
diff --git a/youtube_dl/extractor/howcast.py b/youtube_dlc/extractor/howcast.py

similarity index 100%

rename from youtube_dl/extractor/howcast.py

rename to youtube_dlc/extractor/howcast.py
diff --git a/youtube_dl/extractor/howstuffworks.py b/youtube_dlc/extractor/howstuffworks.py

similarity index 100%

rename from youtube_dl/extractor/howstuffworks.py

rename to youtube_dlc/extractor/howstuffworks.py
diff --git a/youtube_dlc/extractor/hrfensehen.py b/youtube_dlc/extractor/hrfensehen.py

new file mode 100644 (file)

index 0000000..805345e
--- /dev/null
+++ b/youtube_dlc/extractor/hrfensehen.py
@@ -0,0 +1,102 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+import json
+import re
+
+from youtube_dlc.utils import int_or_none, unified_timestamp, unescapeHTML
+from .common import InfoExtractor
+
+
+class HRFernsehenIE(InfoExtractor):
+    IE_NAME = 'hrfernsehen'
+    _VALID_URL = r'^https?://www\.(?:hr-fernsehen|hessenschau)\.de/.*,video-(?P<id>[0-9]{6})\.html'
+
+    _TESTS = [{
+        'url': 'https://www.hessenschau.de/tv-sendung/hessenschau-vom-26082020,video-130546.html',
+        'md5': '5c4e0ba94677c516a2f65a84110fc536',
+        'info_dict': {
+            'id': '130546',
+            'ext': 'mp4',
+            'description': 'Sturmtief Kirsten fegt über Hessen / Die Corona-Pandemie – eine Chronologie / '
+                           'Sterbehilfe: Die Lage in Hessen / Miss Hessen leitet zwei eigene Unternehmen / '
+                           'Pop-Up Museum zeigt Schwarze Unterhaltung und Black Music',
+            'subtitles': {'de': [{
+                'url': 'https://hr-a.akamaihd.net/video/as/hessenschau/2020_08/hrLogo_200826200407_L385592_512x288-25p-500kbit.vtt'
+            }]},
+            'timestamp': 1598470200,
+            'upload_date': '20200826',
+            'thumbnails': [{
+                'url': 'https://www.hessenschau.de/tv-sendung/hs_ganz-1554~_t-1598465545029_v-16to9.jpg',
+                'id': '0'
+            }, {
+                'url': 'https://www.hessenschau.de/tv-sendung/hs_ganz-1554~_t-1598465545029_v-16to9__medium.jpg',
+                'id': '1'
+            }],
+            'title': 'hessenschau vom 26.08.2020'
+        }
+    }, {
+        'url': 'https://www.hr-fernsehen.de/sendungen-a-z/mex/sendungen/fair-und-gut---was-hinter-aldis-eigenem-guetesiegel-steckt,video-130544.html',
+        'only_matching': True
+    }]
+
+    _GEO_COUNTRIES = ['DE']
+
+    def extract_airdate(self, loader_data):
+        airdate_str = loader_data.get('mediaMetadata', {}).get('agf', {}).get('airdate')
+
+        if airdate_str is None:
+            return None
+
+        return unified_timestamp(airdate_str)
+
+    def extract_formats(self, loader_data):
+        stream_formats = []
+        for stream_obj in loader_data["videoResolutionLevels"]:
+            stream_format = {
+                'format_id': str(stream_obj['verticalResolution']) + "p",
+                'height': stream_obj['verticalResolution'],
+                'url': stream_obj['url'],
+            }
+
+            quality_information = re.search(r'([0-9]{3,4})x([0-9]{3,4})-([0-9]{2})p-([0-9]{3,4})kbit',
+                                            stream_obj['url'])
+            if quality_information:
+                stream_format['width'] = int_or_none(quality_information.group(1))
+                stream_format['height'] = int_or_none(quality_information.group(2))
+                stream_format['fps'] = int_or_none(quality_information.group(3))
+                stream_format['tbr'] = int_or_none(quality_information.group(4))
+
+            stream_formats.append(stream_format)
+
+        self._sort_formats(stream_formats)
+        return stream_formats
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+        webpage = self._download_webpage(url, video_id)
+
+        title = self._html_search_meta(
+            ['og:title', 'twitter:title', 'name'], webpage)
+        description = self._html_search_meta(
+            ['description'], webpage)
+
+        loader_str = unescapeHTML(self._search_regex(r"data-hr-mediaplayer-loader='([^']*)'", webpage, "ardloader"))
+        loader_data = json.loads(loader_str)
+
+        info = {
+            'id': video_id,
+            'title': title,
+            'description': description,
+            'formats': self.extract_formats(loader_data),
+            'timestamp': self.extract_airdate(loader_data)
+        }
+
+        if "subtitle" in loader_data:
+            info["subtitles"] = {"de": [{"url": loader_data["subtitle"]}]}
+
+        thumbnails = list(set([t for t in loader_data.get("previewImageUrl", {}).values()]))
+        if len(thumbnails) > 0:
+            info["thumbnails"] = [{"url": t} for t in thumbnails]
+
+        return info
diff --git a/youtube_dl/extractor/hrti.py b/youtube_dlc/extractor/hrti.py

similarity index 100%

rename from youtube_dl/extractor/hrti.py

rename to youtube_dlc/extractor/hrti.py
diff --git a/youtube_dl/extractor/huajiao.py b/youtube_dlc/extractor/huajiao.py

similarity index 100%

rename from youtube_dl/extractor/huajiao.py

rename to youtube_dlc/extractor/huajiao.py
diff --git a/youtube_dl/extractor/huffpost.py b/youtube_dlc/extractor/huffpost.py

similarity index 100%

rename from youtube_dl/extractor/huffpost.py

rename to youtube_dlc/extractor/huffpost.py
diff --git a/youtube_dl/extractor/hungama.py b/youtube_dlc/extractor/hungama.py

similarity index 100%

rename from youtube_dl/extractor/hungama.py

rename to youtube_dlc/extractor/hungama.py
diff --git a/youtube_dl/extractor/hypem.py b/youtube_dlc/extractor/hypem.py

similarity index 100%

rename from youtube_dl/extractor/hypem.py

rename to youtube_dlc/extractor/hypem.py
diff --git a/youtube_dl/extractor/ign.py b/youtube_dlc/extractor/ign.py

similarity index 100%

rename from youtube_dl/extractor/ign.py

rename to youtube_dlc/extractor/ign.py
diff --git a/youtube_dl/extractor/imdb.py b/youtube_dlc/extractor/imdb.py

similarity index 71%

rename from youtube_dl/extractor/imdb.py

rename to youtube_dlc/extractor/imdb.py

index 436759da5480da347a578aba6cb388cb01e97448..a31301985b0c7d212886a2e6e495c7d705714041 100644 (file)
--- a/youtube_dl/extractor/imdb.py
+++ b/youtube_dlc/extractor/imdb.py
@@ -1,5 +1,7 @@
  from __future__ import unicode_literals
  
+import base64
+import json
  import re
  
  from .common import InfoExtractor
@@ -8,6 +10,7 @@
      mimetype2ext,
      parse_duration,
      qualities,
+    try_get,
      url_or_none,
  )
  
@@ -15,15 +18,16 @@
  class ImdbIE(InfoExtractor):
      IE_NAME = 'imdb'
      IE_DESC = 'Internet Movie Database trailers'
-    _VALID_URL = r'https?://(?:www|m)\.imdb\.com/(?:video|title|list).+?[/-]vi(?P<id>\d+)'
+    _VALID_URL = r'https?://(?:www|m)\.imdb\.com/(?:video|title|list).*?[/-]vi(?P<id>\d+)'
  
      _TESTS = [{
          'url': 'http://www.imdb.com/video/imdb/vi2524815897',
          'info_dict': {
              'id': '2524815897',
              'ext': 'mp4',
-            'title': 'No. 2 from Ice Age: Continental Drift (2012)',
+            'title': 'No. 2',
              'description': 'md5:87bd0bdc61e351f21f20d2d7441cb4e7',
+            'duration': 152,
          }
      }, {
          'url': 'http://www.imdb.com/video/_/vi2524815897',
@@ -47,21 +51,23 @@ class ImdbIE(InfoExtractor):
  
      def _real_extract(self, url):
          video_id = self._match_id(url)
-        webpage = self._download_webpage(
-            'https://www.imdb.com/videoplayer/vi' + video_id, video_id)
-        video_metadata = self._parse_json(self._search_regex(
-            r'window\.IMDbReactInitialState\.push\(({.+?})\);', webpage,
-            'video metadata'), video_id)['videos']['videoMetadata']['vi' + video_id]
-        title = self._html_search_meta(
-            ['og:title', 'twitter:title'], webpage) or self._html_search_regex(
-            r'<title>(.+?)</title>', webpage, 'title', fatal=False) or video_metadata['title']
+
+        data = self._download_json(
+            'https://www.imdb.com/ve/data/VIDEO_PLAYBACK_DATA', video_id,
+            query={
+                'key': base64.b64encode(json.dumps({
+                    'type': 'VIDEO_PLAYER',
+                    'subType': 'FORCE_LEGACY',
+                    'id': 'vi%s' % video_id,
+                }).encode()).decode(),
+            })[0]
  
          quality = qualities(('SD', '480p', '720p', '1080p'))
          formats = []
-        for encoding in video_metadata.get('encodings', []):
+        for encoding in data['videoLegacyEncodings']:
              if not encoding or not isinstance(encoding, dict):
                  continue
-            video_url = url_or_none(encoding.get('videoUrl'))
+            video_url = url_or_none(encoding.get('url'))
              if not video_url:
                  continue
              ext = mimetype2ext(encoding.get(
@@ -69,7 +75,7 @@ def _real_extract(self, url):
              if ext == 'm3u8':
                  formats.extend(self._extract_m3u8_formats(
                      video_url, video_id, 'mp4', entry_protocol='m3u8_native',
-                    m3u8_id='hls', fatal=False))
+                    preference=1, m3u8_id='hls', fatal=False))
                  continue
              format_id = encoding.get('definition')
              formats.append({
@@ -80,13 +86,33 @@ def _real_extract(self, url):
              })
          self._sort_formats(formats)
  
+        webpage = self._download_webpage(
+            'https://www.imdb.com/video/vi' + video_id, video_id)
+        video_metadata = self._parse_json(self._search_regex(
+            r'args\.push\(\s*({.+?})\s*\)\s*;', webpage,
+            'video metadata'), video_id)
+
+        video_info = video_metadata.get('VIDEO_INFO')
+        if video_info and isinstance(video_info, dict):
+            info = try_get(
+                video_info, lambda x: x[list(video_info.keys())[0]][0], dict)
+        else:
+            info = {}
+
+        title = self._html_search_meta(
+            ['og:title', 'twitter:title'], webpage) or self._html_search_regex(
+            r'<title>(.+?)</title>', webpage, 'title',
+            default=None) or info['videoTitle']
+
          return {
              'id': video_id,
              'title': title,
+            'alt_title': info.get('videoSubTitle'),
              'formats': formats,
-            'description': video_metadata.get('description'),
-            'thumbnail': video_metadata.get('slate', {}).get('url'),
-            'duration': parse_duration(video_metadata.get('duration')),
+            'description': info.get('videoDescription'),
+            'thumbnail': url_or_none(try_get(
+                video_metadata, lambda x: x['videoSlate']['source'])),
+            'duration': parse_duration(info.get('videoRuntime')),
          }
  
  
diff --git a/youtube_dl/extractor/imggaming.py b/youtube_dlc/extractor/imggaming.py

similarity index 100%

rename from youtube_dl/extractor/imggaming.py

rename to youtube_dlc/extractor/imggaming.py
diff --git a/youtube_dl/extractor/imgur.py b/youtube_dlc/extractor/imgur.py

similarity index 97%

rename from youtube_dl/extractor/imgur.py

rename to youtube_dlc/extractor/imgur.py

index a5ba03efae57e64cb0d05caaff8e9dd700aa6a0e..4dc7b0b5c0ede5df3e433890655377c22602eafa 100644 (file)
--- a/youtube_dl/extractor/imgur.py
+++ b/youtube_dlc/extractor/imgur.py
@@ -60,7 +60,7 @@ def _real_extract(self, url):
                  'width': width,
                  'height': height,
                  'http_headers': {
-                    'User-Agent': 'youtube-dl (like wget)',
+                    'User-Agent': 'youtube-dlc (like wget)',
                  },
              })
  
@@ -82,7 +82,7 @@ def _real_extract(self, url):
                  'url': self._proto_relative_url(gifd['gifUrl']),
                  'filesize': gifd.get('size'),
                  'http_headers': {
-                    'User-Agent': 'youtube-dl (like wget)',
+                    'User-Agent': 'youtube-dlc (like wget)',
                  },
              })
  
diff --git a/youtube_dl/extractor/ina.py b/youtube_dlc/extractor/ina.py

similarity index 100%

rename from youtube_dl/extractor/ina.py

rename to youtube_dlc/extractor/ina.py
diff --git a/youtube_dl/extractor/inc.py b/youtube_dlc/extractor/inc.py

similarity index 100%

rename from youtube_dl/extractor/inc.py

rename to youtube_dlc/extractor/inc.py
diff --git a/youtube_dl/extractor/indavideo.py b/youtube_dlc/extractor/indavideo.py

similarity index 97%

rename from youtube_dl/extractor/indavideo.py

rename to youtube_dlc/extractor/indavideo.py

index 2b5b2b5b0b303aa4c1b6bdb4a6e1226dea11e218..4c16243ec1976676391a5e07a3b35e1140a5ec7a 100644 (file)
--- a/youtube_dl/extractor/indavideo.py
+++ b/youtube_dlc/extractor/indavideo.py
@@ -58,7 +58,7 @@ def _real_extract(self, url):
          video_id = self._match_id(url)
  
          video = self._download_json(
-            'http://amfphp.indavideo.hu/SYm0json.php/player.playerHandler.getVideoData/%s' % video_id,
+            'https://amfphp.indavideo.hu/SYm0json.php/player.playerHandler.getVideoData/%s' % video_id,
              video_id)['data']
  
          title = video['title']
diff --git a/youtube_dl/extractor/infoq.py b/youtube_dlc/extractor/infoq.py

similarity index 100%

rename from youtube_dl/extractor/infoq.py

rename to youtube_dlc/extractor/infoq.py
diff --git a/youtube_dl/extractor/instagram.py b/youtube_dlc/extractor/instagram.py

similarity index 100%

rename from youtube_dl/extractor/instagram.py

rename to youtube_dlc/extractor/instagram.py
diff --git a/youtube_dl/extractor/internazionale.py b/youtube_dlc/extractor/internazionale.py

similarity index 100%

rename from youtube_dl/extractor/internazionale.py

rename to youtube_dlc/extractor/internazionale.py
diff --git a/youtube_dl/extractor/internetvideoarchive.py b/youtube_dlc/extractor/internetvideoarchive.py

similarity index 100%

rename from youtube_dl/extractor/internetvideoarchive.py

rename to youtube_dlc/extractor/internetvideoarchive.py
diff --git a/youtube_dl/extractor/iprima.py b/youtube_dlc/extractor/iprima.py

similarity index 81%

rename from youtube_dl/extractor/iprima.py

rename to youtube_dlc/extractor/iprima.py

index 11bbeb5922a9d85e05977196c822a076c8b45ec3..53a550c11e4407813deb12f646a0c714436862b5 100644 (file)
--- a/youtube_dl/extractor/iprima.py
+++ b/youtube_dlc/extractor/iprima.py
@@ -16,12 +16,22 @@ class IPrimaIE(InfoExtractor):
      _GEO_BYPASS = False
  
      _TESTS = [{
-        'url': 'http://play.iprima.cz/gondici-s-r-o-33',
+        'url': 'https://prima.iprima.cz/particka/92-epizoda',
          'info_dict': {
-            'id': 'p136534',
+            'id': 'p51388',
              'ext': 'mp4',
-            'title': 'Gondíci s. r. o. (34)',
-            'description': 'md5:16577c629d006aa91f59ca8d8e7f99bd',
+            'title': 'Partička (92)',
+            'description': 'md5:859d53beae4609e6dd7796413f1b6cac',
+        },
+        'params': {
+            'skip_download': True,  # m3u8 download
+        },
+    }, {
+        'url': 'https://cnn.iprima.cz/videa/70-epizoda',
+        'info_dict': {
+            'id': 'p681554',
+            'ext': 'mp4',
+            'title': 'HLAVNÍ ZPRÁVY 3.5.2020',
          },
          'params': {
              'skip_download': True,  # m3u8 download
@@ -68,9 +78,15 @@ def _real_extract(self, url):
  
          webpage = self._download_webpage(url, video_id)
  
+        title = self._og_search_title(
+            webpage, default=None) or self._search_regex(
+            r'<h1>([^<]+)', webpage, 'title')
+
          video_id = self._search_regex(
              (r'<iframe[^>]+\bsrc=["\'](?:https?:)?//(?:api\.play-backend\.iprima\.cz/prehravac/embedded|prima\.iprima\.cz/[^/]+/[^/]+)\?.*?\bid=(p\d+)',
-             r'data-product="([^"]+)">'),
+             r'data-product="([^"]+)">',
+             r'id=["\']player-(p\d+)"',
+             r'playerId\s*:\s*["\']player-(p\d+)'),
              webpage, 'real id')
  
          playerpage = self._download_webpage(
@@ -125,8 +141,8 @@ def extract_formats(format_url, format_key=None, lang=None):
  
          return {
              'id': video_id,
-            'title': self._og_search_title(webpage),
-            'thumbnail': self._og_search_thumbnail(webpage),
+            'title': title,
+            'thumbnail': self._og_search_thumbnail(webpage, default=None),
              'formats': formats,
-            'description': self._og_search_description(webpage),
+            'description': self._og_search_description(webpage, default=None),
          }
diff --git a/youtube_dl/extractor/iqiyi.py b/youtube_dlc/extractor/iqiyi.py

similarity index 100%

rename from youtube_dl/extractor/iqiyi.py

rename to youtube_dlc/extractor/iqiyi.py
diff --git a/youtube_dl/extractor/ir90tv.py b/youtube_dlc/extractor/ir90tv.py

similarity index 100%

rename from youtube_dl/extractor/ir90tv.py

rename to youtube_dlc/extractor/ir90tv.py
diff --git a/youtube_dl/extractor/itv.py b/youtube_dlc/extractor/itv.py

similarity index 100%

rename from youtube_dl/extractor/itv.py

rename to youtube_dlc/extractor/itv.py
diff --git a/youtube_dl/extractor/ivi.py b/youtube_dlc/extractor/ivi.py

similarity index 98%

rename from youtube_dl/extractor/ivi.py

rename to youtube_dlc/extractor/ivi.py

index a502e88066850b14de284e6e3ec7d47ea9397f3d..b9cb5a8e6bba0e7d6d4de35b2e69d98d56fda67a 100644 (file)
--- a/youtube_dl/extractor/ivi.py
+++ b/youtube_dlc/extractor/ivi.py
@@ -142,7 +142,7 @@ def _real_extract(self, url):
                      continue
                  elif bundled:
                      raise ExtractorError(
-                        'This feature does not work from bundled exe. Run youtube-dl from sources.',
+                        'This feature does not work from bundled exe. Run youtube-dlc from sources.',
                          expected=True)
                  elif not pycryptodomex_found:
                      raise ExtractorError(
@@ -239,7 +239,7 @@ def _extract_entries(self, html, compilation_id):
              self.url_result(
                  'http://www.ivi.ru/watch/%s/%s' % (compilation_id, serie), IviIE.ie_key())
              for serie in re.findall(
-                r'<a href="/watch/%s/(\d+)"[^>]+data-id="\1"' % compilation_id, html)]
+                r'<a\b[^>]+\bhref=["\']/watch/%s/(\d+)["\']' % compilation_id, html)]
  
      def _real_extract(self, url):
          mobj = re.match(self._VALID_URL, url)
diff --git a/youtube_dl/extractor/ivideon.py b/youtube_dlc/extractor/ivideon.py

similarity index 100%

rename from youtube_dl/extractor/ivideon.py

rename to youtube_dlc/extractor/ivideon.py
diff --git a/youtube_dl/extractor/iwara.py b/youtube_dlc/extractor/iwara.py

similarity index 100%

rename from youtube_dl/extractor/iwara.py

rename to youtube_dlc/extractor/iwara.py
diff --git a/youtube_dl/extractor/izlesene.py b/youtube_dlc/extractor/izlesene.py

similarity index 100%

rename from youtube_dl/extractor/izlesene.py

rename to youtube_dlc/extractor/izlesene.py
diff --git a/youtube_dl/extractor/jamendo.py b/youtube_dlc/extractor/jamendo.py

similarity index 100%

rename from youtube_dl/extractor/jamendo.py

rename to youtube_dlc/extractor/jamendo.py
diff --git a/youtube_dl/extractor/jeuxvideo.py b/youtube_dlc/extractor/jeuxvideo.py

similarity index 100%

rename from youtube_dl/extractor/jeuxvideo.py

rename to youtube_dlc/extractor/jeuxvideo.py
diff --git a/youtube_dl/extractor/joj.py b/youtube_dlc/extractor/joj.py

similarity index 97%

rename from youtube_dl/extractor/joj.py

rename to youtube_dlc/extractor/joj.py

index 62b28e9809856abaca23c4690c4670cacc96965a..63761818350b764742a3f8057f11d0e8f350fb6e 100644 (file)
--- a/youtube_dl/extractor/joj.py
+++ b/youtube_dlc/extractor/joj.py
@@ -1,108 +1,108 @@
-# coding: utf-8\r
-from __future__ import unicode_literals\r
-\r
-import re\r
-\r
-from .common import InfoExtractor\r
-from ..compat import compat_str\r
-from ..utils import (\r
-    int_or_none,\r
-    js_to_json,\r
-    try_get,\r
-)\r
-\r
-\r
-class JojIE(InfoExtractor):\r
-    _VALID_URL = r'''(?x)\r
-                    (?:\r
-                        joj:|\r
-                        https?://media\.joj\.sk/embed/\r
-                    )\r
-                    (?P<id>[^/?#^]+)\r
-                '''\r
-    _TESTS = [{\r
-        'url': 'https://media.joj.sk/embed/a388ec4c-6019-4a4a-9312-b1bee194e932',\r
-        'info_dict': {\r
-            'id': 'a388ec4c-6019-4a4a-9312-b1bee194e932',\r
-            'ext': 'mp4',\r
-            'title': 'NOVÉ BÝVANIE',\r
-            'thumbnail': r're:^https?://.*\.jpg$',\r
-            'duration': 3118,\r
-        }\r
-    }, {\r
-        'url': 'https://media.joj.sk/embed/9i1cxv',\r
-        'only_matching': True,\r
-    }, {\r
-        'url': 'joj:a388ec4c-6019-4a4a-9312-b1bee194e932',\r
-        'only_matching': True,\r
-    }, {\r
-        'url': 'joj:9i1cxv',\r
-        'only_matching': True,\r
-    }]\r
-\r
-    @staticmethod\r
-    def _extract_urls(webpage):\r
-        return [\r
-            mobj.group('url')\r
-            for mobj in re.finditer(\r
-                r'<iframe\b[^>]+\bsrc=(["\'])(?P<url>(?:https?:)?//media\.joj\.sk/embed/(?:(?!\1).)+)\1',\r
-                webpage)]\r
-\r
-    def _real_extract(self, url):\r
-        video_id = self._match_id(url)\r
-\r
-        webpage = self._download_webpage(\r
-            'https://media.joj.sk/embed/%s' % video_id, video_id)\r
-\r
-        title = self._search_regex(\r
-            (r'videoTitle\s*:\s*(["\'])(?P<title>(?:(?!\1).)+)\1',\r
-             r'<title>(?P<title>[^<]+)'), webpage, 'title',\r
-            default=None, group='title') or self._og_search_title(webpage)\r
-\r
-        bitrates = self._parse_json(\r
-            self._search_regex(\r
-                r'(?s)(?:src|bitrates)\s*=\s*({.+?});', webpage, 'bitrates',\r
-                default='{}'),\r
-            video_id, transform_source=js_to_json, fatal=False)\r
-\r
-        formats = []\r
-        for format_url in try_get(bitrates, lambda x: x['mp4'], list) or []:\r
-            if isinstance(format_url, compat_str):\r
-                height = self._search_regex(\r
-                    r'(\d+)[pP]\.', format_url, 'height', default=None)\r
-                formats.append({\r
-                    'url': format_url,\r
-                    'format_id': '%sp' % height if height else None,\r
-                    'height': int(height),\r
-                })\r
-        if not formats:\r
-            playlist = self._download_xml(\r
-                'https://media.joj.sk/services/Video.php?clip=%s' % video_id,\r
-                video_id)\r
-            for file_el in playlist.findall('./files/file'):\r
-                path = file_el.get('path')\r
-                if not path:\r
-                    continue\r
-                format_id = file_el.get('id') or file_el.get('label')\r
-                formats.append({\r
-                    'url': 'http://n16.joj.sk/storage/%s' % path.replace(\r
-                        'dat/', '', 1),\r
-                    'format_id': format_id,\r
-                    'height': int_or_none(self._search_regex(\r
-                        r'(\d+)[pP]', format_id or path, 'height',\r
-                        default=None)),\r
-                })\r
-        self._sort_formats(formats)\r
-\r
-        thumbnail = self._og_search_thumbnail(webpage)\r
-\r
-        duration = int_or_none(self._search_regex(\r
-            r'videoDuration\s*:\s*(\d+)', webpage, 'duration', fatal=False))\r
-\r
-        return {\r
-            'id': video_id,\r
-            'title': title,\r
-            'thumbnail': thumbnail,\r
-            'duration': duration,\r
-            'formats': formats,\r
-        }\r
+# coding: utf-8
+from __future__ import unicode_literals
+
+import re
+
+from .common import InfoExtractor
+from ..compat import compat_str
+from ..utils import (
+    int_or_none,
+    js_to_json,
+    try_get,
+)
+
+
+class JojIE(InfoExtractor):
+    _VALID_URL = r'''(?x)
+                    (?:
+                        joj:|
+                        https?://media\.joj\.sk/embed/
+                    )
+                    (?P<id>[^/?#^]+)
+                '''
+    _TESTS = [{
+        'url': 'https://media.joj.sk/embed/a388ec4c-6019-4a4a-9312-b1bee194e932',
+        'info_dict': {
+            'id': 'a388ec4c-6019-4a4a-9312-b1bee194e932',
+            'ext': 'mp4',
+            'title': 'NOVÉ BÝVANIE',
+            'thumbnail': r're:^https?://.*\.jpg$',
+            'duration': 3118,
+        }
+    }, {
+        'url': 'https://media.joj.sk/embed/9i1cxv',
+        'only_matching': True,
+    }, {
+        'url': 'joj:a388ec4c-6019-4a4a-9312-b1bee194e932',
+        'only_matching': True,
+    }, {
+        'url': 'joj:9i1cxv',
+        'only_matching': True,
+    }]
+
+    @staticmethod
+    def _extract_urls(webpage):
+        return [
+            mobj.group('url')
+            for mobj in re.finditer(
+                r'<iframe\b[^>]+\bsrc=(["\'])(?P<url>(?:https?:)?//media\.joj\.sk/embed/(?:(?!\1).)+)\1',
+                webpage)]
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+
+        webpage = self._download_webpage(
+            'https://media.joj.sk/embed/%s' % video_id, video_id)
+
+        title = self._search_regex(
+            (r'videoTitle\s*:\s*(["\'])(?P<title>(?:(?!\1).)+)\1',
+             r'<title>(?P<title>[^<]+)'), webpage, 'title',
+            default=None, group='title') or self._og_search_title(webpage)
+
+        bitrates = self._parse_json(
+            self._search_regex(
+                r'(?s)(?:src|bitrates)\s*=\s*({.+?});', webpage, 'bitrates',
+                default='{}'),
+            video_id, transform_source=js_to_json, fatal=False)
+
+        formats = []
+        for format_url in try_get(bitrates, lambda x: x['mp4'], list) or []:
+            if isinstance(format_url, compat_str):
+                height = self._search_regex(
+                    r'(\d+)[pP]\.', format_url, 'height', default=None)
+                formats.append({
+                    'url': format_url,
+                    'format_id': '%sp' % height if height else None,
+                    'height': int(height),
+                })
+        if not formats:
+            playlist = self._download_xml(
+                'https://media.joj.sk/services/Video.php?clip=%s' % video_id,
+                video_id)
+            for file_el in playlist.findall('./files/file'):
+                path = file_el.get('path')
+                if not path:
+                    continue
+                format_id = file_el.get('id') or file_el.get('label')
+                formats.append({
+                    'url': 'http://n16.joj.sk/storage/%s' % path.replace(
+                        'dat/', '', 1),
+                    'format_id': format_id,
+                    'height': int_or_none(self._search_regex(
+                        r'(\d+)[pP]', format_id or path, 'height',
+                        default=None)),
+                })
+        self._sort_formats(formats)
+
+        thumbnail = self._og_search_thumbnail(webpage)
+
+        duration = int_or_none(self._search_regex(
+            r'videoDuration\s*:\s*(\d+)', webpage, 'duration', fatal=False))
+
+        return {
+            'id': video_id,
+            'title': title,
+            'thumbnail': thumbnail,
+            'duration': duration,
+            'formats': formats,
+        }
diff --git a/youtube_dl/extractor/jove.py b/youtube_dlc/extractor/jove.py

similarity index 100%

rename from youtube_dl/extractor/jove.py

rename to youtube_dlc/extractor/jove.py
diff --git a/youtube_dl/extractor/jwplatform.py b/youtube_dlc/extractor/jwplatform.py

similarity index 80%

rename from youtube_dl/extractor/jwplatform.py

rename to youtube_dlc/extractor/jwplatform.py

index 2aabd98b5bbaf36efecdc76415bf52854705bd1e..c34b5f5e6bd9e7d38e762f5d82f3669ac2c438a2 100644 (file)
--- a/youtube_dl/extractor/jwplatform.py
+++ b/youtube_dlc/extractor/jwplatform.py
@@ -4,6 +4,7 @@
  import re
  
  from .common import InfoExtractor
+from ..utils import unsmuggle_url
  
  
  class JWPlatformIE(InfoExtractor):
@@ -32,10 +33,14 @@ def _extract_url(webpage):
      @staticmethod
      def _extract_urls(webpage):
          return re.findall(
-            r'<(?:script|iframe)[^>]+?src=["\']((?:https?:)?//content\.jwplatform\.com/players/[a-zA-Z0-9]{8})',
+            r'<(?:script|iframe)[^>]+?src=["\']((?:https?:)?//(?:content\.jwplatform|cdn\.jwplayer)\.com/players/[a-zA-Z0-9]{8})',
              webpage)
  
      def _real_extract(self, url):
+        url, smuggled_data = unsmuggle_url(url, {})
+        self._initialize_geo_bypass({
+            'countries': smuggled_data.get('geo_countries'),
+        })
          video_id = self._match_id(url)
          json_data = self._download_json('https://cdn.jwplayer.com/v2/media/' + video_id, video_id)
          return self._parse_jwplayer_data(json_data, video_id)
diff --git a/youtube_dl/extractor/kakao.py b/youtube_dlc/extractor/kakao.py

similarity index 100%

rename from youtube_dl/extractor/kakao.py

rename to youtube_dlc/extractor/kakao.py
diff --git a/youtube_dl/extractor/kaltura.py b/youtube_dlc/extractor/kaltura.py

similarity index 97%

rename from youtube_dl/extractor/kaltura.py

rename to youtube_dlc/extractor/kaltura.py

index 2d38b758b72a852c6d9718f0537c62e7c215e903..49d13460df7f0edd4d2a08f97deaf831ba9d6a46 100644 (file)
--- a/youtube_dl/extractor/kaltura.py
+++ b/youtube_dlc/extractor/kaltura.py
@@ -113,9 +113,14 @@ class KalturaIE(InfoExtractor):
  
      @staticmethod
      def _extract_url(webpage):
+        urls = KalturaIE._extract_urls(webpage)
+        return urls[0] if urls else None
+
+    @staticmethod
+    def _extract_urls(webpage):
          # Embed codes: https://knowledge.kaltura.com/embedding-kaltura-media-players-your-site
-        mobj = (
-            re.search(
+        finditer = (
+            re.finditer(
                  r"""(?xs)
                      kWidget\.(?:thumb)?[Ee]mbed\(
                      \{.*?
@@ -124,7 +129,7 @@ def _extract_url(webpage):
                          (?P<q3>['"])entry_?[Ii]d(?P=q3)\s*:\s*
                          (?P<q4>['"])(?P<id>(?:(?!(?P=q4)).)+)(?P=q4)(?:,|\s*\})
                  """, webpage)
-            or re.search(
+            or re.finditer(
                  r'''(?xs)
                      (?P<q1>["'])
                          (?:https?:)?//cdnapi(?:sec)?\.kaltura\.com(?::\d+)?/(?:(?!(?P=q1)).)*\b(?:p|partner_id)/(?P<partner_id>\d+)(?:(?!(?P=q1)).)*
@@ -138,7 +143,7 @@ def _extract_url(webpage):
                      )
                      (?P<q3>["'])(?P<id>(?:(?!(?P=q3)).)+)(?P=q3)
                  ''', webpage)
-            or re.search(
+            or re.finditer(
                  r'''(?xs)
                      <(?:iframe[^>]+src|meta[^>]+\bcontent)=(?P<q1>["'])
                        (?:https?:)?//(?:(?:www|cdnapi(?:sec)?)\.)?kaltura\.com/(?:(?!(?P=q1)).)*\b(?:p|partner_id)/(?P<partner_id>\d+)
@@ -148,7 +153,8 @@ def _extract_url(webpage):
                      (?P=q1)
                  ''', webpage)
          )
-        if mobj:
+        urls = []
+        for mobj in finditer:
              embed_info = mobj.groupdict()
              for k, v in embed_info.items():
                  if v:
@@ -160,7 +166,8 @@ def _extract_url(webpage):
                  webpage)
              if service_mobj:
                  url = smuggle_url(url, {'service_url': service_mobj.group('id')})
-            return url
+            urls.append(url)
+        return urls
  
      def _kaltura_api_call(self, video_id, actions, service_url=None, *args, **kwargs):
          params = actions[0]
diff --git a/youtube_dl/extractor/kanalplay.py b/youtube_dlc/extractor/kanalplay.py

similarity index 100%

rename from youtube_dl/extractor/kanalplay.py

rename to youtube_dlc/extractor/kanalplay.py
diff --git a/youtube_dl/extractor/kankan.py b/youtube_dlc/extractor/kankan.py

similarity index 100%

rename from youtube_dl/extractor/kankan.py

rename to youtube_dlc/extractor/kankan.py
diff --git a/youtube_dl/extractor/karaoketv.py b/youtube_dlc/extractor/karaoketv.py

similarity index 100%

rename from youtube_dl/extractor/karaoketv.py

rename to youtube_dlc/extractor/karaoketv.py
diff --git a/youtube_dl/extractor/karrierevideos.py b/youtube_dlc/extractor/karrierevideos.py

similarity index 100%

rename from youtube_dl/extractor/karrierevideos.py

rename to youtube_dlc/extractor/karrierevideos.py
diff --git a/youtube_dl/extractor/keezmovies.py b/youtube_dlc/extractor/keezmovies.py

similarity index 100%

rename from youtube_dl/extractor/keezmovies.py

rename to youtube_dlc/extractor/keezmovies.py
diff --git a/youtube_dl/extractor/ketnet.py b/youtube_dlc/extractor/ketnet.py

similarity index 100%

rename from youtube_dl/extractor/ketnet.py

rename to youtube_dlc/extractor/ketnet.py
diff --git a/youtube_dl/extractor/khanacademy.py b/youtube_dlc/extractor/khanacademy.py

similarity index 100%

rename from youtube_dl/extractor/khanacademy.py

rename to youtube_dlc/extractor/khanacademy.py
diff --git a/youtube_dl/extractor/kickstarter.py b/youtube_dlc/extractor/kickstarter.py

similarity index 100%

rename from youtube_dl/extractor/kickstarter.py

rename to youtube_dlc/extractor/kickstarter.py
diff --git a/youtube_dl/extractor/kinja.py b/youtube_dlc/extractor/kinja.py

similarity index 100%

rename from youtube_dl/extractor/kinja.py

rename to youtube_dlc/extractor/kinja.py
diff --git a/youtube_dl/extractor/kinopoisk.py b/youtube_dlc/extractor/kinopoisk.py

similarity index 100%

rename from youtube_dl/extractor/kinopoisk.py

rename to youtube_dlc/extractor/kinopoisk.py
diff --git a/youtube_dl/extractor/konserthusetplay.py b/youtube_dlc/extractor/konserthusetplay.py

similarity index 100%

rename from youtube_dl/extractor/konserthusetplay.py

rename to youtube_dlc/extractor/konserthusetplay.py
diff --git a/youtube_dl/extractor/krasview.py b/youtube_dlc/extractor/krasview.py

similarity index 100%

rename from youtube_dl/extractor/krasview.py

rename to youtube_dlc/extractor/krasview.py
diff --git a/youtube_dl/extractor/ku6.py b/youtube_dlc/extractor/ku6.py

similarity index 100%

rename from youtube_dl/extractor/ku6.py

rename to youtube_dlc/extractor/ku6.py
diff --git a/youtube_dl/extractor/kusi.py b/youtube_dlc/extractor/kusi.py

similarity index 100%

rename from youtube_dl/extractor/kusi.py

rename to youtube_dlc/extractor/kusi.py
diff --git a/youtube_dl/extractor/kuwo.py b/youtube_dlc/extractor/kuwo.py

similarity index 100%

rename from youtube_dl/extractor/kuwo.py

rename to youtube_dlc/extractor/kuwo.py
diff --git a/youtube_dl/extractor/la7.py b/youtube_dlc/extractor/la7.py

similarity index 100%

rename from youtube_dl/extractor/la7.py

rename to youtube_dlc/extractor/la7.py
diff --git a/youtube_dl/extractor/laola1tv.py b/youtube_dlc/extractor/laola1tv.py

similarity index 100%

rename from youtube_dl/extractor/laola1tv.py

rename to youtube_dlc/extractor/laola1tv.py
diff --git a/youtube_dl/extractor/lci.py b/youtube_dlc/extractor/lci.py

similarity index 100%

rename from youtube_dl/extractor/lci.py

rename to youtube_dlc/extractor/lci.py
diff --git a/youtube_dl/extractor/lcp.py b/youtube_dlc/extractor/lcp.py

similarity index 100%

rename from youtube_dl/extractor/lcp.py

rename to youtube_dlc/extractor/lcp.py
diff --git a/youtube_dl/extractor/lecture2go.py b/youtube_dlc/extractor/lecture2go.py

similarity index 100%

rename from youtube_dl/extractor/lecture2go.py

rename to youtube_dlc/extractor/lecture2go.py
diff --git a/youtube_dl/extractor/lecturio.py b/youtube_dlc/extractor/lecturio.py

similarity index 98%

rename from youtube_dl/extractor/lecturio.py

rename to youtube_dlc/extractor/lecturio.py

index 6ed7da4abaa7a2a45f924b4bf9f919261a40bec9..1b2dcef46621237fd7c7ce376165a6bc5c674606 100644 (file)
--- a/youtube_dl/extractor/lecturio.py
+++ b/youtube_dlc/extractor/lecturio.py
@@ -4,7 +4,6 @@
  import re
  
  from .common import InfoExtractor
-from ..compat import compat_str
  from ..utils import (
      clean_html,
      determine_ext,
@@ -36,7 +35,7 @@ def _login(self):
              self._LOGIN_URL, None, 'Downloading login popup')
  
          def is_logged(url_handle):
-            return self._LOGIN_URL not in compat_str(url_handle.geturl())
+            return self._LOGIN_URL not in url_handle.geturl()
  
          # Already logged in
          if is_logged(urlh):
diff --git a/youtube_dl/extractor/leeco.py b/youtube_dlc/extractor/leeco.py

similarity index 100%

rename from youtube_dl/extractor/leeco.py

rename to youtube_dlc/extractor/leeco.py
diff --git a/youtube_dlc/extractor/lego.py b/youtube_dlc/extractor/lego.py

new file mode 100644 (file)

index 0000000..1e3c19d
--- /dev/null
+++ b/youtube_dlc/extractor/lego.py
@@ -0,0 +1,149 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+import re
+import uuid
+
+from .common import InfoExtractor
+from ..compat import compat_HTTPError
+from ..utils import (
+    ExtractorError,
+    int_or_none,
+    qualities,
+)
+
+
+class LEGOIE(InfoExtractor):
+    _VALID_URL = r'https?://(?:www\.)?lego\.com/(?P<locale>[a-z]{2}-[a-z]{2})/(?:[^/]+/)*videos/(?:[^/]+/)*[^/?#]+-(?P<id>[0-9a-f]{32})'
+    _TESTS = [{
+        'url': 'http://www.lego.com/en-us/videos/themes/club/blocumentary-kawaguchi-55492d823b1b4d5e985787fa8c2973b1',
+        'md5': 'f34468f176cfd76488767fc162c405fa',
+        'info_dict': {
+            'id': '55492d82-3b1b-4d5e-9857-87fa8c2973b1_en-US',
+            'ext': 'mp4',
+            'title': 'Blocumentary Great Creations: Akiyuki Kawaguchi',
+            'description': 'Blocumentary Great Creations: Akiyuki Kawaguchi',
+        },
+    }, {
+        # geo-restricted but the contentUrl contain a valid url
+        'url': 'http://www.lego.com/nl-nl/videos/themes/nexoknights/episode-20-kingdom-of-heroes-13bdc2299ab24d9685701a915b3d71e7##sp=399',
+        'md5': 'c7420221f7ffd03ff056f9db7f8d807c',
+        'info_dict': {
+            'id': '13bdc229-9ab2-4d96-8570-1a915b3d71e7_nl-NL',
+            'ext': 'mp4',
+            'title': 'Aflevering 20:  Helden van het koninkrijk',
+            'description': 'md5:8ee499aac26d7fa8bcb0cedb7f9c3941',
+            'age_limit': 5,
+        },
+    }, {
+        # with subtitle
+        'url': 'https://www.lego.com/nl-nl/kids/videos/classic/creative-storytelling-the-little-puppy-aa24f27c7d5242bc86102ebdc0f24cba',
+        'info_dict': {
+            'id': 'aa24f27c-7d52-42bc-8610-2ebdc0f24cba_nl-NL',
+            'ext': 'mp4',
+            'title': 'De kleine puppy',
+            'description': 'md5:5b725471f849348ac73f2e12cfb4be06',
+            'age_limit': 1,
+            'subtitles': {
+                'nl': [{
+                    'ext': 'srt',
+                    'url': r're:^https://.+\.srt$',
+                }],
+            },
+        },
+        'params': {
+            'skip_download': True,
+        },
+    }]
+    _QUALITIES = {
+        'Lowest': (64, 180, 320),
+        'Low': (64, 270, 480),
+        'Medium': (96, 360, 640),
+        'High': (128, 540, 960),
+        'Highest': (128, 720, 1280),
+    }
+
+    def _real_extract(self, url):
+        locale, video_id = re.match(self._VALID_URL, url).groups()
+        countries = [locale.split('-')[1].upper()]
+        self._initialize_geo_bypass({
+            'countries': countries,
+        })
+
+        try:
+            item = self._download_json(
+                # https://contentfeed.services.lego.com/api/v2/item/[VIDEO_ID]?culture=[LOCALE]&contentType=Video
+                'https://services.slingshot.lego.com/mediaplayer/v2',
+                video_id, query={
+                    'videoId': '%s_%s' % (uuid.UUID(video_id), locale),
+                }, headers=self.geo_verification_headers())
+        except ExtractorError as e:
+            if isinstance(e.cause, compat_HTTPError) and e.cause.code == 451:
+                self.raise_geo_restricted(countries=countries)
+            raise
+
+        video = item['Video']
+        video_id = video['Id']
+        title = video['Title']
+
+        q = qualities(['Lowest', 'Low', 'Medium', 'High', 'Highest'])
+        formats = []
+        for video_source in item.get('VideoFormats', []):
+            video_source_url = video_source.get('Url')
+            if not video_source_url:
+                continue
+            video_source_format = video_source.get('Format')
+            if video_source_format == 'F4M':
+                formats.extend(self._extract_f4m_formats(
+                    video_source_url, video_id,
+                    f4m_id=video_source_format, fatal=False))
+            elif video_source_format == 'M3U8':
+                formats.extend(self._extract_m3u8_formats(
+                    video_source_url, video_id, 'mp4', 'm3u8_native',
+                    m3u8_id=video_source_format, fatal=False))
+            else:
+                video_source_quality = video_source.get('Quality')
+                format_id = []
+                for v in (video_source_format, video_source_quality):
+                    if v:
+                        format_id.append(v)
+                f = {
+                    'format_id': '-'.join(format_id),
+                    'quality': q(video_source_quality),
+                    'url': video_source_url,
+                }
+                quality = self._QUALITIES.get(video_source_quality)
+                if quality:
+                    f.update({
+                        'abr': quality[0],
+                        'height': quality[1],
+                        'width': quality[2],
+                    }),
+                formats.append(f)
+        self._sort_formats(formats)
+
+        subtitles = {}
+        sub_file_id = video.get('SubFileId')
+        if sub_file_id and sub_file_id != '00000000-0000-0000-0000-000000000000':
+            net_storage_path = video.get('NetstoragePath')
+            invariant_id = video.get('InvariantId')
+            video_file_id = video.get('VideoFileId')
+            video_version = video.get('VideoVersion')
+            if net_storage_path and invariant_id and video_file_id and video_version:
+                subtitles.setdefault(locale[:2], []).append({
+                    'url': 'https://lc-mediaplayerns-live-s.legocdn.com/public/%s/%s_%s_%s_%s_sub.srt' % (net_storage_path, invariant_id, video_file_id, locale, video_version),
+                })
+
+        return {
+            'id': video_id,
+            'title': title,
+            'description': video.get('Description'),
+            'thumbnail': video.get('GeneratedCoverImage') or video.get('GeneratedThumbnail'),
+            'duration': int_or_none(video.get('Length')),
+            'formats': formats,
+            'subtitles': subtitles,
+            'age_limit': int_or_none(video.get('AgeFrom')),
+            'season': video.get('SeasonTitle'),
+            'season_number': int_or_none(video.get('Season')) or None,
+            'episode_number': int_or_none(video.get('Episode')) or None,
+        }
diff --git a/youtube_dl/extractor/lemonde.py b/youtube_dlc/extractor/lemonde.py

similarity index 100%

rename from youtube_dl/extractor/lemonde.py

rename to youtube_dlc/extractor/lemonde.py
diff --git a/youtube_dl/extractor/lenta.py b/youtube_dlc/extractor/lenta.py

similarity index 100%

rename from youtube_dl/extractor/lenta.py

rename to youtube_dlc/extractor/lenta.py
diff --git a/youtube_dl/extractor/libraryofcongress.py b/youtube_dlc/extractor/libraryofcongress.py

similarity index 100%

rename from youtube_dl/extractor/libraryofcongress.py

rename to youtube_dlc/extractor/libraryofcongress.py
diff --git a/youtube_dl/extractor/libsyn.py b/youtube_dlc/extractor/libsyn.py

similarity index 100%

rename from youtube_dl/extractor/libsyn.py

rename to youtube_dlc/extractor/libsyn.py
diff --git a/youtube_dl/extractor/lifenews.py b/youtube_dlc/extractor/lifenews.py

similarity index 100%

rename from youtube_dl/extractor/lifenews.py

rename to youtube_dlc/extractor/lifenews.py
diff --git a/youtube_dl/extractor/limelight.py b/youtube_dlc/extractor/limelight.py

similarity index 77%

rename from youtube_dl/extractor/limelight.py

rename to youtube_dlc/extractor/limelight.py

index 729d8de50fab70cd69bab41fae9db0cba4d7da9b..39f74d2822bc7296df8a5c16e5edfce3298e82ab 100644 (file)
--- a/youtube_dl/extractor/limelight.py
+++ b/youtube_dlc/extractor/limelight.py
@@ -18,7 +18,6 @@
  
  class LimelightBaseIE(InfoExtractor):
      _PLAYLIST_SERVICE_URL = 'http://production-ps.lvp.llnw.net/r/PlaylistService/%s/%s/%s'
-    _API_URL = 'http://api.video.limelight.com/rest/organizations/%s/%s/%s/%s.json'
  
      @classmethod
      def _extract_urls(cls, webpage, source_url):
@@ -70,7 +69,8 @@ def _call_playlist_service(self, item_id, method, fatal=True, referer=None):
          try:
              return self._download_json(
                  self._PLAYLIST_SERVICE_URL % (self._PLAYLIST_SERVICE_PATH, item_id, method),
-                item_id, 'Downloading PlaylistService %s JSON' % method, fatal=fatal, headers=headers)
+                item_id, 'Downloading PlaylistService %s JSON' % method,
+                fatal=fatal, headers=headers)
          except ExtractorError as e:
              if isinstance(e.cause, compat_HTTPError) and e.cause.code == 403:
                  error = self._parse_json(e.cause.read().decode(), item_id)['detail']['contentAccessPermission']
@@ -79,22 +79,22 @@ def _call_playlist_service(self, item_id, method, fatal=True, referer=None):
                  raise ExtractorError(error, expected=True)
              raise
  
-    def _call_api(self, organization_id, item_id, method):
-        return self._download_json(
-            self._API_URL % (organization_id, self._API_PATH, item_id, method),
-            item_id, 'Downloading API %s JSON' % method)
-
-    def _extract(self, item_id, pc_method, mobile_method, meta_method, referer=None):
+    def _extract(self, item_id, pc_method, mobile_method, referer=None):
          pc = self._call_playlist_service(item_id, pc_method, referer=referer)
-        metadata = self._call_api(pc['orgId'], item_id, meta_method)
-        mobile = self._call_playlist_service(item_id, mobile_method, fatal=False, referer=referer)
-        return pc, mobile, metadata
+        mobile = self._call_playlist_service(
+            item_id, mobile_method, fatal=False, referer=referer)
+        return pc, mobile
+
+    def _extract_info(self, pc, mobile, i, referer):
+        get_item = lambda x, y: try_get(x, lambda x: x[y][i], dict) or {}
+        pc_item = get_item(pc, 'playlistItems')
+        mobile_item = get_item(mobile, 'mediaList')
+        video_id = pc_item.get('mediaId') or mobile_item['mediaId']
+        title = pc_item.get('title') or mobile_item['title']
  
-    def _extract_info(self, streams, mobile_urls, properties):
-        video_id = properties['media_id']
          formats = []
          urls = []
-        for stream in streams:
+        for stream in pc_item.get('streams', []):
              stream_url = stream.get('url')
              if not stream_url or stream.get('drmProtected') or stream_url in urls:
                  continue
@@ -155,7 +155,7 @@ def _extract_info(self, streams, mobile_urls, properties):
                      })
                  formats.append(fmt)
  
-        for mobile_url in mobile_urls:
+        for mobile_url in mobile_item.get('mobileUrls', []):
              media_url = mobile_url.get('mobileUrl')
              format_id = mobile_url.get('targetMediaPlatform')
              if not media_url or format_id in ('Widevine', 'SmoothStreaming') or media_url in urls:
@@ -179,54 +179,34 @@ def _extract_info(self, streams, mobile_urls, properties):
  
          self._sort_formats(formats)
  
-        title = properties['title']
-        description = properties.get('description')
-        timestamp = int_or_none(properties.get('publish_date') or properties.get('create_date'))
-        duration = float_or_none(properties.get('duration_in_milliseconds'), 1000)
-        filesize = int_or_none(properties.get('total_storage_in_bytes'))
-        categories = [properties.get('category')]
-        tags = properties.get('tags', [])
-        thumbnails = [{
-            'url': thumbnail['url'],
-            'width': int_or_none(thumbnail.get('width')),
-            'height': int_or_none(thumbnail.get('height')),
-        } for thumbnail in properties.get('thumbnails', []) if thumbnail.get('url')]
-
          subtitles = {}
-        for caption in properties.get('captions', []):
-            lang = caption.get('language_code')
-            subtitles_url = caption.get('url')
-            if lang and subtitles_url:
-                subtitles.setdefault(lang, []).append({
-                    'url': subtitles_url,
-                })
-        closed_captions_url = properties.get('closed_captions_url')
-        if closed_captions_url:
-            subtitles.setdefault('en', []).append({
-                'url': closed_captions_url,
-                'ext': 'ttml',
-            })
+        for flag in mobile_item.get('flags'):
+            if flag == 'ClosedCaptions':
+                closed_captions = self._call_playlist_service(
+                    video_id, 'getClosedCaptionsDetailsByMediaId',
+                    False, referer) or []
+                for cc in closed_captions:
+                    cc_url = cc.get('webvttFileUrl')
+                    if not cc_url:
+                        continue
+                    lang = cc.get('languageCode') or self._search_regex(r'/[a-z]{2}\.vtt', cc_url, 'lang', default='en')
+                    subtitles.setdefault(lang, []).append({
+                        'url': cc_url,
+                    })
+                break
+
+        get_meta = lambda x: pc_item.get(x) or mobile_item.get(x)
  
          return {
              'id': video_id,
              'title': title,
-            'description': description,
+            'description': get_meta('description'),
              'formats': formats,
-            'timestamp': timestamp,
-            'duration': duration,
-            'filesize': filesize,
-            'categories': categories,
-            'tags': tags,
-            'thumbnails': thumbnails,
+            'duration': float_or_none(get_meta('durationInMilliseconds'), 1000),
+            'thumbnail': get_meta('previewImageUrl') or get_meta('thumbnailImageUrl'),
              'subtitles': subtitles,
          }
  
-    def _extract_info_helper(self, pc, mobile, i, metadata):
-        return self._extract_info(
-            try_get(pc, lambda x: x['playlistItems'][i]['streams'], list) or [],
-            try_get(mobile, lambda x: x['mediaList'][i]['mobileUrls'], list) or [],
-            metadata)
-
  
  class LimelightMediaIE(LimelightBaseIE):
      IE_NAME = 'limelight'
@@ -251,8 +231,6 @@ class LimelightMediaIE(LimelightBaseIE):
              'description': 'md5:8005b944181778e313d95c1237ddb640',
              'thumbnail': r're:^https?://.*\.jpeg$',
              'duration': 144.23,
-            'timestamp': 1244136834,
-            'upload_date': '20090604',
          },
          'params': {
              # m3u8 download
@@ -268,30 +246,29 @@ class LimelightMediaIE(LimelightBaseIE):
              'title': '3Play Media Overview Video',
              'thumbnail': r're:^https?://.*\.jpeg$',
              'duration': 78.101,
-            'timestamp': 1338929955,
-            'upload_date': '20120605',
-            'subtitles': 'mincount:9',
+            # TODO: extract all languages that were accessible via API
+            # 'subtitles': 'mincount:9',
+            'subtitles': 'mincount:1',
          },
      }, {
          'url': 'https://assets.delvenetworks.com/player/loader.swf?mediaId=8018a574f08d416e95ceaccae4ba0452',
          'only_matching': True,
      }]
      _PLAYLIST_SERVICE_PATH = 'media'
-    _API_PATH = 'media'
  
      def _real_extract(self, url):
          url, smuggled_data = unsmuggle_url(url, {})
          video_id = self._match_id(url)
+        source_url = smuggled_data.get('source_url')
          self._initialize_geo_bypass({
              'countries': smuggled_data.get('geo_countries'),
          })
  
-        pc, mobile, metadata = self._extract(
+        pc, mobile = self._extract(
              video_id, 'getPlaylistByMediaId',
-            'getMobilePlaylistByMediaId', 'properties',
-            smuggled_data.get('source_url'))
+            'getMobilePlaylistByMediaId', source_url)
  
-        return self._extract_info_helper(pc, mobile, 0, metadata)
+        return self._extract_info(pc, mobile, 0, source_url)
  
  
  class LimelightChannelIE(LimelightBaseIE):
@@ -313,6 +290,7 @@ class LimelightChannelIE(LimelightBaseIE):
          'info_dict': {
              'id': 'ab6a524c379342f9b23642917020c082',
              'title': 'Javascript Sample Code',
+            'description': 'Javascript Sample Code - http://www.delvenetworks.com/sample-code/playerCode-demo.html',
          },
          'playlist_mincount': 3,
      }, {
@@ -320,22 +298,23 @@ class LimelightChannelIE(LimelightBaseIE):
          'only_matching': True,
      }]
      _PLAYLIST_SERVICE_PATH = 'channel'
-    _API_PATH = 'channels'
  
      def _real_extract(self, url):
          url, smuggled_data = unsmuggle_url(url, {})
          channel_id = self._match_id(url)
+        source_url = smuggled_data.get('source_url')
  
-        pc, mobile, medias = self._extract(
+        pc, mobile = self._extract(
              channel_id, 'getPlaylistByChannelId',
              'getMobilePlaylistWithNItemsByChannelId?begin=0&count=-1',
-            'media', smuggled_data.get('source_url'))
+            source_url)
  
          entries = [
-            self._extract_info_helper(pc, mobile, i, medias['media_list'][i])
-            for i in range(len(medias['media_list']))]
+            self._extract_info(pc, mobile, i, source_url)
+            for i in range(len(pc['playlistItems']))]
  
-        return self.playlist_result(entries, channel_id, pc['title'])
+        return self.playlist_result(
+            entries, channel_id, pc.get('title'), mobile.get('description'))
  
  
  class LimelightChannelListIE(LimelightBaseIE):
@@ -368,10 +347,12 @@ class LimelightChannelListIE(LimelightBaseIE):
      def _real_extract(self, url):
          channel_list_id = self._match_id(url)
  
-        channel_list = self._call_playlist_service(channel_list_id, 'getMobileChannelListById')
+        channel_list = self._call_playlist_service(
+            channel_list_id, 'getMobileChannelListById')
  
          entries = [
              self.url_result('limelight:channel:%s' % channel['id'], 'LimelightChannel')
              for channel in channel_list['channelList']]
  
-        return self.playlist_result(entries, channel_list_id, channel_list['title'])
+        return self.playlist_result(
+            entries, channel_list_id, channel_list['title'])
diff --git a/youtube_dl/extractor/line.py b/youtube_dlc/extractor/line.py

similarity index 100%

rename from youtube_dl/extractor/line.py

rename to youtube_dlc/extractor/line.py
diff --git a/youtube_dl/extractor/linkedin.py b/youtube_dlc/extractor/linkedin.py

similarity index 100%

rename from youtube_dl/extractor/linkedin.py

rename to youtube_dlc/extractor/linkedin.py
diff --git a/youtube_dl/extractor/linuxacademy.py b/youtube_dlc/extractor/linuxacademy.py

similarity index 97%

rename from youtube_dl/extractor/linuxacademy.py

rename to youtube_dlc/extractor/linuxacademy.py

index a78c6556e105220a09b66dac94d8fc27780e3dc5..23ca965d977b1ec682101f048684f20f1b70834c 100644 (file)
--- a/youtube_dl/extractor/linuxacademy.py
+++ b/youtube_dlc/extractor/linuxacademy.py
@@ -8,7 +8,6 @@
  from ..compat import (
      compat_b64decode,
      compat_HTTPError,
-    compat_str,
  )
  from ..utils import (
      ExtractorError,
@@ -99,7 +98,7 @@ def random_string():
              'sso': 'true',
          })
  
-        login_state_url = compat_str(urlh.geturl())
+        login_state_url = urlh.geturl()
  
          try:
              login_page = self._download_webpage(
@@ -129,7 +128,7 @@ def random_string():
              })
  
          access_token = self._search_regex(
-            r'access_token=([^=&]+)', compat_str(urlh.geturl()),
+            r'access_token=([^=&]+)', urlh.geturl(),
              'access token')
  
          self._download_webpage(
diff --git a/youtube_dl/extractor/litv.py b/youtube_dlc/extractor/litv.py

similarity index 100%

rename from youtube_dl/extractor/litv.py

rename to youtube_dlc/extractor/litv.py
diff --git a/youtube_dl/extractor/livejournal.py b/youtube_dlc/extractor/livejournal.py

similarity index 100%

rename from youtube_dl/extractor/livejournal.py

rename to youtube_dlc/extractor/livejournal.py
diff --git a/youtube_dl/extractor/liveleak.py b/youtube_dlc/extractor/liveleak.py

similarity index 100%

rename from youtube_dl/extractor/liveleak.py

rename to youtube_dlc/extractor/liveleak.py
diff --git a/youtube_dl/extractor/livestream.py b/youtube_dlc/extractor/livestream.py

similarity index 100%

rename from youtube_dl/extractor/livestream.py

rename to youtube_dlc/extractor/livestream.py
diff --git a/youtube_dl/extractor/lnkgo.py b/youtube_dlc/extractor/lnkgo.py

similarity index 100%

rename from youtube_dl/extractor/lnkgo.py

rename to youtube_dlc/extractor/lnkgo.py
diff --git a/youtube_dl/extractor/localnews8.py b/youtube_dlc/extractor/localnews8.py

similarity index 100%

rename from youtube_dl/extractor/localnews8.py

rename to youtube_dlc/extractor/localnews8.py
diff --git a/youtube_dl/extractor/lovehomeporn.py b/youtube_dlc/extractor/lovehomeporn.py

similarity index 100%

rename from youtube_dl/extractor/lovehomeporn.py

rename to youtube_dlc/extractor/lovehomeporn.py
diff --git a/youtube_dl/extractor/lrt.py b/youtube_dlc/extractor/lrt.py

similarity index 100%

rename from youtube_dl/extractor/lrt.py

rename to youtube_dlc/extractor/lrt.py
diff --git a/youtube_dl/extractor/lynda.py b/youtube_dlc/extractor/lynda.py

similarity index 100%

rename from youtube_dl/extractor/lynda.py

rename to youtube_dlc/extractor/lynda.py
diff --git a/youtube_dl/extractor/m6.py b/youtube_dlc/extractor/m6.py

similarity index 100%

rename from youtube_dl/extractor/m6.py

rename to youtube_dlc/extractor/m6.py
diff --git a/youtube_dlc/extractor/magentamusik360.py b/youtube_dlc/extractor/magentamusik360.py

new file mode 100644 (file)

index 0000000..5c27490
--- /dev/null
+++ b/youtube_dlc/extractor/magentamusik360.py
@@ -0,0 +1,61 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+
+
+class MagentaMusik360IE(InfoExtractor):
+    _VALID_URL = r'https?://(?:www\.)?magenta-musik-360\.de/([a-z0-9-]+-(?P<id>[0-9]+)|festivals/.+)'
+    _TESTS = [{
+        'url': 'https://www.magenta-musik-360.de/within-temptation-wacken-2019-1-9208205928595185932',
+        'md5': '65b6f060b40d90276ec6fb9b992c1216',
+        'info_dict': {
+            'id': '9208205928595185932',
+            'ext': 'm3u8',
+            'title': 'WITHIN TEMPTATION',
+            'description': 'Robert Westerholt und Sharon Janny den Adel gründeten die Symphonic Metal-Band. Privat sind die Niederländer ein Paar und haben zwei Kinder. Die Single Ice Queen brachte ihnen Platin und Gold und verhalf 2002 zum internationalen Durchbruch. Charakteristisch für die Band war Anfangs der hohe Gesang von Frontfrau Sharon. Stilistisch fing die Band im Gothic Metal an. Mit neuem Sound, schnellen Gitarrenriffs und Gitarrensoli, avancierte Within Temptation zur erfolgreichen Rockband. Auch dieses Jahr wird die Band ihre Fangemeinde wieder mitreißen.',
+        }
+    }, {
+        'url': 'https://www.magenta-musik-360.de/festivals/wacken-world-wide-2020-body-count-feat-ice-t',
+        'md5': '81010d27d7cab3f7da0b0f681b983b7e',
+        'info_dict': {
+            'id': '9208205928595231363',
+            'ext': 'm3u8',
+            'title': 'Body Count feat. Ice-T',
+            'description': 'Body Count feat. Ice-T konnten bereits im vergangenen Jahr auf dem „Holy Ground“ in Wacken überzeugen. 2020 gehen die Crossover-Metaller aus einem Club in Los Angeles auf Sendung und bringen mit ihrer Mischung aus Metal und Hip-Hop Abwechslung und ordentlich Alarm zum WWW. Bereits seit 1990 stehen die beiden Gründer Ice-T (Gesang) und Ernie C (Gitarre) auf der Bühne. Sieben Studioalben hat die Gruppe bis jetzt veröffentlicht, darunter das Debüt „Body Count“ (1992) mit dem kontroversen Track „Cop Killer“.',
+        }
+    }]
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+        # _match_id casts to string, but since "None" is not a valid video_id for magenta
+        # there is no risk for confusion
+        if video_id == "None":
+            webpage = self._download_webpage(url, video_id)
+            video_id = self._html_search_regex(r'data-asset-id="([^"]+)"', webpage, 'video_id')
+        json = self._download_json("https://wcps.t-online.de/cvss/magentamusic/vodplayer/v3/player/58935/%s/Main%%20Movie" % video_id, video_id)
+        xml_url = json['content']['feature']['representations'][0]['contentPackages'][0]['media']['href']
+        metadata = json['content']['feature'].get('metadata')
+        title = None
+        description = None
+        duration = None
+        thumbnails = []
+        if metadata:
+            title = metadata.get('title')
+            description = metadata.get('fullDescription')
+            duration = metadata.get('runtimeInSeconds')
+            for img_key in ('teaserImageWide', 'smallCoverImage'):
+                if img_key in metadata:
+                    thumbnails.append({'url': metadata[img_key].get('href')})
+
+        xml = self._download_xml(xml_url, video_id)
+        final_url = xml[0][0][0].attrib['src']
+
+        return {
+            'id': video_id,
+            'title': title,
+            'description': description,
+            'url': final_url,
+            'duration': duration,
+            'thumbnails': thumbnails
+        }
diff --git a/youtube_dl/extractor/mailru.py b/youtube_dlc/extractor/mailru.py

similarity index 89%

rename from youtube_dl/extractor/mailru.py

rename to youtube_dlc/extractor/mailru.py

index 6b0e64b7f1032159262220dcf77c6ffaa358d014..6fdf70aa680bb7584eb957e85520f7d845ca0c25 100644 (file)
--- a/youtube_dl/extractor/mailru.py
+++ b/youtube_dlc/extractor/mailru.py
@@ -20,10 +20,10 @@ class MailRuIE(InfoExtractor):
      IE_DESC = 'Видео@Mail.Ru'
      _VALID_URL = r'''(?x)
                      https?://
-                        (?:(?:www|m)\.)?my\.mail\.ru/
+                        (?:(?:www|m|videoapi)\.)?my\.mail\.ru/+
                          (?:
                              video/.*\#video=/?(?P<idv1>(?:[^/]+/){3}\d+)|
-                            (?:(?P<idv2prefix>(?:[^/]+/){2})video/(?P<idv2suffix>[^/]+/\d+))\.html|
+                            (?:videos/embed/)?(?:(?P<idv2prefix>(?:[^/]+/+){2})(?:video/(?:embed/)?)?(?P<idv2suffix>[^/]+/\d+))(?:\.html)?|
                              (?:video/embed|\+/video/meta)/(?P<metaid>\d+)
                          )
                      '''
@@ -85,6 +85,14 @@ class MailRuIE(InfoExtractor):
          {
              'url': 'http://my.mail.ru/+/video/meta/7949340477499637815',
              'only_matching': True,
+        },
+        {
+            'url': 'https://my.mail.ru//list/sinyutin10/video/_myvideo/4.html',
+            'only_matching': True,
+        },
+        {
+            'url': 'https://my.mail.ru//list//sinyutin10/video/_myvideo/4.html',
+            'only_matching': True,
          }
      ]
  
@@ -100,15 +108,21 @@ def _real_extract(self, url):
              if not video_id:
                  video_id = mobj.group('idv2prefix') + mobj.group('idv2suffix')
              webpage = self._download_webpage(url, video_id)
-            page_config = self._parse_json(self._search_regex(
+            page_config = self._parse_json(self._search_regex([
                  r'(?s)<script[^>]+class="sp-video__page-config"[^>]*>(.+?)</script>',
+                r'(?s)"video":\s*(\{.+?\}),'],
                  webpage, 'page config', default='{}'), video_id, fatal=False)
              if page_config:
-                meta_url = page_config.get('metaUrl') or page_config.get('video', {}).get('metaUrl')
+                meta_url = page_config.get('metaUrl') or page_config.get('video', {}).get('metaUrl') or page_config.get('metadataUrl')
              else:
                  meta_url = None
  
          video_data = None
+
+        # fix meta_url if missing the host address
+        if re.match(r'^\/\+\/', meta_url):
+            meta_url = 'https://my.mail.ru' + meta_url
+
          if meta_url:
              video_data = self._download_json(
                  meta_url, video_id or meta_id, 'Downloading video meta JSON',
@@ -120,6 +134,12 @@ def _real_extract(self, url):
                  'http://api.video.mail.ru/videos/%s.json?new=1' % video_id,
                  video_id, 'Downloading video JSON')
  
+        headers = {}
+
+        video_key = self._get_cookies('https://my.mail.ru').get('video_key')
+        if video_key:
+            headers['Cookie'] = 'video_key=%s' % video_key.value
+
          formats = []
          for f in video_data['videos']:
              video_url = f.get('url')
@@ -132,6 +152,7 @@ def _real_extract(self, url):
                  'url': video_url,
                  'format_id': format_id,
                  'height': height,
+                'http_headers': headers,
              })
          self._sort_formats(formats)
  
@@ -237,7 +258,7 @@ def _extract_track(t, fatal=True):
  class MailRuMusicIE(MailRuMusicSearchBaseIE):
      IE_NAME = 'mailru:music'
      IE_DESC = 'Музыка@Mail.Ru'
-    _VALID_URL = r'https?://my\.mail\.ru/music/songs/[^/?#&]+-(?P<id>[\da-f]+)'
+    _VALID_URL = r'https?://my\.mail\.ru/+music/+songs/+[^/?#&]+-(?P<id>[\da-f]+)'
      _TESTS = [{
          'url': 'https://my.mail.ru/music/songs/%D0%BC8%D0%BB8%D1%82%D1%85-l-a-h-luciferian-aesthetics-of-herrschaft-single-2017-4e31f7125d0dfaef505d947642366893',
          'md5': '0f8c22ef8c5d665b13ac709e63025610',
@@ -273,7 +294,7 @@ def _real_extract(self, url):
  class MailRuMusicSearchIE(MailRuMusicSearchBaseIE):
      IE_NAME = 'mailru:music:search'
      IE_DESC = 'Музыка@Mail.Ru'
-    _VALID_URL = r'https?://my\.mail\.ru/music/search/(?P<id>[^/?#&]+)'
+    _VALID_URL = r'https?://my\.mail\.ru/+music/+search/+(?P<id>[^/?#&]+)'
      _TESTS = [{
          'url': 'https://my.mail.ru/music/search/black%20shadow',
          'info_dict': {
diff --git a/youtube_dl/extractor/malltv.py b/youtube_dlc/extractor/malltv.py

similarity index 90%

rename from youtube_dl/extractor/malltv.py

rename to youtube_dlc/extractor/malltv.py

index e13c2e11a5baf301d32a34d8343776ab249ea821..6f4fd927fa3c5a607cb7caee632f2d7aed2471d5 100644 (file)
--- a/youtube_dl/extractor/malltv.py
+++ b/youtube_dlc/extractor/malltv.py
@@ -8,7 +8,7 @@
  
  
  class MallTVIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:www\.)?mall\.tv/(?:[^/]+/)*(?P<id>[^/?#&]+)'
+    _VALID_URL = r'https?://(?:(?:www|sk)\.)?mall\.tv/(?:[^/]+/)*(?P<id>[^/?#&]+)'
      _TESTS = [{
          'url': 'https://www.mall.tv/18-miliard-pro-neziskovky-opravdu-jsou-sportovci-nebo-clovek-v-tisni-pijavice',
          'md5': '1c4a37f080e1f3023103a7b43458e518',
@@ -26,6 +26,9 @@ class MallTVIE(InfoExtractor):
      }, {
          'url': 'https://www.mall.tv/kdo-to-plati/18-miliard-pro-neziskovky-opravdu-jsou-sportovci-nebo-clovek-v-tisni-pijavice',
          'only_matching': True,
+    }, {
+        'url': 'https://sk.mall.tv/gejmhaus/reklamacia-nehreje-vyrobnik-tepla-alebo-spekacka',
+        'only_matching': True,
      }]
  
      def _real_extract(self, url):
diff --git a/youtube_dl/extractor/mangomolo.py b/youtube_dlc/extractor/mangomolo.py

similarity index 100%

rename from youtube_dl/extractor/mangomolo.py

rename to youtube_dlc/extractor/mangomolo.py
diff --git a/youtube_dl/extractor/manyvids.py b/youtube_dlc/extractor/manyvids.py

similarity index 100%

rename from youtube_dl/extractor/manyvids.py

rename to youtube_dlc/extractor/manyvids.py
diff --git a/youtube_dl/extractor/markiza.py b/youtube_dlc/extractor/markiza.py

similarity index 100%

rename from youtube_dl/extractor/markiza.py

rename to youtube_dlc/extractor/markiza.py
diff --git a/youtube_dl/extractor/massengeschmacktv.py b/youtube_dlc/extractor/massengeschmacktv.py

similarity index 100%

rename from youtube_dl/extractor/massengeschmacktv.py

rename to youtube_dlc/extractor/massengeschmacktv.py
diff --git a/youtube_dl/extractor/matchtv.py b/youtube_dlc/extractor/matchtv.py

similarity index 100%

rename from youtube_dl/extractor/matchtv.py

rename to youtube_dlc/extractor/matchtv.py
diff --git a/youtube_dl/extractor/mdr.py b/youtube_dlc/extractor/mdr.py

similarity index 100%

rename from youtube_dl/extractor/mdr.py

rename to youtube_dlc/extractor/mdr.py
diff --git a/youtube_dl/extractor/medialaan.py b/youtube_dlc/extractor/medialaan.py

similarity index 100%

rename from youtube_dl/extractor/medialaan.py

rename to youtube_dlc/extractor/medialaan.py
diff --git a/youtube_dl/extractor/mediaset.py b/youtube_dlc/extractor/mediaset.py

similarity index 97%

rename from youtube_dl/extractor/mediaset.py

rename to youtube_dlc/extractor/mediaset.py

index f976506f416b9f5461dbfb85f988e2816c43b334..933df14952d5cc16857485e306be07f2d32384d3 100644 (file)
--- a/youtube_dl/extractor/mediaset.py
+++ b/youtube_dlc/extractor/mediaset.py
@@ -6,7 +6,6 @@
  from .theplatform import ThePlatformBaseIE
  from ..compat import (
      compat_parse_qs,
-    compat_str,
      compat_urllib_parse_urlparse,
  )
  from ..utils import (
@@ -114,7 +113,7 @@ def _program_guid(qs):
                  continue
              urlh = ie._request_webpage(
                  embed_url, video_id, note='Following embed URL redirect')
-            embed_url = compat_str(urlh.geturl())
+            embed_url = urlh.geturl()
              program_guid = _program_guid(_qs(embed_url))
              if program_guid:
                  entries.append(embed_url)
@@ -123,7 +122,7 @@ def _program_guid(qs):
      def _parse_smil_formats(self, smil, smil_url, video_id, namespace=None, f4m_params=None, transform_rtmp_url=None):
          for video in smil.findall(self._xpath_ns('.//video', namespace)):
              video.attrib['src'] = re.sub(r'(https?://vod05)t(-mediaset-it\.akamaized\.net/.+?.mpd)\?.+', r'\1\2', video.attrib['src'])
-        return super()._parse_smil_formats(smil, smil_url, video_id, namespace, f4m_params, transform_rtmp_url)
+        return super(MediasetIE, self)._parse_smil_formats(smil, smil_url, video_id, namespace, f4m_params, transform_rtmp_url)
  
      def _real_extract(self, url):
          guid = self._match_id(url)
diff --git a/youtube_dl/extractor/mediasite.py b/youtube_dlc/extractor/mediasite.py

similarity index 99%

rename from youtube_dl/extractor/mediasite.py

rename to youtube_dlc/extractor/mediasite.py

index 694a264d672288b47c2700b9265bfc0635158ff2..d6eb1574065dece67e28a4b36fa43478dd48dfa3 100644 (file)
--- a/youtube_dl/extractor/mediasite.py
+++ b/youtube_dlc/extractor/mediasite.py
@@ -129,7 +129,7 @@ def _real_extract(self, url):
          query = mobj.group('query')
  
          webpage, urlh = self._download_webpage_handle(url, resource_id)  # XXX: add UrlReferrer?
-        redirect_url = compat_str(urlh.geturl())
+        redirect_url = urlh.geturl()
  
          # XXX: might have also extracted UrlReferrer and QueryString from the html
          service_path = compat_urlparse.urljoin(redirect_url, self._html_search_regex(
diff --git a/youtube_dl/extractor/medici.py b/youtube_dlc/extractor/medici.py

similarity index 100%

rename from youtube_dl/extractor/medici.py

rename to youtube_dlc/extractor/medici.py
diff --git a/youtube_dl/extractor/megaphone.py b/youtube_dlc/extractor/megaphone.py

similarity index 100%

rename from youtube_dl/extractor/megaphone.py

rename to youtube_dlc/extractor/megaphone.py
diff --git a/youtube_dl/extractor/meipai.py b/youtube_dlc/extractor/meipai.py

similarity index 100%

rename from youtube_dl/extractor/meipai.py

rename to youtube_dlc/extractor/meipai.py
diff --git a/youtube_dl/extractor/melonvod.py b/youtube_dlc/extractor/melonvod.py

similarity index 100%

rename from youtube_dl/extractor/melonvod.py

rename to youtube_dlc/extractor/melonvod.py
diff --git a/youtube_dl/extractor/meta.py b/youtube_dlc/extractor/meta.py

similarity index 100%

rename from youtube_dl/extractor/meta.py

rename to youtube_dlc/extractor/meta.py
diff --git a/youtube_dl/extractor/metacafe.py b/youtube_dlc/extractor/metacafe.py

similarity index 100%

rename from youtube_dl/extractor/metacafe.py

rename to youtube_dlc/extractor/metacafe.py
diff --git a/youtube_dl/extractor/metacritic.py b/youtube_dlc/extractor/metacritic.py

similarity index 100%

rename from youtube_dl/extractor/metacritic.py

rename to youtube_dlc/extractor/metacritic.py
diff --git a/youtube_dl/extractor/mgoon.py b/youtube_dlc/extractor/mgoon.py

similarity index 100%

rename from youtube_dl/extractor/mgoon.py

rename to youtube_dlc/extractor/mgoon.py
diff --git a/youtube_dl/extractor/mgtv.py b/youtube_dlc/extractor/mgtv.py

similarity index 100%

rename from youtube_dl/extractor/mgtv.py

rename to youtube_dlc/extractor/mgtv.py
diff --git a/youtube_dl/extractor/miaopai.py b/youtube_dlc/extractor/miaopai.py

similarity index 100%

rename from youtube_dl/extractor/miaopai.py

rename to youtube_dlc/extractor/miaopai.py
diff --git a/youtube_dl/extractor/microsoftvirtualacademy.py b/youtube_dlc/extractor/microsoftvirtualacademy.py

similarity index 100%

rename from youtube_dl/extractor/microsoftvirtualacademy.py

rename to youtube_dlc/extractor/microsoftvirtualacademy.py
diff --git a/youtube_dl/extractor/ministrygrid.py b/youtube_dlc/extractor/ministrygrid.py

similarity index 100%

rename from youtube_dl/extractor/ministrygrid.py

rename to youtube_dlc/extractor/ministrygrid.py
diff --git a/youtube_dl/extractor/minoto.py b/youtube_dlc/extractor/minoto.py

similarity index 100%

rename from youtube_dl/extractor/minoto.py

rename to youtube_dlc/extractor/minoto.py
diff --git a/youtube_dl/extractor/miomio.py b/youtube_dlc/extractor/miomio.py

similarity index 100%

rename from youtube_dl/extractor/miomio.py

rename to youtube_dlc/extractor/miomio.py
diff --git a/youtube_dl/extractor/mit.py b/youtube_dlc/extractor/mit.py

similarity index 100%

rename from youtube_dl/extractor/mit.py

rename to youtube_dlc/extractor/mit.py
diff --git a/youtube_dlc/extractor/mitele.py b/youtube_dlc/extractor/mitele.py

new file mode 100644 (file)

index 0000000..ad9da96
--- /dev/null
+++ b/youtube_dlc/extractor/mitele.py
@@ -0,0 +1,93 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..utils import (
+    int_or_none,
+    parse_iso8601,
+    smuggle_url,
+)
+
+
+class MiTeleIE(InfoExtractor):
+    IE_DESC = 'mitele.es'
+    _VALID_URL = r'https?://(?:www\.)?mitele\.es/(?:[^/]+/)+(?P<id>[^/]+)/player'
+
+    _TESTS = [{
+        'url': 'http://www.mitele.es/programas-tv/diario-de/57b0dfb9c715da65618b4afa/player',
+        'info_dict': {
+            'id': 'FhYW1iNTE6J6H7NkQRIEzfne6t2quqPg',
+            'ext': 'mp4',
+            'title': 'Diario de La redacción Programa 144',
+            'description': 'md5:07c35a7b11abb05876a6a79185b58d27',
+            'series': 'Diario de',
+            'season': 'Season 14',
+            'season_number': 14,
+            'episode': 'Tor, la web invisible',
+            'episode_number': 3,
+            'thumbnail': r're:(?i)^https?://.*\.jpg$',
+            'duration': 2913,
+            'age_limit': 16,
+            'timestamp': 1471209401,
+            'upload_date': '20160814',
+        },
+        'add_ie': ['Ooyala'],
+    }, {
+        # no explicit title
+        'url': 'http://www.mitele.es/programas-tv/cuarto-milenio/57b0de3dc915da14058b4876/player',
+        'info_dict': {
+            'id': 'oyNG1iNTE6TAPP-JmCjbwfwJqqMMX3Vq',
+            'ext': 'mp4',
+            'title': 'Cuarto Milenio Temporada 6 Programa 226',
+            'description': 'md5:5ff132013f0cd968ffbf1f5f3538a65f',
+            'series': 'Cuarto Milenio',
+            'season': 'Season 6',
+            'season_number': 6,
+            'episode': 'Episode 24',
+            'episode_number': 24,
+            'thumbnail': r're:(?i)^https?://.*\.jpg$',
+            'duration': 7313,
+            'age_limit': 12,
+            'timestamp': 1471209021,
+            'upload_date': '20160814',
+        },
+        'params': {
+            'skip_download': True,
+        },
+        'add_ie': ['Ooyala'],
+    }, {
+        'url': 'http://www.mitele.es/series-online/la-que-se-avecina/57aac5c1c915da951a8b45ed/player',
+        'only_matching': True,
+    }, {
+        'url': 'https://www.mitele.es/programas-tv/diario-de/la-redaccion/programa-144-40_1006364575251/player/',
+        'only_matching': True,
+    }]
+
+    def _real_extract(self, url):
+        display_id = self._match_id(url)
+        webpage = self._download_webpage(url, display_id)
+        pre_player = self._parse_json(self._search_regex(
+            r'window\.\$REACTBASE_STATE\.prePlayer_mtweb\s*=\s*({.+})',
+            webpage, 'Pre Player'), display_id)['prePlayer']
+        title = pre_player['title']
+        video = pre_player['video']
+        video_id = video['dataMediaId']
+        content = pre_player.get('content') or {}
+        info = content.get('info') or {}
+
+        return {
+            '_type': 'url_transparent',
+            # for some reason only HLS is supported
+            'url': smuggle_url('ooyala:' + video_id, {'supportedformats': 'm3u8,dash'}),
+            'id': video_id,
+            'title': title,
+            'description': info.get('synopsis'),
+            'series': content.get('title'),
+            'season_number': int_or_none(info.get('season_number')),
+            'episode': content.get('subtitle'),
+            'episode_number': int_or_none(info.get('episode_number')),
+            'duration': int_or_none(info.get('duration')),
+            'thumbnail': video.get('dataPoster'),
+            'age_limit': int_or_none(info.get('rating')),
+            'timestamp': parse_iso8601(pre_player.get('publishedTime')),
+        }
diff --git a/youtube_dl/extractor/mixcloud.py b/youtube_dlc/extractor/mixcloud.py

similarity index 100%

rename from youtube_dl/extractor/mixcloud.py

rename to youtube_dlc/extractor/mixcloud.py
diff --git a/youtube_dl/extractor/mlb.py b/youtube_dlc/extractor/mlb.py

similarity index 100%

rename from youtube_dl/extractor/mlb.py

rename to youtube_dlc/extractor/mlb.py
diff --git a/youtube_dl/extractor/mnet.py b/youtube_dlc/extractor/mnet.py

similarity index 100%

rename from youtube_dl/extractor/mnet.py

rename to youtube_dlc/extractor/mnet.py
diff --git a/youtube_dl/extractor/moevideo.py b/youtube_dlc/extractor/moevideo.py

similarity index 100%

rename from youtube_dl/extractor/moevideo.py

rename to youtube_dlc/extractor/moevideo.py
diff --git a/youtube_dl/extractor/mofosex.py b/youtube_dlc/extractor/mofosex.py

similarity index 73%

rename from youtube_dl/extractor/mofosex.py

rename to youtube_dlc/extractor/mofosex.py

index 1c652813adb96b994c3ba517db994805e8ea8eb3..5234cac02632d9cddde53ebb10e2a0d91c4ec508 100644 (file)
--- a/youtube_dl/extractor/mofosex.py
+++ b/youtube_dlc/extractor/mofosex.py
@@ -1,5 +1,8 @@
  from __future__ import unicode_literals
  
+import re
+
+from .common import InfoExtractor
  from ..utils import (
      int_or_none,
      str_to_int,
@@ -54,3 +57,23 @@ def _real_extract(self, url):
          })
  
          return info
+
+
+class MofosexEmbedIE(InfoExtractor):
+    _VALID_URL = r'https?://(?:www\.)?mofosex\.com/embed/?\?.*?\bvideoid=(?P<id>\d+)'
+    _TESTS = [{
+        'url': 'https://www.mofosex.com/embed/?videoid=318131&referrer=KM',
+        'only_matching': True,
+    }]
+
+    @staticmethod
+    def _extract_urls(webpage):
+        return re.findall(
+            r'<iframe[^>]+\bsrc=["\']((?:https?:)?//(?:www\.)?mofosex\.com/embed/?\?.*?\bvideoid=\d+)',
+            webpage)
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+        return self.url_result(
+            'http://www.mofosex.com/videos/{0}/{0}.html'.format(video_id),
+            ie=MofosexIE.ie_key(), video_id=video_id)
diff --git a/youtube_dl/extractor/mojvideo.py b/youtube_dlc/extractor/mojvideo.py

similarity index 100%

rename from youtube_dl/extractor/mojvideo.py

rename to youtube_dlc/extractor/mojvideo.py
diff --git a/youtube_dl/extractor/morningstar.py b/youtube_dlc/extractor/morningstar.py

similarity index 100%

rename from youtube_dl/extractor/morningstar.py

rename to youtube_dlc/extractor/morningstar.py
diff --git a/youtube_dl/extractor/motherless.py b/youtube_dlc/extractor/motherless.py

similarity index 92%

rename from youtube_dl/extractor/motherless.py

rename to youtube_dlc/extractor/motherless.py

index 43fd70f112005a893377c8e5cf489291fb8cc812..b1615b4d8e4bce8b580942f717477e6ed57ee92e 100644 (file)
--- a/youtube_dl/extractor/motherless.py
+++ b/youtube_dlc/extractor/motherless.py
@@ -26,7 +26,7 @@ class MotherlessIE(InfoExtractor):
              'categories': ['Gaming', 'anal', 'reluctant', 'rough', 'Wife'],
              'upload_date': '20100913',
              'uploader_id': 'famouslyfuckedup',
-            'thumbnail': r're:http://.*\.jpg',
+            'thumbnail': r're:https?://.*\.jpg',
              'age_limit': 18,
          }
      }, {
@@ -40,7 +40,7 @@ class MotherlessIE(InfoExtractor):
                             'game', 'hairy'],
              'upload_date': '20140622',
              'uploader_id': 'Sulivana7x',
-            'thumbnail': r're:http://.*\.jpg',
+            'thumbnail': r're:https?://.*\.jpg',
              'age_limit': 18,
          },
          'skip': '404',
@@ -54,7 +54,7 @@ class MotherlessIE(InfoExtractor):
              'categories': ['superheroine heroine  superher'],
              'upload_date': '20140827',
              'uploader_id': 'shade0230',
-            'thumbnail': r're:http://.*\.jpg',
+            'thumbnail': r're:https?://.*\.jpg',
              'age_limit': 18,
          }
      }, {
@@ -76,7 +76,8 @@ def _real_extract(self, url):
              raise ExtractorError('Video %s is for friends only' % video_id, expected=True)
  
          title = self._html_search_regex(
-            r'id="view-upload-title">\s+([^<]+)<', webpage, 'title')
+            (r'(?s)<div[^>]+\bclass=["\']media-meta-title[^>]+>(.+?)</div>',
+             r'id="view-upload-title">\s+([^<]+)<'), webpage, 'title')
          video_url = (self._html_search_regex(
              (r'setup\(\{\s*["\']file["\']\s*:\s*(["\'])(?P<url>(?:(?!\1).)+)\1',
               r'fileurl\s*=\s*(["\'])(?P<url>(?:(?!\1).)+)\1'),
@@ -84,14 +85,15 @@ def _real_extract(self, url):
              or 'http://cdn4.videos.motherlessmedia.com/videos/%s.mp4?fs=opencloud' % video_id)
          age_limit = self._rta_search(webpage)
          view_count = str_to_int(self._html_search_regex(
-            r'<strong>Views</strong>\s+([^<]+)<',
+            (r'>(\d+)\s+Views<', r'<strong>Views</strong>\s+([^<]+)<'),
              webpage, 'view count', fatal=False))
          like_count = str_to_int(self._html_search_regex(
-            r'<strong>Favorited</strong>\s+([^<]+)<',
+            (r'>(\d+)\s+Favorites<', r'<strong>Favorited</strong>\s+([^<]+)<'),
              webpage, 'like count', fatal=False))
  
          upload_date = self._html_search_regex(
-            r'<strong>Uploaded</strong>\s+([^<]+)<', webpage, 'upload date')
+            (r'class=["\']count[^>]+>(\d+\s+[a-zA-Z]{3}\s+\d{4})<',
+             r'<strong>Uploaded</strong>\s+([^<]+)<'), webpage, 'upload date')
          if 'Ago' in upload_date:
              days = int(re.search(r'([0-9]+)', upload_date).group(1))
              upload_date = (datetime.datetime.now() - datetime.timedelta(days=days)).strftime('%Y%m%d')
diff --git a/youtube_dl/extractor/motorsport.py b/youtube_dlc/extractor/motorsport.py

similarity index 100%

rename from youtube_dl/extractor/motorsport.py

rename to youtube_dlc/extractor/motorsport.py
diff --git a/youtube_dl/extractor/movieclips.py b/youtube_dlc/extractor/movieclips.py

similarity index 100%

rename from youtube_dl/extractor/movieclips.py

rename to youtube_dlc/extractor/movieclips.py
diff --git a/youtube_dl/extractor/moviezine.py b/youtube_dlc/extractor/moviezine.py

similarity index 100%

rename from youtube_dl/extractor/moviezine.py

rename to youtube_dlc/extractor/moviezine.py
diff --git a/youtube_dl/extractor/movingimage.py b/youtube_dlc/extractor/movingimage.py

similarity index 100%

rename from youtube_dl/extractor/movingimage.py

rename to youtube_dlc/extractor/movingimage.py
diff --git a/youtube_dl/extractor/msn.py b/youtube_dlc/extractor/msn.py

similarity index 100%

rename from youtube_dl/extractor/msn.py

rename to youtube_dlc/extractor/msn.py
diff --git a/youtube_dl/extractor/mtv.py b/youtube_dlc/extractor/mtv.py

similarity index 100%

rename from youtube_dl/extractor/mtv.py

rename to youtube_dlc/extractor/mtv.py
diff --git a/youtube_dl/extractor/muenchentv.py b/youtube_dlc/extractor/muenchentv.py

similarity index 100%

rename from youtube_dl/extractor/muenchentv.py

rename to youtube_dlc/extractor/muenchentv.py
diff --git a/youtube_dl/extractor/mwave.py b/youtube_dlc/extractor/mwave.py

similarity index 100%

rename from youtube_dl/extractor/mwave.py

rename to youtube_dlc/extractor/mwave.py
diff --git a/youtube_dl/extractor/mychannels.py b/youtube_dlc/extractor/mychannels.py

similarity index 100%

rename from youtube_dl/extractor/mychannels.py

rename to youtube_dlc/extractor/mychannels.py
diff --git a/youtube_dl/extractor/myspace.py b/youtube_dlc/extractor/myspace.py

similarity index 100%

rename from youtube_dl/extractor/myspace.py

rename to youtube_dlc/extractor/myspace.py
diff --git a/youtube_dl/extractor/myspass.py b/youtube_dlc/extractor/myspass.py

similarity index 100%

rename from youtube_dl/extractor/myspass.py

rename to youtube_dlc/extractor/myspass.py
diff --git a/youtube_dl/extractor/myvi.py b/youtube_dlc/extractor/myvi.py

similarity index 100%

rename from youtube_dl/extractor/myvi.py

rename to youtube_dlc/extractor/myvi.py
diff --git a/youtube_dlc/extractor/myvideoge.py b/youtube_dlc/extractor/myvideoge.py

new file mode 100644 (file)

index 0000000..0a1d7d0
--- /dev/null
+++ b/youtube_dlc/extractor/myvideoge.py
@@ -0,0 +1,56 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..utils import js_to_json
+
+
+class MyVideoGeIE(InfoExtractor):
+    _VALID_URL = r'https?://(?:www\.)?myvideo\.ge/v/(?P<id>[0-9]+)'
+    _TEST = {
+        'url': 'https://www.myvideo.ge/v/3941048',
+        'md5': '8c192a7d2b15454ba4f29dc9c9a52ea9',
+        'info_dict': {
+            'id': '3941048',
+            'ext': 'mp4',
+            'title': 'The best prikol',
+            'thumbnail': r're:^https?://.*\.jpg$',
+            'uploader': 'md5:d72addd357b0dd914e704781f7f777d8',
+            'description': 'md5:5c0371f540f5888d603ebfedd46b6df3'
+        }
+    }
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+        webpage = self._download_webpage(url, video_id)
+
+        title = self._html_search_regex(r'<h1[^>]*>([^<]+)</h1>', webpage, 'title')
+        description = self._og_search_description(webpage)
+        thumbnail = self._html_search_meta(['og:image'], webpage)
+        uploader = self._search_regex(r'<a[^>]+class="mv_user_name"[^>]*>([^<]+)<', webpage, 'uploader', fatal=False)
+
+        jwplayer_sources = self._parse_json(
+            self._search_regex(
+                r"(?s)jwplayer\(\"mvplayer\"\).setup\(.*?sources: (.*?])", webpage, 'jwplayer sources'),
+            video_id, transform_source=js_to_json)
+
+        def _formats_key(f):
+            if f['label'] == 'SD':
+                return -1
+            elif f['label'] == 'HD':
+                return 1
+            else:
+                return 0
+
+        jwplayer_sources = sorted(jwplayer_sources, key=_formats_key)
+
+        formats = self._parse_jwplayer_formats(jwplayer_sources, video_id)
+
+        return {
+            'id': video_id,
+            'title': title,
+            'description': description,
+            'uploader': uploader,
+            'formats': formats,
+            'thumbnail': thumbnail
+        }
diff --git a/youtube_dl/extractor/myvidster.py b/youtube_dlc/extractor/myvidster.py

similarity index 100%

rename from youtube_dl/extractor/myvidster.py

rename to youtube_dlc/extractor/myvidster.py
diff --git a/youtube_dl/extractor/nationalgeographic.py b/youtube_dlc/extractor/nationalgeographic.py

similarity index 100%

rename from youtube_dl/extractor/nationalgeographic.py

rename to youtube_dlc/extractor/nationalgeographic.py
diff --git a/youtube_dl/extractor/naver.py b/youtube_dlc/extractor/naver.py

similarity index 50%

rename from youtube_dl/extractor/naver.py

rename to youtube_dlc/extractor/naver.py

index bb3d944133d6a1e2685779b86a7565ad9b0985f0..61fc59126f61ef3599f69ac7ad1d920e9bc87817 100644 (file)
--- a/youtube_dl/extractor/naver.py
+++ b/youtube_dlc/extractor/naver.py
@@ -1,68 +1,33 @@
  # coding: utf-8
  from __future__ import unicode_literals
  
+import re
+
  from .common import InfoExtractor
  from ..utils import (
+    clean_html,
+    dict_get,
      ExtractorError,
      int_or_none,
+    parse_duration,
+    try_get,
      update_url_query,
  )
  
  
-class NaverIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:m\.)?tv(?:cast)?\.naver\.com/v/(?P<id>\d+)'
+class NaverBaseIE(InfoExtractor):
+    _CAPTION_EXT_RE = r'\.(?:ttml|vtt)'
  
-    _TESTS = [{
-        'url': 'http://tv.naver.com/v/81652',
-        'info_dict': {
-            'id': '81652',
-            'ext': 'mp4',
-            'title': '[9월 모의고사 해설강의][수학_김상희] 수학 A형 16~20번',
-            'description': '합격불변의 법칙 메가스터디 | 메가스터디 수학 김상희 선생님이 9월 모의고사 수학A형 16번에서 20번까지 해설강의를 공개합니다.',
-            'upload_date': '20130903',
-        },
-    }, {
-        'url': 'http://tv.naver.com/v/395837',
-        'md5': '638ed4c12012c458fefcddfd01f173cd',
-        'info_dict': {
-            'id': '395837',
-            'ext': 'mp4',
-            'title': '9년이 지나도 아픈 기억, 전효성의 아버지',
-            'description': 'md5:5bf200dcbf4b66eb1b350d1eb9c753f7',
-            'upload_date': '20150519',
-        },
-        'skip': 'Georestricted',
-    }, {
-        'url': 'http://tvcast.naver.com/v/81652',
-        'only_matching': True,
-    }]
-
-    def _real_extract(self, url):
-        video_id = self._match_id(url)
-        webpage = self._download_webpage(url, video_id)
-
-        vid = self._search_regex(
-            r'videoId["\']\s*:\s*(["\'])(?P<value>(?:(?!\1).)+)\1', webpage,
-            'video id', fatal=None, group='value')
-        in_key = self._search_regex(
-            r'inKey["\']\s*:\s*(["\'])(?P<value>(?:(?!\1).)+)\1', webpage,
-            'key', default=None, group='value')
-
-        if not vid or not in_key:
-            error = self._html_search_regex(
-                r'(?s)<div class="(?:nation_error|nation_box|error_box)">\s*(?:<!--.*?-->)?\s*<p class="[^"]+">(?P<msg>.+?)</p>\s*</div>',
-                webpage, 'error', default=None)
-            if error:
-                raise ExtractorError(error, expected=True)
-            raise ExtractorError('couldn\'t extract vid and key')
+    def _extract_video_info(self, video_id, vid, key):
          video_data = self._download_json(
              'http://play.rmcnmv.naver.com/vod/play/v2.0/' + vid,
              video_id, query={
-                'key': in_key,
+                'key': key,
              })
          meta = video_data['meta']
          title = meta['subject']
          formats = []
+        get_list = lambda x: try_get(video_data, lambda y: y[x + 's']['list'], list) or []
  
          def extract_formats(streams, stream_type, query={}):
              for stream in streams:
@@ -73,7 +38,7 @@ def extract_formats(streams, stream_type, query={}):
                  encoding_option = stream.get('encodingOption', {})
                  bitrate = stream.get('bitrate', {})
                  formats.append({
-                    'format_id': '%s_%s' % (stream.get('type') or stream_type, encoding_option.get('id') or encoding_option.get('name')),
+                    'format_id': '%s_%s' % (stream.get('type') or stream_type, dict_get(encoding_option, ('name', 'id'))),
                      'url': stream_url,
                      'width': int_or_none(encoding_option.get('width')),
                      'height': int_or_none(encoding_option.get('height')),
@@ -83,7 +48,7 @@ def extract_formats(streams, stream_type, query={}):
                      'protocol': 'm3u8_native' if stream_type == 'HLS' else None,
                  })
  
-        extract_formats(video_data.get('videos', {}).get('list', []), 'H264')
+        extract_formats(get_list('video'), 'H264')
          for stream_set in video_data.get('streams', []):
              query = {}
              for param in stream_set.get('keys', []):
@@ -101,28 +66,101 @@ def extract_formats(streams, stream_type, query={}):
                      'mp4', 'm3u8_native', m3u8_id=stream_type, fatal=False))
          self._sort_formats(formats)
  
+        replace_ext = lambda x, y: re.sub(self._CAPTION_EXT_RE, '.' + y, x)
+
+        def get_subs(caption_url):
+            if re.search(self._CAPTION_EXT_RE, caption_url):
+                return [{
+                    'url': replace_ext(caption_url, 'ttml'),
+                }, {
+                    'url': replace_ext(caption_url, 'vtt'),
+                }]
+            else:
+                return [{'url': caption_url}]
+
+        automatic_captions = {}
          subtitles = {}
-        for caption in video_data.get('captions', {}).get('list', []):
+        for caption in get_list('caption'):
              caption_url = caption.get('source')
              if not caption_url:
                  continue
-            subtitles.setdefault(caption.get('language') or caption.get('locale'), []).append({
-                'url': caption_url,
-            })
+            sub_dict = automatic_captions if caption.get('type') == 'auto' else subtitles
+            sub_dict.setdefault(dict_get(caption, ('locale', 'language')), []).extend(get_subs(caption_url))
  
-        upload_date = self._search_regex(
-            r'<span[^>]+class="date".*?(\d{4}\.\d{2}\.\d{2})',
-            webpage, 'upload date', fatal=False)
-        if upload_date:
-            upload_date = upload_date.replace('.', '')
+        user = meta.get('user', {})
  
          return {
              'id': video_id,
              'title': title,
              'formats': formats,
              'subtitles': subtitles,
-            'description': self._og_search_description(webpage),
-            'thumbnail': meta.get('cover', {}).get('source') or self._og_search_thumbnail(webpage),
+            'automatic_captions': automatic_captions,
+            'thumbnail': try_get(meta, lambda x: x['cover']['source']),
              'view_count': int_or_none(meta.get('count')),
-            'upload_date': upload_date,
+            'uploader_id': user.get('id'),
+            'uploader': user.get('name'),
+            'uploader_url': user.get('url'),
          }
+
+
+class NaverIE(NaverBaseIE):
+    _VALID_URL = r'https?://(?:m\.)?tv(?:cast)?\.naver\.com/(?:v|embed)/(?P<id>\d+)'
+    _GEO_BYPASS = False
+    _TESTS = [{
+        'url': 'http://tv.naver.com/v/81652',
+        'info_dict': {
+            'id': '81652',
+            'ext': 'mp4',
+            'title': '[9월 모의고사 해설강의][수학_김상희] 수학 A형 16~20번',
+            'description': '메가스터디 수학 김상희 선생님이 9월 모의고사 수학A형 16번에서 20번까지 해설강의를 공개합니다.',
+            'timestamp': 1378200754,
+            'upload_date': '20130903',
+            'uploader': '메가스터디, 합격불변의 법칙',
+            'uploader_id': 'megastudy',
+        },
+    }, {
+        'url': 'http://tv.naver.com/v/395837',
+        'md5': '8a38e35354d26a17f73f4e90094febd3',
+        'info_dict': {
+            'id': '395837',
+            'ext': 'mp4',
+            'title': '9년이 지나도 아픈 기억, 전효성의 아버지',
+            'description': 'md5:eb6aca9d457b922e43860a2a2b1984d3',
+            'timestamp': 1432030253,
+            'upload_date': '20150519',
+            'uploader': '4가지쇼 시즌2',
+            'uploader_id': 'wrappinguser29',
+        },
+        'skip': 'Georestricted',
+    }, {
+        'url': 'http://tvcast.naver.com/v/81652',
+        'only_matching': True,
+    }]
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+        content = self._download_json(
+            'https://tv.naver.com/api/json/v/' + video_id,
+            video_id, headers=self.geo_verification_headers())
+        player_info_json = content.get('playerInfoJson') or {}
+        current_clip = player_info_json.get('currentClip') or {}
+
+        vid = current_clip.get('videoId')
+        in_key = current_clip.get('inKey')
+
+        if not vid or not in_key:
+            player_auth = try_get(player_info_json, lambda x: x['playerOption']['auth'])
+            if player_auth == 'notCountry':
+                self.raise_geo_restricted(countries=['KR'])
+            elif player_auth == 'notLogin':
+                self.raise_login_required()
+            raise ExtractorError('couldn\'t extract vid and key')
+        info = self._extract_video_info(video_id, vid, in_key)
+        info.update({
+            'description': clean_html(current_clip.get('description')),
+            'timestamp': int_or_none(current_clip.get('firstExposureTime'), 1000),
+            'duration': parse_duration(current_clip.get('displayPlayTime')),
+            'like_count': int_or_none(current_clip.get('recommendPoint')),
+            'age_limit': 19 if current_clip.get('adult') else None,
+        })
+        return info
diff --git a/youtube_dl/extractor/nba.py b/youtube_dlc/extractor/nba.py

similarity index 100%

rename from youtube_dl/extractor/nba.py

rename to youtube_dlc/extractor/nba.py
diff --git a/youtube_dl/extractor/nbc.py b/youtube_dlc/extractor/nbc.py

similarity index 96%

rename from youtube_dl/extractor/nbc.py

rename to youtube_dlc/extractor/nbc.py

index 5bc39d00242c78e7dca8d479dad38b8df8728a0e..6f3cb30034da7f5fcebb99fc6dec05f1ff3cd8e4 100644 (file)
--- a/youtube_dl/extractor/nbc.py
+++ b/youtube_dlc/extractor/nbc.py
@@ -87,11 +87,25 @@ class NBCIE(AdobePassIE):
      def _real_extract(self, url):
          permalink, video_id = re.match(self._VALID_URL, url).groups()
          permalink = 'http' + compat_urllib_parse_unquote(permalink)
-        response = self._download_json(
+        video_data = self._download_json(
              'https://friendship.nbc.co/v2/graphql', video_id, query={
-                'query': '''{
-  page(name: "%s", platform: web, type: VIDEO, userId: "0") {
-    data {
+                'query': '''query bonanzaPage(
+  $app: NBCUBrands! = nbc
+  $name: String!
+  $oneApp: Boolean
+  $platform: SupportedPlatforms! = web
+  $type: EntityPageType! = VIDEO
+  $userId: String!
+) {
+  bonanzaPage(
+    app: $app
+    name: $name
+    oneApp: $oneApp
+    platform: $platform
+    type: $type
+    userId: $userId
+  ) {
+    metadata {
        ... on VideoPageData {
          description
          episodeNumber
@@ -100,15 +114,20 @@ def _real_extract(self, url):
          mpxAccountId
          mpxGuid
          rating
+        resourceId
          seasonNumber
          secondaryTitle
          seriesShortTitle
        }
      }
    }
-}''' % permalink,
-            })
-        video_data = response['data']['page']['data']
+}''',
+                'variables': json.dumps({
+                    'name': permalink,
+                    'oneApp': True,
+                    'userId': '0',
+                }),
+            })['data']['bonanzaPage']['metadata']
          query = {
              'mbr': 'true',
              'manifest': 'm3u',
@@ -117,8 +136,8 @@ def _real_extract(self, url):
          title = video_data['secondaryTitle']
          if video_data.get('locked'):
              resource = self._get_mvpd_resource(
-                'nbcentertainment', title, video_id,
-                video_data.get('rating'))
+                video_data.get('resourceId') or 'nbcentertainment',
+                title, video_id, video_data.get('rating'))
              query['auth'] = self._extract_mvpd_auth(
                  url, video_id, 'nbcentertainment', resource)
          theplatform_url = smuggle_url(update_url_query(
diff --git a/youtube_dl/extractor/ndr.py b/youtube_dlc/extractor/ndr.py

similarity index 93%

rename from youtube_dl/extractor/ndr.py

rename to youtube_dlc/extractor/ndr.py

index aec2ea1331f3c909957e50d4166e7657618fa1a6..2447c812e021e73991082aefab4bd98e6dd000a1 100644 (file)
--- a/youtube_dl/extractor/ndr.py
+++ b/youtube_dlc/extractor/ndr.py
@@ -7,8 +7,11 @@
  from ..utils import (
      determine_ext,
      int_or_none,
+    merge_dicts,
      parse_iso8601,
      qualities,
+    try_get,
+    urljoin,
  )
  
  
@@ -85,21 +88,25 @@ class NDRIE(NDRBaseIE):
  
      def _extract_embed(self, webpage, display_id):
          embed_url = self._html_search_meta(
-            'embedURL', webpage, 'embed URL', fatal=True)
+            'embedURL', webpage, 'embed URL',
+            default=None) or self._search_regex(
+            r'\bembedUrl["\']\s*:\s*(["\'])(?P<url>(?:(?!\1).)+)\1', webpage,
+            'embed URL', group='url')
          description = self._search_regex(
              r'<p[^>]+itemprop="description">([^<]+)</p>',
              webpage, 'description', default=None) or self._og_search_description(webpage)
          timestamp = parse_iso8601(
              self._search_regex(
                  r'<span[^>]+itemprop="(?:datePublished|uploadDate)"[^>]+content="([^"]+)"',
-                webpage, 'upload date', fatal=False))
-        return {
+                webpage, 'upload date', default=None))
+        info = self._search_json_ld(webpage, display_id, default={})
+        return merge_dicts({
              '_type': 'url_transparent',
              'url': embed_url,
              'display_id': display_id,
              'description': description,
              'timestamp': timestamp,
-        }
+        }, info)
  
  
  class NJoyIE(NDRBaseIE):
@@ -220,11 +227,17 @@ def _real_extract(self, url):
          upload_date = ppjson.get('config', {}).get('publicationDate')
          duration = int_or_none(config.get('duration'))
  
-        thumbnails = [{
-            'id': thumbnail.get('quality') or thumbnail_id,
-            'url': thumbnail['src'],
-            'preference': quality_key(thumbnail.get('quality')),
-        } for thumbnail_id, thumbnail in config.get('poster', {}).items() if thumbnail.get('src')]
+        thumbnails = []
+        poster = try_get(config, lambda x: x['poster'], dict) or {}
+        for thumbnail_id, thumbnail in poster.items():
+            thumbnail_url = urljoin(url, thumbnail.get('src'))
+            if not thumbnail_url:
+                continue
+            thumbnails.append({
+                'id': thumbnail.get('quality') or thumbnail_id,
+                'url': thumbnail_url,
+                'preference': quality_key(thumbnail.get('quality')),
+            })
  
          return {
              'id': video_id,
diff --git a/youtube_dl/extractor/ndtv.py b/youtube_dlc/extractor/ndtv.py

similarity index 100%

rename from youtube_dl/extractor/ndtv.py

rename to youtube_dlc/extractor/ndtv.py
diff --git a/youtube_dl/extractor/nerdcubed.py b/youtube_dlc/extractor/nerdcubed.py

similarity index 100%

rename from youtube_dl/extractor/nerdcubed.py

rename to youtube_dlc/extractor/nerdcubed.py
diff --git a/youtube_dl/extractor/neteasemusic.py b/youtube_dlc/extractor/neteasemusic.py

similarity index 100%

rename from youtube_dl/extractor/neteasemusic.py

rename to youtube_dlc/extractor/neteasemusic.py
diff --git a/youtube_dl/extractor/netzkino.py b/youtube_dlc/extractor/netzkino.py

similarity index 100%

rename from youtube_dl/extractor/netzkino.py

rename to youtube_dlc/extractor/netzkino.py
diff --git a/youtube_dl/extractor/newgrounds.py b/youtube_dlc/extractor/newgrounds.py

similarity index 100%

rename from youtube_dl/extractor/newgrounds.py

rename to youtube_dlc/extractor/newgrounds.py
diff --git a/youtube_dl/extractor/newstube.py b/youtube_dlc/extractor/newstube.py

similarity index 100%

rename from youtube_dl/extractor/newstube.py

rename to youtube_dlc/extractor/newstube.py
diff --git a/youtube_dl/extractor/nextmedia.py b/youtube_dlc/extractor/nextmedia.py

similarity index 100%

rename from youtube_dl/extractor/nextmedia.py

rename to youtube_dlc/extractor/nextmedia.py
diff --git a/youtube_dl/extractor/nexx.py b/youtube_dlc/extractor/nexx.py

similarity index 100%

rename from youtube_dl/extractor/nexx.py

rename to youtube_dlc/extractor/nexx.py
diff --git a/youtube_dl/extractor/nfl.py b/youtube_dlc/extractor/nfl.py

similarity index 100%

rename from youtube_dl/extractor/nfl.py

rename to youtube_dlc/extractor/nfl.py
diff --git a/youtube_dl/extractor/nhk.py b/youtube_dlc/extractor/nhk.py

similarity index 85%

rename from youtube_dl/extractor/nhk.py

rename to youtube_dlc/extractor/nhk.py

index 6a2c6cb7bb6d039c56fcf7325de422846c437ab5..de6a707c4265c4fc61a57db117a432a95468ab54 100644 (file)
--- a/youtube_dl/extractor/nhk.py
+++ b/youtube_dlc/extractor/nhk.py
@@ -6,7 +6,7 @@
  
  
  class NhkVodIE(InfoExtractor):
-    _VALID_URL = r'https?://www3\.nhk\.or\.jp/nhkworld/(?P<lang>[a-z]{2})/ondemand/(?P<type>video|audio)/(?P<id>\d{7}|[a-z]+-\d{8}-\d+)'
+    _VALID_URL = r'https?://www3\.nhk\.or\.jp/nhkworld/(?P<lang>[a-z]{2})/ondemand/(?P<type>video|audio)/(?P<id>\d{7}|[^/]+?-\d{8}-\d+)'
      # Content available only for a limited period of time. Visit
      # https://www3.nhk.or.jp/nhkworld/en/ondemand/ for working samples.
      _TESTS = [{
@@ -30,8 +30,11 @@ class NhkVodIE(InfoExtractor):
      }, {
          'url': 'https://www3.nhk.or.jp/nhkworld/fr/ondemand/audio/plugin-20190404-1/',
          'only_matching': True,
+    }, {
+        'url': 'https://www3.nhk.or.jp/nhkworld/en/ondemand/audio/j_art-20150903-1/',
+        'only_matching': True,
      }]
-    _API_URL_TEMPLATE = 'https://api.nhk.or.jp/nhkworld/%sod%slist/v7/episode/%s/%s/all%s.json'
+    _API_URL_TEMPLATE = 'https://api.nhk.or.jp/nhkworld/%sod%slist/v7a/episode/%s/%s/all%s.json'
  
      def _real_extract(self, url):
          lang, m_type, episode_id = re.match(self._VALID_URL, url).groups()
@@ -82,15 +85,9 @@ def get_clean_field(key):
              audio = episode['audio']
              audio_path = audio['audio']
              info['formats'] = self._extract_m3u8_formats(
-                'https://nhks-vh.akamaihd.net/i%s/master.m3u8' % audio_path,
-                episode_id, 'm4a', m3u8_id='hls', fatal=False)
-            for proto in ('rtmpt', 'rtmp'):
-                info['formats'].append({
-                    'ext': 'flv',
-                    'format_id': proto,
-                    'url': '%s://flv.nhk.or.jp/ondemand/mp4:flv%s' % (proto, audio_path),
-                    'vcodec': 'none',
-                })
+                'https://nhkworld-vh.akamaihd.net/i%s/master.m3u8' % audio_path,
+                episode_id, 'm4a', entry_protocol='m3u8_native',
+                m3u8_id='hls', fatal=False)
              for f in info['formats']:
                  f['language'] = lang
          return info
diff --git a/youtube_dl/extractor/nhl.py b/youtube_dlc/extractor/nhl.py

similarity index 100%

rename from youtube_dl/extractor/nhl.py

rename to youtube_dlc/extractor/nhl.py
diff --git a/youtube_dl/extractor/nick.py b/youtube_dlc/extractor/nick.py

similarity index 100%

rename from youtube_dl/extractor/nick.py

rename to youtube_dlc/extractor/nick.py
diff --git a/youtube_dl/extractor/niconico.py b/youtube_dlc/extractor/niconico.py

similarity index 100%

rename from youtube_dl/extractor/niconico.py

rename to youtube_dlc/extractor/niconico.py
diff --git a/youtube_dl/extractor/ninecninemedia.py b/youtube_dlc/extractor/ninecninemedia.py

similarity index 100%

rename from youtube_dl/extractor/ninecninemedia.py

rename to youtube_dlc/extractor/ninecninemedia.py
diff --git a/youtube_dl/extractor/ninegag.py b/youtube_dlc/extractor/ninegag.py

similarity index 100%

rename from youtube_dl/extractor/ninegag.py

rename to youtube_dlc/extractor/ninegag.py
diff --git a/youtube_dl/extractor/ninenow.py b/youtube_dlc/extractor/ninenow.py

similarity index 100%

rename from youtube_dl/extractor/ninenow.py

rename to youtube_dlc/extractor/ninenow.py
diff --git a/youtube_dl/extractor/nintendo.py b/youtube_dlc/extractor/nintendo.py

similarity index 100%

rename from youtube_dl/extractor/nintendo.py

rename to youtube_dlc/extractor/nintendo.py
diff --git a/youtube_dl/extractor/njpwworld.py b/youtube_dlc/extractor/njpwworld.py

similarity index 100%

rename from youtube_dl/extractor/njpwworld.py

rename to youtube_dlc/extractor/njpwworld.py
diff --git a/youtube_dl/extractor/nobelprize.py b/youtube_dlc/extractor/nobelprize.py

similarity index 100%

rename from youtube_dl/extractor/nobelprize.py

rename to youtube_dlc/extractor/nobelprize.py
diff --git a/youtube_dl/extractor/noco.py b/youtube_dlc/extractor/noco.py

similarity index 100%

rename from youtube_dl/extractor/noco.py

rename to youtube_dlc/extractor/noco.py
diff --git a/youtube_dl/extractor/nonktube.py b/youtube_dlc/extractor/nonktube.py

similarity index 100%

rename from youtube_dl/extractor/nonktube.py

rename to youtube_dlc/extractor/nonktube.py
diff --git a/youtube_dl/extractor/noovo.py b/youtube_dlc/extractor/noovo.py

similarity index 100%

rename from youtube_dl/extractor/noovo.py

rename to youtube_dlc/extractor/noovo.py
diff --git a/youtube_dl/extractor/normalboots.py b/youtube_dlc/extractor/normalboots.py

similarity index 100%

rename from youtube_dl/extractor/normalboots.py

rename to youtube_dlc/extractor/normalboots.py
diff --git a/youtube_dl/extractor/nosvideo.py b/youtube_dlc/extractor/nosvideo.py

similarity index 100%

rename from youtube_dl/extractor/nosvideo.py

rename to youtube_dlc/extractor/nosvideo.py
diff --git a/youtube_dl/extractor/nova.py b/youtube_dlc/extractor/nova.py

similarity index 67%

rename from youtube_dl/extractor/nova.py

rename to youtube_dlc/extractor/nova.py

index 901f44b54f40c2c02e120c636a80b0b5bfb4ea2e..47b9748f0202d76ddef8ee5f96257bebe88e4169 100644 (file)
--- a/youtube_dl/extractor/nova.py
+++ b/youtube_dlc/extractor/nova.py
@@ -6,6 +6,7 @@
  from .common import InfoExtractor
  from ..utils import (
      clean_html,
+    determine_ext,
      int_or_none,
      js_to_json,
      qualities,
@@ -18,7 +19,7 @@ class NovaEmbedIE(InfoExtractor):
      _VALID_URL = r'https?://media\.cms\.nova\.cz/embed/(?P<id>[^/?#&]+)'
      _TEST = {
          'url': 'https://media.cms.nova.cz/embed/8o0n0r?autoplay=1',
-        'md5': 'b3834f6de5401baabf31ed57456463f7',
+        'md5': 'ee009bafcc794541570edd44b71cbea3',
          'info_dict': {
              'id': '8o0n0r',
              'ext': 'mp4',
@@ -33,36 +34,76 @@ def _real_extract(self, url):
  
          webpage = self._download_webpage(url, video_id)
  
-        bitrates = self._parse_json(
+        duration = None
+        formats = []
+
+        player = self._parse_json(
              self._search_regex(
-                r'(?s)(?:src|bitrates)\s*=\s*({.+?})\s*;', webpage, 'formats'),
-            video_id, transform_source=js_to_json)
+                r'Player\.init\s*\([^,]+,\s*({.+?})\s*,\s*{.+?}\s*\)\s*;',
+                webpage, 'player', default='{}'), video_id, fatal=False)
+        if player:
+            for format_id, format_list in player['tracks'].items():
+                if not isinstance(format_list, list):
+                    format_list = [format_list]
+                for format_dict in format_list:
+                    if not isinstance(format_dict, dict):
+                        continue
+                    format_url = url_or_none(format_dict.get('src'))
+                    format_type = format_dict.get('type')
+                    ext = determine_ext(format_url)
+                    if (format_type == 'application/x-mpegURL'
+                            or format_id == 'HLS' or ext == 'm3u8'):
+                        formats.extend(self._extract_m3u8_formats(
+                            format_url, video_id, 'mp4',
+                            entry_protocol='m3u8_native', m3u8_id='hls',
+                            fatal=False))
+                    elif (format_type == 'application/dash+xml'
+                          or format_id == 'DASH' or ext == 'mpd'):
+                        formats.extend(self._extract_mpd_formats(
+                            format_url, video_id, mpd_id='dash', fatal=False))
+                    else:
+                        formats.append({
+                            'url': format_url,
+                        })
+            duration = int_or_none(player.get('duration'))
+        else:
+            # Old path, not actual as of 08.04.2020
+            bitrates = self._parse_json(
+                self._search_regex(
+                    r'(?s)(?:src|bitrates)\s*=\s*({.+?})\s*;', webpage, 'formats'),
+                video_id, transform_source=js_to_json)
  
-        QUALITIES = ('lq', 'mq', 'hq', 'hd')
-        quality_key = qualities(QUALITIES)
+            QUALITIES = ('lq', 'mq', 'hq', 'hd')
+            quality_key = qualities(QUALITIES)
+
+            for format_id, format_list in bitrates.items():
+                if not isinstance(format_list, list):
+                    format_list = [format_list]
+                for format_url in format_list:
+                    format_url = url_or_none(format_url)
+                    if not format_url:
+                        continue
+                    if format_id == 'hls':
+                        formats.extend(self._extract_m3u8_formats(
+                            format_url, video_id, ext='mp4',
+                            entry_protocol='m3u8_native', m3u8_id='hls',
+                            fatal=False))
+                        continue
+                    f = {
+                        'url': format_url,
+                    }
+                    f_id = format_id
+                    for quality in QUALITIES:
+                        if '%s.mp4' % quality in format_url:
+                            f_id += '-%s' % quality
+                            f.update({
+                                'quality': quality_key(quality),
+                                'format_note': quality.upper(),
+                            })
+                            break
+                    f['format_id'] = f_id
+                    formats.append(f)
  
-        formats = []
-        for format_id, format_list in bitrates.items():
-            if not isinstance(format_list, list):
-                continue
-            for format_url in format_list:
-                format_url = url_or_none(format_url)
-                if not format_url:
-                    continue
-                f = {
-                    'url': format_url,
-                }
-                f_id = format_id
-                for quality in QUALITIES:
-                    if '%s.mp4' % quality in format_url:
-                        f_id += '-%s' % quality
-                        f.update({
-                            'quality': quality_key(quality),
-                            'format_note': quality.upper(),
-                        })
-                        break
-                f['format_id'] = f_id
-                formats.append(f)
          self._sort_formats(formats)
  
          title = self._og_search_title(
@@ -75,7 +116,8 @@ def _real_extract(self, url):
              r'poster\s*:\s*(["\'])(?P<value>(?:(?!\1).)+)\1', webpage,
              'thumbnail', fatal=False, group='value')
          duration = int_or_none(self._search_regex(
-            r'videoDuration\s*:\s*(\d+)', webpage, 'duration', fatal=False))
+            r'videoDuration\s*:\s*(\d+)', webpage, 'duration',
+            default=duration))
  
          return {
              'id': video_id,
@@ -91,7 +133,7 @@ class NovaIE(InfoExtractor):
      _VALID_URL = r'https?://(?:[^.]+\.)?(?P<site>tv(?:noviny)?|tn|novaplus|vymena|fanda|krasna|doma|prask)\.nova\.cz/(?:[^/]+/)+(?P<id>[^/]+?)(?:\.html|/|$)'
      _TESTS = [{
          'url': 'http://tn.nova.cz/clanek/tajemstvi-ukryte-v-podzemi-specialni-nemocnice-v-prazske-krci.html#player_13260',
-        'md5': '1dd7b9d5ea27bc361f110cd855a19bd3',
+        'md5': '249baab7d0104e186e78b0899c7d5f28',
          'info_dict': {
              'id': '1757139',
              'display_id': 'tajemstvi-ukryte-v-podzemi-specialni-nemocnice-v-prazske-krci',
@@ -113,7 +155,8 @@ class NovaIE(InfoExtractor):
          'params': {
              # rtmp download
              'skip_download': True,
-        }
+        },
+        'skip': 'gone',
      }, {
          # media.cms.nova.cz embed
          'url': 'https://novaplus.nova.cz/porad/ulice/epizoda/18760-2180-dil',
@@ -128,6 +171,7 @@ class NovaIE(InfoExtractor):
              'skip_download': True,
          },
          'add_ie': [NovaEmbedIE.ie_key()],
+        'skip': 'CHYBA 404: STRÁNKA NENALEZENA',
      }, {
          'url': 'http://sport.tn.nova.cz/clanek/sport/hokej/nhl/zivot-jde-dal-hodnotil-po-vyrazeni-z-playoff-jiri-sekac.html',
          'only_matching': True,
@@ -152,14 +196,29 @@ def _real_extract(self, url):
  
          webpage = self._download_webpage(url, display_id)
  
+        description = clean_html(self._og_search_description(webpage, default=None))
+        if site == 'novaplus':
+            upload_date = unified_strdate(self._search_regex(
+                r'(\d{1,2}-\d{1,2}-\d{4})$', display_id, 'upload date', default=None))
+        elif site == 'fanda':
+            upload_date = unified_strdate(self._search_regex(
+                r'<span class="date_time">(\d{1,2}\.\d{1,2}\.\d{4})', webpage, 'upload date', default=None))
+        else:
+            upload_date = None
+
          # novaplus
          embed_id = self._search_regex(
              r'<iframe[^>]+\bsrc=["\'](?:https?:)?//media\.cms\.nova\.cz/embed/([^/?#&]+)',
              webpage, 'embed url', default=None)
          if embed_id:
-            return self.url_result(
-                'https://media.cms.nova.cz/embed/%s' % embed_id,
-                ie=NovaEmbedIE.ie_key(), video_id=embed_id)
+            return {
+                '_type': 'url_transparent',
+                'url': 'https://media.cms.nova.cz/embed/%s' % embed_id,
+                'ie_key': NovaEmbedIE.ie_key(),
+                'id': embed_id,
+                'description': description,
+                'upload_date': upload_date
+            }
  
          video_id = self._search_regex(
              [r"(?:media|video_id)\s*:\s*'(\d+)'",
@@ -233,18 +292,8 @@ def _real_extract(self, url):
          self._sort_formats(formats)
  
          title = mediafile.get('meta', {}).get('title') or self._og_search_title(webpage)
-        description = clean_html(self._og_search_description(webpage, default=None))
          thumbnail = config.get('poster')
  
-        if site == 'novaplus':
-            upload_date = unified_strdate(self._search_regex(
-                r'(\d{1,2}-\d{1,2}-\d{4})$', display_id, 'upload date', default=None))
-        elif site == 'fanda':
-            upload_date = unified_strdate(self._search_regex(
-                r'<span class="date_time">(\d{1,2}\.\d{1,2}\.\d{4})', webpage, 'upload date', default=None))
-        else:
-            upload_date = None
-
          return {
              'id': video_id,
              'display_id': display_id,
diff --git a/youtube_dl/extractor/nowness.py b/youtube_dlc/extractor/nowness.py

similarity index 98%

rename from youtube_dl/extractor/nowness.py

rename to youtube_dlc/extractor/nowness.py

index f26dafb8f03db4c937ace6607f2ddc795fb245de..c136bc8c0bbbdebae101aa1bfc5928435c4f45ca 100644 (file)
--- a/youtube_dl/extractor/nowness.py
+++ b/youtube_dlc/extractor/nowness.py
@@ -37,7 +37,7 @@ def _extract_url_result(self, post):
                      elif source == 'youtube':
                          return self.url_result(video_id, 'Youtube')
                      elif source == 'cinematique':
-                        # youtube-dl currently doesn't support cinematique
+                        # youtube-dlc currently doesn't support cinematique
                          # return self.url_result('http://cinematique.com/embed/%s' % video_id, 'Cinematique')
                          pass
  
diff --git a/youtube_dl/extractor/noz.py b/youtube_dlc/extractor/noz.py

similarity index 100%

rename from youtube_dl/extractor/noz.py

rename to youtube_dlc/extractor/noz.py
diff --git a/youtube_dl/extractor/npo.py b/youtube_dlc/extractor/npo.py

similarity index 100%

rename from youtube_dl/extractor/npo.py

rename to youtube_dlc/extractor/npo.py
diff --git a/youtube_dl/extractor/npr.py b/youtube_dlc/extractor/npr.py

similarity index 85%

rename from youtube_dl/extractor/npr.py

rename to youtube_dlc/extractor/npr.py

index a5e8baa7e2542f4e8d6a8c83dea7bddecf82413d..53acc6e574c0743a223d552d95d8806dc071ed41 100644 (file)
--- a/youtube_dl/extractor/npr.py
+++ b/youtube_dlc/extractor/npr.py
@@ -4,6 +4,7 @@
  from ..utils import (
      int_or_none,
      qualities,
+    url_or_none,
  )
  
  
@@ -48,6 +49,10 @@ class NprIE(InfoExtractor):
              },
          }],
          'expected_warnings': ['Failed to download m3u8 information'],
+    }, {
+        # multimedia, no formats, stream
+        'url': 'https://www.npr.org/2020/02/14/805476846/laura-stevenson-tiny-desk-concert',
+        'only_matching': True,
      }]
  
      def _real_extract(self, url):
@@ -95,6 +100,17 @@ def _real_extract(self, url):
                              'format_id': format_id,
                              'quality': quality(format_id),
                          })
+            for stream_id, stream_entry in media.get('stream', {}).items():
+                if not isinstance(stream_entry, dict):
+                    continue
+                if stream_id != 'hlsUrl':
+                    continue
+                stream_url = url_or_none(stream_entry.get('$text'))
+                if not stream_url:
+                    continue
+                formats.extend(self._extract_m3u8_formats(
+                    stream_url, stream_id, 'mp4', 'm3u8_native',
+                    m3u8_id='hls', fatal=False))
              self._sort_formats(formats)
  
              entries.append({
diff --git a/youtube_dl/extractor/nrk.py b/youtube_dlc/extractor/nrk.py

similarity index 92%

rename from youtube_dl/extractor/nrk.py

rename to youtube_dlc/extractor/nrk.py

index 60933f069c4ca4d38cfb19cad742fd2cb1d1b537..84aacbcda77e699fd9cca81663c857801f529b76 100644 (file)
--- a/youtube_dl/extractor/nrk.py
+++ b/youtube_dlc/extractor/nrk.py
@@ -11,7 +11,7 @@
  from ..utils import (
      ExtractorError,
      int_or_none,
-    JSON_LD_RE,
+    js_to_json,
      NO_DEFAULT,
      parse_age_limit,
      parse_duration,
@@ -105,6 +105,7 @@ def video_id_and_title(idx):
              MESSAGES = {
                  'ProgramRightsAreNotReady': 'Du kan dessverre ikke se eller høre programmet',
                  'ProgramRightsHasExpired': 'Programmet har gått ut',
+                'NoProgramRights': 'Ikke tilgjengelig',
                  'ProgramIsGeoBlocked': 'NRK har ikke rettigheter til å vise dette programmet utenfor Norge',
              }
              message_type = data.get('messageType', '')
@@ -255,6 +256,17 @@ class NRKTVIE(NRKBaseIE):
                      ''' % _EPISODE_RE
      _API_HOSTS = ('psapi-ne.nrk.no', 'psapi-we.nrk.no')
      _TESTS = [{
+        'url': 'https://tv.nrk.no/program/MDDP12000117',
+        'md5': '8270824df46ec629b66aeaa5796b36fb',
+        'info_dict': {
+            'id': 'MDDP12000117AA',
+            'ext': 'mp4',
+            'title': 'Alarm Trolltunga',
+            'description': 'md5:46923a6e6510eefcce23d5ef2a58f2ce',
+            'duration': 2223,
+            'age_limit': 6,
+        },
+    }, {
          'url': 'https://tv.nrk.no/serie/20-spoersmaal-tv/MUHH48000314/23-05-2014',
          'md5': '9a167e54d04671eb6317a37b7bc8a280',
          'info_dict': {
@@ -266,6 +278,7 @@ class NRKTVIE(NRKBaseIE):
              'series': '20 spørsmål',
              'episode': '23.05.2014',
          },
+        'skip': 'NoProgramRights',
      }, {
          'url': 'https://tv.nrk.no/program/mdfp15000514',
          'info_dict': {
@@ -370,7 +383,24 @@ class NRKTVIE(NRKBaseIE):
  
  class NRKTVEpisodeIE(InfoExtractor):
      _VALID_URL = r'https?://tv\.nrk\.no/serie/(?P<id>[^/]+/sesong/\d+/episode/\d+)'
-    _TEST = {
+    _TESTS = [{
+        'url': 'https://tv.nrk.no/serie/hellums-kro/sesong/1/episode/2',
+        'info_dict': {
+            'id': 'MUHH36005220BA',
+            'ext': 'mp4',
+            'title': 'Kro, krig og kjærlighet 2:6',
+            'description': 'md5:b32a7dc0b1ed27c8064f58b97bda4350',
+            'duration': 1563,
+            'series': 'Hellums kro',
+            'season_number': 1,
+            'episode_number': 2,
+            'episode': '2:6',
+            'age_limit': 6,
+        },
+        'params': {
+            'skip_download': True,
+        },
+    }, {
          'url': 'https://tv.nrk.no/serie/backstage/sesong/1/episode/8',
          'info_dict': {
              'id': 'MSUI14000816AA',
@@ -386,20 +416,28 @@ class NRKTVEpisodeIE(InfoExtractor):
          'params': {
              'skip_download': True,
          },
-    }
+        'skip': 'ProgramRightsHasExpired',
+    }]
  
      def _real_extract(self, url):
          display_id = self._match_id(url)
  
          webpage = self._download_webpage(url, display_id)
  
-        nrk_id = self._parse_json(
-            self._search_regex(JSON_LD_RE, webpage, 'JSON-LD', group='json_ld'),
-            display_id)['@id']
-
+        info = self._search_json_ld(webpage, display_id, default={})
+        nrk_id = info.get('@id') or self._html_search_meta(
+            'nrk:program-id', webpage, default=None) or self._search_regex(
+            r'data-program-id=["\'](%s)' % NRKTVIE._EPISODE_RE, webpage,
+            'nrk id')
          assert re.match(NRKTVIE._EPISODE_RE, nrk_id)
-        return self.url_result(
-            'nrk:%s' % nrk_id, ie=NRKIE.ie_key(), video_id=nrk_id)
+
+        info.update({
+            '_type': 'url_transparent',
+            'id': nrk_id,
+            'url': 'nrk:%s' % nrk_id,
+            'ie_key': NRKIE.ie_key(),
+        })
+        return info
  
  
  class NRKTVSerieBaseIE(InfoExtractor):
@@ -409,7 +447,7 @@ def _extract_series(self, webpage, display_id, fatal=True):
                  (r'INITIAL_DATA(?:_V\d)?_*\s*=\s*({.+?})\s*;',
                   r'({.+?})\s*,\s*"[^"]+"\s*\)\s*</script>'),
                  webpage, 'config', default='{}' if not fatal else NO_DEFAULT),
-            display_id, fatal=False)
+            display_id, fatal=False, transform_source=js_to_json)
          if not config:
              return
          return try_get(
@@ -479,6 +517,14 @@ class NRKTVSeriesIE(NRKTVSerieBaseIE):
      _VALID_URL = r'https?://(?:tv|radio)\.nrk(?:super)?\.no/serie/(?P<id>[^/]+)'
      _ITEM_RE = r'(?:data-season=["\']|id=["\']season-)(?P<id>\d+)'
      _TESTS = [{
+        'url': 'https://tv.nrk.no/serie/blank',
+        'info_dict': {
+            'id': 'blank',
+            'title': 'Blank',
+            'description': 'md5:7664b4e7e77dc6810cd3bca367c25b6e',
+        },
+        'playlist_mincount': 30,
+    }, {
          # new layout, seasons
          'url': 'https://tv.nrk.no/serie/backstage',
          'info_dict': {
@@ -648,7 +694,7 @@ class NRKSkoleIE(InfoExtractor):
  
      _TESTS = [{
          'url': 'https://www.nrk.no/skole/?page=search&q=&mediaId=14099',
-        'md5': '6bc936b01f9dd8ed45bc58b252b2d9b6',
+        'md5': '18c12c3d071953c3bf8d54ef6b2587b7',
          'info_dict': {
              'id': '6021',
              'ext': 'mp4',
diff --git a/youtube_dl/extractor/nrl.py b/youtube_dlc/extractor/nrl.py

similarity index 100%

rename from youtube_dl/extractor/nrl.py

rename to youtube_dlc/extractor/nrl.py
diff --git a/youtube_dl/extractor/ntvcojp.py b/youtube_dlc/extractor/ntvcojp.py

similarity index 100%

rename from youtube_dl/extractor/ntvcojp.py

rename to youtube_dlc/extractor/ntvcojp.py
diff --git a/youtube_dl/extractor/ntvde.py b/youtube_dlc/extractor/ntvde.py

similarity index 100%

rename from youtube_dl/extractor/ntvde.py

rename to youtube_dlc/extractor/ntvde.py
diff --git a/youtube_dl/extractor/ntvru.py b/youtube_dlc/extractor/ntvru.py

similarity index 100%

rename from youtube_dl/extractor/ntvru.py

rename to youtube_dlc/extractor/ntvru.py
diff --git a/youtube_dl/extractor/nuevo.py b/youtube_dlc/extractor/nuevo.py

similarity index 100%

rename from youtube_dl/extractor/nuevo.py

rename to youtube_dlc/extractor/nuevo.py
diff --git a/youtube_dl/extractor/nuvid.py b/youtube_dlc/extractor/nuvid.py

similarity index 100%

rename from youtube_dl/extractor/nuvid.py

rename to youtube_dlc/extractor/nuvid.py
diff --git a/youtube_dl/extractor/nytimes.py b/youtube_dlc/extractor/nytimes.py

similarity index 98%

rename from youtube_dl/extractor/nytimes.py

rename to youtube_dlc/extractor/nytimes.py

index 2bb77ab249239163d8318a57e8fd0fdb57d2e32a..fc78ca56c90d37b00c1f396aee7c896d54fb91c9 100644 (file)
--- a/youtube_dl/extractor/nytimes.py
+++ b/youtube_dlc/extractor/nytimes.py
@@ -69,10 +69,10 @@ def get_file_size(file_size):
                      'width': int_or_none(video.get('width')),
                      'height': int_or_none(video.get('height')),
                      'filesize': get_file_size(video.get('file_size') or video.get('fileSize')),
-                    'tbr': int_or_none(video.get('bitrate'), 1000),
+                    'tbr': int_or_none(video.get('bitrate'), 1000) or None,
                      'ext': ext,
                  })
-        self._sort_formats(formats)
+        self._sort_formats(formats, ('height', 'width', 'filesize', 'tbr', 'fps', 'format_id'))
  
          thumbnails = []
          for image in video_data.get('images', []):
diff --git a/youtube_dl/extractor/nzz.py b/youtube_dlc/extractor/nzz.py

similarity index 100%

rename from youtube_dl/extractor/nzz.py

rename to youtube_dlc/extractor/nzz.py
diff --git a/youtube_dl/extractor/odatv.py b/youtube_dlc/extractor/odatv.py

similarity index 100%

rename from youtube_dl/extractor/odatv.py

rename to youtube_dlc/extractor/odatv.py
diff --git a/youtube_dl/extractor/odnoklassniki.py b/youtube_dlc/extractor/odnoklassniki.py

similarity index 100%

rename from youtube_dl/extractor/odnoklassniki.py

rename to youtube_dlc/extractor/odnoklassniki.py
diff --git a/youtube_dl/extractor/oktoberfesttv.py b/youtube_dlc/extractor/oktoberfesttv.py

similarity index 100%

rename from youtube_dl/extractor/oktoberfesttv.py

rename to youtube_dlc/extractor/oktoberfesttv.py
diff --git a/youtube_dl/extractor/once.py b/youtube_dlc/extractor/once.py

similarity index 100%

rename from youtube_dl/extractor/once.py

rename to youtube_dlc/extractor/once.py
diff --git a/youtube_dl/extractor/ondemandkorea.py b/youtube_dlc/extractor/ondemandkorea.py

similarity index 53%

rename from youtube_dl/extractor/ondemandkorea.py

rename to youtube_dlc/extractor/ondemandkorea.py

index df1ce3c1db1eaa22d03609ddb55748e404b1f4a9..cc3c587bc452922160f13517196e916d2dd9c846 100644 (file)
--- a/youtube_dl/extractor/ondemandkorea.py
+++ b/youtube_dlc/extractor/ondemandkorea.py
@@ -11,18 +11,34 @@
  class OnDemandKoreaIE(InfoExtractor):
      _VALID_URL = r'https?://(?:www\.)?ondemandkorea\.com/(?P<id>[^/]+)\.html'
      _GEO_COUNTRIES = ['US', 'CA']
-    _TEST = {
-        'url': 'http://www.ondemandkorea.com/ask-us-anything-e43.html',
+    _TESTS = [{
+        'url': 'https://www.ondemandkorea.com/ask-us-anything-e43.html',
          'info_dict': {
              'id': 'ask-us-anything-e43',
              'ext': 'mp4',
-            'title': 'Ask Us Anything : E43',
+            'title': 'Ask Us Anything : Gain, Ji Soo - 09/24/2016',
+            'description': 'A talk show/game show with a school theme where celebrity guests appear as “transfer students.”',
              'thumbnail': r're:^https?://.*\.jpg$',
          },
          'params': {
              'skip_download': 'm3u8 download'
          }
-    }
+    }, {
+        'url': 'https://www.ondemandkorea.com/confession-e01-1.html',
+        'info_dict': {
+            'id': 'confession-e01-1',
+            'ext': 'mp4',
+            'title': 'Confession : E01',
+            'description': 'Choi Do-hyun, a criminal attorney, is the son of a death row convict. Ever since Choi Pil-su got arrested for murder, Do-hyun has wanted to solve his ',
+            'thumbnail': r're:^https?://.*\.jpg$',
+            'subtitles': {
+                'English': 'mincount:1',
+            },
+        },
+        'params': {
+            'skip_download': 'm3u8 download'
+        }
+    }]
  
      def _real_extract(self, url):
          video_id = self._match_id(url)
@@ -44,11 +60,18 @@ def _real_extract(self, url):
                  'This video is only available to ODK PLUS members.',
                  expected=True)
  
-        title = self._og_search_title(webpage)
+        if 'ODK PREMIUM Members Only' in webpage:
+            raise ExtractorError(
+                'This video is only available to ODK PREMIUM members.',
+                expected=True)
+
+        title = self._search_regex(
+            r'class=["\']episode_title["\'][^>]*>([^<]+)',
+            webpage, 'episode_title', fatal=False) or self._og_search_title(webpage)
  
          jw_config = self._parse_json(
              self._search_regex(
-                r'(?s)jwplayer\(([\'"])(?:(?!\1).)+\1\)\.setup\s*\((?P<options>.+?)\);',
+                r'(?s)odkPlayer\.init.*?(?P<options>{[^;]+}).*?;',
                  webpage, 'jw config', group='options'),
              video_id, transform_source=js_to_json)
          info = self._parse_jwplayer_data(
@@ -57,6 +80,7 @@ def _real_extract(self, url):
  
          info.update({
              'title': title,
-            'thumbnail': self._og_search_thumbnail(webpage),
+            'description': self._og_search_description(webpage),
+            'thumbnail': self._og_search_thumbnail(webpage)
          })
          return info
diff --git a/youtube_dl/extractor/onet.py b/youtube_dlc/extractor/onet.py

similarity index 100%

rename from youtube_dl/extractor/onet.py

rename to youtube_dlc/extractor/onet.py
diff --git a/youtube_dl/extractor/onionstudios.py b/youtube_dlc/extractor/onionstudios.py

similarity index 100%

rename from youtube_dl/extractor/onionstudios.py

rename to youtube_dlc/extractor/onionstudios.py
diff --git a/youtube_dl/extractor/ooyala.py b/youtube_dlc/extractor/ooyala.py

similarity index 100%

rename from youtube_dl/extractor/ooyala.py

rename to youtube_dlc/extractor/ooyala.py
diff --git a/youtube_dl/extractor/openload.py b/youtube_dlc/extractor/openload.py

similarity index 100%

rename from youtube_dl/extractor/openload.py

rename to youtube_dlc/extractor/openload.py
diff --git a/youtube_dl/extractor/ora.py b/youtube_dlc/extractor/ora.py

similarity index 100%

rename from youtube_dl/extractor/ora.py

rename to youtube_dlc/extractor/ora.py
diff --git a/youtube_dl/extractor/orf.py b/youtube_dlc/extractor/orf.py

similarity index 74%

rename from youtube_dl/extractor/orf.py

rename to youtube_dlc/extractor/orf.py

index 3425f76024c04cdb937502361480a81e0b958c16..700ce448c4b8faa925aa1dae179f030acaa0b4f6 100644 (file)
--- a/youtube_dl/extractor/orf.py
+++ b/youtube_dlc/extractor/orf.py
@@ -6,12 +6,14 @@
  from .common import InfoExtractor
  from ..compat import compat_str
  from ..utils import (
+    clean_html,
      determine_ext,
      float_or_none,
      HEADRequest,
      int_or_none,
      orderedSet,
      remove_end,
+    str_or_none,
      strip_jsonp,
      unescapeHTML,
      unified_strdate,
@@ -88,8 +90,11 @@ def _real_extract(self, url):
                  format_id = '-'.join(format_id_list)
                  ext = determine_ext(src)
                  if ext == 'm3u8':
-                    formats.extend(self._extract_m3u8_formats(
-                        src, video_id, 'mp4', m3u8_id=format_id, fatal=False))
+                    m3u8_formats = self._extract_m3u8_formats(
+                        src, video_id, 'mp4', m3u8_id=format_id, fatal=False)
+                    if any('/geoprotection' in f['url'] for f in m3u8_formats):
+                        self.raise_geo_restricted()
+                    formats.extend(m3u8_formats)
                  elif ext == 'f4m':
                      formats.extend(self._extract_f4m_formats(
                          src, video_id, f4m_id=format_id, fatal=False))
@@ -157,48 +162,53 @@ def _real_extract(self, url):
  class ORFRadioIE(InfoExtractor):
      def _real_extract(self, url):
          mobj = re.match(self._VALID_URL, url)
-        station = mobj.group('station')
          show_date = mobj.group('date')
          show_id = mobj.group('show')
  
-        if station == 'fm4':
-            show_id = '4%s' % show_id
-
          data = self._download_json(
-            'http://audioapi.orf.at/%s/api/json/current/broadcast/%s/%s' % (station, show_id, show_date),
-            show_id
-        )
-
-        def extract_entry_dict(info, title, subtitle):
-            return {
-                'id': info['loopStreamId'].replace('.mp3', ''),
-                'url': 'http://loopstream01.apa.at/?channel=%s&id=%s' % (station, info['loopStreamId']),
+            'http://audioapi.orf.at/%s/api/json/current/broadcast/%s/%s'
+            % (self._API_STATION, show_id, show_date), show_id)
+
+        entries = []
+        for info in data['streams']:
+            loop_stream_id = str_or_none(info.get('loopStreamId'))
+            if not loop_stream_id:
+                continue
+            title = str_or_none(data.get('title'))
+            if not title:
+                continue
+            start = int_or_none(info.get('start'), scale=1000)
+            end = int_or_none(info.get('end'), scale=1000)
+            duration = end - start if end and start else None
+            entries.append({
+                'id': loop_stream_id.replace('.mp3', ''),
+                'url': 'http://loopstream01.apa.at/?channel=%s&id=%s' % (self._LOOP_STATION, loop_stream_id),
                  'title': title,
-                'description': subtitle,
-                'duration': (info['end'] - info['start']) / 1000,
-                'timestamp': info['start'] / 1000,
+                'description': clean_html(data.get('subtitle')),
+                'duration': duration,
+                'timestamp': start,
                  'ext': 'mp3',
-                'series': data.get('programTitle')
-            }
-
-        entries = [extract_entry_dict(t, data['title'], data['subtitle']) for t in data['streams']]
+                'series': data.get('programTitle'),
+            })
  
          return {
              '_type': 'playlist',
              'id': show_id,
-            'title': data['title'],
-            'description': data['subtitle'],
-            'entries': entries
+            'title': data.get('title'),
+            'description': clean_html(data.get('subtitle')),
+            'entries': entries,
          }
  
  
  class ORFFM4IE(ORFRadioIE):
      IE_NAME = 'orf:fm4'
      IE_DESC = 'radio FM4'
-    _VALID_URL = r'https?://(?P<station>fm4)\.orf\.at/player/(?P<date>[0-9]+)/(?P<show>\w+)'
+    _VALID_URL = r'https?://(?P<station>fm4)\.orf\.at/player/(?P<date>[0-9]+)/(?P<show>4\w+)'
+    _API_STATION = 'fm4'
+    _LOOP_STATION = 'fm4'
  
      _TEST = {
-        'url': 'http://fm4.orf.at/player/20170107/CC',
+        'url': 'http://fm4.orf.at/player/20170107/4CC',
          'md5': '2b0be47375432a7ef104453432a19212',
          'info_dict': {
              'id': '2017-01-07_2100_tl_54_7DaysSat18_31295',
@@ -209,7 +219,138 @@ class ORFFM4IE(ORFRadioIE):
              'timestamp': 1483819257,
              'upload_date': '20170107',
          },
-        'skip': 'Shows from ORF radios are only available for 7 days.'
+        'skip': 'Shows from ORF radios are only available for 7 days.',
+        'only_matching': True,
+    }
+
+
+class ORFNOEIE(ORFRadioIE):
+    IE_NAME = 'orf:noe'
+    IE_DESC = 'Radio Niederösterreich'
+    _VALID_URL = r'https?://(?P<station>noe)\.orf\.at/player/(?P<date>[0-9]+)/(?P<show>\w+)'
+    _API_STATION = 'noe'
+    _LOOP_STATION = 'oe2n'
+
+    _TEST = {
+        'url': 'https://noe.orf.at/player/20200423/NGM',
+        'only_matching': True,
+    }
+
+
+class ORFWIEIE(ORFRadioIE):
+    IE_NAME = 'orf:wien'
+    IE_DESC = 'Radio Wien'
+    _VALID_URL = r'https?://(?P<station>wien)\.orf\.at/player/(?P<date>[0-9]+)/(?P<show>\w+)'
+    _API_STATION = 'wie'
+    _LOOP_STATION = 'oe2w'
+
+    _TEST = {
+        'url': 'https://wien.orf.at/player/20200423/WGUM',
+        'only_matching': True,
+    }
+
+
+class ORFBGLIE(ORFRadioIE):
+    IE_NAME = 'orf:burgenland'
+    IE_DESC = 'Radio Burgenland'
+    _VALID_URL = r'https?://(?P<station>burgenland)\.orf\.at/player/(?P<date>[0-9]+)/(?P<show>\w+)'
+    _API_STATION = 'bgl'
+    _LOOP_STATION = 'oe2b'
+
+    _TEST = {
+        'url': 'https://burgenland.orf.at/player/20200423/BGM',
+        'only_matching': True,
+    }
+
+
+class ORFOOEIE(ORFRadioIE):
+    IE_NAME = 'orf:oberoesterreich'
+    IE_DESC = 'Radio Oberösterreich'
+    _VALID_URL = r'https?://(?P<station>ooe)\.orf\.at/player/(?P<date>[0-9]+)/(?P<show>\w+)'
+    _API_STATION = 'ooe'
+    _LOOP_STATION = 'oe2o'
+
+    _TEST = {
+        'url': 'https://ooe.orf.at/player/20200423/OGMO',
+        'only_matching': True,
+    }
+
+
+class ORFSTMIE(ORFRadioIE):
+    IE_NAME = 'orf:steiermark'
+    IE_DESC = 'Radio Steiermark'
+    _VALID_URL = r'https?://(?P<station>steiermark)\.orf\.at/player/(?P<date>[0-9]+)/(?P<show>\w+)'
+    _API_STATION = 'stm'
+    _LOOP_STATION = 'oe2st'
+
+    _TEST = {
+        'url': 'https://steiermark.orf.at/player/20200423/STGMS',
+        'only_matching': True,
+    }
+
+
+class ORFKTNIE(ORFRadioIE):
+    IE_NAME = 'orf:kaernten'
+    IE_DESC = 'Radio Kärnten'
+    _VALID_URL = r'https?://(?P<station>kaernten)\.orf\.at/player/(?P<date>[0-9]+)/(?P<show>\w+)'
+    _API_STATION = 'ktn'
+    _LOOP_STATION = 'oe2k'
+
+    _TEST = {
+        'url': 'https://kaernten.orf.at/player/20200423/KGUMO',
+        'only_matching': True,
+    }
+
+
+class ORFSBGIE(ORFRadioIE):
+    IE_NAME = 'orf:salzburg'
+    IE_DESC = 'Radio Salzburg'
+    _VALID_URL = r'https?://(?P<station>salzburg)\.orf\.at/player/(?P<date>[0-9]+)/(?P<show>\w+)'
+    _API_STATION = 'sbg'
+    _LOOP_STATION = 'oe2s'
+
+    _TEST = {
+        'url': 'https://salzburg.orf.at/player/20200423/SGUM',
+        'only_matching': True,
+    }
+
+
+class ORFTIRIE(ORFRadioIE):
+    IE_NAME = 'orf:tirol'
+    IE_DESC = 'Radio Tirol'
+    _VALID_URL = r'https?://(?P<station>tirol)\.orf\.at/player/(?P<date>[0-9]+)/(?P<show>\w+)'
+    _API_STATION = 'tir'
+    _LOOP_STATION = 'oe2t'
+
+    _TEST = {
+        'url': 'https://tirol.orf.at/player/20200423/TGUMO',
+        'only_matching': True,
+    }
+
+
+class ORFVBGIE(ORFRadioIE):
+    IE_NAME = 'orf:vorarlberg'
+    IE_DESC = 'Radio Vorarlberg'
+    _VALID_URL = r'https?://(?P<station>vorarlberg)\.orf\.at/player/(?P<date>[0-9]+)/(?P<show>\w+)'
+    _API_STATION = 'vbg'
+    _LOOP_STATION = 'oe2v'
+
+    _TEST = {
+        'url': 'https://vorarlberg.orf.at/player/20200423/VGUM',
+        'only_matching': True,
+    }
+
+
+class ORFOE3IE(ORFRadioIE):
+    IE_NAME = 'orf:oe3'
+    IE_DESC = 'Radio Österreich 3'
+    _VALID_URL = r'https?://(?P<station>oe3)\.orf\.at/player/(?P<date>[0-9]+)/(?P<show>\w+)'
+    _API_STATION = 'oe3'
+    _LOOP_STATION = 'oe3'
+
+    _TEST = {
+        'url': 'https://oe3.orf.at/player/20200424/3WEK',
+        'only_matching': True,
      }
  
  
@@ -217,6 +358,8 @@ class ORFOE1IE(ORFRadioIE):
      IE_NAME = 'orf:oe1'
      IE_DESC = 'Radio Österreich 1'
      _VALID_URL = r'https?://(?P<station>oe1)\.orf\.at/player/(?P<date>[0-9]+)/(?P<show>\w+)'
+    _API_STATION = 'oe1'
+    _LOOP_STATION = 'oe1'
  
      _TEST = {
          'url': 'http://oe1.orf.at/player/20170108/456544',
diff --git a/youtube_dl/extractor/outsidetv.py b/youtube_dlc/extractor/outsidetv.py

similarity index 100%

rename from youtube_dl/extractor/outsidetv.py

rename to youtube_dlc/extractor/outsidetv.py
diff --git a/youtube_dl/extractor/packtpub.py b/youtube_dlc/extractor/packtpub.py

similarity index 100%

rename from youtube_dl/extractor/packtpub.py

rename to youtube_dlc/extractor/packtpub.py
diff --git a/youtube_dl/extractor/pandoratv.py b/youtube_dlc/extractor/pandoratv.py

similarity index 100%

rename from youtube_dl/extractor/pandoratv.py

rename to youtube_dlc/extractor/pandoratv.py
diff --git a/youtube_dl/extractor/parliamentliveuk.py b/youtube_dlc/extractor/parliamentliveuk.py

similarity index 100%

rename from youtube_dl/extractor/parliamentliveuk.py

rename to youtube_dlc/extractor/parliamentliveuk.py
diff --git a/youtube_dl/extractor/patreon.py b/youtube_dlc/extractor/patreon.py

similarity index 100%

rename from youtube_dl/extractor/patreon.py

rename to youtube_dlc/extractor/patreon.py
diff --git a/youtube_dl/extractor/pbs.py b/youtube_dlc/extractor/pbs.py

similarity index 100%

rename from youtube_dl/extractor/pbs.py

rename to youtube_dlc/extractor/pbs.py
diff --git a/youtube_dl/extractor/pearvideo.py b/youtube_dlc/extractor/pearvideo.py

similarity index 100%

rename from youtube_dl/extractor/pearvideo.py

rename to youtube_dlc/extractor/pearvideo.py
diff --git a/youtube_dl/extractor/peertube.py b/youtube_dlc/extractor/peertube.py

similarity index 87%

rename from youtube_dl/extractor/peertube.py

rename to youtube_dlc/extractor/peertube.py

index d3a83ea2bb5215e34e72ad85bf99867697e2e1b2..48fb9541693c35878317f22ed9dd6e2da4412ced 100644 (file)
--- a/youtube_dl/extractor/peertube.py
+++ b/youtube_dlc/extractor/peertube.py
@@ -8,6 +8,7 @@
  from ..utils import (
      int_or_none,
      parse_resolution,
+    str_or_none,
      try_get,
      unified_timestamp,
      url_or_none,
@@ -415,6 +416,7 @@ class PeerTubeIE(InfoExtractor):
                              peertube\.cpy\.re
                          )'''
      _UUID_RE = r'[\da-fA-F]{8}-[\da-fA-F]{4}-[\da-fA-F]{4}-[\da-fA-F]{4}-[\da-fA-F]{12}'
+    _API_BASE = 'https://%s/api/v1/videos/%s/%s'
      _VALID_URL = r'''(?x)
                      (?:
                          peertube:(?P<host>[^:]+):|
@@ -423,26 +425,30 @@ class PeerTubeIE(InfoExtractor):
                      (?P<id>%s)
                      ''' % (_INSTANCES_RE, _UUID_RE)
      _TESTS = [{
-        'url': 'https://peertube.cpy.re/videos/watch/2790feb0-8120-4e63-9af3-c943c69f5e6c',
-        'md5': '80f24ff364cc9d333529506a263e7feb',
+        'url': 'https://framatube.org/videos/watch/9c9de5e8-0a1e-484a-b099-e80766180a6d',
+        'md5': '9bed8c0137913e17b86334e5885aacff',
          'info_dict': {
-            'id': '2790feb0-8120-4e63-9af3-c943c69f5e6c',
+            'id': '9c9de5e8-0a1e-484a-b099-e80766180a6d',
              'ext': 'mp4',
-            'title': 'wow',
-            'description': 'wow such video, so gif',
+            'title': 'What is PeerTube?',
+            'description': 'md5:3fefb8dde2b189186ce0719fda6f7b10',
              'thumbnail': r're:https?://.*\.(?:jpg|png)',
-            'timestamp': 1519297480,
-            'upload_date': '20180222',
-            'uploader': 'Luclu7',
-            'uploader_id': '7fc42640-efdb-4505-a45d-a15b1a5496f1',
-            'uploder_url': 'https://peertube.nsa.ovh/accounts/luclu7',
-            'license': 'Unknown',
-            'duration': 3,
+            'timestamp': 1538391166,
+            'upload_date': '20181001',
+            'uploader': 'Framasoft',
+            'uploader_id': '3',
+            'uploader_url': 'https://framatube.org/accounts/framasoft',
+            'channel': 'Les vidéos de Framasoft',
+            'channel_id': '2',
+            'channel_url': 'https://framatube.org/video-channels/bf54d359-cfad-4935-9d45-9d6be93f63e8',
+            'language': 'en',
+            'license': 'Attribution - Share Alike',
+            'duration': 113,
              'view_count': int,
              'like_count': int,
              'dislike_count': int,
-            'tags': list,
-            'categories': list,
+            'tags': ['framasoft', 'peertube'],
+            'categories': ['Science & Technology'],
          }
      }, {
          'url': 'https://peertube.tamanoir.foucry.net/videos/watch/0b04f13d-1e18-4f1d-814e-4979aa7c9c44',
@@ -484,13 +490,38 @@ def _extract_urls(webpage, source_url):
                  entries = [peertube_url]
          return entries
  
+    def _call_api(self, host, video_id, path, note=None, errnote=None, fatal=True):
+        return self._download_json(
+            self._API_BASE % (host, video_id, path), video_id,
+            note=note, errnote=errnote, fatal=fatal)
+
+    def _get_subtitles(self, host, video_id):
+        captions = self._call_api(
+            host, video_id, 'captions', note='Downloading captions JSON',
+            fatal=False)
+        if not isinstance(captions, dict):
+            return
+        data = captions.get('data')
+        if not isinstance(data, list):
+            return
+        subtitles = {}
+        for e in data:
+            language_id = try_get(e, lambda x: x['language']['id'], compat_str)
+            caption_url = urljoin('https://%s' % host, e.get('captionPath'))
+            if not caption_url:
+                continue
+            subtitles.setdefault(language_id or 'en', []).append({
+                'url': caption_url,
+            })
+        return subtitles
+
      def _real_extract(self, url):
          mobj = re.match(self._VALID_URL, url)
          host = mobj.group('host') or mobj.group('host_2')
          video_id = mobj.group('id')
  
-        video = self._download_json(
-            'https://%s/api/v1/videos/%s' % (host, video_id), video_id)
+        video = self._call_api(
+            host, video_id, '', note='Downloading video JSON')
  
          title = video['name']
  
@@ -513,10 +544,28 @@ def _real_extract(self, url):
              formats.append(f)
          self._sort_formats(formats)
  
-        def account_data(field):
-            return try_get(video, lambda x: x['account'][field], compat_str)
+        full_description = self._call_api(
+            host, video_id, 'description', note='Downloading description JSON',
+            fatal=False)
+
+        description = None
+        if isinstance(full_description, dict):
+            description = str_or_none(full_description.get('description'))
+        if not description:
+            description = video.get('description')
+
+        subtitles = self.extract_subtitles(host, video_id)
+
+        def data(section, field, type_):
+            return try_get(video, lambda x: x[section][field], type_)
+
+        def account_data(field, type_):
+            return data('account', field, type_)
+
+        def channel_data(field, type_):
+            return data('channel', field, type_)
  
-        category = try_get(video, lambda x: x['category']['label'], compat_str)
+        category = data('category', 'label', compat_str)
          categories = [category] if category else None
  
          nsfw = video.get('nsfw')
@@ -528,14 +577,17 @@ def account_data(field):
          return {
              'id': video_id,
              'title': title,
-            'description': video.get('description'),
+            'description': description,
              'thumbnail': urljoin(url, video.get('thumbnailPath')),
              'timestamp': unified_timestamp(video.get('publishedAt')),
-            'uploader': account_data('displayName'),
-            'uploader_id': account_data('uuid'),
-            'uploder_url': account_data('url'),
-            'license': try_get(
-                video, lambda x: x['licence']['label'], compat_str),
+            'uploader': account_data('displayName', compat_str),
+            'uploader_id': str_or_none(account_data('id', int)),
+            'uploader_url': url_or_none(account_data('url', compat_str)),
+            'channel': channel_data('displayName', compat_str),
+            'channel_id': str_or_none(channel_data('id', int)),
+            'channel_url': url_or_none(channel_data('url', compat_str)),
+            'language': data('language', 'id', compat_str),
+            'license': data('licence', 'label', compat_str),
              'duration': int_or_none(video.get('duration')),
              'view_count': int_or_none(video.get('views')),
              'like_count': int_or_none(video.get('likes')),
@@ -544,4 +596,5 @@ def account_data(field):
              'tags': try_get(video, lambda x: x['tags'], list),
              'categories': categories,
              'formats': formats,
+            'subtitles': subtitles
          }
diff --git a/youtube_dl/extractor/people.py b/youtube_dlc/extractor/people.py

similarity index 100%

rename from youtube_dl/extractor/people.py

rename to youtube_dlc/extractor/people.py
diff --git a/youtube_dl/extractor/performgroup.py b/youtube_dlc/extractor/performgroup.py

similarity index 100%

rename from youtube_dl/extractor/performgroup.py

rename to youtube_dlc/extractor/performgroup.py
diff --git a/youtube_dl/extractor/periscope.py b/youtube_dlc/extractor/periscope.py

similarity index 99%

rename from youtube_dl/extractor/periscope.py

rename to youtube_dlc/extractor/periscope.py

index c02e34abac8720361f94b7085e5f7fb3814df312..b15906390d07715494b5653dce5499ca0ad72141 100644 (file)
--- a/youtube_dl/extractor/periscope.py
+++ b/youtube_dlc/extractor/periscope.py
@@ -18,7 +18,7 @@ def _call_api(self, method, query, item_id):
              item_id, query=query)
  
      def _parse_broadcast_data(self, broadcast, video_id):
-        title = broadcast['status']
+        title = broadcast.get('status') or 'Periscope Broadcast'
          uploader = broadcast.get('user_display_name') or broadcast.get('username')
          title = '%s - %s' % (uploader, title) if uploader else title
          is_live = broadcast.get('state').lower() == 'running'
diff --git a/youtube_dl/extractor/philharmoniedeparis.py b/youtube_dlc/extractor/philharmoniedeparis.py

similarity index 100%

rename from youtube_dl/extractor/philharmoniedeparis.py

rename to youtube_dlc/extractor/philharmoniedeparis.py
diff --git a/youtube_dlc/extractor/phoenix.py b/youtube_dlc/extractor/phoenix.py

new file mode 100644 (file)

index 0000000..8d52ad3
--- /dev/null
+++ b/youtube_dlc/extractor/phoenix.py
@@ -0,0 +1,52 @@
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..utils import ExtractorError
+
+
+class PhoenixIE(InfoExtractor):
+    IE_NAME = 'phoenix.de'
+    _VALID_URL = r'''https?://(?:www\.)?phoenix.de/\D+(?P<id>\d+)\.html'''
+    _TESTS = [
+        {
+            'url': 'https://www.phoenix.de/sendungen/dokumentationen/unsere-welt-in-zukunft---stadt-a-1283620.html',
+            'md5': '5e765e838aa3531c745a4f5b249ee3e3',
+            'info_dict': {
+                'id': '0OB4HFc43Ns',
+                'ext': 'mp4',
+                'title': 'Unsere Welt in Zukunft - Stadt',
+                'description': 'md5:9bfb6fd498814538f953b2dcad7ce044',
+                'upload_date': '20190912',
+                'uploader': 'phoenix',
+                'uploader_id': 'phoenix',
+            }
+        },
+        {
+            'url': 'https://www.phoenix.de/drohnenangriffe-in-saudi-arabien-a-1286995.html?ref=aktuelles',
+            'only_matching': True,
+        },
+        # an older page: https://www.phoenix.de/sendungen/gespraeche/phoenix-persoenlich/im-dialog-a-177727.html
+        # seems to not have an embedded video, even though it's uploaded on youtube: https://www.youtube.com/watch?v=4GxnoUHvOkM
+    ]
+
+    def extract_from_json_api(self, video_id, api_url):
+        doc = self._download_json(
+            api_url, video_id,
+            note="Downloading webpage metadata",
+            errnote="Failed to load webpage metadata")
+
+        for a in doc["absaetze"]:
+            if a["typ"] == "video-youtube":
+                return {
+                    '_type': 'url_transparent',
+                    'id': a["id"],
+                    'title': doc["titel"],
+                    'url': "https://www.youtube.com/watch?v=%s" % a["id"],
+                    'ie_key': 'Youtube',
+                }
+        raise ExtractorError("No downloadable video found", expected=True)
+
+    def _real_extract(self, url):
+        page_id = self._match_id(url)
+        api_url = 'https://www.phoenix.de/response/id/%s' % page_id
+        return self.extract_from_json_api(page_id, api_url)
diff --git a/youtube_dl/extractor/photobucket.py b/youtube_dlc/extractor/photobucket.py

similarity index 100%

rename from youtube_dl/extractor/photobucket.py

rename to youtube_dlc/extractor/photobucket.py
diff --git a/youtube_dl/extractor/picarto.py b/youtube_dlc/extractor/picarto.py

similarity index 100%

rename from youtube_dl/extractor/picarto.py

rename to youtube_dlc/extractor/picarto.py
diff --git a/youtube_dl/extractor/piksel.py b/youtube_dlc/extractor/piksel.py

similarity index 100%

rename from youtube_dl/extractor/piksel.py

rename to youtube_dlc/extractor/piksel.py
diff --git a/youtube_dl/extractor/pinkbike.py b/youtube_dlc/extractor/pinkbike.py

similarity index 100%

rename from youtube_dl/extractor/pinkbike.py

rename to youtube_dlc/extractor/pinkbike.py
diff --git a/youtube_dl/extractor/pladform.py b/youtube_dlc/extractor/pladform.py

similarity index 100%

rename from youtube_dl/extractor/pladform.py

rename to youtube_dlc/extractor/pladform.py
diff --git a/youtube_dl/extractor/platzi.py b/youtube_dlc/extractor/platzi.py

similarity index 99%

rename from youtube_dl/extractor/platzi.py

rename to youtube_dlc/extractor/platzi.py

index 602207bebdd6a01d7f33dbf08302ab5a75ccf207..23c8256b59dab4a92ae79ef48dc8e3b0adf0ff68 100644 (file)
--- a/youtube_dl/extractor/platzi.py
+++ b/youtube_dlc/extractor/platzi.py
@@ -46,7 +46,7 @@ def _login(self):
              headers={'Referer': self._LOGIN_URL})
  
          # login succeeded
-        if 'platzi.com/login' not in compat_str(urlh.geturl()):
+        if 'platzi.com/login' not in urlh.geturl():
              return
  
          login_error = self._webpage_read_content(
diff --git a/youtube_dl/extractor/playfm.py b/youtube_dlc/extractor/playfm.py

similarity index 100%

rename from youtube_dl/extractor/playfm.py

rename to youtube_dlc/extractor/playfm.py
diff --git a/youtube_dl/extractor/playplustv.py b/youtube_dlc/extractor/playplustv.py

similarity index 100%

rename from youtube_dl/extractor/playplustv.py

rename to youtube_dlc/extractor/playplustv.py
diff --git a/youtube_dl/extractor/plays.py b/youtube_dlc/extractor/plays.py

similarity index 100%

rename from youtube_dl/extractor/plays.py

rename to youtube_dlc/extractor/plays.py
diff --git a/youtube_dl/extractor/playtvak.py b/youtube_dlc/extractor/playtvak.py

similarity index 100%

rename from youtube_dl/extractor/playtvak.py

rename to youtube_dlc/extractor/playtvak.py
diff --git a/youtube_dl/extractor/playvid.py b/youtube_dlc/extractor/playvid.py

similarity index 100%

rename from youtube_dl/extractor/playvid.py

rename to youtube_dlc/extractor/playvid.py
diff --git a/youtube_dl/extractor/playwire.py b/youtube_dlc/extractor/playwire.py

similarity index 100%

rename from youtube_dl/extractor/playwire.py

rename to youtube_dlc/extractor/playwire.py
diff --git a/youtube_dl/extractor/pluralsight.py b/youtube_dlc/extractor/pluralsight.py

similarity index 100%

rename from youtube_dl/extractor/pluralsight.py

rename to youtube_dlc/extractor/pluralsight.py
diff --git a/youtube_dl/extractor/podomatic.py b/youtube_dlc/extractor/podomatic.py

similarity index 100%

rename from youtube_dl/extractor/podomatic.py

rename to youtube_dlc/extractor/podomatic.py
diff --git a/youtube_dlc/extractor/pokemon.py b/youtube_dlc/extractor/pokemon.py

new file mode 100644 (file)

index 0000000..14ee1a7
--- /dev/null
+++ b/youtube_dlc/extractor/pokemon.py
@@ -0,0 +1,138 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+import re
+
+from .common import InfoExtractor
+from ..utils import (
+    ExtractorError,
+    extract_attributes,
+    int_or_none,
+    js_to_json,
+    merge_dicts,
+)
+
+
+class PokemonIE(InfoExtractor):
+    _VALID_URL = r'https?://(?:www\.)?pokemon\.com/[a-z]{2}(?:.*?play=(?P<id>[a-z0-9]{32})|/(?:[^/]+/)+(?P<display_id>[^/?#&]+))'
+    _TESTS = [{
+        'url': 'https://www.pokemon.com/us/pokemon-episodes/20_30-the-ol-raise-and-switch/',
+        'md5': '2fe8eaec69768b25ef898cda9c43062e',
+        'info_dict': {
+            'id': 'afe22e30f01c41f49d4f1d9eab5cd9a4',
+            'ext': 'mp4',
+            'title': 'The Ol’ Raise and Switch!',
+            'description': 'md5:7db77f7107f98ba88401d3adc80ff7af',
+        },
+        'add_id': ['LimelightMedia'],
+    }, {
+        # no data-video-title
+        'url': 'https://www.pokemon.com/fr/episodes-pokemon/films-pokemon/pokemon-lascension-de-darkrai-2008',
+        'info_dict': {
+            'id': 'dfbaf830d7e54e179837c50c0c6cc0e1',
+            'ext': 'mp4',
+            'title': "Pokémon : L'ascension de Darkrai",
+            'description': 'md5:d1dbc9e206070c3e14a06ff557659fb5',
+        },
+        'add_id': ['LimelightMedia'],
+        'params': {
+            'skip_download': True,
+        },
+    }, {
+        'url': 'http://www.pokemon.com/uk/pokemon-episodes/?play=2e8b5c761f1d4a9286165d7748c1ece2',
+        'only_matching': True,
+    }, {
+        'url': 'http://www.pokemon.com/fr/episodes-pokemon/18_09-un-hiver-inattendu/',
+        'only_matching': True,
+    }, {
+        'url': 'http://www.pokemon.com/de/pokemon-folgen/01_20-bye-bye-smettbo/',
+        'only_matching': True,
+    }]
+
+    def _real_extract(self, url):
+        video_id, display_id = re.match(self._VALID_URL, url).groups()
+        webpage = self._download_webpage(url, video_id or display_id)
+        video_data = extract_attributes(self._search_regex(
+            r'(<[^>]+data-video-id="%s"[^>]*>)' % (video_id if video_id else '[a-z0-9]{32}'),
+            webpage, 'video data element'))
+        video_id = video_data['data-video-id']
+        title = video_data.get('data-video-title') or self._html_search_meta(
+            'pkm-title', webpage, ' title', default=None) or self._search_regex(
+            r'<h1[^>]+\bclass=["\']us-title[^>]+>([^<]+)', webpage, 'title')
+        return {
+            '_type': 'url_transparent',
+            'id': video_id,
+            'url': 'limelight:media:%s' % video_id,
+            'title': title,
+            'description': video_data.get('data-video-summary'),
+            'thumbnail': video_data.get('data-video-poster'),
+            'series': 'Pokémon',
+            'season_number': int_or_none(video_data.get('data-video-season')),
+            'episode': title,
+            'episode_number': int_or_none(video_data.get('data-video-episode')),
+            'ie_key': 'LimelightMedia',
+        }
+
+
+class PokemonWatchIE(InfoExtractor):
+    _VALID_URL = r'https?://watch\.pokemon\.com/[a-z]{2}-[a-z]{2}/player\.html\?id=(?P<id>[a-z0-9]{32})'
+    _API_URL = 'https://www.pokemon.com/api/pokemontv/v2/channels/{0:}'
+    _TESTS = [{
+        'url': 'https://watch.pokemon.com/en-us/player.html?id=8309a40969894a8e8d5bc1311e9c5667',
+        'md5': '62833938a31e61ab49ada92f524c42ff',
+        'info_dict': {
+            'id': '8309a40969894a8e8d5bc1311e9c5667',
+            'ext': 'mp4',
+            'title': 'Lillier and the Staff!',
+            'description': 'md5:338841b8c21b283d24bdc9b568849f04',
+        }
+    }, {
+        'url': 'https://watch.pokemon.com/de-de/player.html?id=b3c402e111a4459eb47e12160ab0ba07',
+        'only_matching': True
+    }]
+
+    def _extract_media(self, channel_array, video_id):
+        for channel in channel_array:
+            for media in channel.get('media'):
+                if media.get('id') == video_id:
+                    return media
+        return None
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+
+        info = {
+            '_type': 'url',
+            'id': video_id,
+            'url': 'limelight:media:%s' % video_id,
+            'ie_key': 'LimelightMedia',
+        }
+
+        # API call can be avoided entirely if we are listing formats
+        if self._downloader.params.get('listformats', False):
+            return info
+
+        webpage = self._download_webpage(url, video_id)
+        build_vars = self._parse_json(self._search_regex(
+            r'(?s)buildVars\s*=\s*({.*?})', webpage, 'build vars'),
+            video_id, transform_source=js_to_json)
+        region = build_vars.get('region')
+        channel_array = self._download_json(self._API_URL.format(region), video_id)
+        video_data = self._extract_media(channel_array, video_id)
+
+        if video_data is None:
+            raise ExtractorError(
+                'Video %s does not exist' % video_id, expected=True)
+
+        info['_type'] = 'url_transparent'
+        images = video_data.get('images')
+
+        return merge_dicts(info, {
+            'title': video_data.get('title'),
+            'description': video_data.get('description'),
+            'thumbnail': images.get('medium') or images.get('small'),
+            'series': 'Pokémon',
+            'season_number': int_or_none(video_data.get('season')),
+            'episode': video_data.get('title'),
+            'episode_number': int_or_none(video_data.get('episode')),
+        })
diff --git a/youtube_dl/extractor/polskieradio.py b/youtube_dlc/extractor/polskieradio.py

similarity index 100%

rename from youtube_dl/extractor/polskieradio.py

rename to youtube_dlc/extractor/polskieradio.py
diff --git a/youtube_dlc/extractor/popcorntimes.py b/youtube_dlc/extractor/popcorntimes.py

new file mode 100644 (file)

index 0000000..7bf7f98
--- /dev/null
+++ b/youtube_dlc/extractor/popcorntimes.py
@@ -0,0 +1,99 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+import re
+
+from .common import InfoExtractor
+from ..compat import (
+    compat_b64decode,
+    compat_chr,
+)
+from ..utils import int_or_none
+
+
+class PopcorntimesIE(InfoExtractor):
+    _VALID_URL = r'https?://popcorntimes\.tv/[^/]+/m/(?P<id>[^/]+)/(?P<display_id>[^/?#&]+)'
+    _TEST = {
+        'url': 'https://popcorntimes.tv/de/m/A1XCFvz/haensel-und-gretel-opera-fantasy',
+        'md5': '93f210991ad94ba8c3485950a2453257',
+        'info_dict': {
+            'id': 'A1XCFvz',
+            'display_id': 'haensel-und-gretel-opera-fantasy',
+            'ext': 'mp4',
+            'title': 'Hänsel und Gretel',
+            'description': 'md5:1b8146791726342e7b22ce8125cf6945',
+            'thumbnail': r're:^https?://.*\.jpg$',
+            'creator': 'John Paul',
+            'release_date': '19541009',
+            'duration': 4260,
+            'tbr': 5380,
+            'width': 720,
+            'height': 540,
+        },
+    }
+
+    def _real_extract(self, url):
+        mobj = re.match(self._VALID_URL, url)
+        video_id, display_id = mobj.group('id', 'display_id')
+
+        webpage = self._download_webpage(url, display_id)
+
+        title = self._search_regex(
+            r'<h1>([^<]+)', webpage, 'title',
+            default=None) or self._html_search_meta(
+            'ya:ovs:original_name', webpage, 'title', fatal=True)
+
+        loc = self._search_regex(
+            r'PCTMLOC\s*=\s*(["\'])(?P<value>(?:(?!\1).)+)\1', webpage, 'loc',
+            group='value')
+
+        loc_b64 = ''
+        for c in loc:
+            c_ord = ord(c)
+            if ord('a') <= c_ord <= ord('z') or ord('A') <= c_ord <= ord('Z'):
+                upper = ord('Z') if c_ord <= ord('Z') else ord('z')
+                c_ord += 13
+                if upper < c_ord:
+                    c_ord -= 26
+            loc_b64 += compat_chr(c_ord)
+
+        video_url = compat_b64decode(loc_b64).decode('utf-8')
+
+        description = self._html_search_regex(
+            r'(?s)<div[^>]+class=["\']pt-movie-desc[^>]+>(.+?)</div>', webpage,
+            'description', fatal=False)
+
+        thumbnail = self._search_regex(
+            r'<img[^>]+class=["\']video-preview[^>]+\bsrc=(["\'])(?P<value>(?:(?!\1).)+)\1',
+            webpage, 'thumbnail', default=None,
+            group='value') or self._og_search_thumbnail(webpage)
+
+        creator = self._html_search_meta(
+            'video:director', webpage, 'creator', default=None)
+
+        release_date = self._html_search_meta(
+            'video:release_date', webpage, default=None)
+        if release_date:
+            release_date = release_date.replace('-', '')
+
+        def int_meta(name):
+            return int_or_none(self._html_search_meta(
+                name, webpage, default=None))
+
+        return {
+            'id': video_id,
+            'display_id': display_id,
+            'url': video_url,
+            'title': title,
+            'description': description,
+            'thumbnail': thumbnail,
+            'creator': creator,
+            'release_date': release_date,
+            'duration': int_meta('video:duration'),
+            'tbr': int_meta('ya:ovs:bitrate'),
+            'width': int_meta('og:video:width'),
+            'height': int_meta('og:video:height'),
+            'http_headers': {
+                'Referer': url,
+            },
+        }
diff --git a/youtube_dl/extractor/popcorntv.py b/youtube_dlc/extractor/popcorntv.py

similarity index 100%

rename from youtube_dl/extractor/popcorntv.py

rename to youtube_dlc/extractor/popcorntv.py
diff --git a/youtube_dl/extractor/porn91.py b/youtube_dlc/extractor/porn91.py

similarity index 100%

rename from youtube_dl/extractor/porn91.py

rename to youtube_dlc/extractor/porn91.py
diff --git a/youtube_dl/extractor/porncom.py b/youtube_dlc/extractor/porncom.py

similarity index 100%

rename from youtube_dl/extractor/porncom.py

rename to youtube_dlc/extractor/porncom.py
diff --git a/youtube_dl/extractor/pornhd.py b/youtube_dlc/extractor/pornhd.py

similarity index 77%

rename from youtube_dl/extractor/pornhd.py

rename to youtube_dlc/extractor/pornhd.py

index 27d65d4b9cdcf1e068b0d6971502eaa0caccf895..c6052ac9f966f332d0cfb1f7acfe68b0a143d2b7 100644 (file)
--- a/youtube_dl/extractor/pornhd.py
+++ b/youtube_dlc/extractor/pornhd.py
@@ -8,6 +8,7 @@
      ExtractorError,
      int_or_none,
      js_to_json,
+    merge_dicts,
      urljoin,
  )
  
@@ -27,23 +28,22 @@ class PornHdIE(InfoExtractor):
              'view_count': int,
              'like_count': int,
              'age_limit': 18,
-        }
+        },
+        'skip': 'HTTP Error 404: Not Found',
      }, {
-        # removed video
          'url': 'http://www.pornhd.com/videos/1962/sierra-day-gets-his-cum-all-over-herself-hd-porn-video',
-        'md5': '956b8ca569f7f4d8ec563e2c41598441',
+        'md5': '1b7b3a40b9d65a8e5b25f7ab9ee6d6de',
          'info_dict': {
              'id': '1962',
              'display_id': 'sierra-day-gets-his-cum-all-over-herself-hd-porn-video',
              'ext': 'mp4',
-            'title': 'Sierra loves doing laundry',
+            'title': 'md5:98c6f8b2d9c229d0f0fde47f61a1a759',
              'description': 'md5:8ff0523848ac2b8f9b065ba781ccf294',
              'thumbnail': r're:^https?://.*\.jpg',
              'view_count': int,
              'like_count': int,
              'age_limit': 18,
          },
-        'skip': 'Not available anymore',
      }]
  
      def _real_extract(self, url):
@@ -61,7 +61,13 @@ def _real_extract(self, url):
              r"(?s)sources'?\s*[:=]\s*(\{.+?\})",
              webpage, 'sources', default='{}')), video_id)
  
+        info = {}
          if not sources:
+            entries = self._parse_html5_media_entries(url, webpage, video_id)
+            if entries:
+                info = entries[0]
+
+        if not sources and not info:
              message = self._html_search_regex(
                  r'(?s)<(div|p)[^>]+class="no-video"[^>]*>(?P<value>.+?)</\1',
                  webpage, 'error message', group='value')
@@ -80,23 +86,29 @@ def _real_extract(self, url):
                  'format_id': format_id,
                  'height': height,
              })
-        self._sort_formats(formats)
+        if formats:
+            info['formats'] = formats
+        self._sort_formats(info['formats'])
  
          description = self._html_search_regex(
-            r'<(div|p)[^>]+class="description"[^>]*>(?P<value>[^<]+)</\1',
-            webpage, 'description', fatal=False, group='value')
+            (r'(?s)<section[^>]+class=["\']video-description[^>]+>(?P<value>.+?)</section>',
+             r'<(div|p)[^>]+class="description"[^>]*>(?P<value>[^<]+)</\1'),
+            webpage, 'description', fatal=False,
+            group='value') or self._html_search_meta(
+            'description', webpage, default=None) or self._og_search_description(webpage)
          view_count = int_or_none(self._html_search_regex(
              r'(\d+) views\s*<', webpage, 'view count', fatal=False))
          thumbnail = self._search_regex(
              r"poster'?\s*:\s*([\"'])(?P<url>(?:(?!\1).)+)\1", webpage,
-            'thumbnail', fatal=False, group='url')
+            'thumbnail', default=None, group='url')
  
          like_count = int_or_none(self._search_regex(
-            (r'(\d+)\s*</11[^>]+>(?:&nbsp;|\s)*\blikes',
+            (r'(\d+)</span>\s*likes',
+             r'(\d+)\s*</11[^>]+>(?:&nbsp;|\s)*\blikes',
               r'class=["\']save-count["\'][^>]*>\s*(\d+)'),
              webpage, 'like count', fatal=False))
  
-        return {
+        return merge_dicts(info, {
              'id': video_id,
              'display_id': display_id,
              'title': title,
@@ -106,4 +118,4 @@ def _real_extract(self, url):
              'like_count': like_count,
              'formats': formats,
              'age_limit': 18,
-        }
+        })
diff --git a/youtube_dl/extractor/pornhub.py b/youtube_dlc/extractor/pornhub.py

similarity index 87%

rename from youtube_dl/extractor/pornhub.py

rename to youtube_dlc/extractor/pornhub.py

index ba0ad7da29d188f5e920376805bf7532a1613bee..3567a32839eef2f75123a3f1b939038cf3eaf678 100644 (file)
--- a/youtube_dl/extractor/pornhub.py
+++ b/youtube_dlc/extractor/pornhub.py
@@ -17,6 +17,7 @@
      determine_ext,
      ExtractorError,
      int_or_none,
+    NO_DEFAULT,
      orderedSet,
      remove_quotes,
      str_to_int,
@@ -51,7 +52,7 @@ class PornHubIE(PornHubBaseIE):
      _VALID_URL = r'''(?x)
                      https?://
                          (?:
-                            (?:[^/]+\.)?(?P<host>pornhub\.(?:com|net))/(?:(?:view_video\.php|video/show)\?viewkey=|embed/)|
+                            (?:[^/]+\.)?(?P<host>pornhub(?:premium)?\.(?:com|net))/(?:(?:view_video\.php|video/show)\?viewkey=|embed/)|
                              (?:www\.)?thumbzilla\.com/video/
                          )
                          (?P<id>[\da-z]+)
@@ -148,6 +149,9 @@ class PornHubIE(PornHubBaseIE):
      }, {
          'url': 'https://www.pornhub.net/view_video.php?viewkey=203640933',
          'only_matching': True,
+    }, {
+        'url': 'https://www.pornhubpremium.com/view_video.php?viewkey=ph5e4acdae54a82',
+        'only_matching': True,
      }]
  
      @staticmethod
@@ -165,6 +169,13 @@ def _real_extract(self, url):
          host = mobj.group('host') or 'pornhub.com'
          video_id = mobj.group('id')
  
+        if 'premium' in host:
+            if not self._downloader.params.get('cookiefile'):
+                raise ExtractorError(
+                    'PornHub Premium requires authentication.'
+                    ' You may want to use --cookies.',
+                    expected=True)
+
          self._set_cookie(host, 'age_verified', '1')
  
          def dl_webpage(platform):
@@ -188,10 +199,10 @@ def dl_webpage(platform):
          # http://www.pornhub.com/view_video.php?viewkey=1331683002), not relying
          # on that anymore.
          title = self._html_search_meta(
-            'twitter:title', webpage, default=None) or self._search_regex(
-            (r'<h1[^>]+class=["\']title["\'][^>]*>(?P<title>[^<]+)',
-             r'<div[^>]+data-video-title=(["\'])(?P<title>.+?)\1',
-             r'shareTitle\s*=\s*(["\'])(?P<title>.+?)\1'),
+            'twitter:title', webpage, default=None) or self._html_search_regex(
+            (r'(?s)<h1[^>]+class=["\']title["\'][^>]*>(?P<title>.+?)</h1>',
+             r'<div[^>]+data-video-title=(["\'])(?P<title>(?:(?!\1).)+)\1',
+             r'shareTitle["\']\s*[=:]\s*(["\'])(?P<title>(?:(?!\1).)+)\1'),
              webpage, 'title', group='title')
  
          video_urls = []
@@ -227,12 +238,13 @@ def dl_webpage(platform):
          else:
              thumbnail, duration = [None] * 2
  
-        if not video_urls:
-            tv_webpage = dl_webpage('tv')
-
+        def extract_js_vars(webpage, pattern, default=NO_DEFAULT):
              assignments = self._search_regex(
-                r'(var.+?mediastring.+?)</script>', tv_webpage,
-                'encoded url').split(';')
+                pattern, webpage, 'encoded url', default=default)
+            if not assignments:
+                return {}
+
+            assignments = assignments.split(';')
  
              js_vars = {}
  
@@ -254,11 +266,35 @@ def parse_js_value(inp):
                  assn = re.sub(r'var\s+', '', assn)
                  vname, value = assn.split('=', 1)
                  js_vars[vname] = parse_js_value(value)
+            return js_vars
  
-            video_url = js_vars['mediastring']
-            if video_url not in video_urls_set:
-                video_urls.append((video_url, None))
-                video_urls_set.add(video_url)
+        def add_video_url(video_url):
+            v_url = url_or_none(video_url)
+            if not v_url:
+                return
+            if v_url in video_urls_set:
+                return
+            video_urls.append((v_url, None))
+            video_urls_set.add(v_url)
+
+        if not video_urls:
+            FORMAT_PREFIXES = ('media', 'quality')
+            js_vars = extract_js_vars(
+                webpage, r'(var\s+(?:%s)_.+)' % '|'.join(FORMAT_PREFIXES),
+                default=None)
+            if js_vars:
+                for key, format_url in js_vars.items():
+                    if any(key.startswith(p) for p in FORMAT_PREFIXES):
+                        add_video_url(format_url)
+            if not video_urls and re.search(
+                    r'<[^>]+\bid=["\']lockedPlayer', webpage):
+                raise ExtractorError(
+                    'Video %s is locked' % video_id, expected=True)
+
+        if not video_urls:
+            js_vars = extract_js_vars(
+                dl_webpage('tv'), r'(var.+?mediastring.+?)</script>')
+            add_video_url(js_vars['mediastring'])
  
          for mobj in re.finditer(
                  r'<a[^>]+\bclass=["\']downloadBtn\b[^>]+\bhref=(["\'])(?P<url>(?:(?!\1).)+)\1',
@@ -276,10 +312,16 @@ def parse_js_value(inp):
                      r'/(\d{6}/\d{2})/', video_url, 'upload data', default=None)
                  if upload_date:
                      upload_date = upload_date.replace('/', '')
-            if determine_ext(video_url) == 'mpd':
+            ext = determine_ext(video_url)
+            if ext == 'mpd':
                  formats.extend(self._extract_mpd_formats(
                      video_url, video_id, mpd_id='dash', fatal=False))
                  continue
+            elif ext == 'm3u8':
+                formats.extend(self._extract_m3u8_formats(
+                    video_url, video_id, 'mp4', entry_protocol='m3u8_native',
+                    m3u8_id='hls', fatal=False))
+                continue
              tbr = None
              mobj = re.search(r'(?P<height>\d+)[pP]?_(?P<tbr>\d+)[kK]', video_url)
              if mobj:
@@ -373,7 +415,7 @@ def _real_extract(self, url):
  
  
  class PornHubUserIE(PornHubPlaylistBaseIE):
-    _VALID_URL = r'(?P<url>https?://(?:[^/]+\.)?pornhub\.(?:com|net)/(?:(?:user|channel)s|model|pornstar)/(?P<id>[^/?#&]+))(?:[?#&]|/(?!videos)|$)'
+    _VALID_URL = r'(?P<url>https?://(?:[^/]+\.)?(?P<host>pornhub(?:premium)?\.(?:com|net))/(?:(?:user|channel)s|model|pornstar)/(?P<id>[^/?#&]+))(?:[?#&]|/(?!videos)|$)'
      _TESTS = [{
          'url': 'https://www.pornhub.com/model/zoe_ph',
          'playlist_mincount': 118,
@@ -441,7 +483,7 @@ def _real_extract(self, url):
  
  
  class PornHubPagedVideoListIE(PornHubPagedPlaylistBaseIE):
-    _VALID_URL = r'https?://(?:[^/]+\.)?(?P<host>pornhub\.(?:com|net))/(?P<id>(?:[^/]+/)*[^/?#&]+)'
+    _VALID_URL = r'https?://(?:[^/]+\.)?(?P<host>pornhub(?:premium)?\.(?:com|net))/(?P<id>(?:[^/]+/)*[^/?#&]+)'
      _TESTS = [{
          'url': 'https://www.pornhub.com/model/zoe_ph/videos',
          'only_matching': True,
@@ -556,7 +598,7 @@ def suitable(cls, url):
  
  
  class PornHubUserVideosUploadIE(PornHubPagedPlaylistBaseIE):
-    _VALID_URL = r'(?P<url>https?://(?:[^/]+\.)?(?P<host>pornhub\.(?:com|net))/(?:(?:user|channel)s|model|pornstar)/(?P<id>[^/]+)/videos/upload)'
+    _VALID_URL = r'(?P<url>https?://(?:[^/]+\.)?(?P<host>pornhub(?:premium)?\.(?:com|net))/(?:(?:user|channel)s|model|pornstar)/(?P<id>[^/]+)/videos/upload)'
      _TESTS = [{
          'url': 'https://www.pornhub.com/pornstar/jenny-blighe/videos/upload',
          'info_dict': {
diff --git a/youtube_dl/extractor/pornotube.py b/youtube_dlc/extractor/pornotube.py

similarity index 100%

rename from youtube_dl/extractor/pornotube.py

rename to youtube_dlc/extractor/pornotube.py
diff --git a/youtube_dl/extractor/pornovoisines.py b/youtube_dlc/extractor/pornovoisines.py

similarity index 100%

rename from youtube_dl/extractor/pornovoisines.py

rename to youtube_dlc/extractor/pornovoisines.py
diff --git a/youtube_dl/extractor/pornoxo.py b/youtube_dlc/extractor/pornoxo.py

similarity index 100%

rename from youtube_dl/extractor/pornoxo.py

rename to youtube_dlc/extractor/pornoxo.py
diff --git a/youtube_dl/extractor/presstv.py b/youtube_dlc/extractor/presstv.py

similarity index 100%

rename from youtube_dl/extractor/presstv.py

rename to youtube_dlc/extractor/presstv.py
diff --git a/youtube_dl/extractor/prosiebensat1.py b/youtube_dlc/extractor/prosiebensat1.py

similarity index 93%

rename from youtube_dl/extractor/prosiebensat1.py

rename to youtube_dlc/extractor/prosiebensat1.py

index e19a470a5eee5efda48fe88a694c7e3a62010963..e470882922ffe8f22fad735f73899e0104a4a561 100644 (file)
--- a/youtube_dl/extractor/prosiebensat1.py
+++ b/youtube_dlc/extractor/prosiebensat1.py
@@ -11,12 +11,13 @@
      determine_ext,
      float_or_none,
      int_or_none,
+    merge_dicts,
      unified_strdate,
  )
  
  
  class ProSiebenSat1BaseIE(InfoExtractor):
-    _GEO_COUNTRIES = ['DE']
+    _GEO_BYPASS = False
      _ACCESS_ID = None
      _SUPPORTED_PROTOCOLS = 'dash:clear,hls:clear,progressive:clear'
      _V4_BASE_URL = 'https://vas-v4.p7s1video.net/4.0/get'
@@ -39,14 +40,18 @@ def _extract_video_info(self, url, clip_id):
          formats = []
          if self._ACCESS_ID:
              raw_ct = self._ENCRYPTION_KEY + clip_id + self._IV + self._ACCESS_ID
-            server_token = (self._download_json(
+            protocols = self._download_json(
                  self._V4_BASE_URL + 'protocols', clip_id,
                  'Downloading protocols JSON',
                  headers=self.geo_verification_headers(), query={
                      'access_id': self._ACCESS_ID,
                      'client_token': sha1((raw_ct).encode()).hexdigest(),
                      'video_id': clip_id,
-                }, fatal=False) or {}).get('server_token')
+                }, fatal=False, expected_status=(403,)) or {}
+            error = protocols.get('error') or {}
+            if error.get('title') == 'Geo check failed':
+                self.raise_geo_restricted(countries=['AT', 'CH', 'DE'])
+            server_token = protocols.get('server_token')
              if server_token:
                  urls = (self._download_json(
                      self._V4_BASE_URL + 'urls', clip_id, 'Downloading urls JSON', query={
@@ -171,7 +176,7 @@ class ProSiebenSat1IE(ProSiebenSat1BaseIE):
                          (?:
                              (?:beta\.)?
                              (?:
-                                prosieben(?:maxx)?|sixx|sat1(?:gold)?|kabeleins(?:doku)?|the-voice-of-germany|7tv|advopedia
+                                prosieben(?:maxx)?|sixx|sat1(?:gold)?|kabeleins(?:doku)?|the-voice-of-germany|advopedia
                              )\.(?:de|at|ch)|
                              ran\.de|fem\.com|advopedia\.de|galileo\.tv/video
                          )
@@ -189,10 +194,14 @@ class ProSiebenSat1IE(ProSiebenSat1BaseIE):
              'info_dict': {
                  'id': '2104602',
                  'ext': 'mp4',
-                'title': 'Episode 18 - Staffel 2',
+                'title': 'CIRCUS HALLIGALLI - Episode 18 - Staffel 2',
                  'description': 'md5:8733c81b702ea472e069bc48bb658fc1',
                  'upload_date': '20131231',
                  'duration': 5845.04,
+                'series': 'CIRCUS HALLIGALLI',
+                'season_number': 2,
+                'episode': 'Episode 18 - Staffel 2',
+                'episode_number': 18,
              },
          },
          {
@@ -296,8 +305,9 @@ class ProSiebenSat1IE(ProSiebenSat1BaseIE):
              'info_dict': {
                  'id': '2572814',
                  'ext': 'mp4',
-                'title': 'Andreas Kümmert: Rocket Man',
+                'title': 'The Voice of Germany - Andreas Kümmert: Rocket Man',
                  'description': 'md5:6ddb02b0781c6adf778afea606652e38',
+                'timestamp': 1382041620,
                  'upload_date': '20131017',
                  'duration': 469.88,
              },
@@ -306,7 +316,7 @@ class ProSiebenSat1IE(ProSiebenSat1BaseIE):
              },
          },
          {
-            'url': 'http://www.fem.com/wellness/videos/wellness-video-clip-kurztripps-zum-valentinstag.html',
+            'url': 'http://www.fem.com/videos/beauty-lifestyle/kurztrips-zum-valentinstag',
              'info_dict': {
                  'id': '2156342',
                  'ext': 'mp4',
@@ -328,19 +338,6 @@ class ProSiebenSat1IE(ProSiebenSat1BaseIE):
              'playlist_count': 2,
              'skip': 'This video is unavailable',
          },
-        {
-            'url': 'http://www.7tv.de/circus-halligalli/615-best-of-circus-halligalli-ganze-folge',
-            'info_dict': {
-                'id': '4187506',
-                'ext': 'mp4',
-                'title': 'Best of Circus HalliGalli',
-                'description': 'md5:8849752efd90b9772c9db6fdf87fb9e9',
-                'upload_date': '20151229',
-            },
-            'params': {
-                'skip_download': True,
-            },
-        },
          {
              # title in <h2 class="subtitle">
              'url': 'http://www.prosieben.de/stars/oscar-award/videos/jetzt-erst-enthuellt-das-geheimnis-von-emma-stones-oscar-robe-clip',
@@ -417,7 +414,6 @@ class ProSiebenSat1IE(ProSiebenSat1BaseIE):
          r'<div[^>]+id="veeseoDescription"[^>]*>(.+?)</div>',
      ]
      _UPLOAD_DATE_REGEXES = [
-        r'<meta property="og:published_time" content="(.+?)">',
          r'<span>\s*(\d{2}\.\d{2}\.\d{4} \d{2}:\d{2}) \|\s*<span itemprop="duration"',
          r'<footer>\s*(\d{2}\.\d{2}\.\d{4}) \d{2}:\d{2} Uhr',
          r'<span style="padding-left: 4px;line-height:20px; color:#404040">(\d{2}\.\d{2}\.\d{4})</span>',
@@ -447,17 +443,21 @@ def _extract_clip(self, url, webpage):
          if description is None:
              description = self._og_search_description(webpage)
          thumbnail = self._og_search_thumbnail(webpage)
-        upload_date = unified_strdate(self._html_search_regex(
-            self._UPLOAD_DATE_REGEXES, webpage, 'upload date', default=None))
+        upload_date = unified_strdate(
+            self._html_search_meta('og:published_time', webpage,
+                                   'upload date', default=None)
+            or self._html_search_regex(self._UPLOAD_DATE_REGEXES,
+                                       webpage, 'upload date', default=None))
+
+        json_ld = self._search_json_ld(webpage, clip_id, default={})
  
-        info.update({
+        return merge_dicts(info, {
              'id': clip_id,
              'title': title,
              'description': description,
              'thumbnail': thumbnail,
              'upload_date': upload_date,
-        })
-        return info
+        }, json_ld)
  
      def _extract_playlist(self, url, webpage):
          playlist_id = self._html_search_regex(
diff --git a/youtube_dl/extractor/puhutv.py b/youtube_dlc/extractor/puhutv.py

similarity index 91%

rename from youtube_dl/extractor/puhutv.py

rename to youtube_dlc/extractor/puhutv.py

index fb704a3c4390b9da5b6fa3a5dad027c6e812b7eb..ca71665e0fabf958738192b497130ee12d6ad1f6 100644 (file)
--- a/youtube_dl/extractor/puhutv.py
+++ b/youtube_dlc/extractor/puhutv.py
@@ -82,17 +82,6 @@ def _real_extract(self, url):
          urls = []
          formats = []
  
-        def add_http_from_hls(m3u8_f):
-            http_url = m3u8_f['url'].replace('/hls/', '/mp4/').replace('/chunklist.m3u8', '.mp4')
-            if http_url != m3u8_f['url']:
-                f = m3u8_f.copy()
-                f.update({
-                    'format_id': f['format_id'].replace('hls', 'http'),
-                    'protocol': 'http',
-                    'url': http_url,
-                })
-                formats.append(f)
-
          for video in videos['data']['videos']:
              media_url = url_or_none(video.get('url'))
              if not media_url or media_url in urls:
@@ -101,12 +90,9 @@ def add_http_from_hls(m3u8_f):
  
              playlist = video.get('is_playlist')
              if (video.get('stream_type') == 'hls' and playlist is True) or 'playlist.m3u8' in media_url:
-                m3u8_formats = self._extract_m3u8_formats(
+                formats.extend(self._extract_m3u8_formats(
                      media_url, video_id, 'mp4', entry_protocol='m3u8_native',
-                    m3u8_id='hls', fatal=False)
-                for m3u8_f in m3u8_formats:
-                    formats.append(m3u8_f)
-                    add_http_from_hls(m3u8_f)
+                    m3u8_id='hls', fatal=False))
                  continue
  
              quality = int_or_none(video.get('quality'))
@@ -128,8 +114,6 @@ def add_http_from_hls(m3u8_f):
                  format_id += '-%sp' % quality
              f['format_id'] = format_id
              formats.append(f)
-            if is_hls:
-                add_http_from_hls(f)
          self._sort_formats(formats)
  
          creator = try_get(
diff --git a/youtube_dl/extractor/puls4.py b/youtube_dlc/extractor/puls4.py

similarity index 100%

rename from youtube_dl/extractor/puls4.py

rename to youtube_dlc/extractor/puls4.py
diff --git a/youtube_dl/extractor/pyvideo.py b/youtube_dlc/extractor/pyvideo.py

similarity index 100%

rename from youtube_dl/extractor/pyvideo.py

rename to youtube_dlc/extractor/pyvideo.py
diff --git a/youtube_dl/extractor/qqmusic.py b/youtube_dlc/extractor/qqmusic.py

similarity index 100%

rename from youtube_dl/extractor/qqmusic.py

rename to youtube_dlc/extractor/qqmusic.py
diff --git a/youtube_dl/extractor/r7.py b/youtube_dlc/extractor/r7.py

similarity index 100%

rename from youtube_dl/extractor/r7.py

rename to youtube_dlc/extractor/r7.py
diff --git a/youtube_dl/extractor/radiobremen.py b/youtube_dlc/extractor/radiobremen.py

similarity index 100%

rename from youtube_dl/extractor/radiobremen.py

rename to youtube_dlc/extractor/radiobremen.py
diff --git a/youtube_dl/extractor/radiocanada.py b/youtube_dlc/extractor/radiocanada.py

similarity index 100%

rename from youtube_dl/extractor/radiocanada.py

rename to youtube_dlc/extractor/radiocanada.py
diff --git a/youtube_dl/extractor/radiode.py b/youtube_dlc/extractor/radiode.py

similarity index 100%

rename from youtube_dl/extractor/radiode.py

rename to youtube_dlc/extractor/radiode.py
diff --git a/youtube_dl/extractor/radiofrance.py b/youtube_dlc/extractor/radiofrance.py

similarity index 100%

rename from youtube_dl/extractor/radiofrance.py

rename to youtube_dlc/extractor/radiofrance.py
diff --git a/youtube_dl/extractor/radiojavan.py b/youtube_dlc/extractor/radiojavan.py

similarity index 100%

rename from youtube_dl/extractor/radiojavan.py

rename to youtube_dlc/extractor/radiojavan.py
diff --git a/youtube_dl/extractor/rai.py b/youtube_dlc/extractor/rai.py

similarity index 81%

rename from youtube_dl/extractor/rai.py

rename to youtube_dlc/extractor/rai.py

index 207a6c247f3c78ff260bd8a326af1b1fb59c550c..51a310f5c35379f13fb8293247522ef7becfd974 100644 (file)
--- a/youtube_dl/extractor/rai.py
+++ b/youtube_dlc/extractor/rai.py
@@ -1,3 +1,4 @@
+# coding: utf-8
  from __future__ import unicode_literals
  
  import re
@@ -17,7 +18,6 @@
      parse_duration,
      strip_or_none,
      try_get,
-    unescapeHTML,
      unified_strdate,
      unified_timestamp,
      update_url_query,
@@ -30,6 +30,7 @@ class RaiBaseIE(InfoExtractor):
      _UUID_RE = r'[\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12}'
      _GEO_COUNTRIES = ['IT']
      _GEO_BYPASS = False
+    _BASE_URL = 'https://www.raiplay.it'
  
      def _extract_relinker_info(self, relinker_url, video_id):
          if not re.match(r'https?://', relinker_url):
@@ -122,41 +123,19 @@ def _extract_subtitles(url, subtitle_url):
  
  
  class RaiPlayIE(RaiBaseIE):
-    _VALID_URL = r'(?P<url>https?://(?:www\.)?raiplay\.it/.+?-(?P<id>%s)\.html)' % RaiBaseIE._UUID_RE
+    _VALID_URL = r'(?P<url>(?P<base>https?://(?:www\.)?raiplay\.it/.+?-)(?P<id>%s)(?P<ext>\.(?:html|json)))' % RaiBaseIE._UUID_RE
      _TESTS = [{
-        'url': 'http://www.raiplay.it/video/2016/10/La-Casa-Bianca-e06118bb-59a9-4636-b914-498e4cfd2c66.html?source=twitter',
-        'md5': '340aa3b7afb54bfd14a8c11786450d76',
-        'info_dict': {
-            'id': 'e06118bb-59a9-4636-b914-498e4cfd2c66',
-            'ext': 'mp4',
-            'title': 'La Casa Bianca',
-            'alt_title': 'S2016 - Puntata del 23/10/2016',
-            'description': 'md5:a09d45890850458077d1f68bb036e0a5',
-            'thumbnail': r're:^https?://.*\.jpg$',
-            'uploader': 'Rai 3',
-            'creator': 'Rai 3',
-            'duration': 3278,
-            'timestamp': 1477764300,
-            'upload_date': '20161029',
-            'series': 'La Casa Bianca',
-            'season': '2016',
-        },
-    }, {
          'url': 'http://www.raiplay.it/video/2014/04/Report-del-07042014-cb27157f-9dd0-4aee-b788-b1f67643a391.html',
          'md5': '8970abf8caf8aef4696e7b1f2adfc696',
          'info_dict': {
              'id': 'cb27157f-9dd0-4aee-b788-b1f67643a391',
              'ext': 'mp4',
              'title': 'Report del 07/04/2014',
-            'alt_title': 'S2013/14 - Puntata del 07/04/2014',
-            'description': 'md5:f27c544694cacb46a078db84ec35d2d9',
+            'alt_title': 'St 2013/14 - Espresso nel caffè - 07/04/2014 ',
+            'description': 'md5:d730c168a58f4bb35600fc2f881ec04e',
              'thumbnail': r're:^https?://.*\.jpg$',
-            'uploader': 'Rai 5',
-            'creator': 'Rai 5',
+            'uploader': 'Rai Gulp',
              'duration': 6160,
-            'series': 'Report',
-            'season_number': 5,
-            'season': '2013/14',
          },
          'params': {
              'skip_download': True,
@@ -168,16 +147,15 @@ class RaiPlayIE(RaiBaseIE):
  
      def _real_extract(self, url):
          mobj = re.match(self._VALID_URL, url)
-        url, video_id = mobj.group('url', 'id')
+        url, base, video_id, ext = mobj.group('url', 'base', 'id', 'ext')
  
          media = self._download_json(
-            '%s?json' % url, video_id, 'Downloading video JSON')
+            '%s%s.json' % (base, video_id), video_id, 'Downloading video JSON')
  
          title = media['name']
-
          video = media['video']
  
-        relinker_info = self._extract_relinker_info(video['contentUrl'], video_id)
+        relinker_info = self._extract_relinker_info(video['content_url'], video_id)
          self._sort_formats(relinker_info['formats'])
  
          thumbnails = []
@@ -185,7 +163,7 @@ def _real_extract(self, url):
              for _, value in media.get('images').items():
                  if value:
                      thumbnails.append({
-                        'url': value.replace('[RESOLUTION]', '600x400')
+                        'url': urljoin(RaiBaseIE._BASE_URL, value.replace('[RESOLUTION]', '600x400'))
                      })
  
          timestamp = unified_timestamp(try_get(
@@ -225,7 +203,7 @@ class RaiPlayLiveIE(RaiBaseIE):
              'display_id': 'rainews24',
              'ext': 'mp4',
              'title': 're:^Diretta di Rai News 24 [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
-            'description': 'md5:6eca31500550f9376819f174e5644754',
+            'description': 'md5:4d00bcf6dc98b27c6ec480de329d1497',
              'uploader': 'Rai News 24',
              'creator': 'Rai News 24',
              'is_live': True,
@@ -238,20 +216,32 @@ class RaiPlayLiveIE(RaiBaseIE):
      def _real_extract(self, url):
          display_id = self._match_id(url)
  
-        webpage = self._download_webpage(url, display_id)
+        media = self._download_json(
+            '%s.json' % urljoin(RaiBaseIE._BASE_URL, 'dirette/' + display_id),
+            display_id, 'Downloading channel JSON')
+
+        title = media['name']
+        video = media['video']
+        video_id = media['id'].replace('ContentItem-', '')
  
-        video_id = self._search_regex(
-            r'data-uniquename=["\']ContentItem-(%s)' % RaiBaseIE._UUID_RE,
-            webpage, 'content id')
+        relinker_info = self._extract_relinker_info(video['content_url'], video_id)
+        self._sort_formats(relinker_info['formats'])
  
-        return {
-            '_type': 'url_transparent',
-            'ie_key': RaiPlayIE.ie_key(),
-            'url': 'http://www.raiplay.it/dirette/ContentItem-%s.html' % video_id,
+        info = {
              'id': video_id,
              'display_id': display_id,
+            'title': self._live_title(title) if relinker_info.get(
+                'is_live') else title,
+            'alt_title': media.get('subtitle'),
+            'description': media.get('description'),
+            'uploader': strip_or_none(media.get('channel')),
+            'creator': strip_or_none(media.get('editor')),
+            'duration': parse_duration(video.get('duration')),
          }
  
+        info.update(relinker_info)
+        return info
+
  
  class RaiPlayPlaylistIE(InfoExtractor):
      _VALID_URL = r'https?://(?:www\.)?raiplay\.it/programmi/(?P<id>[^/?#&]+)'
@@ -260,7 +250,7 @@ class RaiPlayPlaylistIE(InfoExtractor):
          'info_dict': {
              'id': 'nondirloalmiocapo',
              'title': 'Non dirlo al mio capo',
-            'description': 'md5:9f3d603b2947c1c7abb098f3b14fac86',
+            'description': 'md5:98ab6b98f7f44c2843fd7d6f045f153b',
          },
          'playlist_mincount': 12,
      }]
@@ -268,21 +258,25 @@ class RaiPlayPlaylistIE(InfoExtractor):
      def _real_extract(self, url):
          playlist_id = self._match_id(url)
  
-        webpage = self._download_webpage(url, playlist_id)
+        media = self._download_json(
+            '%s.json' % urljoin(RaiBaseIE._BASE_URL, 'programmi/' + playlist_id),
+            playlist_id, 'Downloading program JSON')
  
-        title = self._html_search_meta(
-            ('programma', 'nomeProgramma'), webpage, 'title')
-        description = unescapeHTML(self._html_search_meta(
-            ('description', 'og:description'), webpage, 'description'))
+        title = media['name']
+        description = media['program_info']['description']
+
+        content_sets = [s['id'] for b in media['blocks'] for s in b['sets']]
  
          entries = []
-        for mobj in re.finditer(
-                r'<a\b[^>]+\bhref=(["\'])(?P<path>/raiplay/video/.+?)\1',
-                webpage):
-            video_url = urljoin(url, mobj.group('path'))
-            entries.append(self.url_result(
-                video_url, ie=RaiPlayIE.ie_key(),
-                video_id=RaiPlayIE._match_id(video_url)))
+        for cs in content_sets:
+            medias = self._download_json(
+                '%s/%s.json' % (urljoin(RaiBaseIE._BASE_URL, 'programmi/' + playlist_id), cs),
+                cs, 'Downloading content set JSON')
+            for m in medias['items']:
+                video_url = urljoin(url, m['path_id'])
+                entries.append(self.url_result(
+                    video_url, ie=RaiPlayIE.ie_key(),
+                    video_id=RaiPlayIE._match_id(video_url)))
  
          return self.playlist_result(entries, playlist_id, title, description)
  
@@ -316,7 +310,7 @@ class RaiIE(RaiBaseIE):
      }, {
          # with ContentItem in og:url
          'url': 'http://www.rai.it/dl/RaiTV/programmi/media/ContentItem-efb17665-691c-45d5-a60c-5301333cbb0c.html',
-        'md5': '11959b4e44fa74de47011b5799490adf',
+        'md5': '6865dd00cf0bbf5772fdd89d59bd768a',
          'info_dict': {
              'id': 'efb17665-691c-45d5-a60c-5301333cbb0c',
              'ext': 'mp4',
@@ -326,18 +320,6 @@ class RaiIE(RaiBaseIE):
              'duration': 2214,
              'upload_date': '20161103',
          }
-    }, {
-        # drawMediaRaiTV(...)
-        'url': 'http://www.report.rai.it/dl/Report/puntata/ContentItem-0c7a664b-d0f4-4b2c-8835-3f82e46f433e.html',
-        'md5': '2dd727e61114e1ee9c47f0da6914e178',
-        'info_dict': {
-            'id': '59d69d28-6bb6-409d-a4b5-ed44096560af',
-            'ext': 'mp4',
-            'title': 'Il pacco',
-            'description': 'md5:4b1afae1364115ce5d78ed83cd2e5b3a',
-            'thumbnail': r're:^https?://.*\.jpg$',
-            'upload_date': '20141221',
-        },
      }, {
          # initEdizione('ContentItem-...'
          'url': 'http://www.tg1.rai.it/dl/tg1/2010/edizioni/ContentSet-9b6e0cba-4bef-4aef-8cf0-9f7f665b7dfb-tg1.html?item=undefined',
@@ -349,17 +331,6 @@ class RaiIE(RaiBaseIE):
              'upload_date': '20170401',
          },
          'skip': 'Changes daily',
-    }, {
-        # HDS live stream with only relinker URL
-        'url': 'http://www.rai.tv/dl/RaiTV/dirette/PublishingBlock-1912dbbf-3f96-44c3-b4cf-523681fbacbc.html?channel=EuroNews',
-        'info_dict': {
-            'id': '1912dbbf-3f96-44c3-b4cf-523681fbacbc',
-            'ext': 'flv',
-            'title': 'EuroNews',
-        },
-        'params': {
-            'skip_download': True,
-        },
      }, {
          # HLS live stream with ContentItem in og:url
          'url': 'http://www.rainews.it/dl/rainews/live/ContentItem-3156f2f2-dc70-4953-8e2f-70d7489d4ce9.html',
diff --git a/youtube_dl/extractor/raywenderlich.py b/youtube_dlc/extractor/raywenderlich.py

similarity index 100%

rename from youtube_dl/extractor/raywenderlich.py

rename to youtube_dlc/extractor/raywenderlich.py
diff --git a/youtube_dl/extractor/rbmaradio.py b/youtube_dlc/extractor/rbmaradio.py

similarity index 100%

rename from youtube_dl/extractor/rbmaradio.py

rename to youtube_dlc/extractor/rbmaradio.py
diff --git a/youtube_dl/extractor/rds.py b/youtube_dlc/extractor/rds.py

similarity index 100%

rename from youtube_dl/extractor/rds.py

rename to youtube_dlc/extractor/rds.py
diff --git a/youtube_dl/extractor/redbulltv.py b/youtube_dlc/extractor/redbulltv.py

similarity index 100%

rename from youtube_dl/extractor/redbulltv.py

rename to youtube_dlc/extractor/redbulltv.py
diff --git a/youtube_dl/extractor/reddit.py b/youtube_dlc/extractor/reddit.py

similarity index 100%

rename from youtube_dl/extractor/reddit.py

rename to youtube_dlc/extractor/reddit.py
diff --git a/youtube_dl/extractor/redtube.py b/youtube_dlc/extractor/redtube.py

similarity index 81%

rename from youtube_dl/extractor/redtube.py

rename to youtube_dlc/extractor/redtube.py

index 5c84028ef97e8220d494633f5ada42804fa2ae7f..2d2f6a98c97dba8605cb9f640c7c73d860caa1d0 100644 (file)
--- a/youtube_dl/extractor/redtube.py
+++ b/youtube_dlc/extractor/redtube.py
@@ -4,6 +4,7 @@
  
  from .common import InfoExtractor
  from ..utils import (
+    determine_ext,
      ExtractorError,
      int_or_none,
      merge_dicts,
@@ -43,14 +44,21 @@ def _real_extract(self, url):
          webpage = self._download_webpage(
              'http://www.redtube.com/%s' % video_id, video_id)
  
-        if any(s in webpage for s in ['video-deleted-info', '>This video has been removed']):
-            raise ExtractorError('Video %s has been removed' % video_id, expected=True)
+        ERRORS = (
+            (('video-deleted-info', '>This video has been removed'), 'has been removed'),
+            (('private_video_text', '>This video is private', '>Send a friend request to its owner to be able to view it'), 'is private'),
+        )
+
+        for patterns, message in ERRORS:
+            if any(p in webpage for p in patterns):
+                raise ExtractorError(
+                    'Video %s %s' % (video_id, message), expected=True)
  
          info = self._search_json_ld(webpage, video_id, default={})
  
          if not info.get('title'):
              info['title'] = self._html_search_regex(
-                (r'<h(\d)[^>]+class="(?:video_title_text|videoTitle)[^"]*">(?P<title>(?:(?!\1).)+)</h\1>',
+                (r'<h(\d)[^>]+class="(?:video_title_text|videoTitle|video_title)[^"]*">(?P<title>(?:(?!\1).)+)</h\1>',
                   r'(?:videoTitle|title)\s*:\s*(["\'])(?P<title>(?:(?!\1).)+)\1',),
                  webpage, 'title', group='title',
                  default=None) or self._og_search_title(webpage)
@@ -70,7 +78,7 @@ def _real_extract(self, url):
                      })
          medias = self._parse_json(
              self._search_regex(
-                r'mediaDefinition\s*:\s*(\[.+?\])', webpage,
+                r'mediaDefinition["\']?\s*:\s*(\[.+?}\s*\])', webpage,
                  'media definitions', default='{}'),
              video_id, fatal=False)
          if medias and isinstance(medias, list):
@@ -78,6 +86,12 @@ def _real_extract(self, url):
                  format_url = url_or_none(media.get('videoUrl'))
                  if not format_url:
                      continue
+                if media.get('format') == 'hls' or determine_ext(format_url) == 'm3u8':
+                    formats.extend(self._extract_m3u8_formats(
+                        format_url, video_id, 'mp4',
+                        entry_protocol='m3u8_native', m3u8_id='hls',
+                        fatal=False))
+                    continue
                  format_id = media.get('quality')
                  formats.append({
                      'url': format_url,
diff --git a/youtube_dl/extractor/regiotv.py b/youtube_dlc/extractor/regiotv.py

similarity index 100%

rename from youtube_dl/extractor/regiotv.py

rename to youtube_dlc/extractor/regiotv.py
diff --git a/youtube_dl/extractor/rentv.py b/youtube_dlc/extractor/rentv.py

similarity index 100%

rename from youtube_dl/extractor/rentv.py

rename to youtube_dlc/extractor/rentv.py
diff --git a/youtube_dl/extractor/restudy.py b/youtube_dlc/extractor/restudy.py

similarity index 100%

rename from youtube_dl/extractor/restudy.py

rename to youtube_dlc/extractor/restudy.py
diff --git a/youtube_dl/extractor/reuters.py b/youtube_dlc/extractor/reuters.py

similarity index 100%

rename from youtube_dl/extractor/reuters.py

rename to youtube_dlc/extractor/reuters.py
diff --git a/youtube_dl/extractor/reverbnation.py b/youtube_dlc/extractor/reverbnation.py

similarity index 100%

rename from youtube_dl/extractor/reverbnation.py

rename to youtube_dlc/extractor/reverbnation.py
diff --git a/youtube_dl/extractor/rice.py b/youtube_dlc/extractor/rice.py

similarity index 100%

rename from youtube_dl/extractor/rice.py

rename to youtube_dlc/extractor/rice.py
diff --git a/youtube_dl/extractor/rmcdecouverte.py b/youtube_dlc/extractor/rmcdecouverte.py

similarity index 100%

rename from youtube_dl/extractor/rmcdecouverte.py

rename to youtube_dlc/extractor/rmcdecouverte.py
diff --git a/youtube_dl/extractor/ro220.py b/youtube_dlc/extractor/ro220.py

similarity index 100%

rename from youtube_dl/extractor/ro220.py

rename to youtube_dlc/extractor/ro220.py
diff --git a/youtube_dl/extractor/rockstargames.py b/youtube_dlc/extractor/rockstargames.py

similarity index 100%

rename from youtube_dl/extractor/rockstargames.py

rename to youtube_dlc/extractor/rockstargames.py
diff --git a/youtube_dl/extractor/roosterteeth.py b/youtube_dlc/extractor/roosterteeth.py

similarity index 100%

rename from youtube_dl/extractor/roosterteeth.py

rename to youtube_dlc/extractor/roosterteeth.py
diff --git a/youtube_dl/extractor/rottentomatoes.py b/youtube_dlc/extractor/rottentomatoes.py

similarity index 100%

rename from youtube_dl/extractor/rottentomatoes.py

rename to youtube_dlc/extractor/rottentomatoes.py
diff --git a/youtube_dl/extractor/roxwel.py b/youtube_dlc/extractor/roxwel.py

similarity index 100%

rename from youtube_dl/extractor/roxwel.py

rename to youtube_dlc/extractor/roxwel.py
diff --git a/youtube_dl/extractor/rozhlas.py b/youtube_dlc/extractor/rozhlas.py

similarity index 100%

rename from youtube_dl/extractor/rozhlas.py

rename to youtube_dlc/extractor/rozhlas.py
diff --git a/youtube_dl/extractor/rtbf.py b/youtube_dlc/extractor/rtbf.py

similarity index 100%

rename from youtube_dl/extractor/rtbf.py

rename to youtube_dlc/extractor/rtbf.py
diff --git a/youtube_dl/extractor/rte.py b/youtube_dlc/extractor/rte.py

similarity index 100%

rename from youtube_dl/extractor/rte.py

rename to youtube_dlc/extractor/rte.py
diff --git a/youtube_dl/extractor/rtl2.py b/youtube_dlc/extractor/rtl2.py

similarity index 100%

rename from youtube_dl/extractor/rtl2.py

rename to youtube_dlc/extractor/rtl2.py
diff --git a/youtube_dl/extractor/rtlnl.py b/youtube_dlc/extractor/rtlnl.py

similarity index 89%

rename from youtube_dl/extractor/rtlnl.py

rename to youtube_dlc/extractor/rtlnl.py

index fadca8c175475e9b2bf7cb4c9bc0a7e2ed163fcc..8be5ca236e1cf3f5351e47cc68ba3420888b7d34 100644 (file)
--- a/youtube_dl/extractor/rtlnl.py
+++ b/youtube_dlc/extractor/rtlnl.py
@@ -15,11 +15,25 @@ class RtlNlIE(InfoExtractor):
          https?://(?:(?:www|static)\.)?
          (?:
              rtlxl\.nl/[^\#]*\#!/[^/]+/|
+            rtlxl\.nl/programma/[^/]+/|
              rtl\.nl/(?:(?:system/videoplayer/(?:[^/]+/)+(?:video_)?embed\.html|embed)\b.+?\buuid=|video/)
          )
          (?P<id>[0-9a-f-]+)'''
  
      _TESTS = [{
+        'url': 'https://www.rtlxl.nl/programma/rtl-nieuws/0bd1384d-d970-3086-98bb-5c104e10c26f',
+        'md5': '490428f1187b60d714f34e1f2e3af0b6',
+        'info_dict': {
+            'id': '0bd1384d-d970-3086-98bb-5c104e10c26f',
+            'ext': 'mp4',
+            'title': 'RTL Nieuws',
+            'description': 'md5:d41d8cd98f00b204e9800998ecf8427e',
+            'timestamp': 1593293400,
+            'upload_date': '20200627',
+            'duration': 661.08,
+        },
+    }, {
+        # old url pattern. Tests does not pass
          'url': 'http://www.rtlxl.nl/#!/rtl-nieuws-132237/82b1aad1-4a14-3d7b-b554-b0aed1b2c416',
          'md5': '473d1946c1fdd050b2c0161a4b13c373',
          'info_dict': {
diff --git a/youtube_dl/extractor/rtp.py b/youtube_dlc/extractor/rtp.py

similarity index 100%

rename from youtube_dl/extractor/rtp.py

rename to youtube_dlc/extractor/rtp.py
diff --git a/youtube_dl/extractor/rts.py b/youtube_dlc/extractor/rts.py

similarity index 100%

rename from youtube_dl/extractor/rts.py

rename to youtube_dlc/extractor/rts.py
diff --git a/youtube_dl/extractor/rtve.py b/youtube_dlc/extractor/rtve.py

similarity index 100%

rename from youtube_dl/extractor/rtve.py

rename to youtube_dlc/extractor/rtve.py
diff --git a/youtube_dl/extractor/rtvnh.py b/youtube_dlc/extractor/rtvnh.py

similarity index 100%

rename from youtube_dl/extractor/rtvnh.py

rename to youtube_dlc/extractor/rtvnh.py
diff --git a/youtube_dl/extractor/rtvs.py b/youtube_dlc/extractor/rtvs.py

similarity index 100%

rename from youtube_dl/extractor/rtvs.py

rename to youtube_dlc/extractor/rtvs.py
diff --git a/youtube_dl/extractor/ruhd.py b/youtube_dlc/extractor/ruhd.py

similarity index 100%

rename from youtube_dl/extractor/ruhd.py

rename to youtube_dlc/extractor/ruhd.py
diff --git a/youtube_dl/extractor/rutube.py b/youtube_dlc/extractor/rutube.py

similarity index 100%

rename from youtube_dl/extractor/rutube.py

rename to youtube_dlc/extractor/rutube.py
diff --git a/youtube_dl/extractor/rutv.py b/youtube_dlc/extractor/rutv.py

similarity index 98%

rename from youtube_dl/extractor/rutv.py

rename to youtube_dlc/extractor/rutv.py

index d2713c19a053cba19448ad772525157144b19efa..aceb35994c6577c04c89c76beacbf62491c296cf 100644 (file)
--- a/youtube_dl/extractor/rutv.py
+++ b/youtube_dlc/extractor/rutv.py
@@ -139,7 +139,7 @@ def _real_extract(self, url):
          is_live = video_type == 'live'
  
          json_data = self._download_json(
-            'http://player.rutv.ru/iframe/data%s/id/%s' % ('live' if is_live else 'video', video_id),
+            'http://player.vgtrk.com/iframe/data%s/id/%s' % ('live' if is_live else 'video', video_id),
              video_id, 'Downloading JSON')
  
          if json_data['errors']:
diff --git a/youtube_dl/extractor/ruutu.py b/youtube_dlc/extractor/ruutu.py

similarity index 100%

rename from youtube_dl/extractor/ruutu.py

rename to youtube_dlc/extractor/ruutu.py
diff --git a/youtube_dl/extractor/ruv.py b/youtube_dlc/extractor/ruv.py

similarity index 100%

rename from youtube_dl/extractor/ruv.py

rename to youtube_dlc/extractor/ruv.py
diff --git a/youtube_dl/extractor/safari.py b/youtube_dlc/extractor/safari.py

similarity index 98%

rename from youtube_dl/extractor/safari.py

rename to youtube_dlc/extractor/safari.py

index bd9ee1647d47d47bfc8d8341139c2bf953ecf158..2cc66512241dbc6f65589e52d842cf70b7250ccb 100644 (file)
--- a/youtube_dl/extractor/safari.py
+++ b/youtube_dlc/extractor/safari.py
@@ -8,7 +8,6 @@
  
  from ..compat import (
      compat_parse_qs,
-    compat_str,
      compat_urlparse,
  )
  from ..utils import (
@@ -39,13 +38,13 @@ def _login(self):
              'Downloading login page')
  
          def is_logged(urlh):
-            return 'learning.oreilly.com/home/' in compat_str(urlh.geturl())
+            return 'learning.oreilly.com/home/' in urlh.geturl()
  
          if is_logged(urlh):
              self.LOGGED_IN = True
              return
  
-        redirect_url = compat_str(urlh.geturl())
+        redirect_url = urlh.geturl()
          parsed_url = compat_urlparse.urlparse(redirect_url)
          qs = compat_parse_qs(parsed_url.query)
          next_uri = compat_urlparse.urljoin(
@@ -165,7 +164,8 @@ def _real_extract(self, url):
              kaltura_session = self._download_json(
                  '%s/player/kaltura_session/?reference_id=%s' % (self._API_BASE, reference_id),
                  video_id, 'Downloading kaltura session JSON',
-                'Unable to download kaltura session JSON', fatal=False)
+                'Unable to download kaltura session JSON', fatal=False,
+                headers={'Accept': 'application/json'})
              if kaltura_session:
                  session = kaltura_session.get('session')
                  if session:
diff --git a/youtube_dl/extractor/sapo.py b/youtube_dlc/extractor/sapo.py

similarity index 100%

rename from youtube_dl/extractor/sapo.py

rename to youtube_dlc/extractor/sapo.py
diff --git a/youtube_dl/extractor/savefrom.py b/youtube_dlc/extractor/savefrom.py

similarity index 100%

rename from youtube_dl/extractor/savefrom.py

rename to youtube_dlc/extractor/savefrom.py
diff --git a/youtube_dl/extractor/sbs.py b/youtube_dlc/extractor/sbs.py

similarity index 100%

rename from youtube_dl/extractor/sbs.py

rename to youtube_dlc/extractor/sbs.py
diff --git a/youtube_dl/extractor/screencast.py b/youtube_dlc/extractor/screencast.py

similarity index 100%

rename from youtube_dl/extractor/screencast.py

rename to youtube_dlc/extractor/screencast.py
diff --git a/youtube_dl/extractor/screencastomatic.py b/youtube_dlc/extractor/screencastomatic.py

similarity index 100%

rename from youtube_dl/extractor/screencastomatic.py

rename to youtube_dlc/extractor/screencastomatic.py
diff --git a/youtube_dl/extractor/scrippsnetworks.py b/youtube_dlc/extractor/scrippsnetworks.py

similarity index 65%

rename from youtube_dl/extractor/scrippsnetworks.py

rename to youtube_dlc/extractor/scrippsnetworks.py

index 8b3275735b1638b98c9c64f8360a6498c525a8f7..b40b4c4afded1b6f9541d60b3b2d3fb5fe0c5973 100644 (file)
--- a/youtube_dl/extractor/scrippsnetworks.py
+++ b/youtube_dlc/extractor/scrippsnetworks.py
@@ -7,6 +7,7 @@
  
  from .aws import AWSIE
  from .anvato import AnvatoIE
+from .common import InfoExtractor
  from ..utils import (
      smuggle_url,
      urlencode_postdata,
@@ -102,3 +103,50 @@ def get(key):
                  'anvato:anvato_scripps_app_web_prod_0837996dbe373629133857ae9eb72e740424d80a:%s' % mcp_id,
                  {'geo_countries': ['US']}),
              AnvatoIE.ie_key(), video_id=mcp_id)
+
+
+class ScrippsNetworksIE(InfoExtractor):
+    _VALID_URL = r'https?://(?:www\.)?(?P<site>cookingchanneltv|discovery|(?:diy|food)network|hgtv|travelchannel)\.com/videos/[0-9a-z-]+-(?P<id>\d+)'
+    _TESTS = [{
+        'url': 'https://www.cookingchanneltv.com/videos/the-best-of-the-best-0260338',
+        'info_dict': {
+            'id': '0260338',
+            'ext': 'mp4',
+            'title': 'The Best of the Best',
+            'description': 'Catch a new episode of MasterChef Canada Tuedsay at 9/8c.',
+            'timestamp': 1475678834,
+            'upload_date': '20161005',
+            'uploader': 'SCNI-SCND',
+        },
+        'add_ie': ['ThePlatform'],
+    }, {
+        'url': 'https://www.diynetwork.com/videos/diy-barnwood-tablet-stand-0265790',
+        'only_matching': True,
+    }, {
+        'url': 'https://www.foodnetwork.com/videos/chocolate-strawberry-cake-roll-7524591',
+        'only_matching': True,
+    }, {
+        'url': 'https://www.hgtv.com/videos/cookie-decorating-101-0301929',
+        'only_matching': True,
+    }, {
+        'url': 'https://www.travelchannel.com/videos/two-climates-one-bag-5302184',
+        'only_matching': True,
+    }, {
+        'url': 'https://www.discovery.com/videos/guardians-of-the-glades-cooking-with-tom-cobb-5578368',
+        'only_matching': True,
+    }]
+    _ACCOUNT_MAP = {
+        'cookingchanneltv': 2433005105,
+        'discovery': 2706091867,
+        'diynetwork': 2433004575,
+        'foodnetwork': 2433005105,
+        'hgtv': 2433004575,
+        'travelchannel': 2433005739,
+    }
+    _TP_TEMPL = 'https://link.theplatform.com/s/ip77QC/media/guid/%d/%s?mbr=true'
+
+    def _real_extract(self, url):
+        site, guid = re.match(self._VALID_URL, url).groups()
+        return self.url_result(smuggle_url(
+            self._TP_TEMPL % (self._ACCOUNT_MAP[site], guid),
+            {'force_smil_url': True}), 'ThePlatform', guid)
diff --git a/youtube_dl/extractor/scte.py b/youtube_dlc/extractor/scte.py

similarity index 100%

rename from youtube_dl/extractor/scte.py

rename to youtube_dlc/extractor/scte.py
diff --git a/youtube_dl/extractor/seeker.py b/youtube_dlc/extractor/seeker.py

similarity index 100%

rename from youtube_dl/extractor/seeker.py

rename to youtube_dlc/extractor/seeker.py
diff --git a/youtube_dl/extractor/senateisvp.py b/youtube_dlc/extractor/senateisvp.py

similarity index 100%

rename from youtube_dl/extractor/senateisvp.py

rename to youtube_dlc/extractor/senateisvp.py
diff --git a/youtube_dl/extractor/sendtonews.py b/youtube_dlc/extractor/sendtonews.py

similarity index 100%

rename from youtube_dl/extractor/sendtonews.py

rename to youtube_dlc/extractor/sendtonews.py
diff --git a/youtube_dl/extractor/servus.py b/youtube_dlc/extractor/servus.py

similarity index 79%

rename from youtube_dl/extractor/servus.py

rename to youtube_dlc/extractor/servus.py

index e579d42cf525b56500b98c84272b67b279e36baa..9401bf2cf7fcdad2eb218f5a3d072399932fea9a 100644 (file)
--- a/youtube_dl/extractor/servus.py
+++ b/youtube_dlc/extractor/servus.py
@@ -7,9 +7,18 @@
  
  
  class ServusIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:www\.)?servus\.com/(?:(?:at|de)/p/[^/]+|tv/videos)/(?P<id>[aA]{2}-\w+|\d+-\d+)'
+    _VALID_URL = r'''(?x)
+                    https?://
+                        (?:www\.)?
+                        (?:
+                            servus\.com/(?:(?:at|de)/p/[^/]+|tv/videos)|
+                            servustv\.com/videos
+                        )
+                        /(?P<id>[aA]{2}-\w+|\d+-\d+)
+                    '''
      _TESTS = [{
-        'url': 'https://www.servus.com/de/p/Die-Gr%C3%BCnen-aus-Sicht-des-Volkes/AA-1T6VBU5PW1W12/',
+        # new URL schema
+        'url': 'https://www.servustv.com/videos/aa-1t6vbu5pw1w12/',
          'md5': '3e1dd16775aa8d5cbef23628cfffc1f4',
          'info_dict': {
              'id': 'AA-1T6VBU5PW1W12',
@@ -18,6 +27,10 @@ class ServusIE(InfoExtractor):
              'description': 'md5:1247204d85783afe3682644398ff2ec4',
              'thumbnail': r're:^https?://.*\.jpg',
          }
+    }, {
+        # old URL schema
+        'url': 'https://www.servus.com/de/p/Die-Gr%C3%BCnen-aus-Sicht-des-Volkes/AA-1T6VBU5PW1W12/',
+        'only_matching': True,
      }, {
          'url': 'https://www.servus.com/at/p/Wie-das-Leben-beginnt/1309984137314-381415152/',
          'only_matching': True,
diff --git a/youtube_dl/extractor/sevenplus.py b/youtube_dlc/extractor/sevenplus.py

similarity index 100%

rename from youtube_dl/extractor/sevenplus.py

rename to youtube_dlc/extractor/sevenplus.py
diff --git a/youtube_dl/extractor/sexu.py b/youtube_dlc/extractor/sexu.py

similarity index 100%

rename from youtube_dl/extractor/sexu.py

rename to youtube_dlc/extractor/sexu.py
diff --git a/youtube_dl/extractor/seznamzpravy.py b/youtube_dlc/extractor/seznamzpravy.py

similarity index 100%

rename from youtube_dl/extractor/seznamzpravy.py

rename to youtube_dlc/extractor/seznamzpravy.py
diff --git a/youtube_dl/extractor/shahid.py b/youtube_dlc/extractor/shahid.py

similarity index 100%

rename from youtube_dl/extractor/shahid.py

rename to youtube_dlc/extractor/shahid.py
diff --git a/youtube_dl/extractor/shared.py b/youtube_dlc/extractor/shared.py

similarity index 100%

rename from youtube_dl/extractor/shared.py

rename to youtube_dlc/extractor/shared.py
diff --git a/youtube_dl/extractor/showroomlive.py b/youtube_dlc/extractor/showroomlive.py

similarity index 100%

rename from youtube_dl/extractor/showroomlive.py

rename to youtube_dlc/extractor/showroomlive.py
diff --git a/youtube_dl/extractor/sina.py b/youtube_dlc/extractor/sina.py

similarity index 100%

rename from youtube_dl/extractor/sina.py

rename to youtube_dlc/extractor/sina.py
diff --git a/youtube_dl/extractor/sixplay.py b/youtube_dlc/extractor/sixplay.py

similarity index 100%

rename from youtube_dl/extractor/sixplay.py

rename to youtube_dlc/extractor/sixplay.py
diff --git a/youtube_dl/extractor/sky.py b/youtube_dlc/extractor/sky.py

similarity index 100%

rename from youtube_dl/extractor/sky.py

rename to youtube_dlc/extractor/sky.py
diff --git a/youtube_dl/extractor/skylinewebcams.py b/youtube_dlc/extractor/skylinewebcams.py

similarity index 100%

rename from youtube_dl/extractor/skylinewebcams.py

rename to youtube_dlc/extractor/skylinewebcams.py
diff --git a/youtube_dl/extractor/skynewsarabia.py b/youtube_dlc/extractor/skynewsarabia.py

similarity index 100%

rename from youtube_dl/extractor/skynewsarabia.py

rename to youtube_dlc/extractor/skynewsarabia.py
diff --git a/youtube_dl/extractor/slideshare.py b/youtube_dlc/extractor/slideshare.py

similarity index 100%

rename from youtube_dl/extractor/slideshare.py

rename to youtube_dlc/extractor/slideshare.py
diff --git a/youtube_dl/extractor/slideslive.py b/youtube_dlc/extractor/slideslive.py

similarity index 100%

rename from youtube_dl/extractor/slideslive.py

rename to youtube_dlc/extractor/slideslive.py
diff --git a/youtube_dl/extractor/slutload.py b/youtube_dlc/extractor/slutload.py

similarity index 100%

rename from youtube_dl/extractor/slutload.py

rename to youtube_dlc/extractor/slutload.py
diff --git a/youtube_dl/extractor/smotri.py b/youtube_dlc/extractor/smotri.py

similarity index 100%

rename from youtube_dl/extractor/smotri.py

rename to youtube_dlc/extractor/smotri.py
diff --git a/youtube_dl/extractor/snotr.py b/youtube_dlc/extractor/snotr.py

similarity index 100%

rename from youtube_dl/extractor/snotr.py

rename to youtube_dlc/extractor/snotr.py
diff --git a/youtube_dl/extractor/sohu.py b/youtube_dlc/extractor/sohu.py

similarity index 99%

rename from youtube_dl/extractor/sohu.py

rename to youtube_dlc/extractor/sohu.py

index a62ed84f1e34b75a721914d68a101ad2a0bb8983..76b3cc6b6b8ef639aa7185be60e68b3383e4c2fb 100644 (file)
--- a/youtube_dl/extractor/sohu.py
+++ b/youtube_dlc/extractor/sohu.py
@@ -77,7 +77,7 @@ class SohuIE(InfoExtractor):
          'info_dict': {
              'id': '78932792',
              'ext': 'mp4',
-            'title': 'youtube-dl testing video',
+            'title': 'youtube-dlc testing video',
          },
          'params': {
              'skip_download': True
diff --git a/youtube_dl/extractor/sonyliv.py b/youtube_dlc/extractor/sonyliv.py

similarity index 100%

rename from youtube_dl/extractor/sonyliv.py

rename to youtube_dlc/extractor/sonyliv.py
diff --git a/youtube_dl/extractor/soundcloud.py b/youtube_dlc/extractor/soundcloud.py

similarity index 73%

rename from youtube_dl/extractor/soundcloud.py

rename to youtube_dlc/extractor/soundcloud.py

index c2ee54457e90909fea418f38320d9e5480ccc379..04b70c1193b0d8aa5c74ff5cb00fe32941a88d6f 100644 (file)
--- a/youtube_dl/extractor/soundcloud.py
+++ b/youtube_dlc/extractor/soundcloud.py
@@ -3,16 +3,21 @@
  
  import itertools
  import re
+import json
+import random
  
  from .common import (
      InfoExtractor,
      SearchInfoExtractor
  )
  from ..compat import (
+    compat_HTTPError,
+    compat_kwargs,
      compat_str,
      compat_urlparse,
  )
  from ..utils import (
+    error_to_compat_str,
      ExtractorError,
      float_or_none,
      HEADRequest,
@@ -24,6 +29,8 @@
      unified_timestamp,
      update_url_query,
      url_or_none,
+    urlhandle_detect_ext,
+    sanitized_Request,
  )
  
  
@@ -93,7 +100,7 @@ class SoundcloudIE(InfoExtractor):
                  'repost_count': int,
              }
          },
-        # not streamable song
+        # geo-restricted
          {
              'url': 'https://soundcloud.com/the-concept-band/goldrushed-mastered?in=the-concept-band/sets/the-royal-concept-ep',
              'info_dict': {
@@ -105,22 +112,17 @@ class SoundcloudIE(InfoExtractor):
                  'uploader_id': '9615865',
                  'timestamp': 1337635207,
                  'upload_date': '20120521',
-                'duration': 30,
+                'duration': 227.155,
                  'license': 'all-rights-reserved',
                  'view_count': int,
                  'like_count': int,
                  'comment_count': int,
                  'repost_count': int,
              },
-            'params': {
-                # rtmp
-                'skip_download': True,
-            },
-            'skip': 'Preview',
          },
          # private link
          {
-            'url': 'https://soundcloud.com/jaimemf/youtube-dl-test-video-a-y-baw/s-8Pjrp',
+            'url': 'https://soundcloud.com/jaimemf/youtube-dlc-test-video-a-y-baw/s-8Pjrp',
              'md5': 'aa0dd32bfea9b0c5ef4f02aacd080604',
              'info_dict': {
                  'id': '123998367',
@@ -227,7 +229,6 @@ class SoundcloudIE(InfoExtractor):
                  'skip_download': True,
              },
          },
-        # not available via api.soundcloud.com/i1/tracks/id/streams
          {
              'url': 'https://soundcloud.com/giovannisarani/mezzo-valzer',
              'md5': 'e22aecd2bc88e0e4e432d7dcc0a1abf7',
@@ -236,7 +237,7 @@ class SoundcloudIE(InfoExtractor):
                  'ext': 'mp3',
                  'title': 'Mezzo Valzer',
                  'description': 'md5:4138d582f81866a530317bae316e8b61',
-                'uploader': 'Giovanni Sarani',
+                'uploader': 'Micronie',
                  'uploader_id': '3352531',
                  'timestamp': 1551394171,
                  'upload_date': '20190228',
@@ -248,14 +249,21 @@ class SoundcloudIE(InfoExtractor):
                  'comment_count': int,
                  'repost_count': int,
              },
-            'expected_warnings': ['Unable to download JSON metadata'],
-        }
+        },
+        {
+            # AAC HQ format available (account with active subscription needed)
+            'url': 'https://soundcloud.com/wandw/the-chainsmokers-ft-daya-dont-let-me-down-ww-remix-1',
+            'only_matching': True,
+        },
+        {
+            # Go+ (account with active subscription needed)
+            'url': 'https://soundcloud.com/taylorswiftofficial/look-what-you-made-me-do',
+            'only_matching': True,
+        },
      ]
  
-    _API_BASE = 'https://api.soundcloud.com/'
      _API_V2_BASE = 'https://api-v2.soundcloud.com/'
      _BASE_URL = 'https://soundcloud.com/'
-    _CLIENT_ID = 'UW9ajvMgVdMMW3cdeBi8lPfN6dvOVGji'
      _IMAGE_REPL_RE = r'-([0-9a-z]+)\.jpg'
  
      _ARTWORK_MAP = {
@@ -271,14 +279,127 @@ class SoundcloudIE(InfoExtractor):
          'original': 0,
      }
  
+    def _store_client_id(self, client_id):
+        self._downloader.cache.store('soundcloud', 'client_id', client_id)
+
+    def _update_client_id(self):
+        webpage = self._download_webpage('https://soundcloud.com/', None)
+        for src in reversed(re.findall(r'<script[^>]+src="([^"]+)"', webpage)):
+            script = self._download_webpage(src, None, fatal=False)
+            if script:
+                client_id = self._search_regex(
+                    r'client_id\s*:\s*"([0-9a-zA-Z]{32})"',
+                    script, 'client id', default=None)
+                if client_id:
+                    self._CLIENT_ID = client_id
+                    self._store_client_id(client_id)
+                    return
+        raise ExtractorError('Unable to extract client id')
+
+    def _download_json(self, *args, **kwargs):
+        non_fatal = kwargs.get('fatal') is False
+        if non_fatal:
+            del kwargs['fatal']
+        query = kwargs.get('query', {}).copy()
+        for _ in range(2):
+            query['client_id'] = self._CLIENT_ID
+            kwargs['query'] = query
+            try:
+                return super(SoundcloudIE, self)._download_json(*args, **compat_kwargs(kwargs))
+            except ExtractorError as e:
+                if isinstance(e.cause, compat_HTTPError) and e.cause.code == 401:
+                    self._store_client_id(None)
+                    self._update_client_id()
+                    continue
+                elif non_fatal:
+                    self._downloader.report_warning(error_to_compat_str(e))
+                    return False
+                raise
+
+    def _real_initialize(self):
+        self._CLIENT_ID = self._downloader.cache.load('soundcloud', 'client_id') or "T5R4kgWS2PRf6lzLyIravUMnKlbIxQag"  # 'EXLwg5lHTO2dslU5EePe3xkw0m1h86Cd' # 'YUKXoArFcqrlQn9tfNHvvyfnDISj04zk'
+        self._login()
+
+    _USER_AGENT = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.105 Safari/537.36"
+    _API_AUTH_QUERY_TEMPLATE = '?client_id=%s'
+    _API_AUTH_URL_PW = 'https://api-auth.soundcloud.com/web-auth/sign-in/password%s'
+    _access_token = None
+    _HEADERS = {}
+    _NETRC_MACHINE = 'soundcloud'
+
+    def _login(self):
+        username, password = self._get_login_info()
+        if username is None:
+            return
+
+        def genDevId():
+            def genNumBlock():
+                return ''.join([str(random.randrange(10)) for i in range(6)])
+            return '-'.join([genNumBlock() for i in range(4)])
+
+        payload = {
+            'client_id': self._CLIENT_ID,
+            'recaptcha_pubkey': 'null',
+            'recaptcha_response': 'null',
+            'credentials': {
+                'identifier': username,
+                'password': password
+            },
+            'signature': self.sign(username, password, self._CLIENT_ID),
+            'device_id': genDevId(),
+            'user_agent': self._USER_AGENT
+        }
+
+        query = self._API_AUTH_QUERY_TEMPLATE % self._CLIENT_ID
+        login = sanitized_Request(self._API_AUTH_URL_PW % query, json.dumps(payload).encode('utf-8'))
+        response = self._download_json(login, None)
+        self._access_token = response.get('session').get('access_token')
+        if not self._access_token:
+            self.report_warning('Unable to get access token, login may has failed')
+        else:
+            self._HEADERS = {'Authorization': 'OAuth ' + self._access_token}
+
+    # signature generation
+    def sign(self, user, pw, clid):
+        a = 33
+        i = 1
+        s = 440123
+        w = 117
+        u = 1800000
+        l = 1042
+        b = 37
+        k = 37
+        c = 5
+        n = "0763ed7314c69015fd4a0dc16bbf4b90"  # _KEY
+        y = "8"  # _REV
+        r = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.105 Safari/537.36"  # _USER_AGENT
+        e = user  # _USERNAME
+        t = clid  # _CLIENT_ID
+
+        d = '-'.join([str(mInt) for mInt in [a, i, s, w, u, l, b, k]])
+        p = n + y + d + r + e + t + d + n
+        h = p
+
+        m = 8011470
+        f = 0
+
+        for f in range(f, len(h)):
+            m = (m >> 1) + ((1 & m) << 23)
+            m += ord(h[f])
+            m &= 16777215
+
+        # c is not even needed
+        out = str(y) + ':' + str(d) + ':' + format(m, 'x') + ':' + str(c)
+
+        return out
+
      @classmethod
      def _resolv_url(cls, url):
-        return SoundcloudIE._API_V2_BASE + 'resolve?url=' + url + '&client_id=' + cls._CLIENT_ID
+        return SoundcloudIE._API_V2_BASE + 'resolve?url=' + url
  
-    def _extract_info_dict(self, info, full_title=None, secret_token=None, version=2):
+    def _extract_info_dict(self, info, full_title=None, secret_token=None):
          track_id = compat_str(info['id'])
          title = info['title']
-        track_base_url = self._API_BASE + 'tracks/%s' % track_id
  
          format_urls = set()
          formats = []
@@ -287,26 +408,27 @@ def _extract_info_dict(self, info, full_title=None, secret_token=None, version=2
              query['secret_token'] = secret_token
  
          if info.get('downloadable') and info.get('has_downloads_left'):
-            format_url = update_url_query(
-                info.get('download_url') or track_base_url + '/download', query)
-            format_urls.add(format_url)
-            if version == 2:
-                v1_info = self._download_json(
-                    track_base_url, track_id, query=query, fatal=False) or {}
-            else:
-                v1_info = info
-            formats.append({
-                'format_id': 'download',
-                'ext': v1_info.get('original_format') or 'mp3',
-                'filesize': int_or_none(v1_info.get('original_content_size')),
-                'url': format_url,
-                'preference': 10,
-            })
+            download_url = update_url_query(
+                self._API_V2_BASE + 'tracks/' + track_id + '/download', query)
+            redirect_url = (self._download_json(download_url, track_id, fatal=False) or {}).get('redirectUri')
+            if redirect_url:
+                urlh = self._request_webpage(
+                    HEADRequest(redirect_url), track_id, fatal=False)
+                if urlh:
+                    format_url = urlh.geturl()
+                    format_urls.add(format_url)
+                    formats.append({
+                        'format_id': 'download',
+                        'ext': urlhandle_detect_ext(urlh) or 'mp3',
+                        'filesize': int_or_none(urlh.headers.get('Content-Length')),
+                        'url': format_url,
+                        'preference': 10,
+                    })
  
          def invalid_url(url):
-            return not url or url in format_urls or re.search(r'/(?:preview|playlist)/0/30/', url)
+            return not url or url in format_urls
  
-        def add_format(f, protocol):
+        def add_format(f, protocol, is_preview=False):
              mobj = re.search(r'\.(?P<abr>\d+)\.(?P<ext>[0-9a-z]{3,4})(?=[/?])', stream_url)
              if mobj:
                  for k, v in mobj.groupdict().items():
@@ -315,16 +437,27 @@ def add_format(f, protocol):
              format_id_list = []
              if protocol:
                  format_id_list.append(protocol)
+            ext = f.get('ext')
+            if ext == 'aac':
+                f['abr'] = '256'
              for k in ('ext', 'abr'):
                  v = f.get(k)
                  if v:
                      format_id_list.append(v)
+            preview = is_preview or re.search(r'/(?:preview|playlist)/0/30/', f['url'])
+            if preview:
+                format_id_list.append('preview')
              abr = f.get('abr')
              if abr:
                  f['abr'] = int(abr)
+            if protocol == 'hls':
+                protocol = 'm3u8' if ext == 'aac' else 'm3u8_native'
+            else:
+                protocol = 'http'
              f.update({
                  'format_id': '_'.join(format_id_list),
-                'protocol': 'm3u8_native' if protocol == 'hls' else 'http',
+                'protocol': protocol,
+                'preference': -10 if preview else None,
              })
              formats.append(f)
  
@@ -335,10 +468,10 @@ def add_format(f, protocol):
              if not isinstance(t, dict):
                  continue
              format_url = url_or_none(t.get('url'))
-            if not format_url or t.get('snipped') or '/preview/' in format_url:
+            if not format_url:
                  continue
              stream = self._download_json(
-                format_url, track_id, query=query, fatal=False)
+                format_url, track_id, query=query, fatal=False, headers=self._HEADERS)
              if not isinstance(stream, dict):
                  continue
              stream_url = url_or_none(stream.get('url'))
@@ -358,44 +491,14 @@ def add_format(f, protocol):
              add_format({
                  'url': stream_url,
                  'ext': ext,
-            }, 'http' if protocol == 'progressive' else protocol)
-
-        if not formats:
-            # Old API, does not work for some tracks (e.g.
-            # https://soundcloud.com/giovannisarani/mezzo-valzer)
-            # and might serve preview URLs (e.g.
-            # http://www.soundcloud.com/snbrn/ele)
-            format_dict = self._download_json(
-                track_base_url + '/streams', track_id,
-                'Downloading track url', query=query, fatal=False) or {}
-
-            for key, stream_url in format_dict.items():
-                if invalid_url(stream_url):
-                    continue
-                format_urls.add(stream_url)
-                mobj = re.search(r'(http|hls)_([^_]+)_(\d+)_url', key)
-                if mobj:
-                    protocol, ext, abr = mobj.groups()
-                    add_format({
-                        'abr': abr,
-                        'ext': ext,
-                        'url': stream_url,
-                    }, protocol)
-
-        if not formats:
-            # We fallback to the stream_url in the original info, this
-            # cannot be always used, sometimes it can give an HTTP 404 error
-            urlh = self._request_webpage(
-                HEADRequest(info.get('stream_url') or track_base_url + '/stream'),
-                track_id, query=query, fatal=False)
-            if urlh:
-                stream_url = urlh.geturl()
-                if not invalid_url(stream_url):
-                    add_format({'url': stream_url}, 'http')
+            }, 'http' if protocol == 'progressive' else protocol,
+                t.get('snipped') or '/preview/' in format_url)
  
          for f in formats:
              f['vcodec'] = 'none'
  
+        if not formats and info.get('policy') == 'BLOCK':
+            self.raise_geo_restricted()
          self._sort_formats(formats)
  
          user = info.get('user') or {}
@@ -451,9 +554,7 @@ def _real_extract(self, url):
  
          track_id = mobj.group('track_id')
  
-        query = {
-            'client_id': self._CLIENT_ID,
-        }
+        query = {}
          if track_id:
              info_json_url = self._API_V2_BASE + 'tracks/' + track_id
              full_title = track_id
@@ -467,20 +568,24 @@ def _real_extract(self, url):
                  resolve_title += '/%s' % token
              info_json_url = self._resolv_url(self._BASE_URL + resolve_title)
  
-        version = 2
          info = self._download_json(
-            info_json_url, full_title, 'Downloading info JSON', query=query, fatal=False)
-        if not info:
-            info = self._download_json(
-                info_json_url.replace(self._API_V2_BASE, self._API_BASE),
-                full_title, 'Downloading info JSON', query=query)
-            version = 1
+            info_json_url, full_title, 'Downloading info JSON', query=query, headers=self._HEADERS)
  
-        return self._extract_info_dict(info, full_title, token, version)
+        return self._extract_info_dict(info, full_title, token)
  
  
  class SoundcloudPlaylistBaseIE(SoundcloudIE):
-    def _extract_track_entries(self, tracks, token=None):
+    def _extract_set(self, playlist, token=None):
+        playlist_id = compat_str(playlist['id'])
+        tracks = playlist.get('tracks') or []
+        if not all([t.get('permalink_url') for t in tracks]) and token:
+            tracks = self._download_json(
+                self._API_V2_BASE + 'tracks', playlist_id,
+                'Downloading tracks', query={
+                    'ids': ','.join([compat_str(t['id']) for t in tracks]),
+                    'playlistId': playlist_id,
+                    'playlistSecretToken': token,
+                }, headers=self._HEADERS)
          entries = []
          for track in tracks:
              track_id = str_or_none(track.get('id'))
@@ -493,22 +598,35 @@ def _extract_track_entries(self, tracks, token=None):
                      url += '?secret_token=' + token
              entries.append(self.url_result(
                  url, SoundcloudIE.ie_key(), track_id))
-        return entries
+        return self.playlist_result(
+            entries, playlist_id,
+            playlist.get('title'),
+            playlist.get('description'))
  
  
  class SoundcloudSetIE(SoundcloudPlaylistBaseIE):
-    _VALID_URL = r'https?://(?:(?:www|m)\.)?soundcloud\.com/(?P<uploader>[\w\d-]+)/sets/(?P<slug_title>[\w\d-]+)(?:/(?P<token>[^?/]+))?'
+    _VALID_URL = r'https?://(?:(?:www|m)\.)?soundcloud\.com/(?P<uploader>[\w\d-]+)/sets/(?P<slug_title>[:\w\d-]+)(?:/(?P<token>[^?/]+))?'
      IE_NAME = 'soundcloud:set'
      _TESTS = [{
          'url': 'https://soundcloud.com/the-concept-band/sets/the-royal-concept-ep',
          'info_dict': {
              'id': '2284613',
              'title': 'The Royal Concept EP',
+            'description': 'md5:71d07087c7a449e8941a70a29e34671e',
          },
          'playlist_mincount': 5,
      }, {
          'url': 'https://soundcloud.com/the-concept-band/sets/the-royal-concept-ep/token',
          'only_matching': True,
+    }, {
+        'url': 'https://soundcloud.com/discover/sets/weekly::flacmatic',
+        'only_matching': True,
+    }, {
+        'url': 'https://soundcloud.com/discover/sets/charts-top:all-music:de',
+        'only_matching': True,
+    }, {
+        'url': 'https://soundcloud.com/discover/sets/charts-top:hiphoprap:kr',
+        'only_matching': True,
      }]
  
      def _real_extract(self, url):
@@ -520,23 +638,19 @@ def _real_extract(self, url):
              full_title += '/' + token
  
          info = self._download_json(self._resolv_url(
-            self._BASE_URL + full_title), full_title)
+            self._BASE_URL + full_title), full_title, headers=self._HEADERS)
  
          if 'errors' in info:
              msgs = (compat_str(err['error_message']) for err in info['errors'])
              raise ExtractorError('unable to download video webpage: %s' % ','.join(msgs))
  
-        entries = self._extract_track_entries(info['tracks'], token)
-
-        return self.playlist_result(
-            entries, str_or_none(info.get('id')), info.get('title'))
+        return self._extract_set(info, token)
  
  
-class SoundcloudPagedPlaylistBaseIE(SoundcloudPlaylistBaseIE):
+class SoundcloudPagedPlaylistBaseIE(SoundcloudIE):
      def _extract_playlist(self, base_url, playlist_id, playlist_title):
          COMMON_QUERY = {
-            'limit': 2000000000,
-            'client_id': self._CLIENT_ID,
+            'limit': 200,
              'linked_partitioning': '1',
          }
  
@@ -549,7 +663,7 @@ def _extract_playlist(self, base_url, playlist_id, playlist_title):
          for i in itertools.count():
              response = self._download_json(
                  next_href, playlist_id,
-                'Downloading track page %s' % (i + 1), query=query)
+                'Downloading track page %s' % (i + 1), query=query, headers=self._HEADERS)
  
              collection = response['collection']
  
@@ -671,7 +785,7 @@ def _real_extract(self, url):
  
          user = self._download_json(
              self._resolv_url(self._BASE_URL + uploader),
-            uploader, 'Downloading user info')
+            uploader, 'Downloading user info', headers=self._HEADERS)
  
          resource = mobj.group('rsrc') or 'all'
  
@@ -696,7 +810,7 @@ class SoundcloudTrackStationIE(SoundcloudPagedPlaylistBaseIE):
      def _real_extract(self, url):
          track_name = self._match_id(url)
  
-        track = self._download_json(self._resolv_url(url), track_name)
+        track = self._download_json(self._resolv_url(url), track_name, headers=self._HEADERS)
          track_id = self._search_regex(
              r'soundcloud:track-stations:(\d+)', track['id'], 'track id')
  
@@ -722,21 +836,16 @@ def _real_extract(self, url):
          mobj = re.match(self._VALID_URL, url)
          playlist_id = mobj.group('id')
  
-        query = {
-            'client_id': self._CLIENT_ID,
-        }
+        query = {}
          token = mobj.group('token')
          if token:
              query['secret_token'] = token
  
          data = self._download_json(
              self._API_V2_BASE + 'playlists/' + playlist_id,
-            playlist_id, 'Downloading playlist', query=query)
-
-        entries = self._extract_track_entries(data['tracks'], token)
+            playlist_id, 'Downloading playlist', query=query, headers=self._HEADERS)
  
-        return self.playlist_result(
-            entries, playlist_id, data.get('title'), data.get('description'))
+        return self._extract_set(data, token)
  
  
  class SoundcloudSearchIE(SearchInfoExtractor, SoundcloudIE):
@@ -761,7 +870,6 @@ def _get_collection(self, endpoint, collection_id, **query):
              self._MAX_RESULTS_PER_PAGE)
          query.update({
              'limit': limit,
-            'client_id': self._CLIENT_ID,
              'linked_partitioning': 1,
              'offset': 0,
          })
@@ -772,7 +880,7 @@ def _get_collection(self, endpoint, collection_id, **query):
          for i in itertools.count(1):
              response = self._download_json(
                  next_url, collection_id, 'Downloading page {0}'.format(i),
-                'Unable to download API page')
+                'Unable to download API page', headers=self._HEADERS)
  
              collection = response.get('collection', [])
              if not collection:
diff --git a/youtube_dl/extractor/soundgasm.py b/youtube_dlc/extractor/soundgasm.py

similarity index 100%

rename from youtube_dl/extractor/soundgasm.py

rename to youtube_dlc/extractor/soundgasm.py
diff --git a/youtube_dl/extractor/southpark.py b/youtube_dlc/extractor/southpark.py

similarity index 100%

rename from youtube_dl/extractor/southpark.py

rename to youtube_dlc/extractor/southpark.py
diff --git a/youtube_dl/extractor/spankbang.py b/youtube_dlc/extractor/spankbang.py

similarity index 83%

rename from youtube_dl/extractor/spankbang.py

rename to youtube_dlc/extractor/spankbang.py

index e040ada29b24542582f72f08f31b843d928af251..61ca902ce286e6274c7d5776bd10c265a023643f 100644 (file)
--- a/youtube_dl/extractor/spankbang.py
+++ b/youtube_dlc/extractor/spankbang.py
@@ -4,6 +4,7 @@
  
  from .common import InfoExtractor
  from ..utils import (
+    determine_ext,
      ExtractorError,
      merge_dicts,
      orderedSet,
@@ -64,7 +65,7 @@ def _real_extract(self, url):
              url.replace('/%s/embed' % video_id, '/%s/video' % video_id),
              video_id, headers={'Cookie': 'country=US'})
  
-        if re.search(r'<[^>]+\bid=["\']video_removed', webpage):
+        if re.search(r'<[^>]+\b(?:id|class)=["\']video_removed', webpage):
              raise ExtractorError(
                  'Video %s is not available' % video_id, expected=True)
  
@@ -75,11 +76,20 @@ def extract_format(format_id, format_url):
              if not f_url:
                  return
              f = parse_resolution(format_id)
-            f.update({
-                'url': f_url,
-                'format_id': format_id,
-            })
-            formats.append(f)
+            ext = determine_ext(f_url)
+            if format_id.startswith('m3u8') or ext == 'm3u8':
+                formats.extend(self._extract_m3u8_formats(
+                    f_url, video_id, 'mp4', entry_protocol='m3u8_native',
+                    m3u8_id='hls', fatal=False))
+            elif format_id.startswith('mpd') or ext == 'mpd':
+                formats.extend(self._extract_mpd_formats(
+                    f_url, video_id, mpd_id='dash', fatal=False))
+            elif ext == 'mp4' or f.get('width') or f.get('height'):
+                f.update({
+                    'url': f_url,
+                    'format_id': format_id,
+                })
+                formats.append(f)
  
          STREAM_URL_PREFIX = 'stream_url_'
  
@@ -93,28 +103,22 @@ def extract_format(format_id, format_url):
                  r'data-streamkey\s*=\s*(["\'])(?P<value>(?:(?!\1).)+)\1',
                  webpage, 'stream key', group='value')
  
-            sb_csrf_session = self._get_cookies(
-                'https://spankbang.com')['sb_csrf_session'].value
-
              stream = self._download_json(
                  'https://spankbang.com/api/videos/stream', video_id,
                  'Downloading stream JSON', data=urlencode_postdata({
                      'id': stream_key,
                      'data': 0,
-                    'sb_csrf_session': sb_csrf_session,
                  }), headers={
                      'Referer': url,
-                    'X-CSRFToken': sb_csrf_session,
+                    'X-Requested-With': 'XMLHttpRequest',
                  })
  
              for format_id, format_url in stream.items():
-                if format_id.startswith(STREAM_URL_PREFIX):
-                    if format_url and isinstance(format_url, list):
-                        format_url = format_url[0]
-                    extract_format(
-                        format_id[len(STREAM_URL_PREFIX):], format_url)
+                if format_url and isinstance(format_url, list):
+                    format_url = format_url[0]
+                extract_format(format_id, format_url)
  
-        self._sort_formats(formats)
+        self._sort_formats(formats, field_preference=('preference', 'height', 'width', 'fps', 'tbr', 'format_id'))
  
          info = self._search_json_ld(webpage, video_id, default={})
  
diff --git a/youtube_dlc/extractor/spankwire.py b/youtube_dlc/extractor/spankwire.py

new file mode 100644 (file)

index 0000000..35ab9ec
--- /dev/null
+++ b/youtube_dlc/extractor/spankwire.py
@@ -0,0 +1,182 @@
+from __future__ import unicode_literals
+
+import re
+
+from .common import InfoExtractor
+from ..utils import (
+    float_or_none,
+    int_or_none,
+    merge_dicts,
+    str_or_none,
+    str_to_int,
+    url_or_none,
+)
+
+
+class SpankwireIE(InfoExtractor):
+    _VALID_URL = r'''(?x)
+                    https?://
+                        (?:www\.)?spankwire\.com/
+                        (?:
+                            [^/]+/video|
+                            EmbedPlayer\.aspx/?\?.*?\bArticleId=
+                        )
+                        (?P<id>\d+)
+                    '''
+    _TESTS = [{
+        # download URL pattern: */<height>P_<tbr>K_<video_id>.mp4
+        'url': 'http://www.spankwire.com/Buckcherry-s-X-Rated-Music-Video-Crazy-Bitch/video103545/',
+        'md5': '5aa0e4feef20aad82cbcae3aed7ab7cd',
+        'info_dict': {
+            'id': '103545',
+            'ext': 'mp4',
+            'title': 'Buckcherry`s X Rated Music Video Crazy Bitch',
+            'description': 'Crazy Bitch X rated music video.',
+            'duration': 222,
+            'uploader': 'oreusz',
+            'uploader_id': '124697',
+            'timestamp': 1178587885,
+            'upload_date': '20070508',
+            'average_rating': float,
+            'view_count': int,
+            'comment_count': int,
+            'age_limit': 18,
+            'categories': list,
+            'tags': list,
+        },
+    }, {
+        # download URL pattern: */mp4_<format_id>_<video_id>.mp4
+        'url': 'http://www.spankwire.com/Titcums-Compiloation-I/video1921551/',
+        'md5': '09b3c20833308b736ae8902db2f8d7e6',
+        'info_dict': {
+            'id': '1921551',
+            'ext': 'mp4',
+            'title': 'Titcums Compiloation I',
+            'description': 'cum on tits',
+            'uploader': 'dannyh78999',
+            'uploader_id': '3056053',
+            'upload_date': '20150822',
+            'age_limit': 18,
+        },
+        'params': {
+            'proxy': '127.0.0.1:8118'
+        },
+        'skip': 'removed',
+    }, {
+        'url': 'https://www.spankwire.com/EmbedPlayer.aspx/?ArticleId=156156&autostart=true',
+        'only_matching': True,
+    }]
+
+    @staticmethod
+    def _extract_urls(webpage):
+        return re.findall(
+            r'<iframe[^>]+\bsrc=["\']((?:https?:)?//(?:www\.)?spankwire\.com/EmbedPlayer\.aspx/?\?.*?\bArticleId=\d+)',
+            webpage)
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+
+        video = self._download_json(
+            'https://www.spankwire.com/api/video/%s.json' % video_id, video_id)
+
+        title = video['title']
+
+        formats = []
+        videos = video.get('videos')
+        if isinstance(videos, dict):
+            for format_id, format_url in videos.items():
+                video_url = url_or_none(format_url)
+                if not format_url:
+                    continue
+                height = int_or_none(self._search_regex(
+                    r'(\d+)[pP]', format_id, 'height', default=None))
+                m = re.search(
+                    r'/(?P<height>\d+)[pP]_(?P<tbr>\d+)[kK]', video_url)
+                if m:
+                    tbr = int(m.group('tbr'))
+                    height = height or int(m.group('height'))
+                else:
+                    tbr = None
+                formats.append({
+                    'url': video_url,
+                    'format_id': '%dp' % height if height else format_id,
+                    'height': height,
+                    'tbr': tbr,
+                })
+        m3u8_url = url_or_none(video.get('HLS'))
+        if m3u8_url:
+            formats.extend(self._extract_m3u8_formats(
+                m3u8_url, video_id, 'mp4', entry_protocol='m3u8_native',
+                m3u8_id='hls', fatal=False))
+        self._sort_formats(formats, ('height', 'tbr', 'width', 'format_id'))
+
+        view_count = str_to_int(video.get('viewed'))
+
+        thumbnails = []
+        for preference, t in enumerate(('', '2x'), start=0):
+            thumbnail_url = url_or_none(video.get('poster%s' % t))
+            if not thumbnail_url:
+                continue
+            thumbnails.append({
+                'url': thumbnail_url,
+                'preference': preference,
+            })
+
+        def extract_names(key):
+            entries_list = video.get(key)
+            if not isinstance(entries_list, list):
+                return
+            entries = []
+            for entry in entries_list:
+                name = str_or_none(entry.get('name'))
+                if name:
+                    entries.append(name)
+            return entries
+
+        categories = extract_names('categories')
+        tags = extract_names('tags')
+
+        uploader = None
+        info = {}
+
+        webpage = self._download_webpage(
+            'https://www.spankwire.com/_/video%s/' % video_id, video_id,
+            fatal=False)
+        if webpage:
+            info = self._search_json_ld(webpage, video_id, default={})
+            thumbnail_url = None
+            if 'thumbnail' in info:
+                thumbnail_url = url_or_none(info['thumbnail'])
+                del info['thumbnail']
+            if not thumbnail_url:
+                thumbnail_url = self._og_search_thumbnail(webpage)
+            if thumbnail_url:
+                thumbnails.append({
+                    'url': thumbnail_url,
+                    'preference': 10,
+                })
+            uploader = self._html_search_regex(
+                r'(?s)by\s*<a[^>]+\bclass=["\']uploaded__by[^>]*>(.+?)</a>',
+                webpage, 'uploader', fatal=False)
+            if not view_count:
+                view_count = str_to_int(self._search_regex(
+                    r'data-views=["\']([\d,.]+)', webpage, 'view count',
+                    fatal=False))
+
+        return merge_dicts({
+            'id': video_id,
+            'title': title,
+            'description': video.get('description'),
+            'duration': int_or_none(video.get('duration')),
+            'thumbnails': thumbnails,
+            'uploader': uploader,
+            'uploader_id': str_or_none(video.get('userId')),
+            'timestamp': int_or_none(video.get('time_approved_on')),
+            'average_rating': float_or_none(video.get('rating')),
+            'view_count': view_count,
+            'comment_count': int_or_none(video.get('comments')),
+            'age_limit': 18,
+            'categories': categories,
+            'tags': tags,
+            'formats': formats,
+        }, info)
diff --git a/youtube_dl/extractor/spiegel.py b/youtube_dlc/extractor/spiegel.py

similarity index 100%

rename from youtube_dl/extractor/spiegel.py

rename to youtube_dlc/extractor/spiegel.py
diff --git a/youtube_dl/extractor/spiegeltv.py b/youtube_dlc/extractor/spiegeltv.py

similarity index 100%

rename from youtube_dl/extractor/spiegeltv.py

rename to youtube_dlc/extractor/spiegeltv.py
diff --git a/youtube_dl/extractor/spike.py b/youtube_dlc/extractor/spike.py

similarity index 85%

rename from youtube_dl/extractor/spike.py

rename to youtube_dlc/extractor/spike.py

index 7c11ea7aaf9306181fee00adab6080d5c40ac1d7..aabff7a3ce78d76b9aec3772a4e88a5bf802d192 100644 (file)
--- a/youtube_dl/extractor/spike.py
+++ b/youtube_dlc/extractor/spike.py
@@ -8,15 +8,10 @@ class BellatorIE(MTVServicesInfoExtractor):
      _TESTS = [{
          'url': 'http://www.bellator.com/fight/atwr7k/bellator-158-michael-page-vs-evangelista-cyborg',
          'info_dict': {
-            'id': 'b55e434e-fde1-4a98-b7cc-92003a034de4',
-            'ext': 'mp4',
-            'title': 'Douglas Lima vs. Paul Daley - Round 1',
-            'description': 'md5:805a8dd29310fd611d32baba2f767885',
-        },
-        'params': {
-            # m3u8 download
-            'skip_download': True,
+            'title': 'Michael Page vs. Evangelista Cyborg',
+            'description': 'md5:0d917fc00ffd72dd92814963fc6cbb05',
          },
+        'playlist_count': 3,
      }, {
          'url': 'http://www.bellator.com/video-clips/bw6k7n/bellator-158-foundations-michael-venom-page',
          'only_matching': True,
@@ -25,6 +20,9 @@ class BellatorIE(MTVServicesInfoExtractor):
      _FEED_URL = 'http://www.bellator.com/feeds/mrss/'
      _GEO_COUNTRIES = ['US']
  
+    def _extract_mgid(self, webpage):
+        return self._extract_triforce_mgid(webpage)
+
  
  class ParamountNetworkIE(MTVServicesInfoExtractor):
      _VALID_URL = r'https?://(?:www\.)?paramountnetwork\.com/[^/]+/[\da-z]{6}(?:[/?#&]|$)'
diff --git a/youtube_dl/extractor/sport5.py b/youtube_dlc/extractor/sport5.py

similarity index 100%

rename from youtube_dl/extractor/sport5.py

rename to youtube_dlc/extractor/sport5.py
diff --git a/youtube_dl/extractor/sportbox.py b/youtube_dlc/extractor/sportbox.py

similarity index 100%

rename from youtube_dl/extractor/sportbox.py

rename to youtube_dlc/extractor/sportbox.py
diff --git a/youtube_dl/extractor/sportdeutschland.py b/youtube_dlc/extractor/sportdeutschland.py

similarity index 61%

rename from youtube_dl/extractor/sportdeutschland.py

rename to youtube_dlc/extractor/sportdeutschland.py

index a3c35a899a2186f1e937771cd0e34df408b2d361..378fc75686313f92a846aaa30579049e9a29eccc 100644 (file)
--- a/youtube_dl/extractor/sportdeutschland.py
+++ b/youtube_dlc/extractor/sportdeutschland.py
@@ -13,36 +13,18 @@
  class SportDeutschlandIE(InfoExtractor):
      _VALID_URL = r'https?://sportdeutschland\.tv/(?P<sport>[^/?#]+)/(?P<id>[^?#/]+)(?:$|[?#])'
      _TESTS = [{
-        'url': 'http://sportdeutschland.tv/badminton/live-li-ning-badminton-weltmeisterschaft-2014-kopenhagen',
+        'url': 'https://sportdeutschland.tv/badminton/re-live-deutsche-meisterschaften-2020-halbfinals?playlistId=0',
          'info_dict': {
-            'id': 'live-li-ning-badminton-weltmeisterschaft-2014-kopenhagen',
+            'id': 're-live-deutsche-meisterschaften-2020-halbfinals',
              'ext': 'mp4',
-            'title': 're:Li-Ning Badminton Weltmeisterschaft 2014 Kopenhagen',
-            'categories': ['Badminton'],
+            'title': 're:Re-live: Deutsche Meisterschaften 2020.*Halbfinals',
+            'categories': ['Badminton-Deutschland'],
              'view_count': int,
-            'thumbnail': r're:^https?://.*\.jpg$',
-            'description': r're:Die Badminton-WM 2014 aus Kopenhagen bei Sportdeutschland\.TV',
+            'thumbnail': r're:^https?://.*\.(?:jpg|png)$',
              'timestamp': int,
-            'upload_date': 're:^201408[23][0-9]$',
+            'upload_date': '20200201',
+            'description': 're:.*',  # meaningless description for THIS video
          },
-        'params': {
-            'skip_download': 'Live stream',
-        },
-    }, {
-        'url': 'http://sportdeutschland.tv/li-ning-badminton-wm-2014/lee-li-ning-badminton-weltmeisterschaft-2014-kopenhagen-herren-einzel-wei-vs',
-        'info_dict': {
-            'id': 'lee-li-ning-badminton-weltmeisterschaft-2014-kopenhagen-herren-einzel-wei-vs',
-            'ext': 'mp4',
-            'upload_date': '20140825',
-            'description': 'md5:60a20536b57cee7d9a4ec005e8687504',
-            'timestamp': 1408976060,
-            'duration': 2732,
-            'title': 'Li-Ning Badminton Weltmeisterschaft 2014 Kopenhagen: Herren Einzel, Wei Lee vs. Keun Lee',
-            'thumbnail': r're:^https?://.*\.jpg$',
-            'view_count': int,
-            'categories': ['Li-Ning Badminton WM 2014'],
-
-        }
      }]
  
      def _real_extract(self, url):
@@ -50,7 +32,7 @@ def _real_extract(self, url):
          video_id = mobj.group('id')
          sport_id = mobj.group('sport')
  
-        api_url = 'http://proxy.vidibusdynamic.net/sportdeutschland.tv/api/permalinks/%s/%s?access_token=true' % (
+        api_url = 'https://proxy.vidibusdynamic.net/ssl/backend.sportdeutschland.tv/api/permalinks/%s/%s?access_token=true' % (
              sport_id, video_id)
          req = sanitized_Request(api_url, headers={
              'Accept': 'application/vnd.vidibus.v2.html+json',
diff --git a/youtube_dl/extractor/springboardplatform.py b/youtube_dlc/extractor/springboardplatform.py

similarity index 100%

rename from youtube_dl/extractor/springboardplatform.py

rename to youtube_dlc/extractor/springboardplatform.py
diff --git a/youtube_dl/extractor/sprout.py b/youtube_dlc/extractor/sprout.py

similarity index 100%

rename from youtube_dl/extractor/sprout.py

rename to youtube_dlc/extractor/sprout.py
diff --git a/youtube_dl/extractor/srgssr.py b/youtube_dlc/extractor/srgssr.py

similarity index 100%

rename from youtube_dl/extractor/srgssr.py

rename to youtube_dlc/extractor/srgssr.py
diff --git a/youtube_dl/extractor/srmediathek.py b/youtube_dlc/extractor/srmediathek.py

similarity index 96%

rename from youtube_dl/extractor/srmediathek.py

rename to youtube_dlc/extractor/srmediathek.py

index 28baf901c9f021c15544f099f78dd5d5a6b9165c..359dadaa3cce4540f5abb2fb58a43596379e0c56 100644 (file)
--- a/youtube_dl/extractor/srmediathek.py
+++ b/youtube_dlc/extractor/srmediathek.py
@@ -1,14 +1,14 @@
  # coding: utf-8
  from __future__ import unicode_literals
  
-from .ard import ARDMediathekIE
+from .ard import ARDMediathekBaseIE
  from ..utils import (
      ExtractorError,
      get_element_by_attribute,
  )
  
  
-class SRMediathekIE(ARDMediathekIE):
+class SRMediathekIE(ARDMediathekBaseIE):
      IE_NAME = 'sr:mediathek'
      IE_DESC = 'Saarländischer Rundfunk'
      _VALID_URL = r'https?://sr-mediathek(?:\.sr-online)?\.de/index\.php\?.*?&id=(?P<id>[0-9]+)'
diff --git a/youtube_dl/extractor/stanfordoc.py b/youtube_dlc/extractor/stanfordoc.py

similarity index 100%

rename from youtube_dl/extractor/stanfordoc.py

rename to youtube_dlc/extractor/stanfordoc.py
diff --git a/youtube_dl/extractor/steam.py b/youtube_dlc/extractor/steam.py

similarity index 100%

rename from youtube_dl/extractor/steam.py

rename to youtube_dlc/extractor/steam.py
diff --git a/youtube_dl/extractor/stitcher.py b/youtube_dlc/extractor/stitcher.py

similarity index 100%

rename from youtube_dl/extractor/stitcher.py

rename to youtube_dlc/extractor/stitcher.py
diff --git a/youtube_dlc/extractor/storyfire.py b/youtube_dlc/extractor/storyfire.py

new file mode 100644 (file)

index 0000000..67457cc
--- /dev/null
+++ b/youtube_dlc/extractor/storyfire.py
@@ -0,0 +1,255 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+import itertools
+from .common import InfoExtractor
+
+
+class StoryFireIE(InfoExtractor):
+    _VALID_URL = r'(?:(?:https?://(?:www\.)?storyfire\.com/video-details)|(?:https://storyfire.app.link))/(?P<id>[^/\s]+)'
+    _TESTS = [{
+        'url': 'https://storyfire.com/video-details/5df1d132b6378700117f9181',
+        'md5': '560953bfca81a69003cfa5e53ac8a920',
+        'info_dict': {
+            'id': '5df1d132b6378700117f9181',
+            'ext': 'mp4',
+            'title': 'Buzzfeed Teaches You About Memes',
+            'uploader_id': 'ntZAJFECERSgqHSxzonV5K2E89s1',
+            'timestamp': 1576129028,
+            'description': 'Mocking Buzzfeed\'s meme lesson. Reuploaded from YouTube because of their new policies',
+            'uploader': 'whang!',
+            'upload_date': '20191212',
+        },
+        'params': {'format': 'bestvideo'}  # There are no merged formats in the playlist.
+    }, {
+        'url': 'https://storyfire.app.link/5GxAvWOQr8',  # Alternate URL format, with unrelated short ID
+        'md5': '7a2dc6d60c4889edfed459c620fe690d',
+        'info_dict': {
+            'id': '5f1e11ecd78a57b6c702001d',
+            'ext': 'm4a',
+            'title': 'Weird Nintendo Prototype Leaks',
+            'description': 'A stream taking a look at some weird Nintendo Prototypes with Luigi in Mario 64 and weird Yoshis',
+            'timestamp': 1595808576,
+            'upload_date': '20200727',
+            'uploader': 'whang!',
+            'uploader_id': 'ntZAJFECERSgqHSxzonV5K2E89s1',
+        },
+        'params': {'format': 'bestaudio'}  # Verifying audio extraction
+
+    }]
+
+    _aformats = {
+        'audio-medium-audio': {'acodec': 'aac', 'abr': 125, 'preference': -10},
+        'audio-high-audio': {'acodec': 'aac', 'abr': 254, 'preference': -1},
+    }
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+        webpage = self._download_webpage(url, video_id)
+
+        # Extracting the json blob is mandatory to proceed with extraction.
+        jsontext = self._html_search_regex(
+            r'<script id="__NEXT_DATA__" type="application/json">(.+?)</script>',
+            webpage, 'json_data')
+
+        json = self._parse_json(jsontext, video_id)
+
+        # The currentVideo field in the json is mandatory
+        # because it contains the only link to the m3u playlist
+        video = json['props']['initialState']['video']['currentVideo']
+        videourl = video['vimeoVideoURL']  # Video URL is mandatory
+
+        # Extract other fields from the json in an error tolerant fashion
+        # ID may be incorrect (on short URL format), correct it.
+        parsed_id = video.get('_id')
+        if parsed_id:
+            video_id = parsed_id
+
+        title = video.get('title')
+        description = video.get('description')
+
+        thumbnail = video.get('storyImage')
+        views = video.get('views')
+        likes = video.get('likesCount')
+        comments = video.get('commentsCount')
+        duration = video.get('videoDuration')
+        publishdate = video.get('publishDate')  # Apparently epoch time, day only
+
+        uploader = video.get('username')
+        uploader_id = video.get('hostID')
+        # Construct an uploader URL
+        uploader_url = None
+        if uploader_id:
+            uploader_url = "https://storyfire.com/user/%s/video" % uploader_id
+
+        # Collect root playlist to determine formats
+        formats = self._extract_m3u8_formats(
+            videourl, video_id, 'mp4', 'm3u8_native')
+
+        # Modify formats to fill in missing information about audio codecs
+        for format in formats:
+            aformat = self._aformats.get(format['format_id'])
+            if aformat:
+                format['acodec'] = aformat['acodec']
+                format['abr'] = aformat['abr']
+                format['preference'] = aformat['preference']
+                format['ext'] = 'm4a'
+
+        self._sort_formats(formats)
+
+        return {
+            'id': video_id,
+            'title': title,
+            'description': description,
+            'ext': "mp4",
+            'url': videourl,
+            'formats': formats,
+
+            'thumbnail': thumbnail,
+            'view_count': views,
+            'like_count': likes,
+            'comment_count': comments,
+            'duration': duration,
+            'timestamp': publishdate,
+
+            'uploader': uploader,
+            'uploader_id': uploader_id,
+            'uploader_url': uploader_url,
+
+        }
+
+
+class StoryFireUserIE(InfoExtractor):
+    _VALID_URL = r'https?://(?:www\.)?storyfire\.com/user/(?P<id>[^/\s]+)/video'
+    _TESTS = [{
+        'url': 'https://storyfire.com/user/ntZAJFECERSgqHSxzonV5K2E89s1/video',
+        'info_dict': {
+            'id': 'ntZAJFECERSgqHSxzonV5K2E89s1',
+            'title': 'whang!',
+        },
+        'playlist_mincount': 18
+    }, {
+        'url': 'https://storyfire.com/user/UQ986nFxmAWIgnkZQ0ftVhq4nOk2/video',
+        'info_dict': {
+            'id': 'UQ986nFxmAWIgnkZQ0ftVhq4nOk2',
+            'title': 'McJuggerNuggets',
+        },
+        'playlist_mincount': 143
+
+    }]
+
+    # Generator for fetching playlist items
+    def _enum_videos(self, baseurl, user_id, firstjson):
+        totalVideos = int(firstjson['videosCount'])
+        haveVideos = 0
+        json = firstjson
+
+        for page in itertools.count(1):
+            for video in json['videos']:
+                id = video['_id']
+                url = "https://storyfire.com/video-details/%s" % id
+                haveVideos += 1
+                yield {
+                    '_type': 'url',
+                    'id': id,
+                    'url': url,
+                    'ie_key': 'StoryFire',
+
+                    'title': video.get('title'),
+                    'description': video.get('description'),
+                    'view_count': video.get('views'),
+                    'comment_count': video.get('commentsCount'),
+                    'duration': video.get('videoDuration'),
+                    'timestamp': video.get('publishDate'),
+                }
+            # Are there more pages we could fetch?
+            if haveVideos < totalVideos:
+                pageurl = baseurl + ("%i" % haveVideos)
+                json = self._download_json(pageurl, user_id,
+                                           note='Downloading page %s' % page)
+
+                # Are there any videos in the new json?
+                videos = json.get('videos')
+                if not videos or len(videos) == 0:
+                    break  # no videos
+
+            else:
+                break  # We have fetched all the videos, stop
+
+    def _real_extract(self, url):
+        user_id = self._match_id(url)
+
+        baseurl = "https://storyfire.com/app/publicVideos/%s?skip=" % user_id
+
+        # Download first page to ensure it can be downloaded, and get user information if available.
+        firstpage = baseurl + "0"
+        firstjson = self._download_json(firstpage, user_id)
+
+        title = None
+        videos = firstjson.get('videos')
+        if videos and len(videos):
+            title = videos[1].get('username')
+
+        return {
+            '_type': 'playlist',
+            'entries': self._enum_videos(baseurl, user_id, firstjson),
+            'id': user_id,
+            'title': title,
+        }
+
+
+class StoryFireSeriesIE(InfoExtractor):
+    _VALID_URL = r'https?://(?:www\.)?storyfire\.com/write/series/stories/(?P<id>[^/\s]+)'
+    _TESTS = [{
+        'url': 'https://storyfire.com/write/series/stories/-Lq6MsuIHLODO6d2dDkr/',
+        'info_dict': {
+            'id': '-Lq6MsuIHLODO6d2dDkr',
+        },
+        'playlist_mincount': 13
+    }, {
+        'url': 'https://storyfire.com/write/series/stories/the_mortal_one/',
+        'info_dict': {
+            'id': 'the_mortal_one',
+        },
+        'playlist_count': 0  # This playlist has entries, but no videos.
+    }, {
+        'url': 'https://storyfire.com/write/series/stories/story_time',
+        'info_dict': {
+            'id': 'story_time',
+        },
+        'playlist_mincount': 10
+    }]
+
+    # Generator for returning playlist items
+    # This object is substantially different than the one in the user videos page above
+    def _enum_videos(self, jsonlist):
+        for video in jsonlist:
+            id = video['_id']
+            if video.get('hasVideo'):  # Boolean element
+                url = "https://storyfire.com/video-details/%s" % id
+                yield {
+                    '_type': 'url',
+                    'id': id,
+                    'url': url,
+                    'ie_key': 'StoryFire',
+
+                    'title': video.get('title'),
+                    'description': video.get('description'),
+                    'view_count': video.get('views'),
+                    'likes_count': video.get('likesCount'),
+                    'comment_count': video.get('commentsCount'),
+                    'duration': video.get('videoDuration'),
+                    'timestamp': video.get('publishDate'),
+                }
+
+    def _real_extract(self, url):
+        list_id = self._match_id(url)
+
+        listurl = "https://storyfire.com/app/seriesStories/%s/list" % list_id
+        json = self._download_json(listurl, list_id)
+
+        return {
+            '_type': 'playlist',
+            'entries': self._enum_videos(json),
+            'id': list_id
+        }
diff --git a/youtube_dl/extractor/streamable.py b/youtube_dlc/extractor/streamable.py

similarity index 100%

rename from youtube_dl/extractor/streamable.py

rename to youtube_dlc/extractor/streamable.py
diff --git a/youtube_dl/extractor/streamcloud.py b/youtube_dlc/extractor/streamcloud.py

similarity index 93%

rename from youtube_dl/extractor/streamcloud.py

rename to youtube_dlc/extractor/streamcloud.py

index b97bb43741b0861ce3cf9e63f1be116687409854..32eb2b92d3a13a7edc4b728f05534afc98fdf054 100644 (file)
--- a/youtube_dl/extractor/streamcloud.py
+++ b/youtube_dlc/extractor/streamcloud.py
@@ -15,12 +15,12 @@ class StreamcloudIE(InfoExtractor):
      _VALID_URL = r'https?://streamcloud\.eu/(?P<id>[a-zA-Z0-9_-]+)(?:/(?P<fname>[^#?]*)\.html)?'
  
      _TESTS = [{
-        'url': 'http://streamcloud.eu/skp9j99s4bpz/youtube-dl_test_video_____________-BaW_jenozKc.mp4.html',
+        'url': 'http://streamcloud.eu/skp9j99s4bpz/youtube-dlc_test_video_____________-BaW_jenozKc.mp4.html',
          'md5': '6bea4c7fa5daaacc2a946b7146286686',
          'info_dict': {
              'id': 'skp9j99s4bpz',
              'ext': 'mp4',
-            'title': 'youtube-dl test video  \'/\\ ä ↭',
+            'title': 'youtube-dlc test video  \'/\\ ä ↭',
          },
          'skip': 'Only available from the EU'
      }, {
diff --git a/youtube_dl/extractor/streamcz.py b/youtube_dlc/extractor/streamcz.py

similarity index 100%

rename from youtube_dl/extractor/streamcz.py

rename to youtube_dlc/extractor/streamcz.py
diff --git a/youtube_dl/extractor/streetvoice.py b/youtube_dlc/extractor/streetvoice.py

similarity index 100%

rename from youtube_dl/extractor/streetvoice.py

rename to youtube_dlc/extractor/streetvoice.py
diff --git a/youtube_dlc/extractor/stretchinternet.py b/youtube_dlc/extractor/stretchinternet.py

new file mode 100644 (file)

index 0000000..4dbead2
--- /dev/null
+++ b/youtube_dlc/extractor/stretchinternet.py
@@ -0,0 +1,32 @@
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..utils import int_or_none
+
+
+class StretchInternetIE(InfoExtractor):
+    _VALID_URL = r'https?://portal\.stretchinternet\.com/[^/]+/(?:portal|full)\.htm\?.*?\beventId=(?P<id>\d+)'
+    _TEST = {
+        'url': 'https://portal.stretchinternet.com/umary/portal.htm?eventId=573272&streamType=video',
+        'info_dict': {
+            'id': '573272',
+            'ext': 'mp4',
+            'title': 'University of Mary Wrestling vs. Upper Iowa',
+            'timestamp': 1575668361,
+            'upload_date': '20191206',
+        }
+    }
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+
+        event = self._download_json(
+            'https://api.stretchinternet.com/trinity/event/tcg/' + video_id,
+            video_id)[0]
+
+        return {
+            'id': video_id,
+            'title': event['title'],
+            'timestamp': int_or_none(event.get('dateCreated'), 1000),
+            'url': 'https://' + event['media'][0]['url'],
+        }
diff --git a/youtube_dl/extractor/stv.py b/youtube_dlc/extractor/stv.py

similarity index 100%

rename from youtube_dl/extractor/stv.py

rename to youtube_dlc/extractor/stv.py
diff --git a/youtube_dl/extractor/sunporno.py b/youtube_dlc/extractor/sunporno.py

similarity index 100%

rename from youtube_dl/extractor/sunporno.py

rename to youtube_dlc/extractor/sunporno.py
diff --git a/youtube_dl/extractor/sverigesradio.py b/youtube_dlc/extractor/sverigesradio.py

similarity index 100%

rename from youtube_dl/extractor/sverigesradio.py

rename to youtube_dlc/extractor/sverigesradio.py
diff --git a/youtube_dl/extractor/svt.py b/youtube_dlc/extractor/svt.py

similarity index 73%

rename from youtube_dl/extractor/svt.py

rename to youtube_dlc/extractor/svt.py

index 0901c3163e6cab4723d451b30df8574e359ba899..8e9ec2ca3cbe1880ee96d83594af3bbb49f90dce 100644 (file)
--- a/youtube_dl/extractor/svt.py
+++ b/youtube_dlc/extractor/svt.py
@@ -4,19 +4,14 @@
  import re
  
  from .common import InfoExtractor
-from ..compat import (
-    compat_parse_qs,
-    compat_urllib_parse_urlparse,
-)
+from ..compat import compat_str
  from ..utils import (
      determine_ext,
      dict_get,
      int_or_none,
-    orderedSet,
+    str_or_none,
      strip_or_none,
      try_get,
-    urljoin,
-    compat_str,
  )
  
  
@@ -229,31 +224,37 @@ def _real_extract(self, url):
                  self._adjust_title(info_dict)
                  return info_dict
  
-        svt_id = self._search_regex(
-            r'<video[^>]+data-video-id=["\']([\da-zA-Z-]+)',
-            webpage, 'video id')
+            svt_id = try_get(
+                data, lambda x: x['statistics']['dataLake']['content']['id'],
+                compat_str)
+
+        if not svt_id:
+            svt_id = self._search_regex(
+                (r'<video[^>]+data-video-id=["\']([\da-zA-Z-]+)',
+                 r'"content"\s*:\s*{.*?"id"\s*:\s*"([\da-zA-Z-]+)"'),
+                webpage, 'video id')
  
          return self._extract_by_video_id(svt_id, webpage)
  
  
  class SVTSeriesIE(SVTPlayBaseIE):
-    _VALID_URL = r'https?://(?:www\.)?svtplay\.se/(?P<id>[^/?&#]+)'
+    _VALID_URL = r'https?://(?:www\.)?svtplay\.se/(?P<id>[^/?&#]+)(?:.+?\btab=(?P<season_slug>[^&#]+))?'
      _TESTS = [{
          'url': 'https://www.svtplay.se/rederiet',
          'info_dict': {
-            'id': 'rederiet',
+            'id': '14445680',
              'title': 'Rederiet',
-            'description': 'md5:505d491a58f4fcf6eb418ecab947e69e',
+            'description': 'md5:d9fdfff17f5d8f73468176ecd2836039',
          },
          'playlist_mincount': 318,
      }, {
-        'url': 'https://www.svtplay.se/rederiet?tab=sasong2',
+        'url': 'https://www.svtplay.se/rederiet?tab=season-2-14445680',
          'info_dict': {
-            'id': 'rederiet-sasong2',
+            'id': 'season-2-14445680',
              'title': 'Rederiet - Säsong 2',
-            'description': 'md5:505d491a58f4fcf6eb418ecab947e69e',
+            'description': 'md5:d9fdfff17f5d8f73468176ecd2836039',
          },
-        'playlist_count': 12,
+        'playlist_mincount': 12,
      }]
  
      @classmethod
@@ -261,83 +262,87 @@ def suitable(cls, url):
          return False if SVTIE.suitable(url) or SVTPlayIE.suitable(url) else super(SVTSeriesIE, cls).suitable(url)
  
      def _real_extract(self, url):
-        series_id = self._match_id(url)
-
-        qs = compat_parse_qs(compat_urllib_parse_urlparse(url).query)
-        season_slug = qs.get('tab', [None])[0]
-
-        if season_slug:
-            series_id += '-%s' % season_slug
-
-        webpage = self._download_webpage(
-            url, series_id, 'Downloading series page')
-
-        root = self._parse_json(
-            self._search_regex(
-                self._SVTPLAY_RE, webpage, 'content', group='json'),
-            series_id)
+        series_slug, season_id = re.match(self._VALID_URL, url).groups()
+
+        series = self._download_json(
+            'https://api.svt.se/contento/graphql', series_slug,
+            'Downloading series page', query={
+                'query': '''{
+  listablesBySlug(slugs: ["%s"]) {
+    associatedContent(include: [productionPeriod, season]) {
+      items {
+        item {
+          ... on Episode {
+            videoSvtId
+          }
+        }
+      }
+      id
+      name
+    }
+    id
+    longDescription
+    name
+    shortDescription
+  }
+}''' % series_slug,
+            })['data']['listablesBySlug'][0]
  
          season_name = None
  
          entries = []
-        for season in root['relatedVideoContent']['relatedVideosAccordion']:
+        for season in series['associatedContent']:
              if not isinstance(season, dict):
                  continue
-            if season_slug:
-                if season.get('slug') != season_slug:
+            if season_id:
+                if season.get('id') != season_id:
                      continue
                  season_name = season.get('name')
-            videos = season.get('videos')
-            if not isinstance(videos, list):
+            items = season.get('items')
+            if not isinstance(items, list):
                  continue
-            for video in videos:
-                content_url = video.get('contentUrl')
-                if not content_url or not isinstance(content_url, compat_str):
+            for item in items:
+                video = item.get('item') or {}
+                content_id = video.get('videoSvtId')
+                if not content_id or not isinstance(content_id, compat_str):
                      continue
-                entries.append(
-                    self.url_result(
-                        urljoin(url, content_url),
-                        ie=SVTPlayIE.ie_key(),
-                        video_title=video.get('title')
-                    ))
-
-        metadata = root.get('metaData')
-        if not isinstance(metadata, dict):
-            metadata = {}
+                entries.append(self.url_result(
+                    'svt:' + content_id, SVTPlayIE.ie_key(), content_id))
  
-        title = metadata.get('title')
-        season_name = season_name or season_slug
+        title = series.get('name')
+        season_name = season_name or season_id
  
          if title and season_name:
              title = '%s - %s' % (title, season_name)
-        elif season_slug:
-            title = season_slug
+        elif season_id:
+            title = season_id
  
          return self.playlist_result(
-            entries, series_id, title, metadata.get('description'))
+            entries, season_id or series.get('id'), title,
+            dict_get(series, ('longDescription', 'shortDescription')))
  
  
  class SVTPageIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:www\.)?svt\.se/(?:[^/]+/)*(?P<id>[^/?&#]+)'
+    _VALID_URL = r'https?://(?:www\.)?svt\.se/(?P<path>(?:[^/]+/)*(?P<id>[^/?&#]+))'
      _TESTS = [{
-        'url': 'https://www.svt.se/sport/oseedat/guide-sommartraningen-du-kan-gora-var-och-nar-du-vill',
+        'url': 'https://www.svt.se/sport/ishockey/bakom-masken-lehners-kamp-mot-mental-ohalsa',
          'info_dict': {
-            'id': 'guide-sommartraningen-du-kan-gora-var-och-nar-du-vill',
-            'title': 'GUIDE: Sommarträning du kan göra var och när du vill',
+            'id': '25298267',
+            'title': 'Bakom masken – Lehners kamp mot mental ohälsa',
          },
-        'playlist_count': 7,
+        'playlist_count': 4,
      }, {
-        'url': 'https://www.svt.se/nyheter/inrikes/ebba-busch-thor-kd-har-delvis-ratt-om-no-go-zoner',
+        'url': 'https://www.svt.se/nyheter/utrikes/svenska-andrea-ar-en-mil-fran-branderna-i-kalifornien',
          'info_dict': {
-            'id': 'ebba-busch-thor-kd-har-delvis-ratt-om-no-go-zoner',
-            'title': 'Ebba Busch Thor har bara delvis rätt om ”no-go-zoner”',
+            'id': '24243746',
+            'title': 'Svenska Andrea redo att fly sitt hem i Kalifornien',
          },
-        'playlist_count': 1,
+        'playlist_count': 2,
      }, {
          # only programTitle
          'url': 'http://www.svt.se/sport/ishockey/jagr-tacklar-giroux-under-intervjun',
          'info_dict': {
-            'id': '2900353',
+            'id': '8439V2K',
              'ext': 'mp4',
              'title': 'Stjärnorna skojar till det - under SVT-intervjun',
              'duration': 27,
@@ -356,16 +361,26 @@ def suitable(cls, url):
          return False if SVTIE.suitable(url) else super(SVTPageIE, cls).suitable(url)
  
      def _real_extract(self, url):
-        playlist_id = self._match_id(url)
+        path, display_id = re.match(self._VALID_URL, url).groups()
  
-        webpage = self._download_webpage(url, playlist_id)
+        article = self._download_json(
+            'https://api.svt.se/nss-api/page/' + path, display_id,
+            query={'q': 'articles'})['articles']['content'][0]
  
-        entries = [
-            self.url_result(
-                'svt:%s' % video_id, ie=SVTPlayIE.ie_key(), video_id=video_id)
-            for video_id in orderedSet(re.findall(
-                r'data-video-id=["\'](\d+)', webpage))]
+        entries = []
  
-        title = strip_or_none(self._og_search_title(webpage, default=None))
+        def _process_content(content):
+            if content.get('_type') in ('VIDEOCLIP', 'VIDEOEPISODE'):
+                video_id = compat_str(content['image']['svtId'])
+                entries.append(self.url_result(
+                    'svt:' + video_id, SVTPlayIE.ie_key(), video_id))
  
-        return self.playlist_result(entries, playlist_id, title)
+        for media in article.get('media', []):
+            _process_content(media)
+
+        for obj in article.get('structuredBody', []):
+            _process_content(obj.get('content') or {})
+
+        return self.playlist_result(
+            entries, str_or_none(article.get('id')),
+            strip_or_none(article.get('title')))
diff --git a/youtube_dl/extractor/swrmediathek.py b/youtube_dlc/extractor/swrmediathek.py

similarity index 100%

rename from youtube_dl/extractor/swrmediathek.py

rename to youtube_dlc/extractor/swrmediathek.py
diff --git a/youtube_dl/extractor/syfy.py b/youtube_dlc/extractor/syfy.py

similarity index 100%

rename from youtube_dl/extractor/syfy.py

rename to youtube_dlc/extractor/syfy.py
diff --git a/youtube_dl/extractor/sztvhu.py b/youtube_dlc/extractor/sztvhu.py

similarity index 100%

rename from youtube_dl/extractor/sztvhu.py

rename to youtube_dlc/extractor/sztvhu.py
diff --git a/youtube_dl/extractor/tagesschau.py b/youtube_dlc/extractor/tagesschau.py

similarity index 100%

rename from youtube_dl/extractor/tagesschau.py

rename to youtube_dlc/extractor/tagesschau.py
diff --git a/youtube_dl/extractor/tass.py b/youtube_dlc/extractor/tass.py

similarity index 100%

rename from youtube_dl/extractor/tass.py

rename to youtube_dlc/extractor/tass.py
diff --git a/youtube_dl/extractor/tastytrade.py b/youtube_dlc/extractor/tastytrade.py

similarity index 100%

rename from youtube_dl/extractor/tastytrade.py

rename to youtube_dlc/extractor/tastytrade.py
diff --git a/youtube_dl/extractor/tbs.py b/youtube_dlc/extractor/tbs.py

similarity index 100%

rename from youtube_dl/extractor/tbs.py

rename to youtube_dlc/extractor/tbs.py
diff --git a/youtube_dl/extractor/tdslifeway.py b/youtube_dlc/extractor/tdslifeway.py

similarity index 100%

rename from youtube_dl/extractor/tdslifeway.py

rename to youtube_dlc/extractor/tdslifeway.py
diff --git a/youtube_dl/extractor/teachable.py b/youtube_dlc/extractor/teachable.py

similarity index 74%

rename from youtube_dl/extractor/teachable.py

rename to youtube_dlc/extractor/teachable.py

index 7d2e34b3bc4204d3ec999968c1ac76db7687a0c4..a75369dbe8a3582595ae339d58887eaefd220536 100644 (file)
--- a/youtube_dl/extractor/teachable.py
+++ b/youtube_dlc/extractor/teachable.py
@@ -4,11 +4,12 @@
  
  from .common import InfoExtractor
  from .wistia import WistiaIE
-from ..compat import compat_str
  from ..utils import (
      clean_html,
      ExtractorError,
+    int_or_none,
      get_element_by_class,
+    strip_or_none,
      urlencode_postdata,
      urljoin,
  )
@@ -20,8 +21,8 @@ class TeachableBaseIE(InfoExtractor):
  
      _SITES = {
          # Only notable ones here
-        'upskillcourses.com': 'upskill',
-        'academy.gns3.com': 'gns3',
+        'v1.upskillcourses.com': 'upskill',
+        'gns3.teachable.com': 'gns3',
          'academyhacker.com': 'academyhacker',
          'stackskills.com': 'stackskills',
          'market.saleshacker.com': 'saleshacker',
@@ -58,7 +59,7 @@ def is_logged(webpage):
              self._logged_in = True
              return
  
-        login_url = compat_str(urlh.geturl())
+        login_url = urlh.geturl()
  
          login_form = self._hidden_inputs(login_page)
  
@@ -110,27 +111,29 @@ class TeachableIE(TeachableBaseIE):
                      ''' % TeachableBaseIE._VALID_URL_SUB_TUPLE
  
      _TESTS = [{
-        'url': 'http://upskillcourses.com/courses/essential-web-developer-course/lectures/1747100',
+        'url': 'https://gns3.teachable.com/courses/gns3-certified-associate/lectures/6842364',
          'info_dict': {
-            'id': 'uzw6zw58or',
-            'ext': 'mp4',
-            'title': 'Welcome to the Course!',
-            'description': 'md5:65edb0affa582974de4625b9cdea1107',
-            'duration': 138.763,
-            'timestamp': 1479846621,
-            'upload_date': '20161122',
+            'id': 'untlgzk1v7',
+            'ext': 'bin',
+            'title': 'Overview',
+            'description': 'md5:071463ff08b86c208811130ea1c2464c',
+            'duration': 736.4,
+            'timestamp': 1542315762,
+            'upload_date': '20181115',
+            'chapter': 'Welcome',
+            'chapter_number': 1,
          },
          'params': {
              'skip_download': True,
          },
      }, {
-        'url': 'http://upskillcourses.com/courses/119763/lectures/1747100',
+        'url': 'http://v1.upskillcourses.com/courses/119763/lectures/1747100',
          'only_matching': True,
      }, {
-        'url': 'https://academy.gns3.com/courses/423415/lectures/6885939',
+        'url': 'https://gns3.teachable.com/courses/423415/lectures/6885939',
          'only_matching': True,
      }, {
-        'url': 'teachable:https://upskillcourses.com/courses/essential-web-developer-course/lectures/1747100',
+        'url': 'teachable:https://v1.upskillcourses.com/courses/essential-web-developer-course/lectures/1747100',
          'only_matching': True,
      }]
  
@@ -160,22 +163,51 @@ def _real_extract(self, url):
  
          webpage = self._download_webpage(url, video_id)
  
-        wistia_url = WistiaIE._extract_url(webpage)
-        if not wistia_url:
+        wistia_urls = WistiaIE._extract_urls(webpage)
+        if not wistia_urls:
              if any(re.search(p, webpage) for p in (
                      r'class=["\']lecture-contents-locked',
                      r'>\s*Lecture contents locked',
-                    r'id=["\']lecture-locked')):
+                    r'id=["\']lecture-locked',
+                    # https://academy.tailoredtutors.co.uk/courses/108779/lectures/1955313
+                    r'class=["\'](?:inner-)?lesson-locked',
+                    r'>LESSON LOCKED<')):
                  self.raise_login_required('Lecture contents locked')
+            raise ExtractorError('Unable to find video URL')
  
          title = self._og_search_title(webpage, default=None)
  
-        return {
+        chapter = None
+        chapter_number = None
+        section_item = self._search_regex(
+            r'(?s)(?P<li><li[^>]+\bdata-lecture-id=["\']%s[^>]+>.+?</li>)' % video_id,
+            webpage, 'section item', default=None, group='li')
+        if section_item:
+            chapter_number = int_or_none(self._search_regex(
+                r'data-ss-position=["\'](\d+)', section_item, 'section id',
+                default=None))
+            if chapter_number is not None:
+                sections = []
+                for s in re.findall(
+                        r'(?s)<div[^>]+\bclass=["\']section-title[^>]+>(.+?)</div>', webpage):
+                    section = strip_or_none(clean_html(s))
+                    if not section:
+                        sections = []
+                        break
+                    sections.append(section)
+                if chapter_number <= len(sections):
+                    chapter = sections[chapter_number - 1]
+
+        entries = [{
              '_type': 'url_transparent',
              'url': wistia_url,
              'ie_key': WistiaIE.ie_key(),
              'title': title,
-        }
+            'chapter': chapter,
+            'chapter_number': chapter_number,
+        } for wistia_url in wistia_urls]
+
+        return self.playlist_result(entries, video_id, title)
  
  
  class TeachableCourseIE(TeachableBaseIE):
@@ -187,20 +219,20 @@ class TeachableCourseIE(TeachableBaseIE):
                          /(?:courses|p)/(?:enrolled/)?(?P<id>[^/?#&]+)
                      ''' % TeachableBaseIE._VALID_URL_SUB_TUPLE
      _TESTS = [{
-        'url': 'http://upskillcourses.com/courses/essential-web-developer-course/',
+        'url': 'http://v1.upskillcourses.com/courses/essential-web-developer-course/',
          'info_dict': {
              'id': 'essential-web-developer-course',
              'title': 'The Essential Web Developer Course (Free)',
          },
          'playlist_count': 192,
      }, {
-        'url': 'http://upskillcourses.com/courses/119763/',
+        'url': 'http://v1.upskillcourses.com/courses/119763/',
          'only_matching': True,
      }, {
-        'url': 'http://upskillcourses.com/courses/enrolled/119763',
+        'url': 'http://v1.upskillcourses.com/courses/enrolled/119763',
          'only_matching': True,
      }, {
-        'url': 'https://academy.gns3.com/courses/enrolled/423415',
+        'url': 'https://gns3.teachable.com/courses/enrolled/423415',
          'only_matching': True,
      }, {
          'url': 'teachable:https://learn.vrdev.school/p/gear-vr-developer-mini',
diff --git a/youtube_dl/extractor/teachertube.py b/youtube_dlc/extractor/teachertube.py

similarity index 100%

rename from youtube_dl/extractor/teachertube.py

rename to youtube_dlc/extractor/teachertube.py
diff --git a/youtube_dl/extractor/teachingchannel.py b/youtube_dlc/extractor/teachingchannel.py

similarity index 100%

rename from youtube_dl/extractor/teachingchannel.py

rename to youtube_dlc/extractor/teachingchannel.py
diff --git a/youtube_dl/extractor/teamcoco.py b/youtube_dlc/extractor/teamcoco.py

similarity index 100%

rename from youtube_dl/extractor/teamcoco.py

rename to youtube_dlc/extractor/teamcoco.py
diff --git a/youtube_dl/extractor/teamtreehouse.py b/youtube_dlc/extractor/teamtreehouse.py

similarity index 100%

rename from youtube_dl/extractor/teamtreehouse.py

rename to youtube_dlc/extractor/teamtreehouse.py
diff --git a/youtube_dl/extractor/techtalks.py b/youtube_dlc/extractor/techtalks.py

similarity index 100%

rename from youtube_dl/extractor/techtalks.py

rename to youtube_dlc/extractor/techtalks.py
diff --git a/youtube_dl/extractor/ted.py b/youtube_dlc/extractor/ted.py

similarity index 100%

rename from youtube_dl/extractor/ted.py

rename to youtube_dlc/extractor/ted.py
diff --git a/youtube_dl/extractor/tele13.py b/youtube_dlc/extractor/tele13.py

similarity index 100%

rename from youtube_dl/extractor/tele13.py

rename to youtube_dlc/extractor/tele13.py
diff --git a/youtube_dlc/extractor/tele5.py b/youtube_dlc/extractor/tele5.py

new file mode 100644 (file)

index 0000000..3e1a7a9
--- /dev/null
+++ b/youtube_dlc/extractor/tele5.py
@@ -0,0 +1,108 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+import re
+
+from .common import InfoExtractor
+from .jwplatform import JWPlatformIE
+from .nexx import NexxIE
+from ..compat import compat_urlparse
+from ..utils import (
+    NO_DEFAULT,
+    smuggle_url,
+)
+
+
+class Tele5IE(InfoExtractor):
+    _VALID_URL = r'https?://(?:www\.)?tele5\.de/(?:[^/]+/)*(?P<id>[^/?#&]+)'
+    _GEO_COUNTRIES = ['DE']
+    _TESTS = [{
+        'url': 'https://www.tele5.de/mediathek/filme-online/videos?vid=1549416',
+        'info_dict': {
+            'id': '1549416',
+            'ext': 'mp4',
+            'upload_date': '20180814',
+            'timestamp': 1534290623,
+            'title': 'Pandorum',
+        },
+        'params': {
+            'skip_download': True,
+        },
+    }, {
+        # jwplatform, nexx unavailable
+        'url': 'https://www.tele5.de/filme/ghoul-das-geheimnis-des-friedhofmonsters/',
+        'info_dict': {
+            'id': 'WJuiOlUp',
+            'ext': 'mp4',
+            'upload_date': '20200603',
+            'timestamp': 1591214400,
+            'title': 'Ghoul - Das Geheimnis des Friedhofmonsters',
+            'description': 'md5:42002af1d887ff3d5b2b3ca1f8137d97',
+        },
+        'params': {
+            'skip_download': True,
+        },
+        'add_ie': [JWPlatformIE.ie_key()],
+    }, {
+        'url': 'https://www.tele5.de/kalkofes-mattscheibe/video-clips/politik-und-gesellschaft?ve_id=1551191',
+        'only_matching': True,
+    }, {
+        'url': 'https://www.tele5.de/video-clip/?ve_id=1609440',
+        'only_matching': True,
+    }, {
+        'url': 'https://www.tele5.de/filme/schlefaz-dragon-crusaders/',
+        'only_matching': True,
+    }, {
+        'url': 'https://www.tele5.de/filme/making-of/avengers-endgame/',
+        'only_matching': True,
+    }, {
+        'url': 'https://www.tele5.de/star-trek/raumschiff-voyager/ganze-folge/das-vinculum/',
+        'only_matching': True,
+    }, {
+        'url': 'https://www.tele5.de/anders-ist-sevda/',
+        'only_matching': True,
+    }]
+
+    def _real_extract(self, url):
+        qs = compat_urlparse.parse_qs(compat_urlparse.urlparse(url).query)
+        video_id = (qs.get('vid') or qs.get('ve_id') or [None])[0]
+
+        NEXX_ID_RE = r'\d{6,}'
+        JWPLATFORM_ID_RE = r'[a-zA-Z0-9]{8}'
+
+        def nexx_result(nexx_id):
+            return self.url_result(
+                'https://api.nexx.cloud/v3/759/videos/byid/%s' % nexx_id,
+                ie=NexxIE.ie_key(), video_id=nexx_id)
+
+        nexx_id = jwplatform_id = None
+
+        if video_id:
+            if re.match(NEXX_ID_RE, video_id):
+                return nexx_result(video_id)
+            elif re.match(JWPLATFORM_ID_RE, video_id):
+                jwplatform_id = video_id
+
+        if not nexx_id:
+            display_id = self._match_id(url)
+            webpage = self._download_webpage(url, display_id)
+
+            def extract_id(pattern, name, default=NO_DEFAULT):
+                return self._html_search_regex(
+                    (r'id\s*=\s*["\']video-player["\'][^>]+data-id\s*=\s*["\'](%s)' % pattern,
+                     r'\s+id\s*=\s*["\']player_(%s)' % pattern,
+                     r'\bdata-id\s*=\s*["\'](%s)' % pattern), webpage, name,
+                    default=default)
+
+            nexx_id = extract_id(NEXX_ID_RE, 'nexx id', default=None)
+            if nexx_id:
+                return nexx_result(nexx_id)
+
+            if not jwplatform_id:
+                jwplatform_id = extract_id(JWPLATFORM_ID_RE, 'jwplatform id')
+
+        return self.url_result(
+            smuggle_url(
+                'jwplatform:%s' % jwplatform_id,
+                {'geo_countries': self._GEO_COUNTRIES}),
+            ie=JWPlatformIE.ie_key(), video_id=jwplatform_id)
diff --git a/youtube_dl/extractor/telebruxelles.py b/youtube_dlc/extractor/telebruxelles.py

similarity index 100%

rename from youtube_dl/extractor/telebruxelles.py

rename to youtube_dlc/extractor/telebruxelles.py
diff --git a/youtube_dl/extractor/telecinco.py b/youtube_dlc/extractor/telecinco.py

similarity index 77%

rename from youtube_dl/extractor/telecinco.py

rename to youtube_dlc/extractor/telecinco.py

index d37e1b0557cf3ba241a25e7e56d28c8dc679b1d0..9ba3da341dac65d18a599a790bff9c95b0e52eb8 100644 (file)
--- a/youtube_dl/extractor/telecinco.py
+++ b/youtube_dlc/extractor/telecinco.py
@@ -11,6 +11,7 @@
      determine_ext,
      int_or_none,
      str_or_none,
+    try_get,
      urljoin,
  )
  
@@ -24,7 +25,7 @@ class TelecincoIE(InfoExtractor):
          'info_dict': {
              'id': '1876350223',
              'title': 'Bacalao con kokotxas al pil-pil',
-            'description': 'md5:1382dacd32dd4592d478cbdca458e5bb',
+            'description': 'md5:716caf5601e25c3c5ab6605b1ae71529',
          },
          'playlist': [{
              'md5': 'adb28c37238b675dad0f042292f209a7',
@@ -55,6 +56,26 @@ class TelecincoIE(InfoExtractor):
              'description': 'md5:2771356ff7bfad9179c5f5cd954f1477',
              'duration': 50,
          },
+    }, {
+        # video in opening's content
+        'url': 'https://www.telecinco.es/vivalavida/fiorella-sobrina-edmundo-arrocet-entrevista_18_2907195140.html',
+        'info_dict': {
+            'id': '2907195140',
+            'title': 'La surrealista entrevista a la sobrina de Edmundo Arrocet: "No puedes venir aquí y tomarnos por tontos"',
+            'description': 'md5:73f340a7320143d37ab895375b2bf13a',
+        },
+        'playlist': [{
+            'md5': 'adb28c37238b675dad0f042292f209a7',
+            'info_dict': {
+                'id': 'TpI2EttSDAReWpJ1o0NVh2',
+                'ext': 'mp4',
+                'title': 'La surrealista entrevista a la sobrina de Edmundo Arrocet: "No puedes venir aquí y tomarnos por tontos"',
+                'duration': 1015,
+            },
+        }],
+        'params': {
+            'skip_download': True,
+        },
      }, {
          'url': 'http://www.telecinco.es/informativos/nacional/Pablo_Iglesias-Informativos_Telecinco-entrevista-Pedro_Piqueras_2_1945155182.html',
          'only_matching': True,
@@ -135,17 +156,28 @@ def _real_extract(self, url):
          display_id = self._match_id(url)
          webpage = self._download_webpage(url, display_id)
          article = self._parse_json(self._search_regex(
-            r'window\.\$REACTBASE_STATE\.article\s*=\s*({.+})',
+            r'window\.\$REACTBASE_STATE\.article(?:_multisite)?\s*=\s*({.+})',
              webpage, 'article'), display_id)['article']
          title = article.get('title')
-        description = clean_html(article.get('leadParagraph'))
+        description = clean_html(article.get('leadParagraph')) or ''
          if article.get('editorialType') != 'VID':
              entries = []
-            for p in article.get('body', []):
+            body = [article.get('opening')]
+            body.extend(try_get(article, lambda x: x['body'], list) or [])
+            for p in body:
+                if not isinstance(p, dict):
+                    continue
                  content = p.get('content')
-                if p.get('type') != 'video' or not content:
+                if not content:
+                    continue
+                type_ = p.get('type')
+                if type_ == 'paragraph':
+                    content_str = str_or_none(content)
+                    if content_str:
+                        description += content_str
                      continue
-                entries.append(self._parse_content(content, url))
+                if type_ == 'video' and isinstance(content, dict):
+                    entries.append(self._parse_content(content, url))
              return self.playlist_result(
                  entries, str_or_none(article.get('id')), title, description)
          content = article['opening']['content']
diff --git a/youtube_dl/extractor/telegraaf.py b/youtube_dlc/extractor/telegraaf.py

similarity index 100%

rename from youtube_dl/extractor/telegraaf.py

rename to youtube_dlc/extractor/telegraaf.py
diff --git a/youtube_dl/extractor/telemb.py b/youtube_dlc/extractor/telemb.py

similarity index 100%

rename from youtube_dl/extractor/telemb.py

rename to youtube_dlc/extractor/telemb.py
diff --git a/youtube_dl/extractor/telequebec.py b/youtube_dlc/extractor/telequebec.py

similarity index 98%

rename from youtube_dl/extractor/telequebec.py

rename to youtube_dlc/extractor/telequebec.py

index ae9f66787439462967baa63dc58f39870fb89382..c82c94b3a0009da2cf0938c92910feec84de018b 100644 (file)
--- a/youtube_dl/extractor/telequebec.py
+++ b/youtube_dlc/extractor/telequebec.py
@@ -38,8 +38,6 @@ class TeleQuebecIE(TeleQuebecBaseIE):
              'ext': 'mp4',
              'title': 'Un petit choc et puis repart!',
              'description': 'md5:b04a7e6b3f74e32d7b294cffe8658374',
-            'upload_date': '20180222',
-            'timestamp': 1519326631,
          },
          'params': {
              'skip_download': True,
diff --git a/youtube_dl/extractor/teletask.py b/youtube_dlc/extractor/teletask.py

similarity index 100%

rename from youtube_dl/extractor/teletask.py

rename to youtube_dlc/extractor/teletask.py
diff --git a/youtube_dl/extractor/telewebion.py b/youtube_dlc/extractor/telewebion.py

similarity index 100%

rename from youtube_dl/extractor/telewebion.py

rename to youtube_dlc/extractor/telewebion.py
diff --git a/youtube_dl/extractor/tennistv.py b/youtube_dlc/extractor/tennistv.py

similarity index 100%

rename from youtube_dl/extractor/tennistv.py

rename to youtube_dlc/extractor/tennistv.py
diff --git a/youtube_dl/extractor/tenplay.py b/youtube_dlc/extractor/tenplay.py

similarity index 87%

rename from youtube_dl/extractor/tenplay.py

rename to youtube_dlc/extractor/tenplay.py

index dff44a4e2f4f5e4105def539c186dc630b411b34..af325fea8fcd68ce5cf9b8bb8ec33975950b5c32 100644 (file)
--- a/youtube_dl/extractor/tenplay.py
+++ b/youtube_dlc/extractor/tenplay.py
@@ -10,8 +10,8 @@
  
  
  class TenPlayIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:www\.)?10play\.com\.au/[^/]+/episodes/[^/]+/[^/]+/(?P<id>tpv\d{6}[a-z]{5})'
-    _TEST = {
+    _VALID_URL = r'https?://(?:www\.)?10play\.com\.au/(?:[^/]+/)+(?P<id>tpv\d{6}[a-z]{5})'
+    _TESTS = [{
          'url': 'https://10play.com.au/masterchef/episodes/season-1/masterchef-s1-ep-1/tpv190718kwzga',
          'info_dict': {
              'id': '6060533435001',
@@ -27,7 +27,10 @@ class TenPlayIE(InfoExtractor):
              'format': 'bestvideo',
              'skip_download': True,
          }
-    }
+    }, {
+        'url': 'https://10play.com.au/how-to-stay-married/web-extras/season-1/terrys-talks-ep-1-embracing-change/tpv190915ylupc',
+        'only_matching': True,
+    }]
      BRIGHTCOVE_URL_TEMPLATE = 'https://players.brightcove.net/2199827728001/cN6vRtRQt_default/index.html?videoId=%s'
  
      def _real_extract(self, url):
diff --git a/youtube_dl/extractor/testurl.py b/youtube_dlc/extractor/testurl.py

similarity index 100%

rename from youtube_dl/extractor/testurl.py

rename to youtube_dlc/extractor/testurl.py
diff --git a/youtube_dl/extractor/tf1.py b/youtube_dlc/extractor/tf1.py

similarity index 100%

rename from youtube_dl/extractor/tf1.py

rename to youtube_dlc/extractor/tf1.py
diff --git a/youtube_dl/extractor/tfo.py b/youtube_dlc/extractor/tfo.py

similarity index 92%

rename from youtube_dl/extractor/tfo.py

rename to youtube_dlc/extractor/tfo.py

index 0e2370cd828f78a2e1a708852a392a05d96e3039..0631cb7aba8a7068a291fb8e67de0d5e04acf482 100644 (file)
--- a/youtube_dl/extractor/tfo.py
+++ b/youtube_dlc/extractor/tfo.py
@@ -17,14 +17,12 @@ class TFOIE(InfoExtractor):
      _VALID_URL = r'https?://(?:www\.)?tfo\.org/(?:en|fr)/(?:[^/]+/){2}(?P<id>\d+)'
      _TEST = {
          'url': 'http://www.tfo.org/en/universe/tfo-247/100463871/video-game-hackathon',
-        'md5': '47c987d0515561114cf03d1226a9d4c7',
+        'md5': 'cafbe4f47a8dae0ca0159937878100d6',
          'info_dict': {
-            'id': '100463871',
+            'id': '7da3d50e495c406b8fc0b997659cc075',
              'ext': 'mp4',
              'title': 'Video Game Hackathon',
              'description': 'md5:558afeba217c6c8d96c60e5421795c07',
-            'upload_date': '20160212',
-            'timestamp': 1455310233,
          }
      }
  
diff --git a/youtube_dl/extractor/theintercept.py b/youtube_dlc/extractor/theintercept.py

similarity index 100%

rename from youtube_dl/extractor/theintercept.py

rename to youtube_dlc/extractor/theintercept.py
diff --git a/youtube_dl/extractor/theplatform.py b/youtube_dlc/extractor/theplatform.py

similarity index 100%

rename from youtube_dl/extractor/theplatform.py

rename to youtube_dlc/extractor/theplatform.py
diff --git a/youtube_dl/extractor/thescene.py b/youtube_dlc/extractor/thescene.py

similarity index 100%

rename from youtube_dl/extractor/thescene.py

rename to youtube_dlc/extractor/thescene.py
diff --git a/youtube_dl/extractor/thestar.py b/youtube_dlc/extractor/thestar.py

similarity index 100%

rename from youtube_dl/extractor/thestar.py

rename to youtube_dlc/extractor/thestar.py
diff --git a/youtube_dl/extractor/thesun.py b/youtube_dlc/extractor/thesun.py

similarity index 100%

rename from youtube_dl/extractor/thesun.py

rename to youtube_dlc/extractor/thesun.py
diff --git a/youtube_dl/extractor/theweatherchannel.py b/youtube_dlc/extractor/theweatherchannel.py

similarity index 100%

rename from youtube_dl/extractor/theweatherchannel.py

rename to youtube_dlc/extractor/theweatherchannel.py
diff --git a/youtube_dl/extractor/thisamericanlife.py b/youtube_dlc/extractor/thisamericanlife.py

similarity index 100%

rename from youtube_dl/extractor/thisamericanlife.py

rename to youtube_dlc/extractor/thisamericanlife.py
diff --git a/youtube_dl/extractor/thisav.py b/youtube_dlc/extractor/thisav.py

similarity index 100%

rename from youtube_dl/extractor/thisav.py

rename to youtube_dlc/extractor/thisav.py
diff --git a/youtube_dl/extractor/thisoldhouse.py b/youtube_dlc/extractor/thisoldhouse.py

similarity index 51%

rename from youtube_dl/extractor/thisoldhouse.py

rename to youtube_dlc/extractor/thisoldhouse.py

index 6ab147ad726306ba9250599d34491a50e64e82d0..a3d9b4017b93b1d9381b896967fa9e68da59eaec 100644 (file)
--- a/youtube_dl/extractor/thisoldhouse.py
+++ b/youtube_dlc/extractor/thisoldhouse.py
@@ -2,43 +2,46 @@
  from __future__ import unicode_literals
  
  from .common import InfoExtractor
-from ..compat import compat_str
-from ..utils import try_get
  
  
  class ThisOldHouseIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:www\.)?thisoldhouse\.com/(?:watch|how-to|tv-episode)/(?P<id>[^/?#]+)'
+    _VALID_URL = r'https?://(?:www\.)?thisoldhouse\.com/(?:watch|how-to|tv-episode|(?:[^/]+/)?\d+)/(?P<id>[^/?#]+)'
      _TESTS = [{
          'url': 'https://www.thisoldhouse.com/how-to/how-to-build-storage-bench',
-        'md5': '568acf9ca25a639f0c4ff905826b662f',
          'info_dict': {
-            'id': '2REGtUDQ',
+            'id': '5dcdddf673c3f956ef5db202',
              'ext': 'mp4',
              'title': 'How to Build a Storage Bench',
              'description': 'In the workshop, Tom Silva and Kevin O\'Connor build a storage bench for an entryway.',
              'timestamp': 1442548800,
              'upload_date': '20150918',
-        }
+        },
+        'params': {
+            'skip_download': True,
+        },
      }, {
          'url': 'https://www.thisoldhouse.com/watch/arlington-arts-crafts-arts-and-crafts-class-begins',
          'only_matching': True,
      }, {
          'url': 'https://www.thisoldhouse.com/tv-episode/ask-toh-shelf-rough-electric',
          'only_matching': True,
+    }, {
+        'url': 'https://www.thisoldhouse.com/furniture/21017078/how-to-build-a-storage-bench',
+        'only_matching': True,
+    }, {
+        'url': 'https://www.thisoldhouse.com/21113884/s41-e13-paradise-lost',
+        'only_matching': True,
+    }, {
+        # iframe www.thisoldhouse.com
+        'url': 'https://www.thisoldhouse.com/21083431/seaside-transformation-the-westerly-project',
+        'only_matching': True,
      }]
+    _ZYPE_TMPL = 'https://player.zype.com/embed/%s.html?api_key=hsOk_yMSPYNrT22e9pu8hihLXjaZf0JW5jsOWv4ZqyHJFvkJn6rtToHl09tbbsbe'
  
      def _real_extract(self, url):
          display_id = self._match_id(url)
          webpage = self._download_webpage(url, display_id)
          video_id = self._search_regex(
-            (r'data-mid=(["\'])(?P<id>(?:(?!\1).)+)\1',
-             r'id=(["\'])inline-video-player-(?P<id>(?:(?!\1).)+)\1'),
-            webpage, 'video id', default=None, group='id')
-        if not video_id:
-            drupal_settings = self._parse_json(self._search_regex(
-                r'jQuery\.extend\(Drupal\.settings\s*,\s*({.+?})\);',
-                webpage, 'drupal settings'), display_id)
-            video_id = try_get(
-                drupal_settings, lambda x: x['jwplatform']['video_id'],
-                compat_str) or list(drupal_settings['comScore'])[0]
-        return self.url_result('jwplatform:' + video_id, 'JWPlatform', video_id)
+            r'<iframe[^>]+src=[\'"](?:https?:)?//(?:www\.)?thisoldhouse\.(?:chorus\.build|com)/videos/zype/([0-9a-f]{24})',
+            webpage, 'video id')
+        return self.url_result(self._ZYPE_TMPL % video_id, 'Zype', video_id)
diff --git a/youtube_dl/extractor/threeqsdn.py b/youtube_dlc/extractor/threeqsdn.py

similarity index 100%

rename from youtube_dl/extractor/threeqsdn.py

rename to youtube_dlc/extractor/threeqsdn.py
diff --git a/youtube_dl/extractor/tiktok.py b/youtube_dlc/extractor/tiktok.py

similarity index 100%

rename from youtube_dl/extractor/tiktok.py

rename to youtube_dlc/extractor/tiktok.py
diff --git a/youtube_dl/extractor/tinypic.py b/youtube_dlc/extractor/tinypic.py

similarity index 100%

rename from youtube_dl/extractor/tinypic.py

rename to youtube_dlc/extractor/tinypic.py
diff --git a/youtube_dl/extractor/tmz.py b/youtube_dlc/extractor/tmz.py

similarity index 100%

rename from youtube_dl/extractor/tmz.py

rename to youtube_dlc/extractor/tmz.py
diff --git a/youtube_dl/extractor/tnaflix.py b/youtube_dlc/extractor/tnaflix.py

similarity index 100%

rename from youtube_dl/extractor/tnaflix.py

rename to youtube_dlc/extractor/tnaflix.py
diff --git a/youtube_dl/extractor/toggle.py b/youtube_dlc/extractor/toggle.py

similarity index 87%

rename from youtube_dl/extractor/toggle.py

rename to youtube_dlc/extractor/toggle.py

index 5e5efda0f0780fb98b7c37b788ad2734a837e90d..ca2e36efe4216ad66d46252c662ea4cc5395c3ca 100644 (file)
--- a/youtube_dl/extractor/toggle.py
+++ b/youtube_dlc/extractor/toggle.py
@@ -17,9 +17,9 @@
  
  class ToggleIE(InfoExtractor):
      IE_NAME = 'toggle'
-    _VALID_URL = r'https?://video\.toggle\.sg/(?:en|zh)/(?:[^/]+/){2,}(?P<id>[0-9]+)'
+    _VALID_URL = r'https?://(?:(?:www\.)?mewatch|video\.toggle)\.sg/(?:en|zh)/(?:[^/]+/){2,}(?P<id>[0-9]+)'
      _TESTS = [{
-        'url': 'http://video.toggle.sg/en/series/lion-moms-tif/trailers/lion-moms-premier/343115',
+        'url': 'http://www.mewatch.sg/en/series/lion-moms-tif/trailers/lion-moms-premier/343115',
          'info_dict': {
              'id': '343115',
              'ext': 'mp4',
@@ -33,7 +33,7 @@ class ToggleIE(InfoExtractor):
          }
      }, {
          'note': 'DRM-protected video',
-        'url': 'http://video.toggle.sg/en/movies/dug-s-special-mission/341413',
+        'url': 'http://www.mewatch.sg/en/movies/dug-s-special-mission/341413',
          'info_dict': {
              'id': '341413',
              'ext': 'wvm',
@@ -48,7 +48,7 @@ class ToggleIE(InfoExtractor):
      }, {
          # this also tests correct video id extraction
          'note': 'm3u8 links are geo-restricted, but Android/mp4 is okay',
-        'url': 'http://video.toggle.sg/en/series/28th-sea-games-5-show/28th-sea-games-5-show-ep11/332861',
+        'url': 'http://www.mewatch.sg/en/series/28th-sea-games-5-show/28th-sea-games-5-show-ep11/332861',
          'info_dict': {
              'id': '332861',
              'ext': 'mp4',
@@ -65,19 +65,22 @@ class ToggleIE(InfoExtractor):
          'url': 'http://video.toggle.sg/en/clips/seraph-sun-aloysius-will-suddenly-sing-some-old-songs-in-high-pitch-on-set/343331',
          'only_matching': True,
      }, {
-        'url': 'http://video.toggle.sg/zh/series/zero-calling-s2-hd/ep13/336367',
+        'url': 'http://www.mewatch.sg/en/clips/seraph-sun-aloysius-will-suddenly-sing-some-old-songs-in-high-pitch-on-set/343331',
          'only_matching': True,
      }, {
-        'url': 'http://video.toggle.sg/en/series/vetri-s2/webisodes/jeeva-is-an-orphan-vetri-s2-webisode-7/342302',
+        'url': 'http://www.mewatch.sg/zh/series/zero-calling-s2-hd/ep13/336367',
          'only_matching': True,
      }, {
-        'url': 'http://video.toggle.sg/en/movies/seven-days/321936',
+        'url': 'http://www.mewatch.sg/en/series/vetri-s2/webisodes/jeeva-is-an-orphan-vetri-s2-webisode-7/342302',
          'only_matching': True,
      }, {
-        'url': 'https://video.toggle.sg/en/tv-show/news/may-2017-cna-singapore-tonight/fri-19-may-2017/512456',
+        'url': 'http://www.mewatch.sg/en/movies/seven-days/321936',
          'only_matching': True,
      }, {
-        'url': 'http://video.toggle.sg/en/channels/eleven-plus/401585',
+        'url': 'https://www.mewatch.sg/en/tv-show/news/may-2017-cna-singapore-tonight/fri-19-may-2017/512456',
+        'only_matching': True,
+    }, {
+        'url': 'http://www.mewatch.sg/en/channels/eleven-plus/401585',
          'only_matching': True,
      }]
  
diff --git a/youtube_dl/extractor/tonline.py b/youtube_dlc/extractor/tonline.py

similarity index 100%

rename from youtube_dl/extractor/tonline.py

rename to youtube_dlc/extractor/tonline.py
diff --git a/youtube_dl/extractor/toongoggles.py b/youtube_dlc/extractor/toongoggles.py

similarity index 100%

rename from youtube_dl/extractor/toongoggles.py

rename to youtube_dlc/extractor/toongoggles.py
diff --git a/youtube_dl/extractor/toutv.py b/youtube_dlc/extractor/toutv.py

similarity index 100%

rename from youtube_dl/extractor/toutv.py

rename to youtube_dlc/extractor/toutv.py
diff --git a/youtube_dl/extractor/toypics.py b/youtube_dlc/extractor/toypics.py

similarity index 100%

rename from youtube_dl/extractor/toypics.py

rename to youtube_dlc/extractor/toypics.py
diff --git a/youtube_dl/extractor/traileraddict.py b/youtube_dlc/extractor/traileraddict.py

similarity index 100%

rename from youtube_dl/extractor/traileraddict.py

rename to youtube_dlc/extractor/traileraddict.py
diff --git a/youtube_dl/extractor/trilulilu.py b/youtube_dlc/extractor/trilulilu.py

similarity index 100%

rename from youtube_dl/extractor/trilulilu.py

rename to youtube_dlc/extractor/trilulilu.py
diff --git a/youtube_dlc/extractor/trunews.py b/youtube_dlc/extractor/trunews.py

new file mode 100644 (file)

index 0000000..cca5b5c
--- /dev/null
+++ b/youtube_dlc/extractor/trunews.py
@@ -0,0 +1,34 @@
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+
+
+class TruNewsIE(InfoExtractor):
+    _VALID_URL = r'https?://(?:www\.)?trunews\.com/stream/(?P<id>[^/?#&]+)'
+    _TEST = {
+        'url': 'https://www.trunews.com/stream/will-democrats-stage-a-circus-during-president-trump-s-state-of-the-union-speech',
+        'info_dict': {
+            'id': '5c5a21e65d3c196e1c0020cc',
+            'display_id': 'will-democrats-stage-a-circus-during-president-trump-s-state-of-the-union-speech',
+            'ext': 'mp4',
+            'title': "Will Democrats Stage a Circus During President Trump's State of the Union Speech?",
+            'description': 'md5:c583b72147cc92cf21f56a31aff7a670',
+            'duration': 3685,
+            'timestamp': 1549411440,
+            'upload_date': '20190206',
+        },
+        'add_ie': ['Zype'],
+    }
+    _ZYPE_TEMPL = 'https://player.zype.com/embed/%s.js?api_key=X5XnahkjCwJrT_l5zUqypnaLEObotyvtUKJWWlONxDoHVjP8vqxlArLV8llxMbyt'
+
+    def _real_extract(self, url):
+        display_id = self._match_id(url)
+
+        zype_id = self._download_json(
+            'https://api.zype.com/videos', display_id, query={
+                'app_key': 'PUVKp9WgGUb3-JUw6EqafLx8tFVP6VKZTWbUOR-HOm__g4fNDt1bCsm_LgYf_k9H',
+                'per_page': 1,
+                'active': 'true',
+                'friendly_title': display_id,
+            })['response'][0]['_id']
+        return self.url_result(self._ZYPE_TEMPL % zype_id, 'Zype', zype_id)
diff --git a/youtube_dl/extractor/trutv.py b/youtube_dlc/extractor/trutv.py

similarity index 100%

rename from youtube_dl/extractor/trutv.py

rename to youtube_dlc/extractor/trutv.py
diff --git a/youtube_dl/extractor/tube8.py b/youtube_dlc/extractor/tube8.py

similarity index 100%

rename from youtube_dl/extractor/tube8.py

rename to youtube_dlc/extractor/tube8.py
diff --git a/youtube_dl/extractor/tubitv.py b/youtube_dlc/extractor/tubitv.py

similarity index 100%

rename from youtube_dl/extractor/tubitv.py

rename to youtube_dlc/extractor/tubitv.py
diff --git a/youtube_dl/extractor/tudou.py b/youtube_dlc/extractor/tudou.py

similarity index 100%

rename from youtube_dl/extractor/tudou.py

rename to youtube_dlc/extractor/tudou.py
diff --git a/youtube_dl/extractor/tumblr.py b/youtube_dlc/extractor/tumblr.py

similarity index 98%

rename from youtube_dl/extractor/tumblr.py

rename to youtube_dlc/extractor/tumblr.py

index edbb0aa6944ba82b36415875f2d99e570b3373fc..ae584ad697bdf3f460eff033b8f43e75776942ee 100644 (file)
--- a/youtube_dl/extractor/tumblr.py
+++ b/youtube_dlc/extractor/tumblr.py
@@ -4,7 +4,6 @@
  import re
  
  from .common import InfoExtractor
-from ..compat import compat_str
  from ..utils import (
      ExtractorError,
      int_or_none,
@@ -151,7 +150,7 @@ def _real_extract(self, url):
          url = 'http://%s.tumblr.com/post/%s/' % (blog, video_id)
          webpage, urlh = self._download_webpage_handle(url, video_id)
  
-        redirect_url = compat_str(urlh.geturl())
+        redirect_url = urlh.geturl()
          if 'tumblr.com/safe-mode' in redirect_url or redirect_url.startswith('/safe-mode'):
              raise ExtractorError(
                  'This Tumblr may contain sensitive media. '
diff --git a/youtube_dl/extractor/tunein.py b/youtube_dlc/extractor/tunein.py

similarity index 100%

rename from youtube_dl/extractor/tunein.py

rename to youtube_dlc/extractor/tunein.py
diff --git a/youtube_dl/extractor/tunepk.py b/youtube_dlc/extractor/tunepk.py

similarity index 100%

rename from youtube_dl/extractor/tunepk.py

rename to youtube_dlc/extractor/tunepk.py
diff --git a/youtube_dl/extractor/turbo.py b/youtube_dlc/extractor/turbo.py

similarity index 100%

rename from youtube_dl/extractor/turbo.py

rename to youtube_dlc/extractor/turbo.py
diff --git a/youtube_dl/extractor/turner.py b/youtube_dlc/extractor/turner.py

similarity index 100%

rename from youtube_dl/extractor/turner.py

rename to youtube_dlc/extractor/turner.py
diff --git a/youtube_dl/extractor/tv2.py b/youtube_dlc/extractor/tv2.py

similarity index 100%

rename from youtube_dl/extractor/tv2.py

rename to youtube_dlc/extractor/tv2.py
diff --git a/youtube_dl/extractor/tv2dk.py b/youtube_dlc/extractor/tv2dk.py

similarity index 98%

rename from youtube_dl/extractor/tv2dk.py

rename to youtube_dlc/extractor/tv2dk.py

index 611fdc0c6c7002c1669200c7ace75bf498a85c6c..8bda9348d723073b894d2d77b6556b51d89dad80 100644 (file)
--- a/youtube_dl/extractor/tv2dk.py
+++ b/youtube_dlc/extractor/tv2dk.py
@@ -106,7 +106,7 @@ def _real_extract(self, url):
          video_id = self._match_id(url)
  
          video = self._download_json(
-            'http://play.tv2bornholm.dk/controls/AJAX.aspx/specifikVideo', video_id,
+            'https://play.tv2bornholm.dk/controls/AJAX.aspx/specifikVideo', video_id,
              data=json.dumps({
                  'playlist_id': video_id,
                  'serienavn': '',
diff --git a/youtube_dl/extractor/tv2hu.py b/youtube_dlc/extractor/tv2hu.py

similarity index 100%

rename from youtube_dl/extractor/tv2hu.py

rename to youtube_dlc/extractor/tv2hu.py
diff --git a/youtube_dl/extractor/tv4.py b/youtube_dlc/extractor/tv4.py

similarity index 98%

rename from youtube_dl/extractor/tv4.py

rename to youtube_dlc/extractor/tv4.py

index a819d048c613929b79f090facc4a82a097e1cb73..c498b0191623220071d38764f04d3ba1fc114558 100644 (file)
--- a/youtube_dl/extractor/tv4.py
+++ b/youtube_dlc/extractor/tv4.py
@@ -99,7 +99,7 @@ def _real_extract(self, url):
              manifest_url.replace('.m3u8', '.f4m'),
              video_id, f4m_id='hds', fatal=False))
          formats.extend(self._extract_ism_formats(
-            re.sub(r'\.ism/.+?\.m3u8', r'.ism/Manifest', manifest_url),
+            re.sub(r'\.ism/.*?\.m3u8', r'.ism/Manifest', manifest_url),
              video_id, ism_id='mss', fatal=False))
  
          if not formats and info.get('is_geo_restricted'):
diff --git a/youtube_dlc/extractor/tv5mondeplus.py b/youtube_dlc/extractor/tv5mondeplus.py

new file mode 100644 (file)

index 0000000..b7fe082
--- /dev/null
+++ b/youtube_dlc/extractor/tv5mondeplus.py
@@ -0,0 +1,117 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..utils import (
+    determine_ext,
+    extract_attributes,
+    int_or_none,
+    parse_duration,
+)
+
+
+class TV5MondePlusIE(InfoExtractor):
+    IE_DESC = 'TV5MONDE+'
+    _VALID_URL = r'https?://(?:www\.)?(?:tv5mondeplus|revoir\.tv5monde)\.com/toutes-les-videos/[^/]+/(?P<id>[^/?#]+)'
+    _TESTS = [{
+        # movie
+        'url': 'https://revoir.tv5monde.com/toutes-les-videos/cinema/rendez-vous-a-atlit',
+        'md5': '8cbde5ea7b296cf635073e27895e227f',
+        'info_dict': {
+            'id': '822a4756-0712-7329-1859-a13ac7fd1407',
+            'display_id': 'rendez-vous-a-atlit',
+            'ext': 'mp4',
+            'title': 'Rendez-vous à Atlit',
+            'description': 'md5:2893a4c5e1dbac3eedff2d87956e4efb',
+            'upload_date': '20200130',
+        },
+    }, {
+        # series episode
+        'url': 'https://revoir.tv5monde.com/toutes-les-videos/series-fictions/c-est-la-vie-ennemie-juree',
+        'info_dict': {
+            'id': '0df7007c-4900-3936-c601-87a13a93a068',
+            'display_id': 'c-est-la-vie-ennemie-juree',
+            'ext': 'mp4',
+            'title': "C'est la vie - Ennemie jurée",
+            'description': 'md5:dfb5c63087b6f35fe0cc0af4fe44287e',
+            'upload_date': '20200130',
+            'series': "C'est la vie",
+            'episode': 'Ennemie jurée',
+        },
+        'params': {
+            'skip_download': True,
+        },
+    }, {
+        'url': 'https://revoir.tv5monde.com/toutes-les-videos/series-fictions/neuf-jours-en-hiver-neuf-jours-en-hiver',
+        'only_matching': True,
+    }, {
+        'url': 'https://revoir.tv5monde.com/toutes-les-videos/info-societe/le-journal-de-la-rts-edition-du-30-01-20-19h30',
+        'only_matching': True,
+    }]
+    _GEO_BYPASS = False
+
+    def _real_extract(self, url):
+        display_id = self._match_id(url)
+        webpage = self._download_webpage(url, display_id)
+
+        if ">Ce programme n'est malheureusement pas disponible pour votre zone géographique.<" in webpage:
+            self.raise_geo_restricted(countries=['FR'])
+
+        title = episode = self._html_search_regex(r'<h1>([^<]+)', webpage, 'title')
+        vpl_data = extract_attributes(self._search_regex(
+            r'(<[^>]+class="video_player_loader"[^>]+>)',
+            webpage, 'video player loader'))
+
+        video_files = self._parse_json(
+            vpl_data['data-broadcast'], display_id).get('files', [])
+        formats = []
+        for video_file in video_files:
+            v_url = video_file.get('url')
+            if not v_url:
+                continue
+            video_format = video_file.get('format') or determine_ext(v_url)
+            if video_format == 'm3u8':
+                formats.extend(self._extract_m3u8_formats(
+                    v_url, display_id, 'mp4', 'm3u8_native',
+                    m3u8_id='hls', fatal=False))
+            else:
+                formats.append({
+                    'url': v_url,
+                    'format_id': video_format,
+                })
+        self._sort_formats(formats)
+
+        description = self._html_search_regex(
+            r'(?s)<div[^>]+class=["\']episode-texte[^>]+>(.+?)</div>', webpage,
+            'description', fatal=False)
+
+        series = self._html_search_regex(
+            r'<p[^>]+class=["\']episode-emission[^>]+>([^<]+)', webpage,
+            'series', default=None)
+
+        if series and series != title:
+            title = '%s - %s' % (series, title)
+
+        upload_date = self._search_regex(
+            r'(?:date_publication|publish_date)["\']\s*:\s*["\'](\d{4}_\d{2}_\d{2})',
+            webpage, 'upload date', default=None)
+        if upload_date:
+            upload_date = upload_date.replace('_', '')
+
+        video_id = self._search_regex(
+            (r'data-guid=["\']([\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12})',
+             r'id_contenu["\']\s:\s*(\d+)'), webpage, 'video id',
+            default=display_id)
+
+        return {
+            'id': video_id,
+            'display_id': display_id,
+            'title': title,
+            'description': description,
+            'thumbnail': vpl_data.get('data-image'),
+            'duration': int_or_none(vpl_data.get('data-duration')) or parse_duration(self._html_search_meta('duration', webpage)),
+            'upload_date': upload_date,
+            'formats': formats,
+            'series': series,
+            'episode': episode,
+        }
diff --git a/youtube_dl/extractor/tva.py b/youtube_dlc/extractor/tva.py

similarity index 90%

rename from youtube_dl/extractor/tva.py

rename to youtube_dlc/extractor/tva.py

index 0b863df2ff4ad214162c6187ac7aaa65fe3fc6c9..443f46e8a3537165d620c2db8863634e9f922ab6 100644 (file)
--- a/youtube_dl/extractor/tva.py
+++ b/youtube_dlc/extractor/tva.py
@@ -9,8 +9,8 @@
  
  
  class TVAIE(InfoExtractor):
-    _VALID_URL = r'https?://videos\.tva\.ca/details/_(?P<id>\d+)'
-    _TEST = {
+    _VALID_URL = r'https?://videos?\.tva\.ca/details/_(?P<id>\d+)'
+    _TESTS = [{
          'url': 'https://videos.tva.ca/details/_5596811470001',
          'info_dict': {
              'id': '5596811470001',
@@ -24,7 +24,10 @@ class TVAIE(InfoExtractor):
              # m3u8 download
              'skip_download': True,
          }
-    }
+    }, {
+        'url': 'https://video.tva.ca/details/_5596811470001',
+        'only_matching': True,
+    }]
      BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/5481942443001/default_default/index.html?videoId=%s'
  
      def _real_extract(self, url):
diff --git a/youtube_dl/extractor/tvanouvelles.py b/youtube_dlc/extractor/tvanouvelles.py

similarity index 100%

rename from youtube_dl/extractor/tvanouvelles.py

rename to youtube_dlc/extractor/tvanouvelles.py
diff --git a/youtube_dl/extractor/tvc.py b/youtube_dlc/extractor/tvc.py

similarity index 100%

rename from youtube_dl/extractor/tvc.py

rename to youtube_dlc/extractor/tvc.py
diff --git a/youtube_dl/extractor/tvigle.py b/youtube_dlc/extractor/tvigle.py

similarity index 100%

rename from youtube_dl/extractor/tvigle.py

rename to youtube_dlc/extractor/tvigle.py
diff --git a/youtube_dl/extractor/tvland.py b/youtube_dlc/extractor/tvland.py

similarity index 100%

rename from youtube_dl/extractor/tvland.py

rename to youtube_dlc/extractor/tvland.py
diff --git a/youtube_dl/extractor/tvn24.py b/youtube_dlc/extractor/tvn24.py

similarity index 100%

rename from youtube_dl/extractor/tvn24.py

rename to youtube_dlc/extractor/tvn24.py
diff --git a/youtube_dl/extractor/tvnet.py b/youtube_dlc/extractor/tvnet.py

similarity index 100%

rename from youtube_dl/extractor/tvnet.py

rename to youtube_dlc/extractor/tvnet.py
diff --git a/youtube_dl/extractor/tvnoe.py b/youtube_dlc/extractor/tvnoe.py

similarity index 100%

rename from youtube_dl/extractor/tvnoe.py

rename to youtube_dlc/extractor/tvnoe.py
diff --git a/youtube_dl/extractor/tvnow.py b/youtube_dlc/extractor/tvnow.py

similarity index 76%

rename from youtube_dl/extractor/tvnow.py

rename to youtube_dlc/extractor/tvnow.py

index 9c8a8a0dc3944bdf616d7edfde0c4aaa1a9890ef..e2bb62ae85e74f9958d07b3cc86e110949130691 100644 (file)
--- a/youtube_dl/extractor/tvnow.py
+++ b/youtube_dlc/extractor/tvnow.py
@@ -7,10 +7,12 @@
  from ..compat import compat_str
  from ..utils import (
      ExtractorError,
+    get_element_by_id,
      int_or_none,
      parse_iso8601,
      parse_duration,
      str_or_none,
+    try_get,
      update_url_query,
      urljoin,
  )
@@ -204,6 +206,86 @@ def _real_extract(self, url):
              ie=TVNowIE.ie_key(), video_id=mobj.group('id'))
  
  
+class TVNowFilmIE(TVNowBaseIE):
+    _VALID_URL = r'''(?x)
+                    (?P<base_url>https?://
+                        (?:www\.)?tvnow\.(?:de|at|ch)/
+                        (?:filme))/
+                        (?P<title>[^/?$&]+)-(?P<id>\d+)
+                    '''
+    _TESTS = [{
+        'url': 'https://www.tvnow.de/filme/lord-of-war-haendler-des-todes-7959',
+        'info_dict': {
+            'id': '1426690',
+            'display_id': 'lord-of-war-haendler-des-todes',
+            'ext': 'mp4',
+            'title': 'Lord of War',
+            'description': 'md5:5eda15c0d5b8cb70dac724c8a0ff89a9',
+            'timestamp': 1550010000,
+            'upload_date': '20190212',
+            'duration': 7016,
+        },
+    }, {
+        'url': 'https://www.tvnow.de/filme/the-machinist-12157',
+        'info_dict': {
+            'id': '328160',
+            'display_id': 'the-machinist',
+            'ext': 'mp4',
+            'title': 'The Machinist',
+            'description': 'md5:9a0e363fdd74b3a9e1cdd9e21d0ecc28',
+            'timestamp': 1496469720,
+            'upload_date': '20170603',
+            'duration': 5836,
+        },
+    }, {
+        'url': 'https://www.tvnow.de/filme/horst-schlaemmer-isch-kandidiere-17777',
+        'only_matching': True,  # DRM protected
+    }]
+
+    def _real_extract(self, url):
+        mobj = re.match(self._VALID_URL, url)
+        display_id = mobj.group('title')
+
+        webpage = self._download_webpage(url, display_id, fatal=False)
+        if not webpage:
+            raise ExtractorError('Cannot download "%s"' % url, expected=True)
+
+        json_text = get_element_by_id('now-web-state', webpage)
+        if not json_text:
+            raise ExtractorError('Cannot read video data', expected=True)
+
+        json_data = self._parse_json(
+            json_text,
+            display_id,
+            transform_source=lambda x: x.replace('&q;', '"'),
+            fatal=False)
+        if not json_data:
+            raise ExtractorError('Cannot read video data', expected=True)
+
+        player_key = next(
+            (key for key in json_data.keys() if 'module/player' in key),
+            None)
+        page_key = next(
+            (key for key in json_data.keys() if 'page/filme' in key),
+            None)
+        movie_id = try_get(
+            json_data,
+            [
+                lambda x: x[player_key]['body']['id'],
+                lambda x: x[page_key]['body']['modules'][0]['id'],
+                lambda x: x[page_key]['body']['modules'][1]['id']],
+            int)
+        if not movie_id:
+            raise ExtractorError('Cannot extract movie ID', expected=True)
+
+        info = self._call_api(
+            'movies/%d' % movie_id,
+            display_id,
+            query={'fields': ','.join(self._VIDEO_FIELDS)})
+
+        return self._extract_video(info, display_id)
+
+
  class TVNowNewBaseIE(InfoExtractor):
      def _call_api(self, path, video_id, query={}):
          result = self._download_json(
@@ -345,6 +427,82 @@ def _real_extract(self, url):
          display_id, video_id = re.match(self._VALID_URL, url).groups()
          info = self._call_api('player/' + video_id, video_id)
          return self._extract_video(info, video_id, display_id)
+
+
+class TVNowFilmIE(TVNowIE):
+    _VALID_URL = r'''(?x)
+                    (?P<base_url>https?://
+                        (?:www\.)?tvnow\.(?:de|at|ch)/
+                        (?:filme))/
+                        (?P<title>[^/?$&]+)-(?P<id>\d+)
+                    '''
+    _TESTS = [{
+        'url': 'https://www.tvnow.de/filme/lord-of-war-haendler-des-todes-7959',
+        'info_dict': {
+            'id': '1426690',
+            'display_id': 'lord-of-war-haendler-des-todes',
+            'ext': 'mp4',
+            'title': 'Lord of War',
+            'description': 'md5:5eda15c0d5b8cb70dac724c8a0ff89a9',
+            'timestamp': 1550010000,
+            'upload_date': '20190212',
+            'duration': 7016,
+        },
+    }, {
+        'url': 'https://www.tvnow.de/filme/the-machinist-12157',
+        'info_dict': {
+            'id': '328160',
+            'display_id': 'the-machinist',
+            'ext': 'mp4',
+            'title': 'The Machinist',
+            'description': 'md5:9a0e363fdd74b3a9e1cdd9e21d0ecc28',
+            'timestamp': 1496469720,
+            'upload_date': '20170603',
+            'duration': 5836,
+        },
+    }, {
+        'url': 'https://www.tvnow.de/filme/horst-schlaemmer-isch-kandidiere-17777',
+        'only_matching': True,  # DRM protected
+    }]
+
+    def _real_extract(self, url):
+        mobj = re.match(self._VALID_URL, url)
+        display_id = mobj.group('title')
+
+        webpage = self._download_webpage(url, display_id, fatal=False)
+        if not webpage:
+            raise ExtractorError('Cannot download "%s"' % url, expected=True)
+
+        json_text = get_element_by_id('now-web-state', webpage)
+        if not json_text:
+            raise ExtractorError('Cannot read video data', expected=True)
+
+        json_data = self._parse_json(
+            json_text,
+            display_id,
+            transform_source=lambda x: x.replace('&q;', '"'),
+            fatal=False)
+        if not json_data:
+            raise ExtractorError('Cannot read video data', expected=True)
+
+        player_key = next(
+            (key for key in json_data.keys() if 'module/player' in key),
+            None)
+        page_key = next(
+            (key for key in json_data.keys() if 'page/filme' in key),
+            None)
+        movie_id = try_get(
+            json_data,
+            [
+                lambda x: x[player_key]['body']['id'],
+                lambda x: x[page_key]['body']['modules'][0]['id'],
+                lambda x: x[page_key]['body']['modules'][1]['id']],
+            int)
+        if not movie_id:
+            raise ExtractorError('Cannot extract movie ID', expected=True)
+
+        info = self._call_api('player/%d' % movie_id, display_id)
+        return self._extract_video(info, url, display_id)
  """
  
  
diff --git a/youtube_dl/extractor/tvp.py b/youtube_dlc/extractor/tvp.py

similarity index 100%

rename from youtube_dl/extractor/tvp.py

rename to youtube_dlc/extractor/tvp.py
diff --git a/youtube_dl/extractor/tvplay.py b/youtube_dlc/extractor/tvplay.py

similarity index 82%

rename from youtube_dl/extractor/tvplay.py

rename to youtube_dlc/extractor/tvplay.py

index d82d48f94ecc026629479d84aa8ca513eaf0db26..3c2450dd0c8733d3a96a1a842c54505294b70ca6 100644 (file)
--- a/youtube_dl/extractor/tvplay.py
+++ b/youtube_dlc/extractor/tvplay.py
@@ -6,7 +6,6 @@
  from .common import InfoExtractor
  from ..compat import (
      compat_HTTPError,
-    compat_str,
      compat_urlparse,
  )
  from ..utils import (
@@ -15,9 +14,7 @@
      int_or_none,
      parse_iso8601,
      qualities,
-    smuggle_url,
      try_get,
-    unsmuggle_url,
      update_url_query,
      url_or_none,
  )
@@ -235,11 +232,6 @@ class TVPlayIE(InfoExtractor):
      ]
  
      def _real_extract(self, url):
-        url, smuggled_data = unsmuggle_url(url, {})
-        self._initialize_geo_bypass({
-            'countries': smuggled_data.get('geo_countries'),
-        })
-
          video_id = self._match_id(url)
          geo_country = self._search_regex(
              r'https?://[^/]+\.([a-z]{2})', url,
@@ -285,8 +277,6 @@ def _real_extract(self, url):
                      'ext': ext,
                  }
                  if video_url.startswith('rtmp'):
-                    if smuggled_data.get('skip_rtmp'):
-                        continue
                      m = re.search(
                          r'^(?P<url>rtmp://[^/]+/(?P<app>[^/]+))/(?P<playpath>.+)$', video_url)
                      if not m:
@@ -347,115 +337,80 @@ class ViafreeIE(InfoExtractor):
      _VALID_URL = r'''(?x)
                      https?://
                          (?:www\.)?
-                        viafree\.
-                        (?:
-                            (?:dk|no)/programmer|
-                            se/program
-                        )
-                        /(?:[^/]+/)+(?P<id>[^/?#&]+)
+                        viafree\.(?P<country>dk|no|se)
+                        /(?P<id>program(?:mer)?/(?:[^/]+/)+[^/?#&]+)
                      '''
      _TESTS = [{
-        'url': 'http://www.viafree.se/program/livsstil/husraddarna/sasong-2/avsnitt-2',
+        'url': 'http://www.viafree.no/programmer/underholdning/det-beste-vorspielet/sesong-2/episode-1',
          'info_dict': {
-            'id': '395375',
+            'id': '757786',
              'ext': 'mp4',
-            'title': 'Husräddarna S02E02',
-            'description': 'md5:4db5c933e37db629b5a2f75dfb34829e',
-            'series': 'Husräddarna',
-            'season': 'Säsong 2',
+            'title': 'Det beste vorspielet - Sesong 2 - Episode 1',
+            'description': 'md5:b632cb848331404ccacd8cd03e83b4c3',
+            'series': 'Det beste vorspielet',
              'season_number': 2,
-            'duration': 2576,
-            'timestamp': 1400596321,
-            'upload_date': '20140520',
+            'duration': 1116,
+            'timestamp': 1471200600,
+            'upload_date': '20160814',
          },
          'params': {
              'skip_download': True,
          },
-        'add_ie': [TVPlayIE.ie_key()],
      }, {
          # with relatedClips
          'url': 'http://www.viafree.se/program/reality/sommaren-med-youtube-stjarnorna/sasong-1/avsnitt-1',
-        'info_dict': {
-            'id': '758770',
-            'ext': 'mp4',
-            'title': 'Sommaren med YouTube-stjärnorna S01E01',
-            'description': 'md5:2bc69dce2c4bb48391e858539bbb0e3f',
-            'series': 'Sommaren med YouTube-stjärnorna',
-            'season': 'Säsong 1',
-            'season_number': 1,
-            'duration': 1326,
-            'timestamp': 1470905572,
-            'upload_date': '20160811',
-        },
-        'params': {
-            'skip_download': True,
-        },
-        'add_ie': [TVPlayIE.ie_key()],
+        'only_matching': True,
      }, {
          # Different og:image URL schema
          'url': 'http://www.viafree.se/program/reality/sommaren-med-youtube-stjarnorna/sasong-1/avsnitt-2',
          'only_matching': True,
      }, {
-        'url': 'http://www.viafree.no/programmer/underholdning/det-beste-vorspielet/sesong-2/episode-1',
+        'url': 'http://www.viafree.se/program/livsstil/husraddarna/sasong-2/avsnitt-2',
          'only_matching': True,
      }, {
          'url': 'http://www.viafree.dk/programmer/reality/paradise-hotel/saeson-7/episode-5',
          'only_matching': True,
      }]
+    _GEO_BYPASS = False
  
      @classmethod
      def suitable(cls, url):
          return False if TVPlayIE.suitable(url) else super(ViafreeIE, cls).suitable(url)
  
      def _real_extract(self, url):
-        video_id = self._match_id(url)
+        country, path = re.match(self._VALID_URL, url).groups()
+        content = self._download_json(
+            'https://viafree-content.mtg-api.com/viafree-content/v1/%s/path/%s' % (country, path), path)
+        program = content['_embedded']['viafreeBlocks'][0]['_embedded']['program']
+        guid = program['guid']
+        meta = content['meta']
+        title = meta['title']
  
-        webpage = self._download_webpage(url, video_id)
+        try:
+            stream_href = self._download_json(
+                program['_links']['streamLink']['href'], guid,
+                headers=self.geo_verification_headers())['embedded']['prioritizedStreams'][0]['links']['stream']['href']
+        except ExtractorError as e:
+            if isinstance(e.cause, compat_HTTPError) and e.cause.code == 403:
+                self.raise_geo_restricted(countries=[country])
+            raise
+
+        formats = self._extract_m3u8_formats(stream_href, guid, 'mp4')
+        self._sort_formats(formats)
+        episode = program.get('episode') or {}
  
-        data = self._parse_json(
-            self._search_regex(
-                r'(?s)window\.App\s*=\s*({.+?})\s*;\s*</script',
-                webpage, 'data', default='{}'),
-            video_id, transform_source=lambda x: re.sub(
-                r'(?s)function\s+[a-zA-Z_][\da-zA-Z_]*\s*\([^)]*\)\s*{[^}]*}\s*',
-                'null', x), fatal=False)
-
-        video_id = None
-
-        if data:
-            video_id = try_get(
-                data, lambda x: x['context']['dispatcher']['stores'][
-                    'ContentPageProgramStore']['currentVideo']['id'],
-                compat_str)
-
-        # Fallback #1 (extract from og:image URL schema)
-        if not video_id:
-            thumbnail = self._og_search_thumbnail(webpage, default=None)
-            if thumbnail:
-                video_id = self._search_regex(
-                    # Patterns seen:
-                    #  http://cdn.playapi.mtgx.tv/imagecache/600x315/cloud/content-images/inbox/765166/a2e95e5f1d735bab9f309fa345cc3f25.jpg
-                    #  http://cdn.playapi.mtgx.tv/imagecache/600x315/cloud/content-images/seasons/15204/758770/4a5ba509ca8bc043e1ebd1a76131cdf2.jpg
-                    r'https?://[^/]+/imagecache/(?:[^/]+/)+(\d{6,})/',
-                    thumbnail, 'video id', default=None)
-
-        # Fallback #2. Extract from raw JSON string.
-        # May extract wrong video id if relatedClips is present.
-        if not video_id:
-            video_id = self._search_regex(
-                r'currentVideo["\']\s*:\s*.+?["\']id["\']\s*:\s*["\'](\d{6,})',
-                webpage, 'video id')
-
-        return self.url_result(
-            smuggle_url(
-                'mtg:%s' % video_id,
-                {
-                    'geo_countries': [
-                        compat_urlparse.urlparse(url).netloc.rsplit('.', 1)[-1]],
-                    # rtmp host mtgfs.fplive.net for viafree is unresolvable
-                    'skip_rtmp': True,
-                }),
-            ie=TVPlayIE.ie_key(), video_id=video_id)
+        return {
+            'id': guid,
+            'title': title,
+            'thumbnail': meta.get('image'),
+            'description': meta.get('description'),
+            'series': episode.get('seriesTitle'),
+            'episode_number': int_or_none(episode.get('episodeNumber')),
+            'season_number': int_or_none(episode.get('seasonNumber')),
+            'duration': int_or_none(try_get(program, lambda x: x['video']['duration']['milliseconds']), 1000),
+            'timestamp': parse_iso8601(try_get(program, lambda x: x['availability']['start'])),
+            'formats': formats,
+        }
  
  
  class TVPlayHomeIE(InfoExtractor):
diff --git a/youtube_dl/extractor/tvplayer.py b/youtube_dlc/extractor/tvplayer.py

similarity index 100%

rename from youtube_dl/extractor/tvplayer.py

rename to youtube_dlc/extractor/tvplayer.py
diff --git a/youtube_dl/extractor/tweakers.py b/youtube_dlc/extractor/tweakers.py

similarity index 100%

rename from youtube_dl/extractor/tweakers.py

rename to youtube_dlc/extractor/tweakers.py
diff --git a/youtube_dl/extractor/twentyfourvideo.py b/youtube_dlc/extractor/twentyfourvideo.py

similarity index 92%

rename from youtube_dl/extractor/twentyfourvideo.py

rename to youtube_dlc/extractor/twentyfourvideo.py

index 1d66eeaff6e80cd1c629f79ad98a1a68f1d52564..74d14049b482a702bf464a40f2e5f361dc7cd72a 100644 (file)
--- a/youtube_dl/extractor/twentyfourvideo.py
+++ b/youtube_dlc/extractor/twentyfourvideo.py
@@ -17,8 +17,8 @@ class TwentyFourVideoIE(InfoExtractor):
      _VALID_URL = r'''(?x)
                      https?://
                          (?P<host>
-                            (?:(?:www|porno)\.)?24video\.
-                            (?:net|me|xxx|sexy?|tube|adult|site)
+                            (?:(?:www|porno?)\.)?24video\.
+                            (?:net|me|xxx|sexy?|tube|adult|site|vip)
                          )/
                          (?:
                              video/(?:(?:view|xml)/)?|
@@ -59,6 +59,12 @@ class TwentyFourVideoIE(InfoExtractor):
      }, {
          'url': 'https://porno.24video.net/video/2640421-vsya-takaya-gibkaya-i-v-masle',
          'only_matching': True,
+    }, {
+        'url': 'https://www.24video.vip/video/view/1044982',
+        'only_matching': True,
+    }, {
+        'url': 'https://porn.24video.net/video/2640421-vsya-takay',
+        'only_matching': True,
      }]
  
      def _real_extract(self, url):
diff --git a/youtube_dl/extractor/twentymin.py b/youtube_dlc/extractor/twentymin.py

similarity index 100%

rename from youtube_dl/extractor/twentymin.py

rename to youtube_dlc/extractor/twentymin.py
diff --git a/youtube_dl/extractor/twentythreevideo.py b/youtube_dlc/extractor/twentythreevideo.py

similarity index 100%

rename from youtube_dl/extractor/twentythreevideo.py

rename to youtube_dlc/extractor/twentythreevideo.py
diff --git a/youtube_dl/extractor/twitcasting.py b/youtube_dlc/extractor/twitcasting.py

similarity index 100%

rename from youtube_dl/extractor/twitcasting.py

rename to youtube_dlc/extractor/twitcasting.py
diff --git a/youtube_dl/extractor/twitch.py b/youtube_dlc/extractor/twitch.py

similarity index 51%

rename from youtube_dl/extractor/twitch.py

rename to youtube_dlc/extractor/twitch.py

index a8c2502af8132834a34b8ef9c8ade935dd432604..eadc48c6d88d4099c4bd5961d78f9bb5d717925d 100644 (file)
--- a/youtube_dl/extractor/twitch.py
+++ b/youtube_dlc/extractor/twitch.py
@@ -1,26 +1,30 @@
  # coding: utf-8
  from __future__ import unicode_literals
  
+import collections
  import itertools
-import re
-import random
  import json
+import random
+import re
  
  from .common import InfoExtractor
  from ..compat import (
      compat_kwargs,
      compat_parse_qs,
      compat_str,
+    compat_urlparse,
      compat_urllib_parse_urlencode,
      compat_urllib_parse_urlparse,
  )
  from ..utils import (
      clean_html,
      ExtractorError,
+    float_or_none,
      int_or_none,
-    orderedSet,
      parse_duration,
      parse_iso8601,
+    qualities,
+    str_or_none,
      try_get,
      unified_timestamp,
      update_url_query,
@@ -50,8 +54,14 @@ def _handle_error(self, response):
  
      def _call_api(self, path, item_id, *args, **kwargs):
          headers = kwargs.get('headers', {}).copy()
-        headers['Client-ID'] = self._CLIENT_ID
-        kwargs['headers'] = headers
+        headers.update({
+            'Accept': 'application/vnd.twitchtv.v5+json; charset=UTF-8',
+            'Client-ID': self._CLIENT_ID,
+        })
+        kwargs.update({
+            'headers': headers,
+            'expected_status': (400, 410),
+        })
          response = self._download_json(
              '%s/%s' % (self._API_BASE, path), item_id,
              *args, **compat_kwargs(kwargs))
@@ -142,105 +152,16 @@ def _prefer_source(self, formats):
                      })
          self._sort_formats(formats)
  
+    def _download_access_token(self, channel_name):
+        return self._call_api(
+            'api/channels/%s/access_token' % channel_name, channel_name,
+            'Downloading access token JSON')
  
-class TwitchItemBaseIE(TwitchBaseIE):
-    def _download_info(self, item, item_id):
-        return self._extract_info(self._call_api(
-            'kraken/videos/%s%s' % (item, item_id), item_id,
-            'Downloading %s info JSON' % self._ITEM_TYPE))
-
-    def _extract_media(self, item_id):
-        info = self._download_info(self._ITEM_SHORTCUT, item_id)
-        response = self._call_api(
-            'api/videos/%s%s' % (self._ITEM_SHORTCUT, item_id), item_id,
-            'Downloading %s playlist JSON' % self._ITEM_TYPE)
-        entries = []
-        chunks = response['chunks']
-        qualities = list(chunks.keys())
-        for num, fragment in enumerate(zip(*chunks.values()), start=1):
-            formats = []
-            for fmt_num, fragment_fmt in enumerate(fragment):
-                format_id = qualities[fmt_num]
-                fmt = {
-                    'url': fragment_fmt['url'],
-                    'format_id': format_id,
-                    'quality': 1 if format_id == 'live' else 0,
-                }
-                m = re.search(r'^(?P<height>\d+)[Pp]', format_id)
-                if m:
-                    fmt['height'] = int(m.group('height'))
-                formats.append(fmt)
-            self._sort_formats(formats)
-            entry = dict(info)
-            entry['id'] = '%s_%d' % (entry['id'], num)
-            entry['title'] = '%s part %d' % (entry['title'], num)
-            entry['formats'] = formats
-            entries.append(entry)
-        return self.playlist_result(entries, info['id'], info['title'])
-
-    def _extract_info(self, info):
-        status = info.get('status')
-        if status == 'recording':
-            is_live = True
-        elif status == 'recorded':
-            is_live = False
-        else:
-            is_live = None
-        return {
-            'id': info['_id'],
-            'title': info.get('title') or 'Untitled Broadcast',
-            'description': info.get('description'),
-            'duration': int_or_none(info.get('length')),
-            'thumbnail': info.get('preview'),
-            'uploader': info.get('channel', {}).get('display_name'),
-            'uploader_id': info.get('channel', {}).get('name'),
-            'timestamp': parse_iso8601(info.get('recorded_at')),
-            'view_count': int_or_none(info.get('views')),
-            'is_live': is_live,
-        }
-
-    def _real_extract(self, url):
-        return self._extract_media(self._match_id(url))
-
+    def _extract_channel_id(self, token, channel_name):
+        return compat_str(self._parse_json(token, channel_name)['channel_id'])
  
-class TwitchVideoIE(TwitchItemBaseIE):
-    IE_NAME = 'twitch:video'
-    _VALID_URL = r'%s/[^/]+/b/(?P<id>\d+)' % TwitchBaseIE._VALID_URL_BASE
-    _ITEM_TYPE = 'video'
-    _ITEM_SHORTCUT = 'a'
  
-    _TEST = {
-        'url': 'http://www.twitch.tv/riotgames/b/577357806',
-        'info_dict': {
-            'id': 'a577357806',
-            'title': 'Worlds Semifinals - Star Horn Royal Club vs. OMG',
-        },
-        'playlist_mincount': 12,
-        'skip': 'HTTP Error 404: Not Found',
-    }
-
-
-class TwitchChapterIE(TwitchItemBaseIE):
-    IE_NAME = 'twitch:chapter'
-    _VALID_URL = r'%s/[^/]+/c/(?P<id>\d+)' % TwitchBaseIE._VALID_URL_BASE
-    _ITEM_TYPE = 'chapter'
-    _ITEM_SHORTCUT = 'c'
-
-    _TESTS = [{
-        'url': 'http://www.twitch.tv/acracingleague/c/5285812',
-        'info_dict': {
-            'id': 'c5285812',
-            'title': 'ACRL Off Season - Sports Cars @ Nordschleife',
-        },
-        'playlist_mincount': 3,
-        'skip': 'HTTP Error 404: Not Found',
-    }, {
-        'url': 'http://www.twitch.tv/tsm_theoddone/c/2349361',
-        'only_matching': True,
-    }]
-
-
-class TwitchVodIE(TwitchItemBaseIE):
+class TwitchVodIE(TwitchBaseIE):
      IE_NAME = 'twitch:vod'
      _VALID_URL = r'''(?x)
                      https?://
@@ -309,17 +230,60 @@ class TwitchVodIE(TwitchItemBaseIE):
          'only_matching': True,
      }]
  
+    def _download_info(self, item_id):
+        return self._extract_info(
+            self._call_api(
+                'kraken/videos/%s' % item_id, item_id,
+                'Downloading video info JSON'))
+
+    @staticmethod
+    def _extract_info(info):
+        status = info.get('status')
+        if status == 'recording':
+            is_live = True
+        elif status == 'recorded':
+            is_live = False
+        else:
+            is_live = None
+        _QUALITIES = ('small', 'medium', 'large')
+        quality_key = qualities(_QUALITIES)
+        thumbnails = []
+        preview = info.get('preview')
+        if isinstance(preview, dict):
+            for thumbnail_id, thumbnail_url in preview.items():
+                thumbnail_url = url_or_none(thumbnail_url)
+                if not thumbnail_url:
+                    continue
+                if thumbnail_id not in _QUALITIES:
+                    continue
+                thumbnails.append({
+                    'url': thumbnail_url,
+                    'preference': quality_key(thumbnail_id),
+                })
+        return {
+            'id': info['_id'],
+            'title': info.get('title') or 'Untitled Broadcast',
+            'description': info.get('description'),
+            'duration': int_or_none(info.get('length')),
+            'thumbnails': thumbnails,
+            'uploader': info.get('channel', {}).get('display_name'),
+            'uploader_id': info.get('channel', {}).get('name'),
+            'timestamp': parse_iso8601(info.get('recorded_at')),
+            'view_count': int_or_none(info.get('views')),
+            'is_live': is_live,
+        }
+
      def _real_extract(self, url):
-        item_id = self._match_id(url)
+        vod_id = self._match_id(url)
  
-        info = self._download_info(self._ITEM_SHORTCUT, item_id)
+        info = self._download_info(vod_id)
          access_token = self._call_api(
-            'api/vods/%s/access_token' % item_id, item_id,
+            'api/vods/%s/access_token' % vod_id, vod_id,
              'Downloading %s access token' % self._ITEM_TYPE)
  
          formats = self._extract_m3u8_formats(
              '%s/vod/%s.m3u8?%s' % (
-                self._USHER_BASE, item_id,
+                self._USHER_BASE, vod_id,
                  compat_urllib_parse_urlencode({
                      'allow_source': 'true',
                      'allow_audio_only': 'true',
@@ -329,7 +293,7 @@ def _real_extract(self, url):
                      'nauth': access_token['token'],
                      'nauthsig': access_token['sig'],
                  })),
-            item_id, 'mp4', entry_protocol='m3u8_native')
+            vod_id, 'mp4', entry_protocol='m3u8_native')
  
          self._prefer_source(formats)
          info['formats'] = formats
@@ -343,7 +307,7 @@ def _real_extract(self, url):
              info['subtitles'] = {
                  'rechat': [{
                      'url': update_url_query(
-                        'https://api.twitch.tv/v5/videos/%s/comments' % item_id, {
+                        'https://api.twitch.tv/v5/videos/%s/comments' % vod_id, {
                              'client_id': self._CLIENT_ID,
                          }),
                      'ext': 'json',
@@ -353,164 +317,405 @@ def _real_extract(self, url):
          return info
  
  
-class TwitchPlaylistBaseIE(TwitchBaseIE):
-    _PLAYLIST_PATH = 'kraken/channels/%s/videos/?offset=%d&limit=%d'
+def _make_video_result(node):
+    assert isinstance(node, dict)
+    video_id = node.get('id')
+    if not video_id:
+        return
+    return {
+        '_type': 'url_transparent',
+        'ie_key': TwitchVodIE.ie_key(),
+        'id': video_id,
+        'url': 'https://www.twitch.tv/videos/%s' % video_id,
+        'title': node.get('title'),
+        'thumbnail': node.get('previewThumbnailURL'),
+        'duration': float_or_none(node.get('lengthSeconds')),
+        'view_count': int_or_none(node.get('viewCount')),
+    }
+
+
+class TwitchGraphQLBaseIE(TwitchBaseIE):
      _PAGE_LIMIT = 100
  
-    def _extract_playlist(self, channel_id):
-        info = self._call_api(
-            'kraken/channels/%s' % channel_id,
-            channel_id, 'Downloading channel info JSON')
-        channel_name = info.get('display_name') or info.get('name')
+    def _download_gql(self, video_id, op, variables, sha256_hash, note, fatal=True):
+        return self._download_json(
+            'https://gql.twitch.tv/gql', video_id, note,
+            data=json.dumps({
+                'operationName': op,
+                'variables': variables,
+                'extensions': {
+                    'persistedQuery': {
+                        'version': 1,
+                        'sha256Hash': sha256_hash,
+                    }
+                }
+            }).encode(),
+            headers={
+                'Content-Type': 'text/plain;charset=UTF-8',
+                'Client-ID': self._CLIENT_ID,
+            }, fatal=fatal)
+
+
+class TwitchCollectionIE(TwitchGraphQLBaseIE):
+    _VALID_URL = r'https?://(?:(?:www|go|m)\.)?twitch\.tv/collections/(?P<id>[^/]+)'
+
+    _TESTS = [{
+        'url': 'https://www.twitch.tv/collections/wlDCoH0zEBZZbQ',
+        'info_dict': {
+            'id': 'wlDCoH0zEBZZbQ',
+            'title': 'Overthrow Nook, capitalism for children',
+        },
+        'playlist_mincount': 13,
+    }]
+
+    _OPERATION_NAME = 'CollectionSideBar'
+    _SHA256_HASH = '27111f1b382effad0b6def325caef1909c733fe6a4fbabf54f8d491ef2cf2f14'
+
+    def _real_extract(self, url):
+        collection_id = self._match_id(url)
+        collection = self._download_gql(
+            collection_id, self._OPERATION_NAME,
+            {'collectionID': collection_id}, self._SHA256_HASH,
+            'Downloading collection GraphQL')['data']['collection']
+        title = collection.get('title')
          entries = []
+        for edge in collection['items']['edges']:
+            if not isinstance(edge, dict):
+                continue
+            node = edge.get('node')
+            if not isinstance(node, dict):
+                continue
+            video = _make_video_result(node)
+            if video:
+                entries.append(video)
+        return self.playlist_result(
+            entries, playlist_id=collection_id, playlist_title=title)
+
+
+class TwitchPlaylistBaseIE(TwitchGraphQLBaseIE):
+    def _entries(self, channel_name, *args):
+        cursor = None
+        variables_common = self._make_variables(channel_name, *args)
+        entries_key = '%ss' % self._ENTRY_KIND
+        for page_num in itertools.count(1):
+            variables = variables_common.copy()
+            variables['limit'] = self._PAGE_LIMIT
+            if cursor:
+                variables['cursor'] = cursor
+            page = self._download_gql(
+                channel_name, self._OPERATION_NAME, variables,
+                self._SHA256_HASH,
+                'Downloading %ss GraphQL page %s' % (self._NODE_KIND, page_num),
+                fatal=False)
+            if not page:
+                break
+            edges = try_get(
+                page, lambda x: x['data']['user'][entries_key]['edges'], list)
+            if not edges:
+                break
+            for edge in edges:
+                if not isinstance(edge, dict):
+                    continue
+                if edge.get('__typename') != self._EDGE_KIND:
+                    continue
+                node = edge.get('node')
+                if not isinstance(node, dict):
+                    continue
+                if node.get('__typename') != self._NODE_KIND:
+                    continue
+                entry = self._extract_entry(node)
+                if entry:
+                    cursor = edge.get('cursor')
+                    yield entry
+            if not cursor or not isinstance(cursor, compat_str):
+                break
+
+    # Deprecated kraken v5 API
+    def _entries_kraken(self, channel_name, broadcast_type, sort):
+        access_token = self._download_access_token(channel_name)
+        channel_id = self._extract_channel_id(access_token['token'], channel_name)
          offset = 0
-        limit = self._PAGE_LIMIT
-        broken_paging_detected = False
          counter_override = None
          for counter in itertools.count(1):
              response = self._call_api(
-                self._PLAYLIST_PATH % (channel_id, offset, limit),
+                'kraken/channels/%s/videos/' % channel_id,
                  channel_id,
-                'Downloading %s JSON page %s'
-                % (self._PLAYLIST_TYPE, counter_override or counter))
-            page_entries = self._extract_playlist_page(response)
-            if not page_entries:
+                'Downloading video JSON page %s' % (counter_override or counter),
+                query={
+                    'offset': offset,
+                    'limit': self._PAGE_LIMIT,
+                    'broadcast_type': broadcast_type,
+                    'sort': sort,
+                })
+            videos = response.get('videos')
+            if not isinstance(videos, list):
                  break
+            for video in videos:
+                if not isinstance(video, dict):
+                    continue
+                video_url = url_or_none(video.get('url'))
+                if not video_url:
+                    continue
+                yield {
+                    '_type': 'url_transparent',
+                    'ie_key': TwitchVodIE.ie_key(),
+                    'id': video.get('_id'),
+                    'url': video_url,
+                    'title': video.get('title'),
+                    'description': video.get('description'),
+                    'timestamp': unified_timestamp(video.get('published_at')),
+                    'duration': float_or_none(video.get('length')),
+                    'view_count': int_or_none(video.get('views')),
+                    'language': video.get('language'),
+                }
+            offset += self._PAGE_LIMIT
              total = int_or_none(response.get('_total'))
-            # Since the beginning of March 2016 twitch's paging mechanism
-            # is completely broken on the twitch side. It simply ignores
-            # a limit and returns the whole offset number of videos.
-            # Working around by just requesting all videos at once.
-            # Upd: pagination bug was fixed by twitch on 15.03.2016.
-            if not broken_paging_detected and total and len(page_entries) > limit:
-                self.report_warning(
-                    'Twitch pagination is broken on twitch side, requesting all videos at once',
-                    channel_id)
-                broken_paging_detected = True
-                offset = total
-                counter_override = '(all at once)'
-                continue
-            entries.extend(page_entries)
-            if broken_paging_detected or total and len(page_entries) >= total:
+            if total and offset >= total:
                  break
-            offset += limit
-        return self.playlist_result(
-            [self._make_url_result(entry) for entry in orderedSet(entries)],
-            channel_id, channel_name)
-
-    def _make_url_result(self, url):
-        try:
-            video_id = 'v%s' % TwitchVodIE._match_id(url)
-            return self.url_result(url, TwitchVodIE.ie_key(), video_id=video_id)
-        except AssertionError:
-            return self.url_result(url)
-
-    def _extract_playlist_page(self, response):
-        videos = response.get('videos')
-        return [video['url'] for video in videos] if videos else []
  
-    def _real_extract(self, url):
-        return self._extract_playlist(self._match_id(url))
  
-
-class TwitchProfileIE(TwitchPlaylistBaseIE):
-    IE_NAME = 'twitch:profile'
-    _VALID_URL = r'%s/(?P<id>[^/]+)/profile/?(?:\#.*)?$' % TwitchBaseIE._VALID_URL_BASE
-    _PLAYLIST_TYPE = 'profile'
+class TwitchVideosIE(TwitchPlaylistBaseIE):
+    _VALID_URL = r'https?://(?:(?:www|go|m)\.)?twitch\.tv/(?P<id>[^/]+)/(?:videos|profile)'
  
      _TESTS = [{
-        'url': 'http://www.twitch.tv/vanillatv/profile',
+        # All Videos sorted by Date
+        'url': 'https://www.twitch.tv/spamfish/videos?filter=all',
          'info_dict': {
-            'id': 'vanillatv',
-            'title': 'VanillaTV',
+            'id': 'spamfish',
+            'title': 'spamfish - All Videos sorted by Date',
          },
-        'playlist_mincount': 412,
+        'playlist_mincount': 924,
      }, {
-        'url': 'http://m.twitch.tv/vanillatv/profile',
-        'only_matching': True,
-    }]
-
-
-class TwitchVideosBaseIE(TwitchPlaylistBaseIE):
-    _VALID_URL_VIDEOS_BASE = r'%s/(?P<id>[^/]+)/videos' % TwitchBaseIE._VALID_URL_BASE
-    _PLAYLIST_PATH = TwitchPlaylistBaseIE._PLAYLIST_PATH + '&broadcast_type='
-
-
-class TwitchAllVideosIE(TwitchVideosBaseIE):
-    IE_NAME = 'twitch:videos:all'
-    _VALID_URL = r'%s/all' % TwitchVideosBaseIE._VALID_URL_VIDEOS_BASE
-    _PLAYLIST_PATH = TwitchVideosBaseIE._PLAYLIST_PATH + 'archive,upload,highlight'
-    _PLAYLIST_TYPE = 'all videos'
-
-    _TESTS = [{
-        'url': 'https://www.twitch.tv/spamfish/videos/all',
+        # All Videos sorted by Popular
+        'url': 'https://www.twitch.tv/spamfish/videos?filter=all&sort=views',
          'info_dict': {
              'id': 'spamfish',
-            'title': 'Spamfish',
+            'title': 'spamfish - All Videos sorted by Popular',
          },
-        'playlist_mincount': 869,
+        'playlist_mincount': 931,
      }, {
-        'url': 'https://m.twitch.tv/spamfish/videos/all',
-        'only_matching': True,
-    }]
-
-
-class TwitchUploadsIE(TwitchVideosBaseIE):
-    IE_NAME = 'twitch:videos:uploads'
-    _VALID_URL = r'%s/uploads' % TwitchVideosBaseIE._VALID_URL_VIDEOS_BASE
-    _PLAYLIST_PATH = TwitchVideosBaseIE._PLAYLIST_PATH + 'upload'
-    _PLAYLIST_TYPE = 'uploads'
-
-    _TESTS = [{
-        'url': 'https://www.twitch.tv/spamfish/videos/uploads',
+        # Past Broadcasts sorted by Date
+        'url': 'https://www.twitch.tv/spamfish/videos?filter=archives',
          'info_dict': {
              'id': 'spamfish',
-            'title': 'Spamfish',
+            'title': 'spamfish - Past Broadcasts sorted by Date',
          },
-        'playlist_mincount': 0,
+        'playlist_mincount': 27,
      }, {
-        'url': 'https://m.twitch.tv/spamfish/videos/uploads',
+        # Highlights sorted by Date
+        'url': 'https://www.twitch.tv/spamfish/videos?filter=highlights',
+        'info_dict': {
+            'id': 'spamfish',
+            'title': 'spamfish - Highlights sorted by Date',
+        },
+        'playlist_mincount': 901,
+    }, {
+        # Uploads sorted by Date
+        'url': 'https://www.twitch.tv/esl_csgo/videos?filter=uploads&sort=time',
+        'info_dict': {
+            'id': 'esl_csgo',
+            'title': 'esl_csgo - Uploads sorted by Date',
+        },
+        'playlist_mincount': 5,
+    }, {
+        # Past Premieres sorted by Date
+        'url': 'https://www.twitch.tv/spamfish/videos?filter=past_premieres',
+        'info_dict': {
+            'id': 'spamfish',
+            'title': 'spamfish - Past Premieres sorted by Date',
+        },
+        'playlist_mincount': 1,
+    }, {
+        'url': 'https://www.twitch.tv/spamfish/videos/all',
+        'only_matching': True,
+    }, {
+        'url': 'https://m.twitch.tv/spamfish/videos/all',
+        'only_matching': True,
+    }, {
+        'url': 'https://www.twitch.tv/spamfish/videos',
          'only_matching': True,
      }]
  
+    Broadcast = collections.namedtuple('Broadcast', ['type', 'label'])
+
+    _DEFAULT_BROADCAST = Broadcast(None, 'All Videos')
+    _BROADCASTS = {
+        'archives': Broadcast('ARCHIVE', 'Past Broadcasts'),
+        'highlights': Broadcast('HIGHLIGHT', 'Highlights'),
+        'uploads': Broadcast('UPLOAD', 'Uploads'),
+        'past_premieres': Broadcast('PAST_PREMIERE', 'Past Premieres'),
+        'all': _DEFAULT_BROADCAST,
+    }
+
+    _DEFAULT_SORTED_BY = 'Date'
+    _SORTED_BY = {
+        'time': _DEFAULT_SORTED_BY,
+        'views': 'Popular',
+    }
+
+    _SHA256_HASH = 'a937f1d22e269e39a03b509f65a7490f9fc247d7f83d6ac1421523e3b68042cb'
+    _OPERATION_NAME = 'FilterableVideoTower_Videos'
+    _ENTRY_KIND = 'video'
+    _EDGE_KIND = 'VideoEdge'
+    _NODE_KIND = 'Video'
+
+    @classmethod
+    def suitable(cls, url):
+        return (False
+                if any(ie.suitable(url) for ie in (
+                    TwitchVideosClipsIE,
+                    TwitchVideosCollectionsIE))
+                else super(TwitchVideosIE, cls).suitable(url))
+
+    @staticmethod
+    def _make_variables(channel_name, broadcast_type, sort):
+        return {
+            'channelOwnerLogin': channel_name,
+            'broadcastType': broadcast_type,
+            'videoSort': sort.upper(),
+        }
+
+    @staticmethod
+    def _extract_entry(node):
+        return _make_video_result(node)
+
+    def _real_extract(self, url):
+        channel_name = self._match_id(url)
+        qs = compat_urlparse.parse_qs(compat_urlparse.urlparse(url).query)
+        filter = qs.get('filter', ['all'])[0]
+        sort = qs.get('sort', ['time'])[0]
+        broadcast = self._BROADCASTS.get(filter, self._DEFAULT_BROADCAST)
+        return self.playlist_result(
+            self._entries(channel_name, broadcast.type, sort),
+            playlist_id=channel_name,
+            playlist_title='%s - %s sorted by %s'
+            % (channel_name, broadcast.label,
+               self._SORTED_BY.get(sort, self._DEFAULT_SORTED_BY)))
  
-class TwitchPastBroadcastsIE(TwitchVideosBaseIE):
-    IE_NAME = 'twitch:videos:past-broadcasts'
-    _VALID_URL = r'%s/past-broadcasts' % TwitchVideosBaseIE._VALID_URL_VIDEOS_BASE
-    _PLAYLIST_PATH = TwitchVideosBaseIE._PLAYLIST_PATH + 'archive'
-    _PLAYLIST_TYPE = 'past broadcasts'
+
+class TwitchVideosClipsIE(TwitchPlaylistBaseIE):
+    _VALID_URL = r'https?://(?:(?:www|go|m)\.)?twitch\.tv/(?P<id>[^/]+)/(?:clips|videos/*?\?.*?\bfilter=clips)'
  
      _TESTS = [{
-        'url': 'https://www.twitch.tv/spamfish/videos/past-broadcasts',
+        # Clips
+        'url': 'https://www.twitch.tv/vanillatv/clips?filter=clips&range=all',
          'info_dict': {
-            'id': 'spamfish',
-            'title': 'Spamfish',
+            'id': 'vanillatv',
+            'title': 'vanillatv - Clips Top All',
          },
-        'playlist_mincount': 0,
+        'playlist_mincount': 1,
      }, {
-        'url': 'https://m.twitch.tv/spamfish/videos/past-broadcasts',
+        'url': 'https://www.twitch.tv/dota2ruhub/videos?filter=clips&range=7d',
          'only_matching': True,
      }]
  
+    Clip = collections.namedtuple('Clip', ['filter', 'label'])
  
-class TwitchHighlightsIE(TwitchVideosBaseIE):
-    IE_NAME = 'twitch:videos:highlights'
-    _VALID_URL = r'%s/highlights' % TwitchVideosBaseIE._VALID_URL_VIDEOS_BASE
-    _PLAYLIST_PATH = TwitchVideosBaseIE._PLAYLIST_PATH + 'highlight'
-    _PLAYLIST_TYPE = 'highlights'
+    _DEFAULT_CLIP = Clip('LAST_WEEK', 'Top 7D')
+    _RANGE = {
+        '24hr': Clip('LAST_DAY', 'Top 24H'),
+        '7d': _DEFAULT_CLIP,
+        '30d': Clip('LAST_MONTH', 'Top 30D'),
+        'all': Clip('ALL_TIME', 'Top All'),
+    }
+
+    # NB: values other than 20 result in skipped videos
+    _PAGE_LIMIT = 20
+
+    _SHA256_HASH = 'b73ad2bfaecfd30a9e6c28fada15bd97032c83ec77a0440766a56fe0bd632777'
+    _OPERATION_NAME = 'ClipsCards__User'
+    _ENTRY_KIND = 'clip'
+    _EDGE_KIND = 'ClipEdge'
+    _NODE_KIND = 'Clip'
+
+    @staticmethod
+    def _make_variables(channel_name, filter):
+        return {
+            'login': channel_name,
+            'criteria': {
+                'filter': filter,
+            },
+        }
+
+    @staticmethod
+    def _extract_entry(node):
+        assert isinstance(node, dict)
+        clip_url = url_or_none(node.get('url'))
+        if not clip_url:
+            return
+        return {
+            '_type': 'url_transparent',
+            'ie_key': TwitchClipsIE.ie_key(),
+            'id': node.get('id'),
+            'url': clip_url,
+            'title': node.get('title'),
+            'thumbnail': node.get('thumbnailURL'),
+            'duration': float_or_none(node.get('durationSeconds')),
+            'timestamp': unified_timestamp(node.get('createdAt')),
+            'view_count': int_or_none(node.get('viewCount')),
+            'language': node.get('language'),
+        }
+
+    def _real_extract(self, url):
+        channel_name = self._match_id(url)
+        qs = compat_urlparse.parse_qs(compat_urlparse.urlparse(url).query)
+        range = qs.get('range', ['7d'])[0]
+        clip = self._RANGE.get(range, self._DEFAULT_CLIP)
+        return self.playlist_result(
+            self._entries(channel_name, clip.filter),
+            playlist_id=channel_name,
+            playlist_title='%s - Clips %s' % (channel_name, clip.label))
+
+
+class TwitchVideosCollectionsIE(TwitchPlaylistBaseIE):
+    _VALID_URL = r'https?://(?:(?:www|go|m)\.)?twitch\.tv/(?P<id>[^/]+)/videos/*?\?.*?\bfilter=collections'
  
      _TESTS = [{
-        'url': 'https://www.twitch.tv/spamfish/videos/highlights',
+        # Collections
+        'url': 'https://www.twitch.tv/spamfish/videos?filter=collections',
          'info_dict': {
              'id': 'spamfish',
-            'title': 'Spamfish',
+            'title': 'spamfish - Collections',
          },
-        'playlist_mincount': 805,
-    }, {
-        'url': 'https://m.twitch.tv/spamfish/videos/highlights',
-        'only_matching': True,
+        'playlist_mincount': 3,
      }]
  
+    _SHA256_HASH = '07e3691a1bad77a36aba590c351180439a40baefc1c275356f40fc7082419a84'
+    _OPERATION_NAME = 'ChannelCollectionsContent'
+    _ENTRY_KIND = 'collection'
+    _EDGE_KIND = 'CollectionsItemEdge'
+    _NODE_KIND = 'Collection'
+
+    @staticmethod
+    def _make_variables(channel_name):
+        return {
+            'ownerLogin': channel_name,
+        }
+
+    @staticmethod
+    def _extract_entry(node):
+        assert isinstance(node, dict)
+        collection_id = node.get('id')
+        if not collection_id:
+            return
+        return {
+            '_type': 'url_transparent',
+            'ie_key': TwitchCollectionIE.ie_key(),
+            'id': collection_id,
+            'url': 'https://www.twitch.tv/collections/%s' % collection_id,
+            'title': node.get('title'),
+            'thumbnail': node.get('thumbnailURL'),
+            'duration': float_or_none(node.get('lengthSeconds')),
+            'timestamp': unified_timestamp(node.get('updatedAt')),
+            'view_count': int_or_none(node.get('viewCount')),
+        }
+
+    def _real_extract(self, url):
+        channel_name = self._match_id(url)
+        return self.playlist_result(
+            self._entries(channel_name), playlist_id=channel_name,
+            playlist_title='%s - Collections' % channel_name)
+
  
  class TwitchStreamIE(TwitchBaseIE):
      IE_NAME = 'twitch:stream'
@@ -560,23 +765,25 @@ class TwitchStreamIE(TwitchBaseIE):
      def suitable(cls, url):
          return (False
                  if any(ie.suitable(url) for ie in (
-                    TwitchVideoIE,
-                    TwitchChapterIE,
                      TwitchVodIE,
-                    TwitchProfileIE,
-                    TwitchAllVideosIE,
-                    TwitchUploadsIE,
-                    TwitchPastBroadcastsIE,
-                    TwitchHighlightsIE,
+                    TwitchCollectionIE,
+                    TwitchVideosIE,
+                    TwitchVideosClipsIE,
+                    TwitchVideosCollectionsIE,
                      TwitchClipsIE))
                  else super(TwitchStreamIE, cls).suitable(url))
  
      def _real_extract(self, url):
-        channel_id = self._match_id(url)
+        channel_name = self._match_id(url)
+
+        access_token = self._download_access_token(channel_name)
+
+        token = access_token['token']
+        channel_id = self._extract_channel_id(token, channel_name)
  
          stream = self._call_api(
-            'kraken/streams/%s?stream_type=all' % channel_id, channel_id,
-            'Downloading stream JSON').get('stream')
+            'kraken/streams/%s?stream_type=all' % channel_id,
+            channel_id, 'Downloading stream JSON').get('stream')
  
          if not stream:
              raise ExtractorError('%s is offline' % channel_id, expected=True)
@@ -585,11 +792,9 @@ def _real_extract(self, url):
          # (e.g. http://www.twitch.tv/TWITCHPLAYSPOKEMON) that will lead to constructing
          # an invalid m3u8 URL. Working around by use of original channel name from stream
          # JSON and fallback to lowercase if it's not available.
-        channel_id = stream.get('channel', {}).get('name') or channel_id.lower()
-
-        access_token = self._call_api(
-            'api/channels/%s/access_token' % channel_id, channel_id,
-            'Downloading channel access token')
+        channel_name = try_get(
+            stream, lambda x: x['channel']['name'],
+            compat_str) or channel_name.lower()
  
          query = {
              'allow_source': 'true',
@@ -600,11 +805,11 @@ def _real_extract(self, url):
              'playlist_include_framerate': 'true',
              'segment_preference': '4',
              'sig': access_token['sig'].encode('utf-8'),
-            'token': access_token['token'].encode('utf-8'),
+            'token': token.encode('utf-8'),
          }
          formats = self._extract_m3u8_formats(
              '%s/api/channel/hls/%s.m3u8?%s'
-            % (self._USHER_BASE, channel_id, compat_urllib_parse_urlencode(query)),
+            % (self._USHER_BASE, channel_name, compat_urllib_parse_urlencode(query)),
              channel_id, 'mp4')
          self._prefer_source(formats)
  
@@ -627,8 +832,8 @@ def _real_extract(self, url):
              })
  
          return {
-            'id': compat_str(stream['_id']),
-            'display_id': channel_id,
+            'id': str_or_none(stream.get('_id')) or channel_id,
+            'display_id': channel_name,
              'title': title,
              'description': description,
              'thumbnails': thumbnails,
@@ -643,7 +848,14 @@ def _real_extract(self, url):
  
  class TwitchClipsIE(TwitchBaseIE):
      IE_NAME = 'twitch:clips'
-    _VALID_URL = r'https?://(?:clips\.twitch\.tv/(?:embed\?.*?\bclip=|(?:[^/]+/)*)|(?:www\.)?twitch\.tv/[^/]+/clip/)(?P<id>[^/?#&]+)'
+    _VALID_URL = r'''(?x)
+                    https?://
+                        (?:
+                            clips\.twitch\.tv/(?:embed\?.*?\bclip=|(?:[^/]+/)*)|
+                            (?:(?:www|go|m)\.)?twitch\.tv/[^/]+/clip/
+                        )
+                        (?P<id>[^/?#&]+)
+                    '''
  
      _TESTS = [{
          'url': 'https://clips.twitch.tv/FaintLightGullWholeWheat',
@@ -669,6 +881,12 @@ class TwitchClipsIE(TwitchBaseIE):
      }, {
          'url': 'https://clips.twitch.tv/embed?clip=InquisitiveBreakableYogurtJebaited',
          'only_matching': True,
+    }, {
+        'url': 'https://m.twitch.tv/rossbroadcast/clip/ConfidentBraveHumanChefFrank',
+        'only_matching': True,
+    }, {
+        'url': 'https://go.twitch.tv/rossbroadcast/clip/ConfidentBraveHumanChefFrank',
+        'only_matching': True,
      }]
  
      def _real_extract(self, url):
diff --git a/youtube_dl/extractor/twitter.py b/youtube_dlc/extractor/twitter.py

similarity index 96%

rename from youtube_dl/extractor/twitter.py

rename to youtube_dlc/extractor/twitter.py

index 5f8d90fb4e5c13d19cd0601f86a44eec5568fadc..4284487db4994b25990c4151afca71c48a271751 100644 (file)
--- a/youtube_dl/extractor/twitter.py
+++ b/youtube_dlc/extractor/twitter.py
@@ -251,10 +251,10 @@ class TwitterIE(TwitterBaseIE):
          'info_dict': {
              'id': '700207533655363584',
              'ext': 'mp4',
-            'title': 'Simon Vertugo - BEAT PROD: @suhmeduh #Damndaniel',
+            'title': 'simon vetugo - BEAT PROD: @suhmeduh #Damndaniel',
              'description': 'BEAT PROD: @suhmeduh  https://t.co/HBrQ4AfpvZ #Damndaniel https://t.co/byBooq2ejZ',
              'thumbnail': r're:^https?://.*\.jpg',
-            'uploader': 'Simon Vertugo',
+            'uploader': 'simon vetugo',
              'uploader_id': 'simonvertugo',
              'duration': 30.0,
              'timestamp': 1455777459,
@@ -376,6 +376,10 @@ class TwitterIE(TwitterBaseIE):
          # Twitch Clip Embed
          'url': 'https://twitter.com/GunB1g/status/1163218564784017422',
          'only_matching': True,
+    }, {
+        # promo_video_website card
+        'url': 'https://twitter.com/GunB1g/status/1163218564784017422',
+        'only_matching': True,
      }]
  
      def _real_extract(self, url):
@@ -458,10 +462,11 @@ def get_binding_value(k):
                      return try_get(o, lambda x: x[x['type'].lower() + '_value'])
  
                  card_name = card['name'].split(':')[-1]
-                if card_name == 'amplify':
-                    formats = self._extract_formats_from_vmap_url(
-                        get_binding_value('amplify_url_vmap'),
-                        get_binding_value('amplify_content_id') or twid)
+                if card_name in ('amplify', 'promo_video_website'):
+                    is_amplify = card_name == 'amplify'
+                    vmap_url = get_binding_value('amplify_url_vmap') if is_amplify else get_binding_value('player_stream_url')
+                    content_id = get_binding_value('%s_content_id' % (card_name if is_amplify else 'player'))
+                    formats = self._extract_formats_from_vmap_url(vmap_url, content_id or twid)
                      self._sort_formats(formats)
  
                      thumbnails = []
@@ -573,6 +578,18 @@ class TwitterBroadcastIE(TwitterBaseIE, PeriscopeBaseIE):
      IE_NAME = 'twitter:broadcast'
      _VALID_URL = TwitterBaseIE._BASE_REGEX + r'i/broadcasts/(?P<id>[0-9a-zA-Z]{13})'
  
+    _TEST = {
+        # untitled Periscope video
+        'url': 'https://twitter.com/i/broadcasts/1yNGaQLWpejGj',
+        'info_dict': {
+            'id': '1yNGaQLWpejGj',
+            'ext': 'mp4',
+            'title': 'Andrea May Sahouri - Periscope Broadcast',
+            'uploader': 'Andrea May Sahouri',
+            'uploader_id': '1PXEdBZWpGwKe',
+        },
+    }
+
      def _real_extract(self, url):
          broadcast_id = self._match_id(url)
          broadcast = self._call_api(
diff --git a/youtube_dl/extractor/udemy.py b/youtube_dlc/extractor/udemy.py

similarity index 99%

rename from youtube_dl/extractor/udemy.py

rename to youtube_dlc/extractor/udemy.py

index 2a4faecefbc271f1f88dfe38d2fe284e0bbd9158..60e364d301feae058ae2eda4a7b46f7103ec3439 100644 (file)
--- a/youtube_dl/extractor/udemy.py
+++ b/youtube_dlc/extractor/udemy.py
@@ -143,7 +143,7 @@ def _download_webpage_handle(self, *args, **kwargs):
              raise ExtractorError(
                  'Udemy asks you to solve a CAPTCHA. Login with browser, '
                  'solve CAPTCHA, then export cookies and pass cookie file to '
-                'youtube-dl with --cookies.', expected=True)
+                'youtube-dlc with --cookies.', expected=True)
          return ret
  
      def _download_json(self, url_or_request, *args, **kwargs):
diff --git a/youtube_dl/extractor/udn.py b/youtube_dlc/extractor/udn.py

similarity index 100%

rename from youtube_dl/extractor/udn.py

rename to youtube_dlc/extractor/udn.py
diff --git a/youtube_dl/extractor/ufctv.py b/youtube_dlc/extractor/ufctv.py

similarity index 100%

rename from youtube_dl/extractor/ufctv.py

rename to youtube_dlc/extractor/ufctv.py
diff --git a/youtube_dl/extractor/uktvplay.py b/youtube_dlc/extractor/uktvplay.py

similarity index 100%

rename from youtube_dl/extractor/uktvplay.py

rename to youtube_dlc/extractor/uktvplay.py
diff --git a/youtube_dl/extractor/umg.py b/youtube_dlc/extractor/umg.py

similarity index 100%

rename from youtube_dl/extractor/umg.py

rename to youtube_dlc/extractor/umg.py
diff --git a/youtube_dl/extractor/unistra.py b/youtube_dlc/extractor/unistra.py

similarity index 100%

rename from youtube_dl/extractor/unistra.py

rename to youtube_dlc/extractor/unistra.py
diff --git a/youtube_dl/extractor/unity.py b/youtube_dlc/extractor/unity.py

similarity index 100%

rename from youtube_dl/extractor/unity.py

rename to youtube_dlc/extractor/unity.py
diff --git a/youtube_dl/extractor/uol.py b/youtube_dlc/extractor/uol.py

similarity index 54%

rename from youtube_dl/extractor/uol.py

rename to youtube_dlc/extractor/uol.py

index 08f0c072e28b09dfbbde2f662b52f5de4cf46f3d..628adf2199ca7a1d7ebbc9ffece05e2447f59133 100644 (file)
--- a/youtube_dl/extractor/uol.py
+++ b/youtube_dlc/extractor/uol.py
@@ -2,12 +2,17 @@
  from __future__ import unicode_literals
  
  from .common import InfoExtractor
+from ..compat import (
+    compat_str,
+    compat_urllib_parse_urlencode,
+)
  from ..utils import (
      clean_html,
      int_or_none,
      parse_duration,
+    parse_iso8601,
+    qualities,
      update_url_query,
-    str_or_none,
  )
  
  
@@ -16,21 +21,25 @@ class UOLIE(InfoExtractor):
      _VALID_URL = r'https?://(?:.+?\.)?uol\.com\.br/.*?(?:(?:mediaId|v)=|view/(?:[a-z0-9]+/)?|video(?:=|/(?:\d{4}/\d{2}/\d{2}/)?))(?P<id>\d+|[\w-]+-[A-Z0-9]+)'
      _TESTS = [{
          'url': 'http://player.mais.uol.com.br/player_video_v3.swf?mediaId=15951931',
-        'md5': '25291da27dc45e0afb5718a8603d3816',
+        'md5': '4f1e26683979715ff64e4e29099cf020',
          'info_dict': {
              'id': '15951931',
              'ext': 'mp4',
              'title': 'Miss simpatia é encontrada morta',
              'description': 'md5:3f8c11a0c0556d66daf7e5b45ef823b2',
+            'timestamp': 1470421860,
+            'upload_date': '20160805',
          }
      }, {
          'url': 'http://tvuol.uol.com.br/video/incendio-destroi-uma-das-maiores-casas-noturnas-de-londres-04024E9A3268D4C95326',
-        'md5': 'e41a2fb7b7398a3a46b6af37b15c00c9',
+        'md5': '2850a0e8dfa0a7307e04a96c5bdc5bc2',
          'info_dict': {
              'id': '15954259',
              'ext': 'mp4',
              'title': 'Incêndio destrói uma das maiores casas noturnas de Londres',
              'description': 'Em Londres, um incêndio destruiu uma das maiores boates da cidade. Não há informações sobre vítimas.',
+            'timestamp': 1470674520,
+            'upload_date': '20160808',
          }
      }, {
          'url': 'http://mais.uol.com.br/static/uolplayer/index.html?mediaId=15951931',
@@ -55,91 +64,55 @@ class UOLIE(InfoExtractor):
          'only_matching': True,
      }]
  
-    _FORMATS = {
-        '2': {
-            'width': 640,
-            'height': 360,
-        },
-        '5': {
-            'width': 1280,
-            'height': 720,
-        },
-        '6': {
-            'width': 426,
-            'height': 240,
-        },
-        '7': {
-            'width': 1920,
-            'height': 1080,
-        },
-        '8': {
-            'width': 192,
-            'height': 144,
-        },
-        '9': {
-            'width': 568,
-            'height': 320,
-        },
-        '11': {
-            'width': 640,
-            'height': 360,
-        }
-    }
-
      def _real_extract(self, url):
          video_id = self._match_id(url)
-        media_id = None
-
-        if video_id.isdigit():
-            media_id = video_id
-
-        if not media_id:
-            embed_page = self._download_webpage(
-                'https://jsuol.com.br/c/tv/uol/embed/?params=[embed,%s]' % video_id,
-                video_id, 'Downloading embed page', fatal=False)
-            if embed_page:
-                media_id = self._search_regex(
-                    (r'uol\.com\.br/(\d+)', r'mediaId=(\d+)'),
-                    embed_page, 'media id', default=None)
-
-        if not media_id:
-            webpage = self._download_webpage(url, video_id)
-            media_id = self._search_regex(r'mediaId=(\d+)', webpage, 'media id')
  
          video_data = self._download_json(
-            'http://mais.uol.com.br/apiuol/v3/player/getMedia/%s.json' % media_id,
-            media_id)['item']
+            # https://api.mais.uol.com.br/apiuol/v4/player/data/[MEDIA_ID]
+            'https://api.mais.uol.com.br/apiuol/v3/media/detail/' + video_id,
+            video_id)['item']
+        media_id = compat_str(video_data['mediaId'])
          title = video_data['title']
+        ver = video_data.get('revision', 2)
  
-        query = {
-            'ver': video_data.get('numRevision', 2),
-            'r': 'http://mais.uol.com.br',
-        }
-        for k in ('token', 'sign'):
-            v = video_data.get(k)
-            if v:
-                query[k] = v
-
+        uol_formats = self._download_json(
+            'https://croupier.mais.uol.com.br/v3/formats/%s/jsonp' % media_id,
+            media_id)
+        quality = qualities(['mobile', 'WEBM', '360p', '720p', '1080p'])
          formats = []
-        for f in video_data.get('formats', []):
+        for format_id, f in uol_formats.items():
+            if not isinstance(f, dict):
+                continue
              f_url = f.get('url') or f.get('secureUrl')
              if not f_url:
                  continue
+            query = {
+                'ver': ver,
+                'r': 'http://mais.uol.com.br',
+            }
+            for k in ('token', 'sign'):
+                v = f.get(k)
+                if v:
+                    query[k] = v
              f_url = update_url_query(f_url, query)
-            format_id = str_or_none(f.get('id'))
-            if format_id == '10':
-                formats.extend(self._extract_m3u8_formats(
-                    f_url, video_id, 'mp4', 'm3u8_native',
-                    m3u8_id='hls', fatal=False))
+            format_id = format_id
+            if format_id == 'HLS':
+                m3u8_formats = self._extract_m3u8_formats(
+                    f_url, media_id, 'mp4', 'm3u8_native',
+                    m3u8_id='hls', fatal=False)
+                encoded_query = compat_urllib_parse_urlencode(query)
+                for m3u8_f in m3u8_formats:
+                    m3u8_f['extra_param_to_segment_url'] = encoded_query
+                    m3u8_f['url'] = update_url_query(m3u8_f['url'], query)
+                formats.extend(m3u8_formats)
                  continue
-            fmt = {
+            formats.append({
                  'format_id': format_id,
                  'url': f_url,
-                'source_preference': 1,
-            }
-            fmt.update(self._FORMATS.get(format_id, {}))
-            formats.append(fmt)
-        self._sort_formats(formats, ('height', 'width', 'source_preference', 'tbr', 'ext'))
+                'quality': quality(format_id),
+                'preference': -1,
+            })
+        self._sort_formats(formats)
  
          tags = []
          for tag in video_data.get('tags', []):
@@ -148,12 +121,24 @@ def _real_extract(self, url):
                  continue
              tags.append(tag_description)
  
+        thumbnails = []
+        for q in ('Small', 'Medium', 'Wmedium', 'Large', 'Wlarge', 'Xlarge'):
+            q_url = video_data.get('thumb' + q)
+            if not q_url:
+                continue
+            thumbnails.append({
+                'id': q,
+                'url': q_url,
+            })
+
          return {
              'id': media_id,
              'title': title,
-            'description': clean_html(video_data.get('desMedia')),
-            'thumbnail': video_data.get('thumbnail'),
-            'duration': int_or_none(video_data.get('durationSeconds')) or parse_duration(video_data.get('duration')),
+            'description': clean_html(video_data.get('description')),
+            'thumbnails': thumbnails,
+            'duration': parse_duration(video_data.get('duration')),
              'tags': tags,
              'formats': formats,
+            'timestamp': parse_iso8601(video_data.get('publishDate'), ' '),
+            'view_count': int_or_none(video_data.get('viewsQtty')),
          }
diff --git a/youtube_dl/extractor/uplynk.py b/youtube_dlc/extractor/uplynk.py

similarity index 100%

rename from youtube_dl/extractor/uplynk.py

rename to youtube_dlc/extractor/uplynk.py
diff --git a/youtube_dl/extractor/urort.py b/youtube_dlc/extractor/urort.py

similarity index 100%

rename from youtube_dl/extractor/urort.py

rename to youtube_dlc/extractor/urort.py
diff --git a/youtube_dl/extractor/urplay.py b/youtube_dlc/extractor/urplay.py

similarity index 100%

rename from youtube_dl/extractor/urplay.py

rename to youtube_dlc/extractor/urplay.py
diff --git a/youtube_dl/extractor/usanetwork.py b/youtube_dlc/extractor/usanetwork.py

similarity index 100%

rename from youtube_dl/extractor/usanetwork.py

rename to youtube_dlc/extractor/usanetwork.py
diff --git a/youtube_dl/extractor/usatoday.py b/youtube_dlc/extractor/usatoday.py

similarity index 100%

rename from youtube_dl/extractor/usatoday.py

rename to youtube_dlc/extractor/usatoday.py
diff --git a/youtube_dl/extractor/ustream.py b/youtube_dlc/extractor/ustream.py

similarity index 100%

rename from youtube_dl/extractor/ustream.py

rename to youtube_dlc/extractor/ustream.py
diff --git a/youtube_dl/extractor/ustudio.py b/youtube_dlc/extractor/ustudio.py

similarity index 100%

rename from youtube_dl/extractor/ustudio.py

rename to youtube_dlc/extractor/ustudio.py
diff --git a/youtube_dl/extractor/varzesh3.py b/youtube_dlc/extractor/varzesh3.py

similarity index 100%

rename from youtube_dl/extractor/varzesh3.py

rename to youtube_dlc/extractor/varzesh3.py
diff --git a/youtube_dl/extractor/vbox7.py b/youtube_dlc/extractor/vbox7.py

similarity index 100%

rename from youtube_dl/extractor/vbox7.py

rename to youtube_dlc/extractor/vbox7.py
diff --git a/youtube_dl/extractor/veehd.py b/youtube_dlc/extractor/veehd.py

similarity index 100%

rename from youtube_dl/extractor/veehd.py

rename to youtube_dlc/extractor/veehd.py
diff --git a/youtube_dl/extractor/veoh.py b/youtube_dlc/extractor/veoh.py

similarity index 100%

rename from youtube_dl/extractor/veoh.py

rename to youtube_dlc/extractor/veoh.py
diff --git a/youtube_dl/extractor/vesti.py b/youtube_dlc/extractor/vesti.py

similarity index 100%

rename from youtube_dl/extractor/vesti.py

rename to youtube_dlc/extractor/vesti.py
diff --git a/youtube_dl/extractor/vevo.py b/youtube_dlc/extractor/vevo.py

similarity index 100%

rename from youtube_dl/extractor/vevo.py

rename to youtube_dlc/extractor/vevo.py
diff --git a/youtube_dl/extractor/vgtv.py b/youtube_dlc/extractor/vgtv.py

similarity index 100%

rename from youtube_dl/extractor/vgtv.py

rename to youtube_dlc/extractor/vgtv.py
diff --git a/youtube_dl/extractor/vh1.py b/youtube_dlc/extractor/vh1.py

similarity index 100%

rename from youtube_dl/extractor/vh1.py

rename to youtube_dlc/extractor/vh1.py
diff --git a/youtube_dl/extractor/vice.py b/youtube_dlc/extractor/vice.py

similarity index 63%

rename from youtube_dl/extractor/vice.py

rename to youtube_dlc/extractor/vice.py

index 8fdfd743d04a6ee4319a64337ea67398d2243896..e37499512856234d9b989aec53dfd59c42647300 100644 (file)
--- a/youtube_dl/extractor/vice.py
+++ b/youtube_dlc/extractor/vice.py
@@ -1,35 +1,50 @@
  # coding: utf-8
  from __future__ import unicode_literals
  
-import re
-import time
+import functools
  import hashlib
  import json
  import random
+import re
+import time
  
  from .adobepass import AdobePassIE
-from .youtube import YoutubeIE
  from .common import InfoExtractor
+from .youtube import YoutubeIE
  from ..compat import (
      compat_HTTPError,
      compat_str,
  )
  from ..utils import (
+    clean_html,
      ExtractorError,
      int_or_none,
+    OnDemandPagedList,
      parse_age_limit,
      str_or_none,
      try_get,
  )
  
  
-class ViceIE(AdobePassIE):
+class ViceBaseIE(InfoExtractor):
+    def _call_api(self, resource, resource_key, resource_id, locale, fields, args=''):
+        return self._download_json(
+            'https://video.vice.com/api/v1/graphql', resource_id, query={
+                'query': '''{
+  %s(locale: "%s", %s: "%s"%s) {
+    %s
+  }
+}''' % (resource, locale, resource_key, resource_id, args, fields),
+            })['data'][resource]
+
+
+class ViceIE(ViceBaseIE, AdobePassIE):
      IE_NAME = 'vice'
-    _VALID_URL = r'https?://(?:(?:video|vms)\.vice|(?:www\.)?viceland)\.com/(?P<locale>[^/]+)/(?:video/[^/]+|embed)/(?P<id>[\da-f]+)'
+    _VALID_URL = r'https?://(?:(?:video|vms)\.vice|(?:www\.)?vice(?:land|tv))\.com/(?P<locale>[^/]+)/(?:video/[^/]+|embed)/(?P<id>[\da-f]{24})'
      _TESTS = [{
          'url': 'https://video.vice.com/en_us/video/pet-cremator/58c69e38a55424f1227dc3f7',
          'info_dict': {
-            'id': '5e647f0125e145c9aef2069412c0cbde',
+            'id': '58c69e38a55424f1227dc3f7',
              'ext': 'mp4',
              'title': '10 Questions You Always Wanted To Ask: Pet Cremator',
              'description': 'md5:fe856caacf61fe0e74fab15ce2b07ca5',
@@ -43,17 +58,16 @@ class ViceIE(AdobePassIE):
              # m3u8 download
              'skip_download': True,
          },
-        'add_ie': ['UplynkPreplay'],
      }, {
          # geo restricted to US
          'url': 'https://video.vice.com/en_us/video/the-signal-from-tolva/5816510690b70e6c5fd39a56',
          'info_dict': {
-            'id': '930c0ad1f47141cc955087eecaddb0e2',
+            'id': '5816510690b70e6c5fd39a56',
              'ext': 'mp4',
-            'uploader': 'waypoint',
+            'uploader': 'vice',
              'title': 'The Signal From Tölva',
              'description': 'md5:3927e3c79f9e8094606a2b3c5b5e55d5',
-            'uploader_id': '57f7d621e05ca860fa9ccaf9',
+            'uploader_id': '57a204088cb727dec794c67b',
              'timestamp': 1477941983,
              'upload_date': '20161031',
          },
@@ -61,15 +75,14 @@ class ViceIE(AdobePassIE):
              # m3u8 download
              'skip_download': True,
          },
-        'add_ie': ['UplynkPreplay'],
      }, {
          'url': 'https://video.vice.com/alps/video/ulfs-wien-beruchtigste-grafitti-crew-part-1/581b12b60a0e1f4c0fb6ea2f',
          'info_dict': {
              'id': '581b12b60a0e1f4c0fb6ea2f',
              'ext': 'mp4',
              'title': 'ULFs - Wien berüchtigste Grafitti Crew - Part 1',
-            'description': '<p>Zwischen Hinterzimmer-Tattoos und U-Bahnschächten erzählen uns die Ulfs, wie es ist, "süchtig nach Sachbeschädigung" zu sein.</p>',
-            'uploader': 'VICE',
+            'description': 'Zwischen Hinterzimmer-Tattoos und U-Bahnschächten erzählen uns die Ulfs, wie es ist, "süchtig nach Sachbeschädigung" zu sein.',
+            'uploader': 'vice',
              'uploader_id': '57a204088cb727dec794c67b',
              'timestamp': 1485368119,
              'upload_date': '20170125',
@@ -78,9 +91,7 @@ class ViceIE(AdobePassIE):
          'params': {
              # AES-encrypted m3u8
              'skip_download': True,
-            'proxy': '127.0.0.1:8118',
          },
-        'add_ie': ['UplynkPreplay'],
      }, {
          'url': 'https://video.vice.com/en_us/video/pizza-show-trailer/56d8c9a54d286ed92f7f30e4',
          'only_matching': True,
@@ -98,7 +109,7 @@ class ViceIE(AdobePassIE):
      @staticmethod
      def _extract_urls(webpage):
          return re.findall(
-            r'<iframe\b[^>]+\bsrc=["\']((?:https?:)?//video\.vice\.com/[^/]+/embed/[\da-f]+)',
+            r'<iframe\b[^>]+\bsrc=["\']((?:https?:)?//video\.vice\.com/[^/]+/embed/[\da-f]{24})',
              webpage)
  
      @staticmethod
@@ -109,31 +120,16 @@ def _extract_url(webpage):
      def _real_extract(self, url):
          locale, video_id = re.match(self._VALID_URL, url).groups()
  
-        webpage = self._download_webpage(
-            'https://video.vice.com/%s/embed/%s' % (locale, video_id),
-            video_id)
-
-        video = self._parse_json(
-            self._search_regex(
-                r'PREFETCH_DATA\s*=\s*({.+?})\s*;\s*\n', webpage,
-                'app state'), video_id)['video']
-        video_id = video.get('vms_id') or video.get('id') or video_id
-        title = video['title']
-        is_locked = video.get('locked')
+        video = self._call_api('videos', 'id', video_id, locale, '''body
+    locked
+    rating
+    thumbnail_url
+    title''')[0]
+        title = video['title'].strip()
          rating = video.get('rating')
-        thumbnail = video.get('thumbnail_url')
-        duration = int_or_none(video.get('duration'))
-        series = try_get(
-            video, lambda x: x['episode']['season']['show']['title'],
-            compat_str)
-        episode_number = try_get(
-            video, lambda x: x['episode']['episode_number'])
-        season_number = try_get(
-            video, lambda x: x['episode']['season']['season_number'])
-        uploader = None
  
          query = {}
-        if is_locked:
+        if video.get('locked'):
              resource = self._get_mvpd_resource(
                  'VICELAND', title, video_id, rating)
              query['tvetoken'] = self._extract_mvpd_auth(
@@ -148,12 +144,9 @@ def _real_extract(self, url):
          query.update({
              'exp': exp,
              'sign': hashlib.sha512(('%s:GET:%d' % (video_id, exp)).encode()).hexdigest(),
-            '_ad_blocked': None,
-            '_ad_unit': '',
-            '_debug': '',
+            'skipadstitching': 1,
              'platform': 'desktop',
              'rn': random.randint(10000, 100000),
-            'fbprebidtoken': '',
          })
  
          try:
@@ -169,85 +162,94 @@ def _real_extract(self, url):
              raise
  
          video_data = preplay['video']
-        base = video_data['base']
-        uplynk_preplay_url = preplay['preplayURL']
-        episode = video_data.get('episode', {})
-        channel = video_data.get('channel', {})
+        formats = self._extract_m3u8_formats(
+            preplay['playURL'], video_id, 'mp4', 'm3u8_native')
+        self._sort_formats(formats)
+        episode = video_data.get('episode') or {}
+        channel = video_data.get('channel') or {}
+        season = video_data.get('season') or {}
  
          subtitles = {}
-        cc_url = preplay.get('ccURL')
-        if cc_url:
-            subtitles['en'] = [{
+        for subtitle in preplay.get('subtitleURLs', []):
+            cc_url = subtitle.get('url')
+            if not cc_url:
+                continue
+            language_code = try_get(subtitle, lambda x: x['languages'][0]['language_code'], compat_str) or 'en'
+            subtitles.setdefault(language_code, []).append({
                  'url': cc_url,
-            }]
+            })
  
          return {
-            '_type': 'url_transparent',
-            'url': uplynk_preplay_url,
+            'formats': formats,
              'id': video_id,
              'title': title,
-            'description': base.get('body') or base.get('display_body'),
-            'thumbnail': thumbnail,
-            'duration': int_or_none(video_data.get('video_duration')) or duration,
+            'description': clean_html(video.get('body')),
+            'thumbnail': video.get('thumbnail_url'),
+            'duration': int_or_none(video_data.get('video_duration')),
              'timestamp': int_or_none(video_data.get('created_at'), 1000),
-            'age_limit': parse_age_limit(video_data.get('video_rating')),
-            'series': video_data.get('show_title') or series,
-            'episode_number': int_or_none(episode.get('episode_number') or episode_number),
+            'age_limit': parse_age_limit(video_data.get('video_rating') or rating),
+            'series': try_get(video_data, lambda x: x['show']['base']['display_title'], compat_str),
+            'episode_number': int_or_none(episode.get('episode_number')),
              'episode_id': str_or_none(episode.get('id') or video_data.get('episode_id')),
-            'season_number': int_or_none(season_number),
-            'season_id': str_or_none(episode.get('season_id')),
-            'uploader': channel.get('base', {}).get('title') or channel.get('name') or uploader,
+            'season_number': int_or_none(season.get('season_number')),
+            'season_id': str_or_none(season.get('id') or video_data.get('season_id')),
+            'uploader': channel.get('name'),
              'uploader_id': str_or_none(channel.get('id')),
              'subtitles': subtitles,
-            'ie_key': 'UplynkPreplay',
          }
  
  
-class ViceShowIE(InfoExtractor):
+class ViceShowIE(ViceBaseIE):
      IE_NAME = 'vice:show'
-    _VALID_URL = r'https?://(?:.+?\.)?vice\.com/(?:[^/]+/)?show/(?P<id>[^/?#&]+)'
-
-    _TEST = {
-        'url': 'https://munchies.vice.com/en/show/fuck-thats-delicious-2',
+    _VALID_URL = r'https?://(?:video\.vice|(?:www\.)?vice(?:land|tv))\.com/(?P<locale>[^/]+)/show/(?P<id>[^/?#&]+)'
+    _PAGE_SIZE = 25
+    _TESTS = [{
+        'url': 'https://video.vice.com/en_us/show/fck-thats-delicious',
          'info_dict': {
-            'id': 'fuck-thats-delicious-2',
-            'title': "Fuck, That's Delicious",
-            'description': 'Follow the culinary adventures of rapper Action Bronson during his ongoing world tour.',
+            'id': '57a2040c8cb727dec794c901',
+            'title': 'F*ck, That’s Delicious',
+            'description': 'The life and eating habits of rap’s greatest bon vivant, Action Bronson.',
          },
-        'playlist_count': 17,
-    }
+        'playlist_mincount': 64,
+    }, {
+        'url': 'https://www.vicetv.com/en_us/show/fck-thats-delicious',
+        'only_matching': True,
+    }]
  
-    def _real_extract(self, url):
-        show_id = self._match_id(url)
-        webpage = self._download_webpage(url, show_id)
+    def _fetch_page(self, locale, show_id, page):
+        videos = self._call_api('videos', 'show_id', show_id, locale, '''body
+    id
+    url''', ', page: %d, per_page: %d' % (page + 1, self._PAGE_SIZE))
+        for video in videos:
+            yield self.url_result(
+                video['url'], ViceIE.ie_key(), video.get('id'))
  
-        entries = [
-            self.url_result(video_url, ViceIE.ie_key())
-            for video_url, _ in re.findall(
-                r'<h2[^>]+class="article-title"[^>]+data-id="\d+"[^>]*>\s*<a[^>]+href="(%s.*?)"'
-                % ViceIE._VALID_URL, webpage)]
+    def _real_extract(self, url):
+        locale, display_id = re.match(self._VALID_URL, url).groups()
+        show = self._call_api('shows', 'slug', display_id, locale, '''dek
+    id
+    title''')[0]
+        show_id = show['id']
  
-        title = self._search_regex(
-            r'<title>(.+?)</title>', webpage, 'title', default=None)
-        if title:
-            title = re.sub(r'(.+)\s*\|\s*.+$', r'\1', title).strip()
-        description = self._html_search_meta(
-            'description', webpage, 'description')
+        entries = OnDemandPagedList(
+            functools.partial(self._fetch_page, locale, show_id),
+            self._PAGE_SIZE)
  
-        return self.playlist_result(entries, show_id, title, description)
+        return self.playlist_result(
+            entries, show_id, show.get('title'), show.get('dek'))
  
  
-class ViceArticleIE(InfoExtractor):
+class ViceArticleIE(ViceBaseIE):
      IE_NAME = 'vice:article'
-    _VALID_URL = r'https://www\.vice\.com/[^/]+/article/(?P<id>[^?#]+)'
+    _VALID_URL = r'https://(?:www\.)?vice\.com/(?P<locale>[^/]+)/article/(?:[0-9a-z]{6}/)?(?P<id>[^?#]+)'
  
      _TESTS = [{
          'url': 'https://www.vice.com/en_us/article/on-set-with-the-woman-making-mormon-porn-in-utah',
          'info_dict': {
-            'id': '41eae2a47b174a1398357cec55f1f6fc',
+            'id': '58dc0a3dee202d2a0ccfcbd8',
              'ext': 'mp4',
-            'title': 'Mormon War on Porn ',
-            'description': 'md5:6394a8398506581d0346b9ab89093fef',
+            'title': 'Mormon War on Porn',
+            'description': 'md5:1c5d91fe25fa8aa304f9def118b92dbf',
              'uploader': 'vice',
              'uploader_id': '57a204088cb727dec794c67b',
              'timestamp': 1491883129,
@@ -258,10 +260,10 @@ class ViceArticleIE(InfoExtractor):
              # AES-encrypted m3u8
              'skip_download': True,
          },
-        'add_ie': ['UplynkPreplay'],
+        'add_ie': [ViceIE.ie_key()],
      }, {
          'url': 'https://www.vice.com/en_us/article/how-to-hack-a-car',
-        'md5': '7fe8ebc4fa3323efafc127b82bd821d9',
+        'md5': '13010ee0bc694ea87ec40724397c2349',
          'info_dict': {
              'id': '3jstaBeXgAs',
              'ext': 'mp4',
@@ -271,15 +273,15 @@ class ViceArticleIE(InfoExtractor):
              'uploader_id': 'MotherboardTV',
              'upload_date': '20140529',
          },
-        'add_ie': ['Youtube'],
+        'add_ie': [YoutubeIE.ie_key()],
      }, {
          'url': 'https://www.vice.com/en_us/article/znm9dx/karley-sciortino-slutever-reloaded',
          'md5': 'a7ecf64ee4fa19b916c16f4b56184ae2',
          'info_dict': {
-            'id': 'e2ed435eb67e43efb66e6ef9a6930a88',
+            'id': '57f41d3556a0a80f54726060',
              'ext': 'mp4',
              'title': "Making The World's First Male Sex Doll",
-            'description': 'md5:916078ef0e032d76343116208b6cc2c4',
+            'description': 'md5:19b00b215b99961cf869c40fbe9df755',
              'uploader': 'vice',
              'uploader_id': '57a204088cb727dec794c67b',
              'timestamp': 1476919911,
@@ -288,6 +290,7 @@ class ViceArticleIE(InfoExtractor):
          },
          'params': {
              'skip_download': True,
+            'format': 'bestvideo',
          },
          'add_ie': [ViceIE.ie_key()],
      }, {
@@ -299,14 +302,11 @@ class ViceArticleIE(InfoExtractor):
      }]
  
      def _real_extract(self, url):
-        display_id = self._match_id(url)
-
-        webpage = self._download_webpage(url, display_id)
+        locale, display_id = re.match(self._VALID_URL, url).groups()
  
-        prefetch_data = self._parse_json(self._search_regex(
-            r'__APP_STATE\s*=\s*({.+?})(?:\s*\|\|\s*{}\s*)?;\s*\n',
-            webpage, 'app state'), display_id)['pageData']
-        body = prefetch_data['body']
+        article = self._call_api('articles', 'slug', display_id, locale, '''body
+    embed_code''')[0]
+        body = article['body']
  
          def _url_res(video_url, ie_key):
              return {
@@ -316,7 +316,7 @@ def _url_res(video_url, ie_key):
                  'ie_key': ie_key,
              }
  
-        vice_url = ViceIE._extract_url(webpage)
+        vice_url = ViceIE._extract_url(body)
          if vice_url:
              return _url_res(vice_url, ViceIE.ie_key())
  
@@ -332,6 +332,6 @@ def _url_res(video_url, ie_key):
  
          video_url = self._html_search_regex(
              r'data-video-url="([^"]+)"',
-            prefetch_data['embed_code'], 'video URL')
+            article['embed_code'], 'video URL')
  
          return _url_res(video_url, ViceIE.ie_key())
diff --git a/youtube_dl/extractor/vidbit.py b/youtube_dlc/extractor/vidbit.py

similarity index 100%

rename from youtube_dl/extractor/vidbit.py

rename to youtube_dlc/extractor/vidbit.py
diff --git a/youtube_dl/extractor/viddler.py b/youtube_dlc/extractor/viddler.py

similarity index 100%

rename from youtube_dl/extractor/viddler.py

rename to youtube_dlc/extractor/viddler.py
diff --git a/youtube_dl/extractor/videa.py b/youtube_dlc/extractor/videa.py

similarity index 58%

rename from youtube_dl/extractor/videa.py

rename to youtube_dlc/extractor/videa.py

index d0e34c81980c51b6cd464dfc1e1e34ff9b959c44..a03614cc10918301dbd3196de2123b0f22b2a8b8 100644 (file)
--- a/youtube_dl/extractor/videa.py
+++ b/youtube_dlc/extractor/videa.py
@@ -2,15 +2,24 @@
  from __future__ import unicode_literals
  
  import re
+import random
+import string
+import struct
  
  from .common import InfoExtractor
  from ..utils import (
+    ExtractorError,
      int_or_none,
      mimetype2ext,
      parse_codecs,
      xpath_element,
      xpath_text,
  )
+from ..compat import (
+    compat_b64decode,
+    compat_ord,
+    compat_parse_qs,
+)
  
  
  class VideaIE(InfoExtractor):
@@ -60,15 +69,63 @@ def _extract_urls(webpage):
              r'<iframe[^>]+src=(["\'])(?P<url>(?:https?:)?//videa\.hu/player\?.*?\bv=.+?)\1',
              webpage)]
  
+    def rc4(self, ciphertext, key):
+        res = b''
+
+        keyLen = len(key)
+        S = list(range(256))
+
+        j = 0
+        for i in range(256):
+            j = (j + S[i] + ord(key[i % keyLen])) % 256
+            S[i], S[j] = S[j], S[i]
+
+        i = 0
+        j = 0
+        for m in range(len(ciphertext)):
+            i = (i + 1) % 256
+            j = (j + S[i]) % 256
+            S[i], S[j] = S[j], S[i]
+            k = S[(S[i] + S[j]) % 256]
+            res += struct.pack("B", k ^ compat_ord(ciphertext[m]))
+
+        return res
+
      def _real_extract(self, url):
          video_id = self._match_id(url)
+        webpage = self._download_webpage(url, video_id, fatal=True)
+        error = self._search_regex(r'<p class="error-text">([^<]+)</p>', webpage, 'error', default=None)
+        if error:
+            raise ExtractorError(error, expected=True)
+
+        video_src_params_raw = self._search_regex(r'<iframe[^>]+id="videa_player_iframe"[^>]+src="/player\?([^"]+)"', webpage, 'video_src_params')
+        video_src_params = compat_parse_qs(video_src_params_raw)
+        player_page = self._download_webpage("https://videa.hu/videojs_player?%s" % video_src_params_raw, video_id, fatal=True)
+        nonce = self._search_regex(r'_xt\s*=\s*"([^"]+)"', player_page, 'nonce')
+        random_seed = ''.join(random.choice(string.ascii_uppercase + string.ascii_lowercase + string.digits) for _ in range(8))
+        static_secret = 'xHb0ZvME5q8CBcoQi6AngerDu3FGO9fkUlwPmLVY_RTzj2hJIS4NasXWKy1td7p'
+        l = nonce[:32]
+        s = nonce[32:]
+        result = ''
+        for i in range(0, 32):
+            result += s[i - (static_secret.index(l[i]) - 31)]
  
-        info = self._download_xml(
+        video_src_params['_s'] = random_seed
+        video_src_params['_t'] = result[:16]
+        encryption_key_stem = result[16:] + random_seed
+
+        [b64_info, handle] = self._download_webpage_handle(
              'http://videa.hu/videaplayer_get_xml.php', video_id,
-            query={'v': video_id})
+            query=video_src_params, fatal=True)
+
+        encrypted_info = compat_b64decode(b64_info)
+        key = encryption_key_stem + handle.info()['x-videa-xs']
+        info_str = self.rc4(encrypted_info, key).decode('utf8')
+        info = self._parse_xml(info_str, video_id)
  
          video = xpath_element(info, './/video', 'video', fatal=True)
          sources = xpath_element(info, './/video_sources', 'sources', fatal=True)
+        hash_values = xpath_element(info, './/hash_values', 'hash_values', fatal=True)
  
          title = xpath_text(video, './title', fatal=True)
  
@@ -77,6 +134,7 @@ def _real_extract(self, url):
              source_url = source.text
              if not source_url:
                  continue
+            source_url += '?md5=%s&expires=%s' % (hash_values.find('hash_value_%s' % source.get('name')).text, source.get('exp'))
              f = parse_codecs(source.get('codecs'))
              f.update({
                  'url': source_url,
diff --git a/youtube_dl/extractor/videodetective.py b/youtube_dlc/extractor/videodetective.py

similarity index 100%

rename from youtube_dl/extractor/videodetective.py

rename to youtube_dlc/extractor/videodetective.py
diff --git a/youtube_dl/extractor/videofyme.py b/youtube_dlc/extractor/videofyme.py

similarity index 100%

rename from youtube_dl/extractor/videofyme.py

rename to youtube_dlc/extractor/videofyme.py
diff --git a/youtube_dl/extractor/videomore.py b/youtube_dlc/extractor/videomore.py

similarity index 100%

rename from youtube_dl/extractor/videomore.py

rename to youtube_dlc/extractor/videomore.py
diff --git a/youtube_dl/extractor/videopress.py b/youtube_dlc/extractor/videopress.py

similarity index 100%

rename from youtube_dl/extractor/videopress.py

rename to youtube_dlc/extractor/videopress.py
diff --git a/youtube_dl/extractor/vidio.py b/youtube_dlc/extractor/vidio.py

similarity index 100%

rename from youtube_dl/extractor/vidio.py

rename to youtube_dlc/extractor/vidio.py
diff --git a/youtube_dl/extractor/vidlii.py b/youtube_dlc/extractor/vidlii.py

similarity index 100%

rename from youtube_dl/extractor/vidlii.py

rename to youtube_dlc/extractor/vidlii.py
diff --git a/youtube_dl/extractor/vidme.py b/youtube_dlc/extractor/vidme.py

similarity index 100%

rename from youtube_dl/extractor/vidme.py

rename to youtube_dlc/extractor/vidme.py
diff --git a/youtube_dl/extractor/vidzi.py b/youtube_dlc/extractor/vidzi.py

similarity index 96%

rename from youtube_dl/extractor/vidzi.py

rename to youtube_dlc/extractor/vidzi.py

index 42ea4952c381b497b6f01abcb44212339563e0a3..4e79a0b84b8ca04ea537646aee4b593872351829 100644 (file)
--- a/youtube_dl/extractor/vidzi.py
+++ b/youtube_dlc/extractor/vidzi.py
@@ -20,7 +20,7 @@ class VidziIE(InfoExtractor):
          'info_dict': {
              'id': 'cghql9yq6emu',
              'ext': 'mp4',
-            'title': 'youtube-dl test video  1\\\\2\'3/4<5\\\\6ä7↭',
+            'title': 'youtube-dlc test video  1\\\\2\'3/4<5\\\\6ä7↭',
          },
          'params': {
              # m3u8 download
diff --git a/youtube_dl/extractor/vier.py b/youtube_dlc/extractor/vier.py

similarity index 100%

rename from youtube_dl/extractor/vier.py

rename to youtube_dlc/extractor/vier.py
diff --git a/youtube_dlc/extractor/viewlift.py b/youtube_dlc/extractor/viewlift.py

new file mode 100644 (file)

index 0000000..d6b92b1
--- /dev/null
+++ b/youtube_dlc/extractor/viewlift.py
@@ -0,0 +1,250 @@
+from __future__ import unicode_literals
+
+import json
+import re
+
+from .common import InfoExtractor
+from ..compat import compat_HTTPError
+from ..utils import (
+    ExtractorError,
+    int_or_none,
+    parse_age_limit,
+)
+
+
+class ViewLiftBaseIE(InfoExtractor):
+    _API_BASE = 'https://prod-api.viewlift.com/'
+    _DOMAINS_REGEX = r'(?:(?:main\.)?snagfilms|snagxtreme|funnyforfree|kiddovid|winnersview|(?:monumental|lax)sportsnetwork|vayafilm|failarmy|ftfnext|lnppass\.legapallacanestro|moviespree|app\.myoutdoortv|neoufitness|pflmma|theidentitytb)\.com|(?:hoichoi|app\.horseandcountry|kronon|marquee|supercrosslive)\.tv'
+    _SITE_MAP = {
+        'ftfnext': 'lax',
+        'funnyforfree': 'snagfilms',
+        'hoichoi': 'hoichoitv',
+        'kiddovid': 'snagfilms',
+        'laxsportsnetwork': 'lax',
+        'legapallacanestro': 'lnp',
+        'marquee': 'marquee-tv',
+        'monumentalsportsnetwork': 'monumental-network',
+        'moviespree': 'bingeflix',
+        'pflmma': 'pfl',
+        'snagxtreme': 'snagfilms',
+        'theidentitytb': 'tampabay',
+        'vayafilm': 'snagfilms',
+    }
+    _TOKENS = {}
+
+    def _call_api(self, site, path, video_id, query):
+        token = self._TOKENS.get(site)
+        if not token:
+            token_query = {'site': site}
+            email, password = self._get_login_info(netrc_machine=site)
+            if email:
+                resp = self._download_json(
+                    self._API_BASE + 'identity/signin', video_id,
+                    'Logging in', query=token_query, data=json.dumps({
+                        'email': email,
+                        'password': password,
+                    }).encode())
+            else:
+                resp = self._download_json(
+                    self._API_BASE + 'identity/anonymous-token', video_id,
+                    'Downloading authorization token', query=token_query)
+            self._TOKENS[site] = token = resp['authorizationToken']
+        return self._download_json(
+            self._API_BASE + path, video_id,
+            headers={'Authorization': token}, query=query)
+
+
+class ViewLiftEmbedIE(ViewLiftBaseIE):
+    IE_NAME = 'viewlift:embed'
+    _VALID_URL = r'https?://(?:(?:www|embed)\.)?(?P<domain>%s)/embed/player\?.*\bfilmId=(?P<id>[\da-f]{8}-(?:[\da-f]{4}-){3}[\da-f]{12})' % ViewLiftBaseIE._DOMAINS_REGEX
+    _TESTS = [{
+        'url': 'http://embed.snagfilms.com/embed/player?filmId=74849a00-85a9-11e1-9660-123139220831&w=500',
+        'md5': '2924e9215c6eff7a55ed35b72276bd93',
+        'info_dict': {
+            'id': '74849a00-85a9-11e1-9660-123139220831',
+            'ext': 'mp4',
+            'title': '#whilewewatch',
+            'description': 'md5:b542bef32a6f657dadd0df06e26fb0c8',
+            'timestamp': 1334350096,
+            'upload_date': '20120413',
+        }
+    }, {
+        # invalid labels, 360p is better that 480p
+        'url': 'http://www.snagfilms.com/embed/player?filmId=17ca0950-a74a-11e0-a92a-0026bb61d036',
+        'md5': '882fca19b9eb27ef865efeeaed376a48',
+        'info_dict': {
+            'id': '17ca0950-a74a-11e0-a92a-0026bb61d036',
+            'ext': 'mp4',
+            'title': 'Life in Limbo',
+        },
+        'skip': 'The video does not exist',
+    }, {
+        'url': 'http://www.snagfilms.com/embed/player?filmId=0000014c-de2f-d5d6-abcf-ffef58af0017',
+        'only_matching': True,
+    }]
+
+    @staticmethod
+    def _extract_url(webpage):
+        mobj = re.search(
+            r'<iframe[^>]+?src=(["\'])(?P<url>(?:https?:)?//(?:embed\.)?(?:%s)/embed/player.+?)\1' % ViewLiftBaseIE._DOMAINS_REGEX,
+            webpage)
+        if mobj:
+            return mobj.group('url')
+
+    def _real_extract(self, url):
+        domain, film_id = re.match(self._VALID_URL, url).groups()
+        site = domain.split('.')[-2]
+        if site in self._SITE_MAP:
+            site = self._SITE_MAP[site]
+        try:
+            content_data = self._call_api(
+                site, 'entitlement/video/status', film_id, {
+                    'id': film_id
+                })['video']
+        except ExtractorError as e:
+            if isinstance(e.cause, compat_HTTPError) and e.cause.code == 403:
+                error_message = self._parse_json(e.cause.read().decode(), film_id).get('errorMessage')
+                if error_message == 'User does not have a valid subscription or has not purchased this content.':
+                    self.raise_login_required()
+                raise ExtractorError(error_message, expected=True)
+            raise
+        gist = content_data['gist']
+        title = gist['title']
+        video_assets = content_data['streamingInfo']['videoAssets']
+
+        formats = []
+        mpeg_video_assets = video_assets.get('mpeg') or []
+        for video_asset in mpeg_video_assets:
+            video_asset_url = video_asset.get('url')
+            if not video_asset:
+                continue
+            bitrate = int_or_none(video_asset.get('bitrate'))
+            height = int_or_none(self._search_regex(
+                r'^_?(\d+)[pP]$', video_asset.get('renditionValue'),
+                'height', default=None))
+            formats.append({
+                'url': video_asset_url,
+                'format_id': 'http%s' % ('-%d' % bitrate if bitrate else ''),
+                'tbr': bitrate,
+                'height': height,
+                'vcodec': video_asset.get('codec'),
+            })
+
+        hls_url = video_assets.get('hls')
+        if hls_url:
+            formats.extend(self._extract_m3u8_formats(
+                hls_url, film_id, 'mp4', 'm3u8_native', m3u8_id='hls', fatal=False))
+        self._sort_formats(formats, ('height', 'tbr', 'format_id'))
+
+        info = {
+            'id': film_id,
+            'title': title,
+            'description': gist.get('description'),
+            'thumbnail': gist.get('videoImageUrl'),
+            'duration': int_or_none(gist.get('runtime')),
+            'age_limit': parse_age_limit(content_data.get('parentalRating')),
+            'timestamp': int_or_none(gist.get('publishDate'), 1000),
+            'formats': formats,
+        }
+        for k in ('categories', 'tags'):
+            info[k] = [v['title'] for v in content_data.get(k, []) if v.get('title')]
+        return info
+
+
+class ViewLiftIE(ViewLiftBaseIE):
+    IE_NAME = 'viewlift'
+    _VALID_URL = r'https?://(?:www\.)?(?P<domain>%s)(?P<path>(?:/(?:films/title|show|(?:news/)?videos?|watch))?/(?P<id>[^?#]+))' % ViewLiftBaseIE._DOMAINS_REGEX
+    _TESTS = [{
+        'url': 'http://www.snagfilms.com/films/title/lost_for_life',
+        'md5': '19844f897b35af219773fd63bdec2942',
+        'info_dict': {
+            'id': '0000014c-de2f-d5d6-abcf-ffef58af0017',
+            'display_id': 'lost_for_life',
+            'ext': 'mp4',
+            'title': 'Lost for Life',
+            'description': 'md5:ea10b5a50405ae1f7b5269a6ec594102',
+            'thumbnail': r're:^https?://.*\.jpg',
+            'duration': 4489,
+            'categories': 'mincount:3',
+            'age_limit': 14,
+            'upload_date': '20150421',
+            'timestamp': 1429656820,
+        }
+    }, {
+        'url': 'http://www.snagfilms.com/show/the_world_cut_project/india',
+        'md5': 'e6292e5b837642bbda82d7f8bf3fbdfd',
+        'info_dict': {
+            'id': '00000145-d75c-d96e-a9c7-ff5c67b20000',
+            'display_id': 'the_world_cut_project/india',
+            'ext': 'mp4',
+            'title': 'India',
+            'description': 'md5:5c168c5a8f4719c146aad2e0dfac6f5f',
+            'thumbnail': r're:^https?://.*\.jpg',
+            'duration': 979,
+            'timestamp': 1399478279,
+            'upload_date': '20140507',
+        }
+    }, {
+        'url': 'http://main.snagfilms.com/augie_alone/s_2_ep_12_love',
+        'info_dict': {
+            'id': '00000148-7b53-de26-a9fb-fbf306f70020',
+            'display_id': 'augie_alone/s_2_ep_12_love',
+            'ext': 'mp4',
+            'title': 'S. 2 Ep. 12 - Love',
+            'description': 'Augie finds love.',
+            'thumbnail': r're:^https?://.*\.jpg',
+            'duration': 107,
+            'upload_date': '20141012',
+            'timestamp': 1413129540,
+            'age_limit': 17,
+        },
+        'params': {
+            'skip_download': True,
+        },
+    }, {
+        'url': 'http://main.snagfilms.com/films/title/the_freebie',
+        'only_matching': True,
+    }, {
+        # Film is not playable in your area.
+        'url': 'http://www.snagfilms.com/films/title/inside_mecca',
+        'only_matching': True,
+    }, {
+        # Film is not available.
+        'url': 'http://www.snagfilms.com/show/augie_alone/flirting',
+        'only_matching': True,
+    }, {
+        'url': 'http://www.winnersview.com/videos/the-good-son',
+        'only_matching': True,
+    }, {
+        # Was once Kaltura embed
+        'url': 'https://www.monumentalsportsnetwork.com/videos/john-carlson-postgame-2-25-15',
+        'only_matching': True,
+    }, {
+        'url': 'https://www.marquee.tv/watch/sadlerswells-sacredmonsters',
+        'only_matching': True,
+    }]
+
+    @classmethod
+    def suitable(cls, url):
+        return False if ViewLiftEmbedIE.suitable(url) else super(ViewLiftIE, cls).suitable(url)
+
+    def _real_extract(self, url):
+        domain, path, display_id = re.match(self._VALID_URL, url).groups()
+        site = domain.split('.')[-2]
+        if site in self._SITE_MAP:
+            site = self._SITE_MAP[site]
+        modules = self._call_api(
+            site, 'content/pages', display_id, {
+                'includeContent': 'true',
+                'moduleOffset': 1,
+                'path': path,
+                'site': site,
+            })['modules']
+        film_id = next(m['contentData'][0]['gist']['id'] for m in modules if m.get('moduleType') == 'VideoDetailModule')
+        return {
+            '_type': 'url_transparent',
+            'url': 'http://%s/embed/player?filmId=%s' % (domain, film_id),
+            'id': film_id,
+            'display_id': display_id,
+            'ie_key': 'ViewLiftEmbed',
+        }
diff --git a/youtube_dl/extractor/viidea.py b/youtube_dlc/extractor/viidea.py

similarity index 100%

rename from youtube_dl/extractor/viidea.py

rename to youtube_dlc/extractor/viidea.py
diff --git a/youtube_dl/extractor/viki.py b/youtube_dlc/extractor/viki.py

similarity index 90%

rename from youtube_dl/extractor/viki.py

rename to youtube_dlc/extractor/viki.py

index b0dcdc0e6baced889541e3307ac8314e73f99522..f8e3603385474fc003f2e763f9a910a214897939 100644 (file)
--- a/youtube_dl/extractor/viki.py
+++ b/youtube_dlc/extractor/viki.py
@@ -12,6 +12,7 @@
  from ..utils import (
      ExtractorError,
      int_or_none,
+    HEADRequest,
      parse_age_limit,
      parse_iso8601,
      sanitized_Request,
@@ -56,14 +57,14 @@ def _prepare_call(self, path, timestamp=None, post_data=None):
  
      def _call_api(self, path, video_id, note, timestamp=None, post_data=None):
          resp = self._download_json(
-            self._prepare_call(path, timestamp, post_data), video_id, note)
+            self._prepare_call(path, timestamp, post_data), video_id, note, headers={'x-viki-app-ver': '2.2.5.1428709186'}, expected_status=[200, 400, 404])
  
          error = resp.get('error')
          if error:
              if error == 'invalid timestamp':
                  resp = self._download_json(
                      self._prepare_call(path, int(resp['current_timestamp']), post_data),
-                    video_id, '%s (retry)' % note)
+                    video_id, '%s (retry)' % note, headers={'x-viki-app-ver': '2.2.5.1428709186'}, expected_status=[200, 400, 404])
                  error = resp.get('error')
              if error:
                  self._raise_error(resp['error'])
@@ -220,6 +221,69 @@ def _real_extract(self, url):
          video = self._call_api(
              'videos/%s.json' % video_id, video_id, 'Downloading video JSON')
  
+        streams = self._call_api(
+            'videos/%s/streams.json' % video_id, video_id,
+            'Downloading video streams JSON')
+
+        formats = []
+        for format_id, stream_dict in streams.items():
+            height = int_or_none(self._search_regex(
+                r'^(\d+)[pP]$', format_id, 'height', default=None))
+            for protocol, format_dict in stream_dict.items():
+                # rtmps URLs does not seem to work
+                if protocol == 'rtmps':
+                    continue
+                format_url = format_dict.get('url')
+                format_drms = format_dict.get('drms')
+                format_stream_id = format_dict.get('id')
+                if format_id == 'm3u8':
+                    m3u8_formats = self._extract_m3u8_formats(
+                        format_url, video_id, 'mp4',
+                        entry_protocol='m3u8_native',
+                        m3u8_id='m3u8-%s' % protocol, fatal=False)
+                    # Despite CODECS metadata in m3u8 all video-only formats
+                    # are actually video+audio
+                    for f in m3u8_formats:
+                        if f.get('acodec') == 'none' and f.get('vcodec') != 'none':
+                            f['acodec'] = None
+                    formats.extend(m3u8_formats)
+                elif format_id == 'mpd':
+                    mpd_formats = self._extract_mpd_formats(
+                        format_url, video_id,
+                        mpd_id='mpd-%s' % protocol, fatal=False)
+                    formats.extend(mpd_formats)
+                elif format_id == 'mpd':
+
+                    formats.extend(mpd_formats)
+                elif format_url.startswith('rtmp'):
+                    mobj = re.search(
+                        r'^(?P<url>rtmp://[^/]+/(?P<app>.+?))/(?P<playpath>mp4:.+)$',
+                        format_url)
+                    if not mobj:
+                        continue
+                    formats.append({
+                        'format_id': 'rtmp-%s' % format_id,
+                        'ext': 'flv',
+                        'url': mobj.group('url'),
+                        'play_path': mobj.group('playpath'),
+                        'app': mobj.group('app'),
+                        'page_url': url,
+                        'drms': format_drms,
+                        'stream_id': format_stream_id,
+                    })
+                else:
+                    urlh = self._request_webpage(
+                        HEADRequest(format_url), video_id, 'Checking file size', fatal=False)
+                    formats.append({
+                        'url': format_url,
+                        'format_id': '%s-%s' % (format_id, protocol),
+                        'height': height,
+                        'drms': format_drms,
+                        'stream_id': format_stream_id,
+                        'filesize': int_or_none(urlh.headers.get('Content-Length')),
+                    })
+        self._sort_formats(formats)
+
          self._check_errors(video)
  
          title = self.dict_selection(video.get('titles', {}), 'en', allow_fallback=False)
@@ -244,12 +308,18 @@ def _real_extract(self, url):
                  'url': thumbnail.get('url'),
              })
  
+        stream_ids = []
+        for f in formats:
+            s_id = f.get('stream_id')
+            if s_id is not None:
+                stream_ids.append(s_id)
+
          subtitles = {}
          for subtitle_lang, _ in video.get('subtitle_completions', {}).items():
              subtitles[subtitle_lang] = [{
                  'ext': subtitles_format,
                  'url': self._prepare_call(
-                    'videos/%s/subtitles/%s.%s' % (video_id, subtitle_lang, subtitles_format)),
+                    'videos/%s/subtitles/%s.%s?stream_id=%s' % (video_id, subtitle_lang, subtitles_format, stream_ids[0])),
              } for subtitles_format in ('srt', 'vtt')]
  
          result = {
@@ -265,10 +335,6 @@ def _real_extract(self, url):
              'subtitles': subtitles,
          }
  
-        streams = self._call_api(
-            'videos/%s/streams.json' % video_id, video_id,
-            'Downloading video streams JSON')
-
          if 'external' in streams:
              result.update({
                  '_type': 'url_transparent',
@@ -276,48 +342,6 @@ def _real_extract(self, url):
              })
              return result
  
-        formats = []
-        for format_id, stream_dict in streams.items():
-            height = int_or_none(self._search_regex(
-                r'^(\d+)[pP]$', format_id, 'height', default=None))
-            for protocol, format_dict in stream_dict.items():
-                # rtmps URLs does not seem to work
-                if protocol == 'rtmps':
-                    continue
-                format_url = format_dict['url']
-                if format_id == 'm3u8':
-                    m3u8_formats = self._extract_m3u8_formats(
-                        format_url, video_id, 'mp4',
-                        entry_protocol='m3u8_native',
-                        m3u8_id='m3u8-%s' % protocol, fatal=False)
-                    # Despite CODECS metadata in m3u8 all video-only formats
-                    # are actually video+audio
-                    for f in m3u8_formats:
-                        if f.get('acodec') == 'none' and f.get('vcodec') != 'none':
-                            f['acodec'] = None
-                    formats.extend(m3u8_formats)
-                elif format_url.startswith('rtmp'):
-                    mobj = re.search(
-                        r'^(?P<url>rtmp://[^/]+/(?P<app>.+?))/(?P<playpath>mp4:.+)$',
-                        format_url)
-                    if not mobj:
-                        continue
-                    formats.append({
-                        'format_id': 'rtmp-%s' % format_id,
-                        'ext': 'flv',
-                        'url': mobj.group('url'),
-                        'play_path': mobj.group('playpath'),
-                        'app': mobj.group('app'),
-                        'page_url': url,
-                    })
-                else:
-                    formats.append({
-                        'url': format_url,
-                        'format_id': '%s-%s' % (format_id, protocol),
-                        'height': height,
-                    })
-        self._sort_formats(formats)
-
          result['formats'] = formats
          return result
  
diff --git a/youtube_dl/extractor/vimeo.py b/youtube_dlc/extractor/vimeo.py

similarity index 93%

rename from youtube_dl/extractor/vimeo.py

rename to youtube_dlc/extractor/vimeo.py

index baa46d5f3513cbde337f144c7143a9c501455ff4..9839657ca3d96fe3775c5bed4391d88e419225f9 100644 (file)
--- a/youtube_dl/extractor/vimeo.py
+++ b/youtube_dlc/extractor/vimeo.py
@@ -33,6 +33,7 @@
      unified_timestamp,
      unsmuggle_url,
      urlencode_postdata,
+    urljoin,
      unescapeHTML,
  )
  
@@ -139,28 +140,28 @@ def _parse_config(self, config, video_id):
              })
  
          # TODO: fix handling of 308 status code returned for live archive manifest requests
+        sep_pattern = r'/sep/video/'
          for files_type in ('hls', 'dash'):
              for cdn_name, cdn_data in config_files.get(files_type, {}).get('cdns', {}).items():
                  manifest_url = cdn_data.get('url')
                  if not manifest_url:
                      continue
                  format_id = '%s-%s' % (files_type, cdn_name)
-                if files_type == 'hls':
-                    formats.extend(self._extract_m3u8_formats(
-                        manifest_url, video_id, 'mp4',
-                        'm3u8' if is_live else 'm3u8_native', m3u8_id=format_id,
-                        note='Downloading %s m3u8 information' % cdn_name,
-                        fatal=False))
-                elif files_type == 'dash':
-                    mpd_pattern = r'/%s/(?:sep/)?video/' % video_id
-                    mpd_manifest_urls = []
-                    if re.search(mpd_pattern, manifest_url):
-                        for suffix, repl in (('', 'video'), ('_sep', 'sep/video')):
-                            mpd_manifest_urls.append((format_id + suffix, re.sub(
-                                mpd_pattern, '/%s/%s/' % (video_id, repl), manifest_url)))
-                    else:
-                        mpd_manifest_urls = [(format_id, manifest_url)]
-                    for f_id, m_url in mpd_manifest_urls:
+                sep_manifest_urls = []
+                if re.search(sep_pattern, manifest_url):
+                    for suffix, repl in (('', 'video'), ('_sep', 'sep/video')):
+                        sep_manifest_urls.append((format_id + suffix, re.sub(
+                            sep_pattern, '/%s/' % repl, manifest_url)))
+                else:
+                    sep_manifest_urls = [(format_id, manifest_url)]
+                for f_id, m_url in sep_manifest_urls:
+                    if files_type == 'hls':
+                        formats.extend(self._extract_m3u8_formats(
+                            m_url, video_id, 'mp4',
+                            'm3u8' if is_live else 'm3u8_native', m3u8_id=f_id,
+                            note='Downloading %s m3u8 information' % cdn_name,
+                            fatal=False))
+                    elif files_type == 'dash':
                          if 'json=1' in m_url:
                              real_m_url = (self._download_json(m_url, video_id, fatal=False) or {}).get('url')
                              if real_m_url:
@@ -169,11 +170,6 @@ def _parse_config(self, config, video_id):
                              m_url.replace('/master.json', '/master.mpd'), video_id, f_id,
                              'Downloading %s MPD information' % cdn_name,
                              fatal=False)
-                        for f in mpd_formats:
-                            if f.get('vcodec') == 'none':
-                                f['preference'] = -50
-                            elif f.get('acodec') == 'none':
-                                f['preference'] = -40
                          formats.extend(mpd_formats)
  
          live_archive = live_event.get('archive') or {}
@@ -185,13 +181,19 @@ def _parse_config(self, config, video_id):
                  'preference': 1,
              })
  
+        for f in formats:
+            if f.get('vcodec') == 'none':
+                f['preference'] = -50
+            elif f.get('acodec') == 'none':
+                f['preference'] = -40
+
          subtitles = {}
          text_tracks = config['request'].get('text_tracks')
          if text_tracks:
              for tt in text_tracks:
                  subtitles[tt['lang']] = [{
                      'ext': 'vtt',
-                    'url': 'https://vimeo.com' + tt['url'],
+                    'url': urljoin('https://vimeo.com', tt['url']),
                  }]
  
          thumbnails = []
@@ -591,14 +593,14 @@ def _real_extract(self, url):
              # Retrieve video webpage to extract further information
              webpage, urlh = self._download_webpage_handle(
                  url, video_id, headers=headers)
-            redirect_url = compat_str(urlh.geturl())
+            redirect_url = urlh.geturl()
          except ExtractorError as ee:
              if isinstance(ee.cause, compat_HTTPError) and ee.cause.code == 403:
                  errmsg = ee.cause.read()
                  if b'Because of its privacy settings, this video cannot be played here' in errmsg:
                      raise ExtractorError(
                          'Cannot download embed-only video without embedding '
-                        'URL. Please call youtube-dl with the URL of the page '
+                        'URL. Please call youtube-dlc with the URL of the page '
                          'that embeds this video.',
                          expected=True)
              raise
@@ -841,33 +843,6 @@ def _extract_list_title(self, webpage):
          return self._TITLE or self._html_search_regex(
              self._TITLE_RE, webpage, 'list title', fatal=False)
  
-    def _login_list_password(self, page_url, list_id, webpage):
-        login_form = self._search_regex(
-            r'(?s)<form[^>]+?id="pw_form"(.*?)</form>',
-            webpage, 'login form', default=None)
-        if not login_form:
-            return webpage
-
-        password = self._downloader.params.get('videopassword')
-        if password is None:
-            raise ExtractorError('This album is protected by a password, use the --video-password option', expected=True)
-        fields = self._hidden_inputs(login_form)
-        token, vuid = self._extract_xsrft_and_vuid(webpage)
-        fields['token'] = token
-        fields['password'] = password
-        post = urlencode_postdata(fields)
-        password_path = self._search_regex(
-            r'action="([^"]+)"', login_form, 'password URL')
-        password_url = compat_urlparse.urljoin(page_url, password_path)
-        password_request = sanitized_Request(password_url, post)
-        password_request.add_header('Content-type', 'application/x-www-form-urlencoded')
-        self._set_vimeo_cookie('vuid', vuid)
-        self._set_vimeo_cookie('xsrft', token)
-
-        return self._download_webpage(
-            password_request, list_id,
-            'Verifying the password', 'Wrong password')
-
      def _title_and_entries(self, list_id, base_url):
          for pagenum in itertools.count(1):
              page_url = self._page_url(base_url, pagenum)
@@ -876,7 +851,6 @@ def _title_and_entries(self, list_id, base_url):
                  'Downloading page %s' % pagenum)
  
              if pagenum == 1:
-                webpage = self._login_list_password(page_url, list_id, webpage)
                  yield self._extract_list_title(webpage)
  
              # Try extracting href first since not all videos are available via
@@ -923,7 +897,7 @@ class VimeoUserIE(VimeoChannelIE):
      _BASE_URL_TEMPL = 'https://vimeo.com/%s'
  
  
-class VimeoAlbumIE(VimeoChannelIE):
+class VimeoAlbumIE(VimeoBaseInfoExtractor):
      IE_NAME = 'vimeo:album'
      _VALID_URL = r'https://vimeo\.com/(?:album|showcase)/(?P<id>\d+)(?:$|[?#]|/(?!video))'
      _TITLE_RE = r'<header id="page_header">\n\s*<h1>(.*?)</h1>'
@@ -973,13 +947,39 @@ def _fetch_page(self, album_id, authorizaion, hashed_pass, page):
      def _real_extract(self, url):
          album_id = self._match_id(url)
          webpage = self._download_webpage(url, album_id)
-        webpage = self._login_list_password(url, album_id, webpage)
-        api_config = self._extract_vimeo_config(webpage, album_id)['api']
+        viewer = self._parse_json(self._search_regex(
+            r'bootstrap_data\s*=\s*({.+?})</script>',
+            webpage, 'bootstrap data'), album_id)['viewer']
+        jwt = viewer['jwt']
+        album = self._download_json(
+            'https://api.vimeo.com/albums/' + album_id,
+            album_id, headers={'Authorization': 'jwt ' + jwt},
+            query={'fields': 'description,name,privacy'})
+        hashed_pass = None
+        if try_get(album, lambda x: x['privacy']['view']) == 'password':
+            password = self._downloader.params.get('videopassword')
+            if not password:
+                raise ExtractorError(
+                    'This album is protected by a password, use the --video-password option',
+                    expected=True)
+            self._set_vimeo_cookie('vuid', viewer['vuid'])
+            try:
+                hashed_pass = self._download_json(
+                    'https://vimeo.com/showcase/%s/auth' % album_id,
+                    album_id, 'Verifying the password', data=urlencode_postdata({
+                        'password': password,
+                        'token': viewer['xsrft'],
+                    }), headers={
+                        'X-Requested-With': 'XMLHttpRequest',
+                    })['hashed_pass']
+            except ExtractorError as e:
+                if isinstance(e.cause, compat_HTTPError) and e.cause.code == 401:
+                    raise ExtractorError('Wrong password', expected=True)
+                raise
          entries = OnDemandPagedList(functools.partial(
-            self._fetch_page, album_id, api_config['jwt'],
-            api_config.get('hashed_pass')), self._PAGE_SIZE)
-        return self.playlist_result(entries, album_id, self._html_search_regex(
-            r'<title>\s*(.+?)(?:\s+on Vimeo)?</title>', webpage, 'title', fatal=False))
+            self._fetch_page, album_id, jwt, hashed_pass), self._PAGE_SIZE)
+        return self.playlist_result(
+            entries, album_id, album.get('name'), album.get('description'))
  
  
  class VimeoGroupsIE(VimeoChannelIE):
diff --git a/youtube_dl/extractor/vimple.py b/youtube_dlc/extractor/vimple.py

similarity index 100%

rename from youtube_dl/extractor/vimple.py

rename to youtube_dlc/extractor/vimple.py
diff --git a/youtube_dl/extractor/vine.py b/youtube_dlc/extractor/vine.py

similarity index 100%

rename from youtube_dl/extractor/vine.py

rename to youtube_dlc/extractor/vine.py
diff --git a/youtube_dl/extractor/viqeo.py b/youtube_dlc/extractor/viqeo.py

similarity index 100%

rename from youtube_dl/extractor/viqeo.py

rename to youtube_dlc/extractor/viqeo.py
diff --git a/youtube_dl/extractor/viu.py b/youtube_dlc/extractor/viu.py

similarity index 100%

rename from youtube_dl/extractor/viu.py

rename to youtube_dlc/extractor/viu.py
diff --git a/youtube_dl/extractor/vk.py b/youtube_dlc/extractor/vk.py

similarity index 100%

rename from youtube_dl/extractor/vk.py

rename to youtube_dlc/extractor/vk.py
diff --git a/youtube_dl/extractor/vlive.py b/youtube_dlc/extractor/vlive.py

similarity index 87%

rename from youtube_dl/extractor/vlive.py

rename to youtube_dlc/extractor/vlive.py

index c3429f723ddec36cb89dfc9f329f766f45b953e1..f79531e6f3a2e922b0369706cc0d76a22feb2499 100644 (file)
--- a/youtube_dl/extractor/vlive.py
+++ b/youtube_dlc/extractor/vlive.py
@@ -6,22 +6,18 @@
  import itertools
  
  from .common import InfoExtractor
-from ..compat import (
-    compat_urllib_parse_urlencode,
-    compat_str,
-)
+from .naver import NaverBaseIE
+from ..compat import compat_str
  from ..utils import (
-    dict_get,
      ExtractorError,
-    float_or_none,
-    int_or_none,
+    merge_dicts,
      remove_start,
      try_get,
      urlencode_postdata,
  )
  
  
-class VLiveIE(InfoExtractor):
+class VLiveIE(NaverBaseIE):
      IE_NAME = 'vlive'
      _VALID_URL = r'https?://(?:(?:www|m)\.)?vlive\.tv/video/(?P<id>[0-9]+)'
      _NETRC_MACHINE = 'vlive'
@@ -34,6 +30,7 @@ class VLiveIE(InfoExtractor):
              'title': "[V LIVE] Girl's Day's Broadcast",
              'creator': "Girl's Day",
              'view_count': int,
+            'uploader_id': 'muploader_a',
          },
      }, {
          'url': 'http://www.vlive.tv/video/16937',
@@ -44,6 +41,7 @@ class VLiveIE(InfoExtractor):
              'creator': 'EXO',
              'view_count': int,
              'subtitles': 'mincount:12',
+            'uploader_id': 'muploader_j',
          },
          'params': {
              'skip_download': True,
@@ -187,45 +185,9 @@ def _replay(self, video_id, webpage, long_video_id, key):
                      'This video is only available for CH+ subscribers')
              long_video_id, key = video_info['vid'], video_info['inkey']
  
-        playinfo = self._download_json(
-            'http://global.apis.naver.com/rmcnmv/rmcnmv/vod_play_videoInfo.json?%s'
-            % compat_urllib_parse_urlencode({
-                'videoId': long_video_id,
-                'key': key,
-                'ptc': 'http',
-                'doct': 'json',  # document type (xml or json)
-                'cpt': 'vtt',  # captions type (vtt or ttml)
-            }), video_id)
-
-        formats = [{
-            'url': vid['source'],
-            'format_id': vid.get('encodingOption', {}).get('name'),
-            'abr': float_or_none(vid.get('bitrate', {}).get('audio')),
-            'vbr': float_or_none(vid.get('bitrate', {}).get('video')),
-            'width': int_or_none(vid.get('encodingOption', {}).get('width')),
-            'height': int_or_none(vid.get('encodingOption', {}).get('height')),
-            'filesize': int_or_none(vid.get('size')),
-        } for vid in playinfo.get('videos', {}).get('list', []) if vid.get('source')]
-        self._sort_formats(formats)
-
-        view_count = int_or_none(playinfo.get('meta', {}).get('count'))
-
-        subtitles = {}
-        for caption in playinfo.get('captions', {}).get('list', []):
-            lang = dict_get(caption, ('locale', 'language', 'country', 'label'))
-            if lang and caption.get('source'):
-                subtitles[lang] = [{
-                    'ext': 'vtt',
-                    'url': caption['source']}]
-
-        info = self._get_common_fields(webpage)
-        info.update({
-            'id': video_id,
-            'formats': formats,
-            'view_count': view_count,
-            'subtitles': subtitles,
-        })
-        return info
+        return merge_dicts(
+            self._get_common_fields(webpage),
+            self._extract_video_info(video_id, long_video_id, key))
  
      def _download_init_page(self, video_id):
          return self._download_webpage(
diff --git a/youtube_dl/extractor/vodlocker.py b/youtube_dlc/extractor/vodlocker.py

similarity index 100%

rename from youtube_dl/extractor/vodlocker.py

rename to youtube_dlc/extractor/vodlocker.py
diff --git a/youtube_dl/extractor/vodpl.py b/youtube_dlc/extractor/vodpl.py

similarity index 100%

rename from youtube_dl/extractor/vodpl.py

rename to youtube_dlc/extractor/vodpl.py
diff --git a/youtube_dl/extractor/vodplatform.py b/youtube_dlc/extractor/vodplatform.py

similarity index 84%

rename from youtube_dl/extractor/vodplatform.py

rename to youtube_dlc/extractor/vodplatform.py

index 239644340384b60c8e1a80d40b50cabbd0fd2c9e..74d2257e7c059cdb410db69d77c25e1c7e8c9902 100644 (file)
--- a/youtube_dl/extractor/vodplatform.py
+++ b/youtube_dlc/extractor/vodplatform.py
@@ -6,8 +6,8 @@
  
  
  class VODPlatformIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:www\.)?vod-platform\.net/[eE]mbed/(?P<id>[^/?#]+)'
-    _TEST = {
+    _VALID_URL = r'https?://(?:(?:www\.)?vod-platform\.net|embed\.kwikmotion\.com)/[eE]mbed/(?P<id>[^/?#]+)'
+    _TESTS = [{
          # from http://www.lbcgroup.tv/watch/chapter/29143/52844/%D8%A7%D9%84%D9%86%D8%B5%D8%B1%D8%A9-%D9%81%D9%8A-%D8%B6%D9%8A%D8%A7%D9%81%D8%A9-%D8%A7%D9%84%D9%80-cnn/ar
          'url': 'http://vod-platform.net/embed/RufMcytHDolTH1MuKHY9Fw',
          'md5': '1db2b7249ce383d6be96499006e951fc',
@@ -16,7 +16,10 @@ class VODPlatformIE(InfoExtractor):
              'ext': 'mp4',
              'title': 'LBCi News_ النصرة في ضيافة الـ "سي.أن.أن"',
          }
-    }
+    }, {
+        'url': 'http://embed.kwikmotion.com/embed/RufMcytHDolTH1MuKHY9Fw',
+        'only_matching': True,
+    }]
  
      def _real_extract(self, url):
          video_id = self._match_id(url)
diff --git a/youtube_dlc/extractor/voicerepublic.py b/youtube_dlc/extractor/voicerepublic.py

new file mode 100644 (file)

index 0000000..a52e40a
--- /dev/null
+++ b/youtube_dlc/extractor/voicerepublic.py
@@ -0,0 +1,62 @@
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..compat import compat_str
+from ..utils import (
+    ExtractorError,
+    determine_ext,
+    int_or_none,
+    urljoin,
+)
+
+
+class VoiceRepublicIE(InfoExtractor):
+    _VALID_URL = r'https?://voicerepublic\.com/(?:talks|embed)/(?P<id>[0-9a-z-]+)'
+    _TESTS = [{
+        'url': 'http://voicerepublic.com/talks/watching-the-watchers-building-a-sousveillance-state',
+        'md5': 'b9174d651323f17783000876347116e3',
+        'info_dict': {
+            'id': '2296',
+            'display_id': 'watching-the-watchers-building-a-sousveillance-state',
+            'ext': 'm4a',
+            'title': 'Watching the Watchers: Building a Sousveillance State',
+            'description': 'Secret surveillance programs have metadata too. The people and companies that operate secret surveillance programs can be surveilled.',
+            'duration': 1556,
+            'view_count': int,
+        }
+    }, {
+        'url': 'http://voicerepublic.com/embed/watching-the-watchers-building-a-sousveillance-state',
+        'only_matching': True,
+    }]
+
+    def _real_extract(self, url):
+        display_id = self._match_id(url)
+
+        webpage = self._download_webpage(url, display_id)
+
+        if '>Queued for processing, please stand by...<' in webpage:
+            raise ExtractorError(
+                'Audio is still queued for processing', expected=True)
+
+        talk = self._parse_json(self._search_regex(
+            r'initialSnapshot\s*=\s*({.+?});',
+            webpage, 'talk'), display_id)['talk']
+        title = talk['title']
+        formats = [{
+            'url': urljoin(url, talk_url),
+            'format_id': format_id,
+            'ext': determine_ext(talk_url) or format_id,
+            'vcodec': 'none',
+        } for format_id, talk_url in talk['media_links'].items()]
+        self._sort_formats(formats)
+
+        return {
+            'id': compat_str(talk.get('id') or display_id),
+            'display_id': display_id,
+            'title': title,
+            'description': talk.get('teaser'),
+            'thumbnail': talk.get('image_url'),
+            'duration': int_or_none(talk.get('archived_duration')),
+            'view_count': int_or_none(talk.get('play_count')),
+            'formats': formats,
+        }
diff --git a/youtube_dl/extractor/voot.py b/youtube_dlc/extractor/voot.py

similarity index 100%

rename from youtube_dl/extractor/voot.py

rename to youtube_dlc/extractor/voot.py
diff --git a/youtube_dl/extractor/voxmedia.py b/youtube_dlc/extractor/voxmedia.py

similarity index 100%

rename from youtube_dl/extractor/voxmedia.py

rename to youtube_dlc/extractor/voxmedia.py
diff --git a/youtube_dl/extractor/vrak.py b/youtube_dlc/extractor/vrak.py

similarity index 100%

rename from youtube_dl/extractor/vrak.py

rename to youtube_dlc/extractor/vrak.py
diff --git a/youtube_dl/extractor/vrt.py b/youtube_dlc/extractor/vrt.py

similarity index 92%

rename from youtube_dl/extractor/vrt.py

rename to youtube_dlc/extractor/vrt.py

index 422025267573fb655230d434edcb5cb2e2d6f2d2..2b65d2e5f395e5eeab0b72cd8ec45ee17229b98e 100644 (file)
--- a/youtube_dl/extractor/vrt.py
+++ b/youtube_dlc/extractor/vrt.py
@@ -55,13 +55,13 @@ def _real_extract(self, url):
          site, display_id = re.match(self._VALID_URL, url).groups()
          webpage = self._download_webpage(url, display_id)
          attrs = extract_attributes(self._search_regex(
-            r'(<[^>]+class="vrtvideo"[^>]*>)', webpage, 'vrt video'))
+            r'(<[^>]+class="vrtvideo( [^"]*)?"[^>]*>)', webpage, 'vrt video'))
  
-        asset_id = attrs['data-videoid']
-        publication_id = attrs.get('data-publicationid')
+        asset_id = attrs['data-video-id']
+        publication_id = attrs.get('data-publication-id')
          if publication_id:
              asset_id = publication_id + '$' + asset_id
-        client = attrs.get('data-client') or self._CLIENT_MAP[site]
+        client = attrs.get('data-client-code') or self._CLIENT_MAP[site]
  
          title = strip_or_none(get_element_by_class(
              'vrt-title', webpage) or self._html_search_meta(
diff --git a/youtube_dl/extractor/vrv.py b/youtube_dlc/extractor/vrv.py

similarity index 100%

rename from youtube_dl/extractor/vrv.py

rename to youtube_dlc/extractor/vrv.py
diff --git a/youtube_dl/extractor/vshare.py b/youtube_dlc/extractor/vshare.py

similarity index 100%

rename from youtube_dl/extractor/vshare.py

rename to youtube_dlc/extractor/vshare.py
diff --git a/youtube_dl/extractor/vube.py b/youtube_dlc/extractor/vube.py

similarity index 100%

rename from youtube_dl/extractor/vube.py

rename to youtube_dlc/extractor/vube.py
diff --git a/youtube_dl/extractor/vuclip.py b/youtube_dlc/extractor/vuclip.py

similarity index 100%

rename from youtube_dl/extractor/vuclip.py

rename to youtube_dlc/extractor/vuclip.py
diff --git a/youtube_dl/extractor/vvvvid.py b/youtube_dlc/extractor/vvvvid.py

similarity index 100%

rename from youtube_dl/extractor/vvvvid.py

rename to youtube_dlc/extractor/vvvvid.py
diff --git a/youtube_dl/extractor/vyborymos.py b/youtube_dlc/extractor/vyborymos.py

similarity index 100%

rename from youtube_dl/extractor/vyborymos.py

rename to youtube_dlc/extractor/vyborymos.py
diff --git a/youtube_dl/extractor/vzaar.py b/youtube_dlc/extractor/vzaar.py

similarity index 100%

rename from youtube_dl/extractor/vzaar.py

rename to youtube_dlc/extractor/vzaar.py
diff --git a/youtube_dl/extractor/wakanim.py b/youtube_dlc/extractor/wakanim.py

similarity index 100%

rename from youtube_dl/extractor/wakanim.py

rename to youtube_dlc/extractor/wakanim.py
diff --git a/youtube_dl/extractor/walla.py b/youtube_dlc/extractor/walla.py

similarity index 100%

rename from youtube_dl/extractor/walla.py

rename to youtube_dlc/extractor/walla.py
diff --git a/youtube_dl/extractor/washingtonpost.py b/youtube_dlc/extractor/washingtonpost.py

similarity index 100%

rename from youtube_dl/extractor/washingtonpost.py

rename to youtube_dlc/extractor/washingtonpost.py
diff --git a/youtube_dl/extractor/wat.py b/youtube_dlc/extractor/wat.py

similarity index 100%

rename from youtube_dl/extractor/wat.py

rename to youtube_dlc/extractor/wat.py
diff --git a/youtube_dl/extractor/watchbox.py b/youtube_dlc/extractor/watchbox.py

similarity index 100%

rename from youtube_dl/extractor/watchbox.py

rename to youtube_dlc/extractor/watchbox.py
diff --git a/youtube_dl/extractor/watchindianporn.py b/youtube_dlc/extractor/watchindianporn.py

similarity index 100%

rename from youtube_dl/extractor/watchindianporn.py

rename to youtube_dlc/extractor/watchindianporn.py
diff --git a/youtube_dl/extractor/wdr.py b/youtube_dlc/extractor/wdr.py

similarity index 97%

rename from youtube_dl/extractor/wdr.py

rename to youtube_dlc/extractor/wdr.py

index cf6f7c7ed6ab5bce442a6cb2079baf687c324417..44d4a13cac006ac650c6a07b1a604729c25c5205 100644 (file)
--- a/youtube_dl/extractor/wdr.py
+++ b/youtube_dlc/extractor/wdr.py
@@ -45,9 +45,18 @@ def _real_extract(self, url):
          media_resource = metadata['mediaResource']
  
          formats = []
+        subtitles = {}
  
          # check if the metadata contains a direct URL to a file
          for kind, media_resource in media_resource.items():
+            if kind == 'captionsHash':
+                for ext, url in media_resource.items():
+                    subtitles.setdefault('de', []).append({
+                        'url': url,
+                        'ext': ext,
+                    })
+                continue
+
              if kind not in ('dflt', 'alt'):
                  continue
  
@@ -81,14 +90,6 @@ def _real_extract(self, url):
  
          self._sort_formats(formats)
  
-        subtitles = {}
-        caption_url = media_resource.get('captionURL')
-        if caption_url:
-            subtitles['de'] = [{
-                'url': caption_url,
-                'ext': 'ttml',
-            }]
-
          title = tracker_data['trackerClipTitle']
  
          return {
diff --git a/youtube_dl/extractor/webcaster.py b/youtube_dlc/extractor/webcaster.py

similarity index 100%

rename from youtube_dl/extractor/webcaster.py

rename to youtube_dlc/extractor/webcaster.py
diff --git a/youtube_dl/extractor/webofstories.py b/youtube_dlc/extractor/webofstories.py

similarity index 100%

rename from youtube_dl/extractor/webofstories.py

rename to youtube_dlc/extractor/webofstories.py
diff --git a/youtube_dl/extractor/weibo.py b/youtube_dlc/extractor/weibo.py

similarity index 100%

rename from youtube_dl/extractor/weibo.py

rename to youtube_dlc/extractor/weibo.py
diff --git a/youtube_dl/extractor/weiqitv.py b/youtube_dlc/extractor/weiqitv.py

similarity index 100%

rename from youtube_dl/extractor/weiqitv.py

rename to youtube_dlc/extractor/weiqitv.py
diff --git a/youtube_dlc/extractor/wistia.py b/youtube_dlc/extractor/wistia.py

new file mode 100644 (file)

index 0000000..77febd2
--- /dev/null
+++ b/youtube_dlc/extractor/wistia.py
@@ -0,0 +1,162 @@
+from __future__ import unicode_literals
+
+import re
+
+from .common import InfoExtractor
+from ..utils import (
+    ExtractorError,
+    int_or_none,
+    float_or_none,
+    unescapeHTML,
+)
+
+
+class WistiaIE(InfoExtractor):
+    _VALID_URL = r'(?:wistia:|https?://(?:fast\.)?wistia\.(?:net|com)/embed/(?:iframe|medias)/)(?P<id>[a-z0-9]{10})'
+    _EMBED_BASE_URL = 'http://fast.wistia.com/embed/'
+
+    _TESTS = [{
+        'url': 'http://fast.wistia.net/embed/iframe/sh7fpupwlt',
+        'md5': 'cafeb56ec0c53c18c97405eecb3133df',
+        'info_dict': {
+            'id': 'sh7fpupwlt',
+            'ext': 'mov',
+            'title': 'Being Resourceful',
+            'description': 'a Clients From Hell Video Series video from worldwidewebhosting',
+            'upload_date': '20131204',
+            'timestamp': 1386185018,
+            'duration': 117,
+        },
+    }, {
+        'url': 'wistia:sh7fpupwlt',
+        'only_matching': True,
+    }, {
+        # with hls video
+        'url': 'wistia:807fafadvk',
+        'only_matching': True,
+    }, {
+        'url': 'http://fast.wistia.com/embed/iframe/sh7fpupwlt',
+        'only_matching': True,
+    }, {
+        'url': 'http://fast.wistia.net/embed/medias/sh7fpupwlt.json',
+        'only_matching': True,
+    }]
+
+    # https://wistia.com/support/embed-and-share/video-on-your-website
+    @staticmethod
+    def _extract_url(webpage):
+        urls = WistiaIE._extract_urls(webpage)
+        return urls[0] if urls else None
+
+    @staticmethod
+    def _extract_urls(webpage):
+        urls = []
+        for match in re.finditer(
+                r'<(?:meta[^>]+?content|(?:iframe|script)[^>]+?src)=["\'](?P<url>(?:https?:)?//(?:fast\.)?wistia\.(?:net|com)/embed/(?:iframe|medias)/[a-z0-9]{10})', webpage):
+            urls.append(unescapeHTML(match.group('url')))
+        for match in re.finditer(
+                r'''(?sx)
+                    <div[^>]+class=(["'])(?:(?!\1).)*?\bwistia_async_(?P<id>[a-z0-9]{10})\b(?:(?!\1).)*?\1
+                ''', webpage):
+            urls.append('wistia:%s' % match.group('id'))
+        for match in re.finditer(r'(?:data-wistia-?id=["\']|Wistia\.embed\(["\']|id=["\']wistia_)(?P<id>[a-z0-9]{10})', webpage):
+            urls.append('wistia:%s' % match.group('id'))
+        return urls
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+
+        data_json = self._download_json(
+            self._EMBED_BASE_URL + 'medias/%s.json' % video_id, video_id,
+            # Some videos require this.
+            headers={
+                'Referer': url if url.startswith('http') else self._EMBED_BASE_URL + 'iframe/' + video_id,
+            })
+
+        if data_json.get('error'):
+            raise ExtractorError(
+                'Error while getting the playlist', expected=True)
+
+        data = data_json['media']
+        title = data['name']
+
+        formats = []
+        thumbnails = []
+        for a in data['assets']:
+            aurl = a.get('url')
+            if not aurl:
+                continue
+            astatus = a.get('status')
+            atype = a.get('type')
+            if (astatus is not None and astatus != 2) or atype in ('preview', 'storyboard'):
+                continue
+            elif atype in ('still', 'still_image'):
+                thumbnails.append({
+                    'url': aurl,
+                    'width': int_or_none(a.get('width')),
+                    'height': int_or_none(a.get('height')),
+                    'filesize': int_or_none(a.get('size')),
+                })
+            else:
+                aext = a.get('ext')
+                display_name = a.get('display_name')
+                format_id = atype
+                if atype and atype.endswith('_video') and display_name:
+                    format_id = '%s-%s' % (atype[:-6], display_name)
+                f = {
+                    'format_id': format_id,
+                    'url': aurl,
+                    'tbr': int_or_none(a.get('bitrate')) or None,
+                    'preference': 1 if atype == 'original' else None,
+                }
+                if display_name == 'Audio':
+                    f.update({
+                        'vcodec': 'none',
+                    })
+                else:
+                    f.update({
+                        'width': int_or_none(a.get('width')),
+                        'height': int_or_none(a.get('height')),
+                        'vcodec': a.get('codec'),
+                    })
+                if a.get('container') == 'm3u8' or aext == 'm3u8':
+                    ts_f = f.copy()
+                    ts_f.update({
+                        'ext': 'ts',
+                        'format_id': f['format_id'].replace('hls-', 'ts-'),
+                        'url': f['url'].replace('.bin', '.ts'),
+                    })
+                    formats.append(ts_f)
+                    f.update({
+                        'ext': 'mp4',
+                        'protocol': 'm3u8_native',
+                    })
+                else:
+                    f.update({
+                        'container': a.get('container'),
+                        'ext': aext,
+                        'filesize': int_or_none(a.get('size')),
+                    })
+                formats.append(f)
+
+        self._sort_formats(formats)
+
+        subtitles = {}
+        for caption in data.get('captions', []):
+            language = caption.get('language')
+            if not language:
+                continue
+            subtitles[language] = [{
+                'url': self._EMBED_BASE_URL + 'captions/' + video_id + '.vtt?language=' + language,
+            }]
+
+        return {
+            'id': video_id,
+            'title': title,
+            'description': data.get('seoDescription'),
+            'formats': formats,
+            'thumbnails': thumbnails,
+            'duration': float_or_none(data.get('duration')),
+            'timestamp': int_or_none(data.get('createdAt')),
+            'subtitles': subtitles,
+        }
diff --git a/youtube_dl/extractor/worldstarhiphop.py b/youtube_dlc/extractor/worldstarhiphop.py

similarity index 100%

rename from youtube_dl/extractor/worldstarhiphop.py

rename to youtube_dlc/extractor/worldstarhiphop.py
diff --git a/youtube_dl/extractor/wsj.py b/youtube_dlc/extractor/wsj.py

similarity index 100%

rename from youtube_dl/extractor/wsj.py

rename to youtube_dlc/extractor/wsj.py
diff --git a/youtube_dl/extractor/wwe.py b/youtube_dlc/extractor/wwe.py

similarity index 100%

rename from youtube_dl/extractor/wwe.py

rename to youtube_dlc/extractor/wwe.py
diff --git a/youtube_dl/extractor/xbef.py b/youtube_dlc/extractor/xbef.py

similarity index 100%

rename from youtube_dl/extractor/xbef.py

rename to youtube_dlc/extractor/xbef.py
diff --git a/youtube_dl/extractor/xboxclips.py b/youtube_dlc/extractor/xboxclips.py

similarity index 100%

rename from youtube_dl/extractor/xboxclips.py

rename to youtube_dlc/extractor/xboxclips.py
diff --git a/youtube_dl/extractor/xfileshare.py b/youtube_dlc/extractor/xfileshare.py

similarity index 100%

rename from youtube_dl/extractor/xfileshare.py

rename to youtube_dlc/extractor/xfileshare.py
diff --git a/youtube_dl/extractor/xhamster.py b/youtube_dlc/extractor/xhamster.py

similarity index 94%

rename from youtube_dl/extractor/xhamster.py

rename to youtube_dlc/extractor/xhamster.py

index a5b94d2794166d452464b728ec40b2b258459c64..76aeaf9a46a6f67a054bfbc3313b8e6d4309a7f2 100644 (file)
--- a/youtube_dl/extractor/xhamster.py
+++ b/youtube_dlc/extractor/xhamster.py
@@ -20,13 +20,13 @@
  
  
  class XHamsterIE(InfoExtractor):
-    _DOMAINS = r'(?:xhamster\.(?:com|one|desi)|xhms\.pro|xhamster[27]\.com)'
+    _DOMAINS = r'(?:xhamster\.(?:com|one|desi)|xhms\.pro|xhamster\d+\.com)'
      _VALID_URL = r'''(?x)
                      https?://
                          (?:.+?\.)?%s/
                          (?:
-                            movies/(?P<id>\d+)/(?P<display_id>[^/]*)\.html|
-                            videos/(?P<display_id_2>[^/]*)-(?P<id_2>\d+)
+                            movies/(?P<id>[\dA-Za-z]+)/(?P<display_id>[^/]*)\.html|
+                            videos/(?P<display_id_2>[^/]*)-(?P<id_2>[\dA-Za-z]+)
                          )
                      ''' % _DOMAINS
      _TESTS = [{
@@ -99,12 +99,21 @@ class XHamsterIE(InfoExtractor):
      }, {
          'url': 'https://xhamster2.com/videos/femaleagent-shy-beauty-takes-the-bait-1509445',
          'only_matching': True,
+    }, {
+        'url': 'https://xhamster11.com/videos/femaleagent-shy-beauty-takes-the-bait-1509445',
+        'only_matching': True,
+    }, {
+        'url': 'https://xhamster26.com/videos/femaleagent-shy-beauty-takes-the-bait-1509445',
+        'only_matching': True,
      }, {
          'url': 'http://xhamster.com/movies/1509445/femaleagent_shy_beauty_takes_the_bait.html',
          'only_matching': True,
      }, {
          'url': 'http://xhamster.com/movies/2221348/britney_spears_sexy_booty.html?hd',
          'only_matching': True,
+    }, {
+        'url': 'http://de.xhamster.com/videos/skinny-girl-fucks-herself-hard-in-the-forest-xhnBJZx',
+        'only_matching': True,
      }]
  
      def _real_extract(self, url):
@@ -113,7 +122,7 @@ def _real_extract(self, url):
          display_id = mobj.group('display_id') or mobj.group('display_id_2')
  
          desktop_url = re.sub(r'^(https?://(?:.+?\.)?)m\.', r'\1', url)
-        webpage = self._download_webpage(desktop_url, video_id)
+        webpage, urlh = self._download_webpage_handle(desktop_url, video_id)
  
          error = self._html_search_regex(
              r'<div[^>]+id=["\']videoClosed["\'][^>]*>(.+?)</div>',
@@ -129,7 +138,8 @@ def get_height(s):
  
          initials = self._parse_json(
              self._search_regex(
-                r'window\.initials\s*=\s*({.+?})\s*;\s*\n', webpage, 'initials',
+                (r'window\.initials\s*=\s*({.+?})\s*;\s*</script>',
+                 r'window\.initials\s*=\s*({.+?})\s*;'), webpage, 'initials',
                  default='{}'),
              video_id, fatal=False)
          if initials:
@@ -161,6 +171,9 @@ def get_height(s):
                          'ext': determine_ext(format_url, 'mp4'),
                          'height': get_height(quality),
                          'filesize': filesize,
+                        'http_headers': {
+                            'Referer': urlh.geturl(),
+                        },
                      })
              self._sort_formats(formats)
  
diff --git a/youtube_dl/extractor/xiami.py b/youtube_dlc/extractor/xiami.py

similarity index 100%

rename from youtube_dl/extractor/xiami.py

rename to youtube_dlc/extractor/xiami.py
diff --git a/youtube_dl/extractor/ximalaya.py b/youtube_dlc/extractor/ximalaya.py

similarity index 100%

rename from youtube_dl/extractor/ximalaya.py

rename to youtube_dlc/extractor/ximalaya.py
diff --git a/youtube_dl/extractor/xminus.py b/youtube_dlc/extractor/xminus.py

similarity index 100%

rename from youtube_dl/extractor/xminus.py

rename to youtube_dlc/extractor/xminus.py
diff --git a/youtube_dl/extractor/xnxx.py b/youtube_dlc/extractor/xnxx.py

similarity index 100%

rename from youtube_dl/extractor/xnxx.py

rename to youtube_dlc/extractor/xnxx.py
diff --git a/youtube_dl/extractor/xstream.py b/youtube_dlc/extractor/xstream.py

similarity index 100%

rename from youtube_dl/extractor/xstream.py

rename to youtube_dlc/extractor/xstream.py
diff --git a/youtube_dl/extractor/xtube.py b/youtube_dlc/extractor/xtube.py

similarity index 76%

rename from youtube_dl/extractor/xtube.py

rename to youtube_dlc/extractor/xtube.py

index c6c0b3291c8320064fa0a7529be5b5d78f14461c..01b253dcb1e8c92232a06c0b2b4153a545dabcc1 100644 (file)
--- a/youtube_dl/extractor/xtube.py
+++ b/youtube_dlc/extractor/xtube.py
@@ -47,7 +47,7 @@ class XTubeIE(InfoExtractor):
              'display_id': 'A-Super-Run-Part-1-YT',
              'ext': 'flv',
              'title': 'A Super Run - Part 1 (YT)',
-            'description': 'md5:ca0d47afff4a9b2942e4b41aa970fd93',
+            'description': 'md5:4cc3af1aa1b0413289babc88f0d4f616',
              'uploader': 'tshirtguy59',
              'duration': 579,
              'view_count': int,
@@ -87,10 +87,24 @@ def _real_extract(self, url):
                  'Cookie': 'age_verified=1; cookiesAccepted=1',
              })
  
-        sources = self._parse_json(self._search_regex(
-            r'(["\'])?sources\1?\s*:\s*(?P<sources>{.+?}),',
-            webpage, 'sources', group='sources'), video_id,
-            transform_source=js_to_json)
+        title, thumbnail, duration = [None] * 3
+
+        config = self._parse_json(self._search_regex(
+            r'playerConf\s*=\s*({.+?})\s*,\s*\n', webpage, 'config',
+            default='{}'), video_id, transform_source=js_to_json, fatal=False)
+        if config:
+            config = config.get('mainRoll')
+            if isinstance(config, dict):
+                title = config.get('title')
+                thumbnail = config.get('poster')
+                duration = int_or_none(config.get('duration'))
+                sources = config.get('sources') or config.get('format')
+
+        if not isinstance(sources, dict):
+            sources = self._parse_json(self._search_regex(
+                r'(["\'])?sources\1?\s*:\s*(?P<sources>{.+?}),',
+                webpage, 'sources', group='sources'), video_id,
+                transform_source=js_to_json)
  
          formats = []
          for format_id, format_url in sources.items():
@@ -102,20 +116,25 @@ def _real_extract(self, url):
          self._remove_duplicate_formats(formats)
          self._sort_formats(formats)
  
-        title = self._search_regex(
-            (r'<h1>\s*(?P<title>[^<]+?)\s*</h1>', r'videoTitle\s*:\s*(["\'])(?P<title>.+?)\1'),
-            webpage, 'title', group='title')
-        description = self._search_regex(
+        if not title:
+            title = self._search_regex(
+                (r'<h1>\s*(?P<title>[^<]+?)\s*</h1>', r'videoTitle\s*:\s*(["\'])(?P<title>.+?)\1'),
+                webpage, 'title', group='title')
+        description = self._og_search_description(
+            webpage, default=None) or self._html_search_meta(
+            'twitter:description', webpage, default=None) or self._search_regex(
              r'</h1>\s*<p>([^<]+)', webpage, 'description', fatal=False)
          uploader = self._search_regex(
              (r'<input[^>]+name="contentOwnerId"[^>]+value="([^"]+)"',
               r'<span[^>]+class="nickname"[^>]*>([^<]+)'),
              webpage, 'uploader', fatal=False)
-        duration = parse_duration(self._search_regex(
-            r'<dt>Runtime:?</dt>\s*<dd>([^<]+)</dd>',
-            webpage, 'duration', fatal=False))
+        if not duration:
+            duration = parse_duration(self._search_regex(
+                r'<dt>Runtime:?</dt>\s*<dd>([^<]+)</dd>',
+                webpage, 'duration', fatal=False))
          view_count = str_to_int(self._search_regex(
-            r'<dt>Views:?</dt>\s*<dd>([\d,\.]+)</dd>',
+            (r'["\']viewsCount["\'][^>]*>(\d+)\s+views',
+             r'<dt>Views:?</dt>\s*<dd>([\d,\.]+)</dd>'),
              webpage, 'view count', fatal=False))
          comment_count = str_to_int(self._html_search_regex(
              r'>Comments? \(([\d,\.]+)\)<',
@@ -126,6 +145,7 @@ def _real_extract(self, url):
              'display_id': display_id,
              'title': title,
              'description': description,
+            'thumbnail': thumbnail,
              'uploader': uploader,
              'duration': duration,
              'view_count': view_count,
@@ -144,7 +164,7 @@ class XTubeUserIE(InfoExtractor):
              'id': 'greenshowers-4056496',
              'age_limit': 18,
          },
-        'playlist_mincount': 155,
+        'playlist_mincount': 154,
      }
  
      def _real_extract(self, url):
diff --git a/youtube_dl/extractor/xuite.py b/youtube_dlc/extractor/xuite.py

similarity index 100%

rename from youtube_dl/extractor/xuite.py

rename to youtube_dlc/extractor/xuite.py
diff --git a/youtube_dl/extractor/xvideos.py b/youtube_dlc/extractor/xvideos.py

similarity index 100%

rename from youtube_dl/extractor/xvideos.py

rename to youtube_dlc/extractor/xvideos.py
diff --git a/youtube_dl/extractor/xxxymovies.py b/youtube_dlc/extractor/xxxymovies.py

similarity index 100%

rename from youtube_dl/extractor/xxxymovies.py

rename to youtube_dlc/extractor/xxxymovies.py
diff --git a/youtube_dl/extractor/yahoo.py b/youtube_dlc/extractor/yahoo.py

similarity index 94%

rename from youtube_dl/extractor/yahoo.py

rename to youtube_dlc/extractor/yahoo.py

index 238d9cea0c729912351895e5bd6ad453d43b7d31..e4615376c428432f7035c2141d1cbecc738496cc 100644 (file)
--- a/youtube_dl/extractor/yahoo.py
+++ b/youtube_dlc/extractor/yahoo.py
@@ -12,6 +12,7 @@
  )
  from ..utils import (
      clean_html,
+    ExtractorError,
      int_or_none,
      mimetype2ext,
      parse_iso8601,
@@ -368,31 +369,47 @@ class YahooGyaOPlayerIE(InfoExtractor):
          'url': 'https://gyao.yahoo.co.jp/episode/%E3%81%8D%E3%81%AE%E3%81%86%E4%BD%95%E9%A3%9F%E3%81%B9%E3%81%9F%EF%BC%9F%20%E7%AC%AC2%E8%A9%B1%202019%2F4%2F12%E6%94%BE%E9%80%81%E5%88%86/5cb02352-b725-409e-9f8d-88f947a9f682',
          'only_matching': True,
      }]
+    _GEO_BYPASS = False
  
      def _real_extract(self, url):
          video_id = self._match_id(url).replace('/', ':')
-        video = self._download_json(
-            'https://gyao.yahoo.co.jp/dam/v1/videos/' + video_id,
-            video_id, query={
-                'fields': 'longDescription,title,videoId',
-            }, headers={
-                'X-User-Agent': 'Unknown Pc GYAO!/2.0.0 Web',
-            })
+        headers = self.geo_verification_headers()
+        headers['Accept'] = 'application/json'
+        resp = self._download_json(
+            'https://gyao.yahoo.co.jp/apis/playback/graphql', video_id, query={
+                'appId': 'dj00aiZpPUNJeDh2cU1RazU3UCZzPWNvbnN1bWVyc2VjcmV0Jng9NTk-',
+                'query': '''{
+  content(parameter: {contentId: "%s", logicaAgent: PC_WEB}) {
+    video {
+      delivery {
+        id
+      }
+      title
+    }
+  }
+}''' % video_id,
+            }, headers=headers)
+        content = resp['data']['content']
+        if not content:
+            msg = resp['errors'][0]['message']
+            if msg == 'not in japan':
+                self.raise_geo_restricted(countries=['JP'])
+            raise ExtractorError(msg)
+        video = content['video']
          return {
              '_type': 'url_transparent',
              'id': video_id,
              'title': video['title'],
              'url': smuggle_url(
-                'http://players.brightcove.net/4235717419001/SyG5P0gjb_default/index.html?videoId=' + video['videoId'],
+                'http://players.brightcove.net/4235717419001/SyG5P0gjb_default/index.html?videoId=' + video['delivery']['id'],
                  {'geo_countries': ['JP']}),
-            'description': video.get('longDescription'),
              'ie_key': BrightcoveNewIE.ie_key(),
          }
  
  
  class YahooGyaOIE(InfoExtractor):
      IE_NAME = 'yahoo:gyao'
-    _VALID_URL = r'https?://(?:gyao\.yahoo\.co\.jp/(?:p|title/[^/]+)|streaming\.yahoo\.co\.jp/p/y)/(?P<id>\d+/v\d+|[\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12})'
+    _VALID_URL = r'https?://(?:gyao\.yahoo\.co\.jp/(?:p|title(?:/[^/]+)?)|streaming\.yahoo\.co\.jp/p/y)/(?P<id>\d+/v\d+|[\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12})'
      _TESTS = [{
          'url': 'https://gyao.yahoo.co.jp/p/00449/v03102/',
          'info_dict': {
@@ -405,6 +422,9 @@ class YahooGyaOIE(InfoExtractor):
      }, {
          'url': 'https://gyao.yahoo.co.jp/title/%E3%81%97%E3%82%83%E3%81%B9%E3%81%8F%E3%82%8A007/5b025a49-b2e5-4dc7-945c-09c6634afacf',
          'only_matching': True,
+    }, {
+        'url': 'https://gyao.yahoo.co.jp/title/5b025a49-b2e5-4dc7-945c-09c6634afacf',
+        'only_matching': True,
      }]
  
      def _real_extract(self, url):
diff --git a/youtube_dl/extractor/yandexdisk.py b/youtube_dlc/extractor/yandexdisk.py

similarity index 100%

rename from youtube_dl/extractor/yandexdisk.py

rename to youtube_dlc/extractor/yandexdisk.py
diff --git a/youtube_dl/extractor/yandexmusic.py b/youtube_dlc/extractor/yandexmusic.py

similarity index 99%

rename from youtube_dl/extractor/yandexmusic.py

rename to youtube_dlc/extractor/yandexmusic.py

index 08d35e04c968a497bc28f4c3207e8d63432bd8d4..4358bc83669ffab11e5dab39444ddce1acdab2cd 100644 (file)
--- a/youtube_dl/extractor/yandexmusic.py
+++ b/youtube_dlc/extractor/yandexmusic.py
@@ -27,12 +27,12 @@ def _handle_error(response):
      @staticmethod
      def _raise_captcha():
          raise ExtractorError(
-            'YandexMusic has considered youtube-dl requests automated and '
+            'YandexMusic has considered youtube-dlc requests automated and '
              'asks you to solve a CAPTCHA. You can either wait for some '
              'time until unblocked and optionally use --sleep-interval '
              'in future or alternatively you can go to https://music.yandex.ru/ '
              'solve CAPTCHA, then export cookies and pass cookie file to '
-            'youtube-dl with --cookies',
+            'youtube-dlc with --cookies',
              expected=True)
  
      def _download_webpage_handle(self, *args, **kwargs):
diff --git a/youtube_dl/extractor/yandexvideo.py b/youtube_dlc/extractor/yandexvideo.py

similarity index 100%

rename from youtube_dl/extractor/yandexvideo.py

rename to youtube_dlc/extractor/yandexvideo.py
diff --git a/youtube_dl/extractor/yapfiles.py b/youtube_dlc/extractor/yapfiles.py

similarity index 100%

rename from youtube_dl/extractor/yapfiles.py

rename to youtube_dlc/extractor/yapfiles.py
diff --git a/youtube_dl/extractor/yesjapan.py b/youtube_dlc/extractor/yesjapan.py

similarity index 100%

rename from youtube_dl/extractor/yesjapan.py

rename to youtube_dlc/extractor/yesjapan.py
diff --git a/youtube_dl/extractor/yinyuetai.py b/youtube_dlc/extractor/yinyuetai.py

similarity index 100%

rename from youtube_dl/extractor/yinyuetai.py

rename to youtube_dlc/extractor/yinyuetai.py
diff --git a/youtube_dl/extractor/ynet.py b/youtube_dlc/extractor/ynet.py

similarity index 100%

rename from youtube_dl/extractor/ynet.py

rename to youtube_dlc/extractor/ynet.py
diff --git a/youtube_dl/extractor/youjizz.py b/youtube_dlc/extractor/youjizz.py

similarity index 97%

rename from youtube_dl/extractor/youjizz.py

rename to youtube_dlc/extractor/youjizz.py

index dff69fcb7aca250373fc0e70b2f8278ed2661755..88aabd272c98e944523f3b333174342ba23c9fe1 100644 (file)
--- a/youtube_dl/extractor/youjizz.py
+++ b/youtube_dlc/extractor/youjizz.py
@@ -44,7 +44,7 @@ def _real_extract(self, url):
  
          encodings = self._parse_json(
              self._search_regex(
-                r'encodings\s*=\s*(\[.+?\]);\n', webpage, 'encodings',
+                r'[Ee]ncodings\s*=\s*(\[.+?\]);\n', webpage, 'encodings',
                  default='[]'),
              video_id, fatal=False)
          for encoding in encodings:
diff --git a/youtube_dl/extractor/youku.py b/youtube_dlc/extractor/youku.py

similarity index 100%

rename from youtube_dl/extractor/youku.py

rename to youtube_dlc/extractor/youku.py
diff --git a/youtube_dl/extractor/younow.py b/youtube_dlc/extractor/younow.py

similarity index 100%

rename from youtube_dl/extractor/younow.py

rename to youtube_dlc/extractor/younow.py
diff --git a/youtube_dl/extractor/youporn.py b/youtube_dlc/extractor/youporn.py

similarity index 90%

rename from youtube_dl/extractor/youporn.py

rename to youtube_dlc/extractor/youporn.py

index d4eccb4b2a48efafec0232a451b3ee617e6bc859..e7fca22dec9a17b10ef76efcd43c4b9e0e4da208 100644 (file)
--- a/youtube_dl/extractor/youporn.py
+++ b/youtube_dlc/extractor/youporn.py
@@ -5,7 +5,6 @@
  from .common import InfoExtractor
  from ..utils import (
      int_or_none,
-    sanitized_Request,
      str_to_int,
      unescapeHTML,
      unified_strdate,
@@ -15,7 +14,7 @@
  
  
  class YouPornIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:www\.)?youporn\.com/watch/(?P<id>\d+)/(?P<display_id>[^/?#&]+)'
+    _VALID_URL = r'https?://(?:www\.)?youporn\.com/(?:watch|embed)/(?P<id>\d+)(?:/(?P<display_id>[^/?#&]+))?'
      _TESTS = [{
          'url': 'http://www.youporn.com/watch/505835/sex-ed-is-it-safe-to-masturbate-daily/',
          'md5': '3744d24c50438cf5b6f6d59feb5055c2',
@@ -57,16 +56,28 @@ class YouPornIE(InfoExtractor):
          'params': {
              'skip_download': True,
          },
+    }, {
+        'url': 'https://www.youporn.com/embed/505835/sex-ed-is-it-safe-to-masturbate-daily/',
+        'only_matching': True,
+    }, {
+        'url': 'http://www.youporn.com/watch/505835',
+        'only_matching': True,
      }]
  
+    @staticmethod
+    def _extract_urls(webpage):
+        return re.findall(
+            r'<iframe[^>]+\bsrc=["\']((?:https?:)?//(?:www\.)?youporn\.com/embed/\d+)',
+            webpage)
+
      def _real_extract(self, url):
          mobj = re.match(self._VALID_URL, url)
          video_id = mobj.group('id')
-        display_id = mobj.group('display_id')
+        display_id = mobj.group('display_id') or video_id
  
-        request = sanitized_Request(url)
-        request.add_header('Cookie', 'age_verified=1')
-        webpage = self._download_webpage(request, display_id)
+        webpage = self._download_webpage(
+            'http://www.youporn.com/watch/%s' % video_id, display_id,
+            headers={'Cookie': 'age_verified=1'})
  
          title = self._html_search_regex(
              r'(?s)<div[^>]+class=["\']watchVideoTitle[^>]+>(.+?)</div>',
diff --git a/youtube_dl/extractor/yourporn.py b/youtube_dlc/extractor/yourporn.py

similarity index 75%

rename from youtube_dl/extractor/yourporn.py

rename to youtube_dlc/extractor/yourporn.py

index 8a2d5f63bdb929edd310c250b9242bd2ed409208..98347491ee00b66d2c2f8df69ddf663c1bffbd84 100644 (file)
--- a/youtube_dl/extractor/yourporn.py
+++ b/youtube_dlc/extractor/yourporn.py
@@ -1,6 +1,7 @@
  from __future__ import unicode_literals
  
  from .common import InfoExtractor
+from ..compat import compat_str
  from ..utils import (
      parse_duration,
      urljoin,
@@ -8,9 +9,9 @@
  
  
  class YourPornIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:www\.)?(?:yourporn\.sexy|sxyprn\.com)/post/(?P<id>[^/?#&.]+)'
+    _VALID_URL = r'https?://(?:www\.)?sxyprn\.com/post/(?P<id>[^/?#&.]+)'
      _TESTS = [{
-        'url': 'https://yourporn.sexy/post/57ffcb2e1179b.html',
+        'url': 'https://sxyprn.com/post/57ffcb2e1179b.html',
          'md5': '6f8682b6464033d87acaa7a8ff0c092e',
          'info_dict': {
              'id': '57ffcb2e1179b',
@@ -33,11 +34,19 @@ def _real_extract(self, url):
  
          webpage = self._download_webpage(url, video_id)
  
-        video_url = urljoin(url, self._parse_json(
+        parts = self._parse_json(
              self._search_regex(
                  r'data-vnfo=(["\'])(?P<data>{.+?})\1', webpage, 'data info',
                  group='data'),
-            video_id)[video_id]).replace('/cdn/', '/cdn5/')
+            video_id)[video_id].split('/')
+
+        num = 0
+        for c in parts[6] + parts[7]:
+            if c.isnumeric():
+                num += int(c)
+        parts[5] = compat_str(int(parts[5]) - num)
+        parts[1] += '8'
+        video_url = urljoin(url, '/'.join(parts))
  
          title = (self._search_regex(
              r'<[^>]+\bclass=["\']PostEditTA[^>]+>([^<]+)', webpage, 'title',
@@ -54,4 +63,5 @@ def _real_extract(self, url):
              'thumbnail': thumbnail,
              'duration': duration,
              'age_limit': 18,
+            'ext': 'mp4',
          }
diff --git a/youtube_dl/extractor/yourupload.py b/youtube_dlc/extractor/yourupload.py

similarity index 100%

rename from youtube_dl/extractor/yourupload.py

rename to youtube_dlc/extractor/yourupload.py
diff --git a/youtube_dl/extractor/youtube.py b/youtube_dlc/extractor/youtube.py

similarity index 90%

rename from youtube_dl/extractor/youtube.py

rename to youtube_dlc/extractor/youtube.py

index b913d07a63920de2c56570644f7b8175afd5fd8a..d3ba4c73cb680ab6424359c79d7cd90e10fb71c3 100644 (file)
--- a/youtube_dl/extractor/youtube.py
+++ b/youtube_dlc/extractor/youtube.py
@@ -29,7 +29,6 @@
  from ..utils import (
      bool_or_none,
      clean_html,
-    dict_get,
      error_to_compat_str,
      extract_attributes,
      ExtractorError,
@@ -71,9 +70,14 @@ class YoutubeBaseInfoExtractor(InfoExtractor):
  
      _PLAYLIST_ID_RE = r'(?:PL|LL|EC|UU|FL|RD|UL|TL|PU|OLAK5uy_)[0-9A-Za-z-_]{10,}'
  
+    _YOUTUBE_CLIENT_HEADERS = {
+        'x-youtube-client-name': '1',
+        'x-youtube-client-version': '1.20200609.04.02',
+    }
+
      def _set_language(self):
          self._set_cookie(
-            '.youtube.com', 'PREF', 'f1=50000000&hl=en',
+            '.youtube.com', 'PREF', 'f1=50000000&f6=8&hl=en',
              # YouTube sets the expire time to about two months
              expire_time=time.time() + 2 * 30 * 24 * 3600)
  
@@ -299,10 +303,11 @@ def _entries(self, page, playlist_id):
                      # Downloading page may result in intermittent 5xx HTTP error
                      # that is usually worked around with a retry
                      more = self._download_json(
-                        'https://youtube.com/%s' % mobj.group('more'), playlist_id,
+                        'https://www.youtube.com/%s' % mobj.group('more'), playlist_id,
                          'Downloading page #%s%s'
                          % (page_num, ' (retry #%d)' % count if count else ''),
-                        transform_source=uppercase_escape)
+                        transform_source=uppercase_escape,
+                        headers=self._YOUTUBE_CLIENT_HEADERS)
                      break
                  except ExtractorError as e:
                      if isinstance(e.cause, compat_HTTPError) and e.cause.code in (500, 503):
@@ -389,8 +394,15 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
                              (?:www\.)?invidious\.drycat\.fr/|
                              (?:www\.)?tube\.poal\.co/|
                              (?:www\.)?vid\.wxzm\.sx/|
+                            (?:www\.)?yewtu\.be/|
                              (?:www\.)?yt\.elukerio\.org/|
                              (?:www\.)?yt\.lelux\.fi/|
+                            (?:www\.)?invidious\.ggc-project\.de/|
+                            (?:www\.)?yt\.maisputain\.ovh/|
+                            (?:www\.)?invidious\.13ad\.de/|
+                            (?:www\.)?invidious\.toot\.koeln/|
+                            (?:www\.)?invidious\.fdn\.fr/|
+                            (?:www\.)?watch\.nettohikari\.com/|
                              (?:www\.)?kgg2m7yk5aybusll\.onion/|
                              (?:www\.)?qklhadlycap4cnod\.onion/|
                              (?:www\.)?axqzx4s6s54s32yentfqojs3x5i7faxza6xo3ehd4bzzsg2ii4fv2iid\.onion/|
@@ -398,6 +410,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
                              (?:www\.)?fz253lmuao3strwbfbmx46yu7acac2jz27iwtorgmbqlkurlclmancad\.onion/|
                              (?:www\.)?invidious\.l4qlywnpwqsluw65ts7md3khrivpirse744un3x7mlskqauz5pyuzgqd\.onion/|
                              (?:www\.)?owxfohz4kjyv25fvlqilyxast7inivgiktls3th44jhk3ej3i7ya\.b32\.i2p/|
+                            (?:www\.)?4l2dgddgsrkf2ous66i6seeyi6etzfgrue332grh2n7madpwopotugyd\.onion/|
                              youtube\.googleapis\.com/)                        # the various hostnames, with wildcard subdomains
                           (?:.*?\#/)?                                          # handle anchor (#/) redirect urls
                           (?:                                                  # the various things that can precede the ID:
@@ -427,6 +440,10 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
                       (?(1).+)?                                                # if we found the ID, everything can follow
                       $""" % {'playlist_id': YoutubeBaseInfoExtractor._PLAYLIST_ID_RE}
      _NEXT_URL_RE = r'[\?&]next_url=([^&]+)'
+    _PLAYER_INFO_RE = (
+        r'/(?P<id>[a-zA-Z0-9_-]{8,})/player_ias\.vflset(?:/[a-zA-Z]{2,3}_[a-zA-Z]{2,3})?/base\.(?P<ext>[a-z]+)$',
+        r'\b(?P<id>vfl[a-zA-Z0-9_-]+)\b.*?\.(?P<ext>[a-z]+)$',
+    )
      _formats = {
          '5': {'ext': 'flv', 'width': 400, 'height': 240, 'acodec': 'mp3', 'abr': 64, 'vcodec': 'h263'},
          '6': {'ext': 'flv', 'width': 450, 'height': 270, 'acodec': 'mp3', 'abr': 64, 'vcodec': 'h263'},
@@ -532,7 +549,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
          '396': {'acodec': 'none', 'vcodec': 'av01.0.05M.08'},
          '397': {'acodec': 'none', 'vcodec': 'av01.0.05M.08'},
      }
-    _SUBTITLE_FORMATS = ('srv1', 'srv2', 'srv3', 'ttml', 'vtt')
+    _SUBTITLE_FORMATS = ('srv1', 'srv2', 'srv3', 'ttml', 'vtt', 'json3')
  
      _GEO_BYPASS = False
  
@@ -570,7 +587,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
                  'upload_date': '20120506',
                  'title': 'Icona Pop - I Love It (feat. Charli XCX) [OFFICIAL VIDEO]',
                  'alt_title': 'I Love It (feat. Charli XCX)',
-                'description': 'md5:f3ceb5ef83a08d95b9d146f973157cc8',
+                'description': 'md5:19a2f98d9032b9311e686ed039564f63',
                  'tags': ['Icona Pop i love it', 'sweden', 'pop music', 'big beat records', 'big beat', 'charli',
                           'xcx', 'charli xcx', 'girls', 'hbo', 'i love it', "i don't care", 'icona', 'pop',
                           'iconic ep', 'iconic', 'love', 'it'],
@@ -685,12 +702,11 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
                  'id': 'nfWlot6h_JM',
                  'ext': 'm4a',
                  'title': 'Taylor Swift - Shake It Off',
-                'description': 'md5:bec2185232c05479482cb5a9b82719bf',
+                'description': 'md5:307195cd21ff7fa352270fe884570ef0',
                  'duration': 242,
                  'uploader': 'TaylorSwiftVEVO',
                  'uploader_id': 'TaylorSwiftVEVO',
                  'upload_date': '20140818',
-                'creator': 'Taylor Swift',
              },
              'params': {
                  'youtube_include_dash_manifest': True,
@@ -755,11 +771,11 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
                  'upload_date': '20100430',
                  'uploader_id': 'deadmau5',
                  'uploader_url': r're:https?://(?:www\.)?youtube\.com/user/deadmau5',
-                'creator': 'deadmau5',
+                'creator': 'Dada Life, deadmau5',
                  'description': 'md5:12c56784b8032162bb936a5f76d55360',
                  'uploader': 'deadmau5',
                  'title': 'Deadmau5 - Some Chords (HD)',
-                'alt_title': 'Some Chords',
+                'alt_title': 'This Machine Kills Some Chords',
              },
              'expected_warnings': [
                  'DASH manifest missing',
@@ -1135,6 +1151,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
                  'skip_download': True,
                  'youtube_include_dash_manifest': False,
              },
+            'skip': 'not actual anymore',
          },
          {
              # Youtube Music Auto-generated description
@@ -1145,8 +1162,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
                  'title': 'Voyeur Girl',
                  'description': 'md5:7ae382a65843d6df2685993e90a8628f',
                  'upload_date': '20190312',
-                'uploader': 'Various Artists - Topic',
-                'uploader_id': 'UCVWKBi1ELZn0QX2CBLSkiyw',
+                'uploader': 'Stephen - Topic',
+                'uploader_id': 'UC-pWHpBjdGG69N9mM2auIAA',
                  'artist': 'Stephen',
                  'track': 'Voyeur Girl',
                  'album': 'it\'s too much love to know my dear',
@@ -1210,7 +1227,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
                  'id': '-hcAI0g-f5M',
                  'ext': 'mp4',
                  'title': 'Put It On Me',
-                'description': 'md5:93c55acc682ae7b0c668f2e34e1c069e',
+                'description': 'md5:f6422397c07c4c907c6638e1fee380a5',
                  'upload_date': '20180426',
                  'uploader': 'Matt Maeson - Topic',
                  'uploader_id': 'UCnEkIGqtGcQMLk73Kp-Q5LQ',
@@ -1228,6 +1245,26 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
              'url': 'https://www.youtubekids.com/watch?v=3b8nCWDgZ6Q',
              'only_matching': True,
          },
+        {
+            # invalid -> valid video id redirection
+            'url': 'DJztXj2GPfl',
+            'info_dict': {
+                'id': 'DJztXj2GPfk',
+                'ext': 'mp4',
+                'title': 'Panjabi MC - Mundian To Bach Ke (The Dictator Soundtrack)',
+                'description': 'md5:bf577a41da97918e94fa9798d9228825',
+                'upload_date': '20090125',
+                'uploader': 'Prochorowka',
+                'uploader_id': 'Prochorowka',
+                'uploader_url': r're:https?://(?:www\.)?youtube\.com/user/Prochorowka',
+                'artist': 'Panjabi MC',
+                'track': 'Beware of the Boys (Mundian to Bach Ke) - Motivo Hi-Lectro Remix',
+                'album': 'Beware of the Boys (Mundian To Bach Ke)',
+            },
+            'params': {
+                'skip_download': True,
+            },
+        }
      ]
  
      def __init__(self, *args, **kwargs):
@@ -1254,14 +1291,18 @@ def _signature_cache_id(self, example_sig):
          """ Return a string representation of a signature """
          return '.'.join(compat_str(len(part)) for part in example_sig.split('.'))
  
-    def _extract_signature_function(self, video_id, player_url, example_sig):
-        id_m = re.match(
-            r'.*?-(?P<id>[a-zA-Z0-9_-]+)(?:/watch_as3|/html5player(?:-new)?|(?:/[a-z]{2,3}_[A-Z]{2})?/base)?\.(?P<ext>[a-z]+)$',
-            player_url)
-        if not id_m:
+    @classmethod
+    def _extract_player_info(cls, player_url):
+        for player_re in cls._PLAYER_INFO_RE:
+            id_m = re.search(player_re, player_url)
+            if id_m:
+                break
+        else:
              raise ExtractorError('Cannot identify player %r' % player_url)
-        player_type = id_m.group('ext')
-        player_id = id_m.group('id')
+        return id_m.group('ext'), id_m.group('id')
+
+    def _extract_signature_function(self, video_id, player_url, example_sig):
+        player_type, player_id = self._extract_player_info(player_url)
  
          # Read from filesystem cache
          func_id = '%s_%s_%s' % (
@@ -1343,6 +1384,7 @@ def _parse_sig_js(self, jscode):
          funcname = self._search_regex(
              (r'\b[cs]\s*&&\s*[adf]\.set\([^,]+\s*,\s*encodeURIComponent\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
               r'\b[a-zA-Z0-9]+\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*encodeURIComponent\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
+             r'(?:\b|[^a-zA-Z0-9$])(?P<sig>[a-zA-Z0-9$]{2})\s*=\s*function\(\s*a\s*\)\s*{\s*a\s*=\s*a\.split\(\s*""\s*\)',
               r'(?P<sig>[a-zA-Z0-9$]+)\s*=\s*function\(\s*a\s*\)\s*{\s*a\s*=\s*a\.split\(\s*""\s*\)',
               # Obsolete patterns
               r'(["\'])signature\1\s*,\s*(?P<sig>[a-zA-Z0-9$]+)\(',
@@ -1393,7 +1435,7 @@ def _decrypt_signature(self, s, video_id, player_url, age_gate=False):
              raise ExtractorError(
                  'Signature extraction failed: ' + tb, cause=e)
  
-    def _get_subtitles(self, video_id, webpage):
+    def _get_subtitles(self, video_id, webpage, has_live_chat_replay):
          try:
              subs_doc = self._download_xml(
                  'https://video.google.com/timedtext?hl=en&type=list&v=%s' % video_id,
@@ -1420,6 +1462,14 @@ def _get_subtitles(self, video_id, webpage):
                      'ext': ext,
                  })
              sub_lang_list[lang] = sub_formats
+        if has_live_chat_replay:
+            sub_lang_list['live_chat'] = [
+                {
+                    'video_id': video_id,
+                    'ext': 'json',
+                    'protocol': 'youtube_live_chat_replay',
+                },
+            ]
          if not sub_lang_list:
              self._downloader.report_warning('video doesn\'t have subtitles')
              return {}
@@ -1443,6 +1493,15 @@ def _get_ytplayer_config(self, video_id, webpage):
              return self._parse_json(
                  uppercase_escape(config), video_id, fatal=False)
  
+    def _get_yt_initial_data(self, video_id, webpage):
+        config = self._search_regex(
+            (r'window\["ytInitialData"\]\s*=\s*(.*?)(?<=});',
+             r'var\s+ytInitialData\s*=\s*(.*?)(?<=});'),
+            webpage, 'ytInitialData', default=None)
+        if config:
+            return self._parse_json(
+                uppercase_escape(config), video_id, fatal=False)
+
      def _get_automatic_captions(self, video_id, webpage):
          """We need the webpage for getting the captions url, pass it as an
             argument to speed up the process."""
@@ -1518,14 +1577,21 @@ def make_captions(sub_url, sub_langs):
                      player_response, video_id, fatal=False)
                  if player_response:
                      renderer = player_response['captions']['playerCaptionsTracklistRenderer']
-                    base_url = renderer['captionTracks'][0]['baseUrl']
-                    sub_lang_list = []
-                    for lang in renderer['translationLanguages']:
-                        lang_code = lang.get('languageCode')
-                        if lang_code:
-                            sub_lang_list.append(lang_code)
-                    return make_captions(base_url, sub_lang_list)
-
+                    caption_tracks = renderer['captionTracks']
+                    for caption_track in caption_tracks:
+                        if 'kind' not in caption_track:
+                            # not an automatic transcription
+                            continue
+                        base_url = caption_track['baseUrl']
+                        sub_lang_list = []
+                        for lang in renderer['translationLanguages']:
+                            lang_code = lang.get('languageCode')
+                            if lang_code:
+                                sub_lang_list.append(lang_code)
+                        return make_captions(base_url, sub_lang_list)
+
+                    self._downloader.report_warning("Couldn't find automatic captions for %s" % video_id)
+                    return {}
              # Some videos don't provide ttsurl but rather caption_tracks and
              # caption_translation_languages (e.g. 20LmZk1hakA)
              # Does not used anymore as of 22.06.2017
@@ -1616,8 +1682,57 @@ def extract_id(cls, url):
          video_id = mobj.group(2)
          return video_id
  
+    def _extract_chapters_from_json(self, webpage, video_id, duration):
+        if not webpage:
+            return
+        initial_data = self._parse_json(
+            self._search_regex(
+                r'window\["ytInitialData"\] = (.+);\n', webpage,
+                'player args', default='{}'),
+            video_id, fatal=False)
+        if not initial_data or not isinstance(initial_data, dict):
+            return
+        chapters_list = try_get(
+            initial_data,
+            lambda x: x['playerOverlays']
+                       ['playerOverlayRenderer']
+                       ['decoratedPlayerBarRenderer']
+                       ['decoratedPlayerBarRenderer']
+                       ['playerBar']
+                       ['chapteredPlayerBarRenderer']
+                       ['chapters'],
+            list)
+        if not chapters_list:
+            return
+
+        def chapter_time(chapter):
+            return float_or_none(
+                try_get(
+                    chapter,
+                    lambda x: x['chapterRenderer']['timeRangeStartMillis'],
+                    int),
+                scale=1000)
+        chapters = []
+        for next_num, chapter in enumerate(chapters_list, start=1):
+            start_time = chapter_time(chapter)
+            if start_time is None:
+                continue
+            end_time = (chapter_time(chapters_list[next_num])
+                        if next_num < len(chapters_list) else duration)
+            if end_time is None:
+                continue
+            title = try_get(
+                chapter, lambda x: x['chapterRenderer']['title']['simpleText'],
+                compat_str)
+            chapters.append({
+                'start_time': start_time,
+                'end_time': end_time,
+                'title': title,
+            })
+        return chapters
+
      @staticmethod
-    def _extract_chapters(description, duration):
+    def _extract_chapters_from_description(description, duration):
          if not description:
              return None
          chapter_lines = re.findall(
@@ -1651,6 +1766,10 @@ def _extract_chapters(description, duration):
              })
          return chapters
  
+    def _extract_chapters(self, webpage, description, video_id, duration):
+        return (self._extract_chapters_from_json(webpage, video_id, duration)
+                or self._extract_chapters_from_description(description, duration))
+
      def _real_extract(self, url):
          url, smuggled_data = unsmuggle_url(url, {})
  
@@ -1678,7 +1797,10 @@ def _real_extract(self, url):
  
          # Get video webpage
          url = proto + '://www.youtube.com/watch?v=%s&gl=US&hl=en&has_verified=1&bpctr=9999999999' % video_id
-        video_webpage = self._download_webpage(url, video_id)
+        video_webpage, urlh = self._download_webpage_handle(url, video_id)
+
+        qs = compat_parse_qs(compat_urllib_parse_urlparse(urlh.geturl()).query)
+        video_id = qs.get('v', [None])[0] or video_id
  
          # Attempt to extract SWF player URL
          mobj = re.search(r'swfConfig.*?"(https?:\\/\\/.*?watch.*?-.*?\.swf)"', video_webpage)
@@ -1707,9 +1829,6 @@ def add_dash_mpd_pr(pl_response):
          def extract_view_count(v_info):
              return int_or_none(try_get(v_info, lambda x: x['view_count'][0]))
  
-        def extract_token(v_info):
-            return dict_get(v_info, ('account_playback_token', 'accountPlaybackToken', 'token'))
-
          def extract_player_response(player_response, video_id):
              pl_response = str_or_none(player_response)
              if not pl_response:
@@ -1722,8 +1841,10 @@ def extract_player_response(player_response, video_id):
          player_response = {}
  
          # Get video info
+        video_info = {}
          embed_webpage = None
-        if re.search(r'player-age-gate-content">', video_webpage) is not None:
+        if (self._og_search_property('restrictions:age', video_webpage, default=None) == '18+'
+                or re.search(r'player-age-gate-content">', video_webpage) is not None):
              age_gate = True
              # We simulate the access to the video from www.youtube.com/v/{video_id}
              # this can be viewed without login into Youtube
@@ -1736,19 +1857,21 @@ def extract_player_response(player_response, video_id):
                      r'"sts"\s*:\s*(\d+)', embed_webpage, 'sts', default=''),
              })
              video_info_url = proto + '://www.youtube.com/get_video_info?' + data
-            video_info_webpage = self._download_webpage(
-                video_info_url, video_id,
-                note='Refetching age-gated info webpage',
-                errnote='unable to download video info webpage')
-            video_info = compat_parse_qs(video_info_webpage)
-            pl_response = video_info.get('player_response', [None])[0]
-            player_response = extract_player_response(pl_response, video_id)
-            add_dash_mpd(video_info)
-            view_count = extract_view_count(video_info)
+            try:
+                video_info_webpage = self._download_webpage(
+                    video_info_url, video_id,
+                    note='Refetching age-gated info webpage',
+                    errnote='unable to download video info webpage')
+            except ExtractorError:
+                video_info_webpage = None
+            if video_info_webpage:
+                video_info = compat_parse_qs(video_info_webpage)
+                pl_response = video_info.get('player_response', [None])[0]
+                player_response = extract_player_response(pl_response, video_id)
+                add_dash_mpd(video_info)
+                view_count = extract_view_count(video_info)
          else:
              age_gate = False
-            video_info = None
-            sts = None
              # Try looking directly into the video webpage
              ytplayer_config = self._get_ytplayer_config(video_id, video_webpage)
              if ytplayer_config:
@@ -1765,61 +1888,10 @@ def extract_player_response(player_response, video_id):
                          args['ypc_vid'], YoutubeIE.ie_key(), video_id=args['ypc_vid'])
                  if args.get('livestream') == '1' or args.get('live_playback') == 1:
                      is_live = True
-                sts = ytplayer_config.get('sts')
                  if not player_response:
                      player_response = extract_player_response(args.get('player_response'), video_id)
              if not video_info or self._downloader.params.get('youtube_include_dash_manifest', True):
                  add_dash_mpd_pr(player_response)
-                # We also try looking in get_video_info since it may contain different dashmpd
-                # URL that points to a DASH manifest with possibly different itag set (some itags
-                # are missing from DASH manifest pointed by webpage's dashmpd, some - from DASH
-                # manifest pointed by get_video_info's dashmpd).
-                # The general idea is to take a union of itags of both DASH manifests (for example
-                # video with such 'manifest behavior' see https://github.com/ytdl-org/youtube-dl/issues/6093)
-                self.report_video_info_webpage_download(video_id)
-                for el in ('embedded', 'detailpage', 'vevo', ''):
-                    query = {
-                        'video_id': video_id,
-                        'ps': 'default',
-                        'eurl': '',
-                        'gl': 'US',
-                        'hl': 'en',
-                    }
-                    if el:
-                        query['el'] = el
-                    if sts:
-                        query['sts'] = sts
-                    video_info_webpage = self._download_webpage(
-                        '%s://www.youtube.com/get_video_info' % proto,
-                        video_id, note=False,
-                        errnote='unable to download video info webpage',
-                        fatal=False, query=query)
-                    if not video_info_webpage:
-                        continue
-                    get_video_info = compat_parse_qs(video_info_webpage)
-                    if not player_response:
-                        pl_response = get_video_info.get('player_response', [None])[0]
-                        player_response = extract_player_response(pl_response, video_id)
-                    add_dash_mpd(get_video_info)
-                    if view_count is None:
-                        view_count = extract_view_count(get_video_info)
-                    if not video_info:
-                        video_info = get_video_info
-                    get_token = extract_token(get_video_info)
-                    if get_token:
-                        # Different get_video_info requests may report different results, e.g.
-                        # some may report video unavailability, but some may serve it without
-                        # any complaint (see https://github.com/ytdl-org/youtube-dl/issues/7362,
-                        # the original webpage as well as el=info and el=embedded get_video_info
-                        # requests report video unavailability due to geo restriction while
-                        # el=detailpage succeeds and returns valid data). This is probably
-                        # due to YouTube measures against IP ranges of hosting providers.
-                        # Working around by preferring the first succeeded video_info containing
-                        # the token if no such video_info yet was found.
-                        token = extract_token(video_info)
-                        if not token:
-                            video_info = get_video_info
-                        break
  
          def extract_unavailable_message():
              messages = []
@@ -1832,16 +1904,22 @@ def extract_unavailable_message():
              if messages:
                  return '\n'.join(messages)
  
-        if not video_info:
+        if not video_info and not player_response:
              unavailable_message = extract_unavailable_message()
              if not unavailable_message:
                  unavailable_message = 'Unable to extract video data'
              raise ExtractorError(
                  'YouTube said: %s' % unavailable_message, expected=True, video_id=video_id)
  
+        if not isinstance(video_info, dict):
+            video_info = {}
+
          video_details = try_get(
              player_response, lambda x: x['videoDetails'], dict) or {}
  
+        microformat = try_get(
+            player_response, lambda x: x['microformat']['playerMicroformatRenderer'], dict) or {}
+
          video_title = video_info.get('title', [None])[0] or video_details.get('title')
          if not video_title:
              self._downloader.report_warning('Unable to extract video title')
@@ -1871,7 +1949,7 @@ def replace_url(m):
              ''', replace_url, video_description)
              video_description = clean_html(video_description)
          else:
-            video_description = self._html_search_meta('description', video_webpage) or video_details.get('shortDescription')
+            video_description = video_details.get('shortDescription') or self._html_search_meta('description', video_webpage)
  
          if not smuggled_data.get('force_singlefeed', False):
              if not self._downloader.params.get('noplaylist'):
@@ -1888,15 +1966,26 @@ def replace_url(m):
                          # fields may contain comma as well (see
                          # https://github.com/ytdl-org/youtube-dl/issues/8536)
                          feed_data = compat_parse_qs(compat_urllib_parse_unquote_plus(feed))
+
+                        def feed_entry(name):
+                            return try_get(feed_data, lambda x: x[name][0], compat_str)
+
+                        feed_id = feed_entry('id')
+                        if not feed_id:
+                            continue
+                        feed_title = feed_entry('title')
+                        title = video_title
+                        if feed_title:
+                            title += ' (%s)' % feed_title
                          entries.append({
                              '_type': 'url_transparent',
                              'ie_key': 'Youtube',
                              'url': smuggle_url(
                                  '%s://www.youtube.com/watch?v=%s' % (proto, feed_data['id'][0]),
                                  {'force_singlefeed': True}),
-                            'title': '%s (%s)' % (video_title, feed_data['title'][0]),
+                            'title': title,
                          })
-                        feed_ids.append(feed_data['id'][0])
+                        feed_ids.append(feed_id)
                      self.to_screen(
                          'Downloading multifeed video (%s) - add --no-playlist to just download video %s'
                          % (', '.join(feed_ids), video_id))
@@ -1908,10 +1997,21 @@ def replace_url(m):
              view_count = extract_view_count(video_info)
          if view_count is None and video_details:
              view_count = int_or_none(video_details.get('viewCount'))
+        if view_count is None and microformat:
+            view_count = int_or_none(microformat.get('viewCount'))
  
          if is_live is None:
              is_live = bool_or_none(video_details.get('isLive'))
  
+        has_live_chat_replay = False
+        if not is_live:
+            yt_initial_data = self._get_yt_initial_data(video_id, video_webpage)
+            try:
+                yt_initial_data['contents']['twoColumnWatchNextResults']['conversationBar']['liveChatRenderer']['continuations'][0]['reloadContinuationData']['continuation']
+                has_live_chat_replay = True
+            except (KeyError, IndexError, TypeError):
+                pass
+
          # Check for "rental" videos
          if 'ypc_video_rental_bar_text' in video_info and 'author' not in video_info:
              raise ExtractorError('"rental" videos not supported. See https://github.com/ytdl-org/youtube-dl/issues/359 for more information.', expected=True)
@@ -1967,12 +2067,12 @@ def _extract_filesize(media_url):
                  }
  
              for fmt in streaming_formats:
-                if fmt.get('drm_families'):
+                if fmt.get('drmFamilies') or fmt.get('drm_families'):
                      continue
                  url = url_or_none(fmt.get('url'))
  
                  if not url:
-                    cipher = fmt.get('cipher')
+                    cipher = fmt.get('cipher') or fmt.get('signatureCipher')
                      if not cipher:
                          continue
                      url_data = compat_parse_qs(cipher)
@@ -2023,22 +2123,10 @@ def _extract_filesize(media_url):
  
                          if self._downloader.params.get('verbose'):
                              if player_url is None:
-                                player_version = 'unknown'
                                  player_desc = 'unknown'
                              else:
-                                if player_url.endswith('swf'):
-                                    player_version = self._search_regex(
-                                        r'-(.+?)(?:/watch_as3)?\.swf$', player_url,
-                                        'flash player', fatal=False)
-                                    player_desc = 'flash player %s' % player_version
-                                else:
-                                    player_version = self._search_regex(
-                                        [r'html5player-([^/]+?)(?:/html5player(?:-new)?)?\.js',
-                                         r'(?:www|player(?:_ias)?)-([^/]+)(?:/[a-z]{2,3}_[A-Z]{2})?/base\.js'],
-                                        player_url,
-                                        'html5 player', fatal=False)
-                                    player_desc = 'html5 player %s' % player_version
-
+                                player_type, player_version = self._extract_player_info(player_url)
+                                player_desc = '%s player %s' % ('flash' if player_type == 'swf' else 'html5', player_version)
                              parts_sizes = self._signature_cache_id(encrypted_sig)
                              self.to_screen('{%s} signature length %s, %s' %
                                             (format_id, parts_sizes, player_desc))
@@ -2171,7 +2259,12 @@ def _extract_filesize(media_url):
              video_uploader_id = mobj.group('uploader_id')
              video_uploader_url = mobj.group('uploader_url')
          else:
-            self._downloader.report_warning('unable to extract uploader nickname')
+            owner_profile_url = url_or_none(microformat.get('ownerProfileUrl'))
+            if owner_profile_url:
+                video_uploader_id = self._search_regex(
+                    r'(?:user|channel)/([^/]+)', owner_profile_url, 'uploader id',
+                    default=None)
+                video_uploader_url = owner_profile_url
  
          channel_id = (
              str_or_none(video_details.get('channelId'))
@@ -2182,17 +2275,33 @@ def _extract_filesize(media_url):
                  video_webpage, 'channel id', default=None, group='id'))
          channel_url = 'http://www.youtube.com/channel/%s' % channel_id if channel_id else None
  
-        # thumbnail image
-        # We try first to get a high quality image:
-        m_thumb = re.search(r'<span itemprop="thumbnail".*?href="(.*?)">',
-                            video_webpage, re.DOTALL)
-        if m_thumb is not None:
-            video_thumbnail = m_thumb.group(1)
-        elif 'thumbnail_url' not in video_info:
-            self._downloader.report_warning('unable to extract video thumbnail')
+        thumbnails = []
+        thumbnails_list = try_get(
+            video_details, lambda x: x['thumbnail']['thumbnails'], list) or []
+        for t in thumbnails_list:
+            if not isinstance(t, dict):
+                continue
+            thumbnail_url = url_or_none(t.get('url'))
+            if not thumbnail_url:
+                continue
+            thumbnails.append({
+                'url': thumbnail_url,
+                'width': int_or_none(t.get('width')),
+                'height': int_or_none(t.get('height')),
+            })
+
+        if not thumbnails:
              video_thumbnail = None
-        else:   # don't panic if we can't find it
-            video_thumbnail = compat_urllib_parse_unquote_plus(video_info['thumbnail_url'][0])
+            # We try first to get a high quality image:
+            m_thumb = re.search(r'<span itemprop="thumbnail".*?href="(.*?)">',
+                                video_webpage, re.DOTALL)
+            if m_thumb is not None:
+                video_thumbnail = m_thumb.group(1)
+            thumbnail_url = try_get(video_info, lambda x: x['thumbnail_url'][0], compat_str)
+            if thumbnail_url:
+                video_thumbnail = compat_urllib_parse_unquote_plus(thumbnail_url)
+            if video_thumbnail:
+                thumbnails.append({'url': video_thumbnail})
  
          # upload date
          upload_date = self._html_search_meta(
@@ -2202,6 +2311,8 @@ def _extract_filesize(media_url):
                  [r'(?s)id="eow-date.*?>(.*?)</span>',
                   r'(?:id="watch-uploader-info".*?>.*?|["\']simpleText["\']\s*:\s*["\'])(?:Published|Uploaded|Streamed live|Started) on (.+?)[<"\']'],
                  video_webpage, 'upload date', default=None)
+        if not upload_date:
+            upload_date = microformat.get('publishDate') or microformat.get('uploadDate')
          upload_date = unified_strdate(upload_date)
  
          video_license = self._html_search_regex(
@@ -2273,17 +2384,21 @@ def extract_meta(field):
          m_cat_container = self._search_regex(
              r'(?s)<h4[^>]*>\s*Category\s*</h4>\s*<ul[^>]*>(.*?)</ul>',
              video_webpage, 'categories', default=None)
+        category = None
          if m_cat_container:
              category = self._html_search_regex(
                  r'(?s)<a[^<]+>(.*?)</a>', m_cat_container, 'category',
                  default=None)
-            video_categories = None if category is None else [category]
-        else:
-            video_categories = None
+        if not category:
+            category = try_get(
+                microformat, lambda x: x['category'], compat_str)
+        video_categories = None if category is None else [category]
  
          video_tags = [
              unescapeHTML(m.group('content'))
              for m in re.finditer(self._meta_regex('og:video:tag'), video_webpage)]
+        if not video_tags:
+            video_tags = try_get(video_details, lambda x: x['keywords'], list)
  
          def _extract_count(count_name):
              return str_to_int(self._search_regex(
@@ -2304,7 +2419,8 @@ def _extract_count(count_name):
              or try_get(video_info, lambda x: float_or_none(x['avg_rating'][0])))
  
          # subtitles
-        video_subtitles = self.extract_subtitles(video_id, video_webpage)
+        video_subtitles = self.extract_subtitles(
+            video_id, video_webpage, has_live_chat_replay)
          automatic_captions = self.extract_automatic_captions(video_id, video_webpage)
  
          video_duration = try_get(
@@ -2334,7 +2450,7 @@ def _extract_count(count_name):
                      errnote='Unable to download video annotations', fatal=False,
                      data=urlencode_postdata({xsrf_field_name: xsrf_token}))
  
-        chapters = self._extract_chapters(description_original, video_duration)
+        chapters = self._extract_chapters(video_webpage, description_original, video_id, video_duration)
  
          # Look for the DASH manifest
          if self._downloader.params.get('youtube_include_dash_manifest', True):
@@ -2391,30 +2507,23 @@ def decrypt_sig(mobj):
                          f['stretched_ratio'] = ratio
  
          if not formats:
-            token = extract_token(video_info)
-            if not token:
-                if 'reason' in video_info:
-                    if 'The uploader has not made this video available in your country.' in video_info['reason']:
-                        regions_allowed = self._html_search_meta(
-                            'regionsAllowed', video_webpage, default=None)
-                        countries = regions_allowed.split(',') if regions_allowed else None
-                        self.raise_geo_restricted(
-                            msg=video_info['reason'][0], countries=countries)
-                    reason = video_info['reason'][0]
-                    if 'Invalid parameters' in reason:
-                        unavailable_message = extract_unavailable_message()
-                        if unavailable_message:
-                            reason = unavailable_message
-                    raise ExtractorError(
-                        'YouTube said: %s' % reason,
-                        expected=True, video_id=video_id)
-                else:
-                    raise ExtractorError(
-                        '"token" parameter not in video info for unknown reason',
-                        video_id=video_id)
-
-        if not formats and (video_info.get('license_info') or try_get(player_response, lambda x: x['streamingData']['licenseInfos'])):
-            raise ExtractorError('This video is DRM protected.', expected=True)
+            if 'reason' in video_info:
+                if 'The uploader has not made this video available in your country.' in video_info['reason']:
+                    regions_allowed = self._html_search_meta(
+                        'regionsAllowed', video_webpage, default=None)
+                    countries = regions_allowed.split(',') if regions_allowed else None
+                    self.raise_geo_restricted(
+                        msg=video_info['reason'][0], countries=countries)
+                reason = video_info['reason'][0]
+                if 'Invalid parameters' in reason:
+                    unavailable_message = extract_unavailable_message()
+                    if unavailable_message:
+                        reason = unavailable_message
+                raise ExtractorError(
+                    'YouTube said: %s' % reason,
+                    expected=True, video_id=video_id)
+            if video_info.get('license_info') or try_get(player_response, lambda x: x['streamingData']['licenseInfos']):
+                raise ExtractorError('This video is DRM protected.', expected=True)
  
          self._sort_formats(formats)
  
@@ -2432,7 +2541,7 @@ def decrypt_sig(mobj):
              'creator': video_creator or artist,
              'title': video_title,
              'alt_title': video_alt_title or track,
-            'thumbnail': video_thumbnail,
+            'thumbnails': thumbnails,
              'description': video_description,
              'categories': video_categories,
              'tags': video_tags,
@@ -2494,20 +2603,23 @@ class YoutubePlaylistIE(YoutubePlaylistBaseInfoExtractor):
      _VIDEO_RE = _VIDEO_RE_TPL % r'(?P<id>[0-9A-Za-z_-]{11})'
      IE_NAME = 'youtube:playlist'
      _TESTS = [{
-        'url': 'https://www.youtube.com/playlist?list=PLwiyx1dc3P2JR9N8gQaQN_BCvlSlap7re',
+        'url': 'https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc',
          'info_dict': {
-            'title': 'ytdl test PL',
-            'id': 'PLwiyx1dc3P2JR9N8gQaQN_BCvlSlap7re',
+            'uploader_id': 'UCmlqkdCBesrv2Lak1mF_MxA',
+            'uploader': 'Sergey M.',
+            'id': 'PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc',
+            'title': 'youtube-dl public playlist',
          },
-        'playlist_count': 3,
+        'playlist_count': 1,
      }, {
-        'url': 'https://www.youtube.com/playlist?list=PLtPgu7CB4gbZDA7i_euNxn75ISqxwZPYx',
+        'url': 'https://www.youtube.com/playlist?list=PL4lCao7KL_QFodcLWhDpGCYnngnHtQ-Xf',
          'info_dict': {
-            'id': 'PLtPgu7CB4gbZDA7i_euNxn75ISqxwZPYx',
-            'title': 'YDL_Empty_List',
+            'uploader_id': 'UCmlqkdCBesrv2Lak1mF_MxA',
+            'uploader': 'Sergey M.',
+            'id': 'PL4lCao7KL_QFodcLWhDpGCYnngnHtQ-Xf',
+            'title': 'youtube-dl empty playlist',
          },
          'playlist_count': 0,
-        'skip': 'This playlist is private',
      }, {
          'note': 'Playlist with deleted videos (#651). As a bonus, the video #51 is also twice in this list.',
          'url': 'https://www.youtube.com/playlist?list=PLwP_SiAcdui0KVebT0mU9Apz359a4ubsC',
@@ -2517,7 +2629,7 @@ class YoutubePlaylistIE(YoutubePlaylistBaseInfoExtractor):
              'uploader': 'Christiaan008',
              'uploader_id': 'ChRiStIaAn008',
          },
-        'playlist_count': 95,
+        'playlist_count': 96,
      }, {
          'note': 'issue #673',
          'url': 'PLBB231211A4F62143',
@@ -2693,7 +2805,7 @@ def _extract_mix(self, playlist_id):
          ids = []
          last_id = playlist_id[-11:]
          for n in itertools.count(1):
-            url = 'https://youtube.com/watch?v=%s&list=%s' % (last_id, playlist_id)
+            url = 'https://www.youtube.com/watch?v=%s&list=%s' % (last_id, playlist_id)
              webpage = self._download_webpage(
                  url, playlist_id, 'Downloading page {0} of Youtube mix'.format(n))
              new_ids = orderedSet(re.findall(
@@ -2925,7 +3037,7 @@ def _real_extract(self, url):
  
  class YoutubeUserIE(YoutubeChannelIE):
      IE_DESC = 'YouTube.com user videos (URL or "ytuser" keyword)'
-    _VALID_URL = r'(?:(?:https?://(?:\w+\.)?youtube\.com/(?:(?P<user>user|c)/)?(?!(?:attribution_link|watch|results|shared)(?:$|[^a-z_A-Z0-9-])))|ytuser:)(?!feed/)(?P<id>[A-Za-z0-9_-]+)'
+    _VALID_URL = r'(?:(?:https?://(?:\w+\.)?youtube\.com/(?:(?P<user>user|c)/)?(?!(?:attribution_link|watch|results|shared)(?:$|[^a-z_A-Z0-9%-])))|ytuser:)(?!feed/)(?P<id>[A-Za-z0-9_%-]+)'
      _TEMPLATE_URL = 'https://www.youtube.com/%s/%s/videos'
      IE_NAME = 'youtube:user'
  
@@ -2955,6 +3067,9 @@ class YoutubeUserIE(YoutubeChannelIE):
      }, {
          'url': 'https://www.youtube.com/c/gametrailers',
          'only_matching': True,
+    }, {
+        'url': 'https://www.youtube.com/c/Pawe%C5%82Zadro%C5%BCniak',
+        'only_matching': True,
      }, {
          'url': 'https://www.youtube.com/gametrailers',
          'only_matching': True,
@@ -3033,7 +3148,7 @@ def _real_extract(self, url):
  
  class YoutubePlaylistsIE(YoutubePlaylistsBaseInfoExtractor):
      IE_DESC = 'YouTube.com user/channel playlists'
-    _VALID_URL = r'https?://(?:\w+\.)?youtube\.com/(?:user|channel)/(?P<id>[^/]+)/playlists'
+    _VALID_URL = r'https?://(?:\w+\.)?youtube\.com/(?:user|channel|c)/(?P<id>[^/]+)/playlists'
      IE_NAME = 'youtube:playlists'
  
      _TESTS = [{
@@ -3059,6 +3174,9 @@ class YoutubePlaylistsIE(YoutubePlaylistsBaseInfoExtractor):
              'title': 'Chem Player',
          },
          'skip': 'Blocked',
+    }, {
+        'url': 'https://www.youtube.com/c/ChristophLaimer/playlists',
+        'only_matching': True,
      }]
  
  
@@ -3203,9 +3321,10 @@ def _entries(self, page):
                  break
  
              more = self._download_json(
-                'https://youtube.com/%s' % mobj.group('more'), self._PLAYLIST_TITLE,
+                'https://www.youtube.com/%s' % mobj.group('more'), self._PLAYLIST_TITLE,
                  'Downloading page #%s' % page_num,
-                transform_source=uppercase_escape)
+                transform_source=uppercase_escape,
+                headers=self._YOUTUBE_CLIENT_HEADERS)
              content_html = more['content_html']
              more_widget_html = more['load_more_widget_html']
  
diff --git a/youtube_dl/extractor/zapiks.py b/youtube_dlc/extractor/zapiks.py

similarity index 99%

rename from youtube_dl/extractor/zapiks.py

rename to youtube_dlc/extractor/zapiks.py

index bacb82eeeb2a549edbb0cbf6d0a67e07f28b595b..f6496f5168cf057c9c415cc7461105462ad66370 100644 (file)
--- a/youtube_dl/extractor/zapiks.py
+++ b/youtube_dlc/extractor/zapiks.py
@@ -29,7 +29,6 @@ class ZapiksIE(InfoExtractor):
                  'timestamp': 1359044972,
                  'upload_date': '20130124',
                  'view_count': int,
-                'comment_count': int,
              },
          },
          {
diff --git a/youtube_dl/extractor/zaq1.py b/youtube_dlc/extractor/zaq1.py

similarity index 100%

rename from youtube_dl/extractor/zaq1.py

rename to youtube_dlc/extractor/zaq1.py
diff --git a/youtube_dl/extractor/zattoo.py b/youtube_dlc/extractor/zattoo.py

similarity index 100%

rename from youtube_dl/extractor/zattoo.py

rename to youtube_dlc/extractor/zattoo.py
diff --git a/youtube_dl/extractor/zdf.py b/youtube_dlc/extractor/zdf.py

similarity index 95%

rename from youtube_dl/extractor/zdf.py

rename to youtube_dlc/extractor/zdf.py

index 145c123a42fee5e67c0fd8c2750ea13562632666..7b5ad4a6e85398331fe4c7926cbc378f3b624f0f 100644 (file)
--- a/youtube_dl/extractor/zdf.py
+++ b/youtube_dlc/extractor/zdf.py
@@ -39,11 +39,23 @@ def _extract_player(self, webpage, video_id, fatal=True):
  
  
  class ZDFIE(ZDFBaseIE):
-    _VALID_URL = r'https?://www\.zdf\.de/(?:[^/]+/)*(?P<id>[^/?]+)\.html'
+    IE_NAME = "ZDF-3sat"
+    _VALID_URL = r'https?://www\.(zdf|3sat)\.de/(?:[^/]+/)*(?P<id>[^/?]+)\.html'
      _QUALITIES = ('auto', 'low', 'med', 'high', 'veryhigh')
      _GEO_COUNTRIES = ['DE']
  
      _TESTS = [{
+        'url': 'https://www.3sat.de/wissen/wissenschaftsdoku/luxusgut-lebensraum-100.html',
+        'info_dict': {
+            'id': 'luxusgut-lebensraum-100',
+            'ext': 'mp4',
+            'title': 'Luxusgut Lebensraum',
+            'description': 'md5:5c09b2f45ac3bc5233d1b50fc543d061',
+            'duration': 2601,
+            'timestamp': 1566497700,
+            'upload_date': '20190822',
+        }
+    }, {
          'url': 'https://www.zdf.de/dokumentation/terra-x/die-magie-der-farben-von-koenigspurpur-und-jeansblau-100.html',
          'info_dict': {
              'id': 'die-magie-der-farben-von-koenigspurpur-und-jeansblau-100',
@@ -244,14 +256,14 @@ class ZDFChannelIE(ZDFBaseIE):
              'id': 'das-aktuelle-sportstudio',
              'title': 'das aktuelle sportstudio | ZDF',
          },
-        'playlist_count': 21,
+        'playlist_mincount': 23,
      }, {
          'url': 'https://www.zdf.de/dokumentation/planet-e',
          'info_dict': {
              'id': 'planet-e',
              'title': 'planet e.',
          },
-        'playlist_count': 4,
+        'playlist_mincount': 50,
      }, {
          'url': 'https://www.zdf.de/filme/taunuskrimi/',
          'only_matching': True,
diff --git a/youtube_dl/extractor/zingmp3.py b/youtube_dlc/extractor/zingmp3.py

similarity index 100%

rename from youtube_dl/extractor/zingmp3.py

rename to youtube_dlc/extractor/zingmp3.py
diff --git a/youtube_dlc/extractor/zype.py b/youtube_dlc/extractor/zype.py

new file mode 100644 (file)

index 0000000..2e2e97a
--- /dev/null
+++ b/youtube_dlc/extractor/zype.py
@@ -0,0 +1,134 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+import re
+
+from .common import InfoExtractor
+from ..compat import compat_HTTPError
+from ..utils import (
+    dict_get,
+    ExtractorError,
+    int_or_none,
+    js_to_json,
+    parse_iso8601,
+)
+
+
+class ZypeIE(InfoExtractor):
+    _ID_RE = r'[\da-fA-F]+'
+    _COMMON_RE = r'//player\.zype\.com/embed/%s\.(?:js|json|html)\?.*?(?:access_token|(?:ap[ip]|player)_key)='
+    _VALID_URL = r'https?:%s[^&]+' % (_COMMON_RE % ('(?P<id>%s)' % _ID_RE))
+    _TEST = {
+        'url': 'https://player.zype.com/embed/5b400b834b32992a310622b9.js?api_key=jZ9GUhRmxcPvX7M3SlfejB6Hle9jyHTdk2jVxG7wOHPLODgncEKVdPYBhuz9iWXQ&autoplay=false&controls=true&da=false',
+        'md5': 'eaee31d474c76a955bdaba02a505c595',
+        'info_dict': {
+            'id': '5b400b834b32992a310622b9',
+            'ext': 'mp4',
+            'title': 'Smoky Barbecue Favorites',
+            'thumbnail': r're:^https?://.*\.jpe?g',
+            'description': 'md5:5ff01e76316bd8d46508af26dc86023b',
+            'timestamp': 1504915200,
+            'upload_date': '20170909',
+        },
+    }
+
+    @staticmethod
+    def _extract_urls(webpage):
+        return [
+            mobj.group('url')
+            for mobj in re.finditer(
+                r'<script[^>]+\bsrc=(["\'])(?P<url>(?:https?:)?%s.+?)\1' % (ZypeIE._COMMON_RE % ZypeIE._ID_RE),
+                webpage)]
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+
+        try:
+            response = self._download_json(re.sub(
+                r'\.(?:js|html)\?', '.json?', url), video_id)['response']
+        except ExtractorError as e:
+            if isinstance(e.cause, compat_HTTPError) and e.cause.code in (400, 401, 403):
+                raise ExtractorError(self._parse_json(
+                    e.cause.read().decode(), video_id)['message'], expected=True)
+            raise
+
+        body = response['body']
+        video = response['video']
+        title = video['title']
+
+        if isinstance(body, dict):
+            formats = []
+            for output in body.get('outputs', []):
+                output_url = output.get('url')
+                if not output_url:
+                    continue
+                name = output.get('name')
+                if name == 'm3u8':
+                    formats = self._extract_m3u8_formats(
+                        output_url, video_id, 'mp4',
+                        'm3u8_native', m3u8_id='hls', fatal=False)
+                else:
+                    f = {
+                        'format_id': name,
+                        'tbr': int_or_none(output.get('bitrate')),
+                        'url': output_url,
+                    }
+                    if name in ('m4a', 'mp3'):
+                        f['vcodec'] = 'none'
+                    else:
+                        f.update({
+                            'height': int_or_none(output.get('height')),
+                            'width': int_or_none(output.get('width')),
+                        })
+                    formats.append(f)
+            text_tracks = body.get('subtitles') or []
+        else:
+            m3u8_url = self._search_regex(
+                r'(["\'])(?P<url>(?:(?!\1).)+\.m3u8(?:(?!\1).)*)\1',
+                body, 'm3u8 url', group='url')
+            formats = self._extract_m3u8_formats(
+                m3u8_url, video_id, 'mp4', 'm3u8_native', m3u8_id='hls')
+            text_tracks = self._search_regex(
+                r'textTracks\s*:\s*(\[[^]]+\])',
+                body, 'text tracks', default=None)
+            if text_tracks:
+                text_tracks = self._parse_json(
+                    text_tracks, video_id, js_to_json, False)
+        self._sort_formats(formats)
+
+        subtitles = {}
+        if text_tracks:
+            for text_track in text_tracks:
+                tt_url = dict_get(text_track, ('file', 'src'))
+                if not tt_url:
+                    continue
+                subtitles.setdefault(text_track.get('label') or 'English', []).append({
+                    'url': tt_url,
+                })
+
+        thumbnails = []
+        for thumbnail in video.get('thumbnails', []):
+            thumbnail_url = thumbnail.get('url')
+            if not thumbnail_url:
+                continue
+            thumbnails.append({
+                'url': thumbnail_url,
+                'width': int_or_none(thumbnail.get('width')),
+                'height': int_or_none(thumbnail.get('height')),
+            })
+
+        return {
+            'id': video_id,
+            'display_id': video.get('friendly_title'),
+            'title': title,
+            'thumbnails': thumbnails,
+            'description': dict_get(video, ('description', 'ott_description', 'short_description')),
+            'timestamp': parse_iso8601(video.get('published_at')),
+            'duration': int_or_none(video.get('duration')),
+            'view_count': int_or_none(video.get('request_count')),
+            'average_rating': int_or_none(video.get('rating')),
+            'season_number': int_or_none(video.get('season')),
+            'episode_number': int_or_none(video.get('episode')),
+            'formats': formats,
+            'subtitles': subtitles,
+        }
diff --git a/youtube_dl/jsinterp.py b/youtube_dlc/jsinterp.py

similarity index 100%

rename from youtube_dl/jsinterp.py

rename to youtube_dlc/jsinterp.py
diff --git a/youtube_dl/options.py b/youtube_dlc/options.py

similarity index 96%

rename from youtube_dl/options.py

rename to youtube_dlc/options.py

index 1ffabc62bedacb42aeb34f585d04ed7bc3ff8045..2cc5eee742dcef08553b12082c10b0e9f23248fb 100644 (file)
--- a/youtube_dl/options.py
+++ b/youtube_dlc/options.py
@@ -57,33 +57,33 @@ def _readOptions(filename_bytes, default=[]):
      def _readUserConf():
          xdg_config_home = compat_getenv('XDG_CONFIG_HOME')
          if xdg_config_home:
-            userConfFile = os.path.join(xdg_config_home, 'youtube-dl', 'config')
+            userConfFile = os.path.join(xdg_config_home, 'youtube-dlc', 'config')
              if not os.path.isfile(userConfFile):
-                userConfFile = os.path.join(xdg_config_home, 'youtube-dl.conf')
+                userConfFile = os.path.join(xdg_config_home, 'youtube-dlc.conf')
          else:
-            userConfFile = os.path.join(compat_expanduser('~'), '.config', 'youtube-dl', 'config')
+            userConfFile = os.path.join(compat_expanduser('~'), '.config', 'youtube-dlc', 'config')
              if not os.path.isfile(userConfFile):
-                userConfFile = os.path.join(compat_expanduser('~'), '.config', 'youtube-dl.conf')
+                userConfFile = os.path.join(compat_expanduser('~'), '.config', 'youtube-dlc.conf')
          userConf = _readOptions(userConfFile, None)
  
          if userConf is None:
              appdata_dir = compat_getenv('appdata')
              if appdata_dir:
                  userConf = _readOptions(
-                    os.path.join(appdata_dir, 'youtube-dl', 'config'),
+                    os.path.join(appdata_dir, 'youtube-dlc', 'config'),
                      default=None)
                  if userConf is None:
                      userConf = _readOptions(
-                        os.path.join(appdata_dir, 'youtube-dl', 'config.txt'),
+                        os.path.join(appdata_dir, 'youtube-dlc', 'config.txt'),
                          default=None)
  
          if userConf is None:
              userConf = _readOptions(
-                os.path.join(compat_expanduser('~'), 'youtube-dl.conf'),
+                os.path.join(compat_expanduser('~'), 'youtube-dlc.conf'),
                  default=None)
          if userConf is None:
              userConf = _readOptions(
-                os.path.join(compat_expanduser('~'), 'youtube-dl.conf.txt'),
+                os.path.join(compat_expanduser('~'), 'youtube-dlc.conf.txt'),
                  default=None)
  
          if userConf is None:
@@ -134,7 +134,7 @@ def _comma_separated_values_options_callback(option, opt_str, value, parser):
          action='help',
          help='Print this help text and exit')
      general.add_option(
-        '-v', '--version',
+        '--version',
          action='version',
          help='Print program version and exit')
      general.add_option(
@@ -168,14 +168,14 @@ def _comma_separated_values_options_callback(option, opt_str, value, parser):
      general.add_option(
          '--default-search',
          dest='default_search', metavar='PREFIX',
-        help='Use this prefix for unqualified URLs. For example "gvsearch2:" downloads two videos from google videos for youtube-dl "large apple". Use the value "auto" to let youtube-dl guess ("auto_warning" to emit a warning when guessing). "error" just throws an error. The default value "fixup_error" repairs broken URLs, but emits an error if this is not possible instead of searching.')
+        help='Use this prefix for unqualified URLs. For example "gvsearch2:" downloads two videos from google videos for youtube-dlc "large apple". Use the value "auto" to let youtube-dlc guess ("auto_warning" to emit a warning when guessing). "error" just throws an error. The default value "fixup_error" repairs broken URLs, but emits an error if this is not possible instead of searching.')
      general.add_option(
          '--ignore-config',
          action='store_true',
          help='Do not read configuration files. '
-        'When given in the global configuration file /etc/youtube-dl.conf: '
-        'Do not read the user configuration in ~/.config/youtube-dl/config '
-        '(%APPDATA%/youtube-dl/config.txt on Windows)')
+        'When given in the global configuration file /etc/youtube-dlc.conf: '
+        'Do not read the user configuration in ~/.config/youtube-dlc/config '
+        '(%APPDATA%/youtube-dlc/config.txt on Windows)')
      general.add_option(
          '--config-location',
          dest='config_location', metavar='PATH',
@@ -357,7 +357,7 @@ def _comma_separated_values_options_callback(option, opt_str, value, parser):
      authentication.add_option(
          '-p', '--password',
          dest='password', metavar='PASSWORD',
-        help='Account password. If this option is left out, youtube-dl will ask interactively.')
+        help='Account password. If this option is left out, youtube-dlc will ask interactively.')
      authentication.add_option(
          '-2', '--twofactor',
          dest='twofactor', metavar='TWOFACTOR',
@@ -383,7 +383,7 @@ def _comma_separated_values_options_callback(option, opt_str, value, parser):
      adobe_pass.add_option(
          '--ap-password',
          dest='ap_password', metavar='PASSWORD',
-        help='Multiple-system operator account password. If this option is left out, youtube-dl will ask interactively.')
+        help='Multiple-system operator account password. If this option is left out, youtube-dlc will ask interactively.')
      adobe_pass.add_option(
          '--ap-list-mso',
          action='store_true', dest='ap_list_mso', default=False,
@@ -670,11 +670,11 @@ def _comma_separated_values_options_callback(option, opt_str, value, parser):
      verbosity.add_option(
          '-C', '--call-home',
          dest='call_home', action='store_true', default=False,
-        help='Contact the youtube-dl server for debugging')
+        help='Contact the youtube-dlc server for debugging')
      verbosity.add_option(
          '--no-call-home',
          dest='call_home', action='store_false', default=False,
-        help='Do NOT contact the youtube-dl server for debugging')
+        help='Do NOT contact the youtube-dlc server for debugging')
  
      filesystem = optparse.OptionGroup(parser, 'Filesystem Options')
      filesystem.add_option(
@@ -720,7 +720,7 @@ def _comma_separated_values_options_callback(option, opt_str, value, parser):
      filesystem.add_option(
          '-c', '--continue',
          action='store_true', dest='continue_dl', default=True,
-        help='Force resume of partially downloaded files. By default, youtube-dl will resume downloads if possible.')
+        help='Force resume of partially downloaded files. By default, youtube-dlc will resume downloads if possible.')
      filesystem.add_option(
          '--no-continue',
          action='store_false', dest='continue_dl',
@@ -755,7 +755,7 @@ def _comma_separated_values_options_callback(option, opt_str, value, parser):
          help='File to read cookies from and dump cookie jar in')
      filesystem.add_option(
          '--cache-dir', dest='cachedir', default=None, metavar='DIR',
-        help='Location in the filesystem where youtube-dl can store some downloaded information permanently. By default $XDG_CACHE_HOME/youtube-dl or ~/.cache/youtube-dl . At the moment, only YouTube player files (for videos with obfuscated signatures) are cached, but that may change.')
+        help='Location in the filesystem where youtube-dlc can store some downloaded information permanently. By default $XDG_CACHE_HOME/youtube-dlc or ~/.cache/youtube-dlc . At the moment, only YouTube player files (for videos with obfuscated signatures) are cached, but that may change.')
      filesystem.add_option(
          '--no-cache-dir', action='store_const', const=False, dest='cachedir',
          help='Disable filesystem caching')
@@ -853,7 +853,7 @@ def _comma_separated_values_options_callback(option, opt_str, value, parser):
      postproc.add_option(
          '--exec',
          metavar='CMD', dest='exec_cmd',
-        help='Execute a command on the file after downloading, similar to find\'s -exec syntax. Example: --exec \'adb push {} /sdcard/Music/ && rm {}\'')
+        help='Execute a command on the file after downloading and post-processing, similar to find\'s -exec syntax. Example: --exec \'adb push {} /sdcard/Music/ && rm {}\'')
      postproc.add_option(
          '--convert-subs', '--convert-subtitles',
          metavar='FORMAT', dest='convertsubtitles', default=None,
@@ -892,14 +892,14 @@ def compat_conf(conf):
          if '--config-location' in command_line_conf:
              location = compat_expanduser(opts.config_location)
              if os.path.isdir(location):
-                location = os.path.join(location, 'youtube-dl.conf')
+                location = os.path.join(location, 'youtube-dlc.conf')
              if not os.path.exists(location):
                  parser.error('config-location %s does not exist.' % location)
              custom_conf = _readOptions(location)
          elif '--ignore-config' in command_line_conf:
              pass
          else:
-            system_conf = _readOptions('/etc/youtube-dl.conf')
+            system_conf = _readOptions('/etc/youtube-dlc.conf')
              if '--ignore-config' not in system_conf:
                  user_conf = _readUserConf()
  
diff --git a/youtube_dl/postprocessor/__init__.py b/youtube_dlc/postprocessor/__init__.py

similarity index 100%

rename from youtube_dl/postprocessor/__init__.py

rename to youtube_dlc/postprocessor/__init__.py
diff --git a/youtube_dl/postprocessor/common.py b/youtube_dlc/postprocessor/common.py

similarity index 100%

rename from youtube_dl/postprocessor/common.py

rename to youtube_dlc/postprocessor/common.py
diff --git a/youtube_dl/postprocessor/embedthumbnail.py b/youtube_dlc/postprocessor/embedthumbnail.py

similarity index 61%

rename from youtube_dl/postprocessor/embedthumbnail.py

rename to youtube_dlc/postprocessor/embedthumbnail.py

index 56be914b8f1b6e98802163ae1013392079d93fb3..e66558ea6fe249bd90d860d818912592be0651e3 100644 (file)
--- a/youtube_dl/postprocessor/embedthumbnail.py
+++ b/youtube_dlc/postprocessor/embedthumbnail.py
@@ -41,6 +41,28 @@ def run(self, info):
                  'Skipping embedding the thumbnail because the file is missing.')
              return [], info
  
+        # Check for mislabeled webp file
+        with open(encodeFilename(thumbnail_filename), "rb") as f:
+            b = f.read(16)
+        if b'\x57\x45\x42\x50' in b:  # Binary for WEBP
+            [thumbnail_filename_path, thumbnail_filename_extension] = os.path.splitext(thumbnail_filename)
+            if not thumbnail_filename_extension == ".webp":
+                webp_thumbnail_filename = thumbnail_filename_path + ".webp"
+                os.rename(encodeFilename(thumbnail_filename), encodeFilename(webp_thumbnail_filename))
+                thumbnail_filename = webp_thumbnail_filename
+
+        # If not a jpg or png thumbnail, convert it to jpg using ffmpeg
+        if not os.path.splitext(thumbnail_filename)[1].lower() in ['.jpg', '.png']:
+            jpg_thumbnail_filename = os.path.splitext(thumbnail_filename)[0] + ".jpg"
+            jpg_thumbnail_filename = os.path.join(os.path.dirname(jpg_thumbnail_filename), os.path.basename(jpg_thumbnail_filename).replace('%', '_'))  # ffmpeg interprets % as image sequence
+
+            self._downloader.to_screen('[ffmpeg] Converting thumbnail "%s" to JPEG' % thumbnail_filename)
+
+            self.run_ffmpeg(thumbnail_filename, jpg_thumbnail_filename, ['-bsf:v', 'mjpeg2jpeg'])
+
+            os.remove(encodeFilename(thumbnail_filename))
+            thumbnail_filename = jpg_thumbnail_filename
+
          if info['ext'] == 'mp3':
              options = [
                  '-c', 'copy', '-map', '0', '-map', '1',
@@ -55,6 +77,25 @@ def run(self, info):
              os.remove(encodeFilename(filename))
              os.rename(encodeFilename(temp_filename), encodeFilename(filename))
  
+        elif info['ext'] == 'mkv':
+            os.rename(encodeFilename(thumbnail_filename), encodeFilename('cover.jpg'))
+            old_thumbnail_filename = thumbnail_filename
+            thumbnail_filename = 'cover.jpg'
+
+            options = [
+                '-c', 'copy', '-attach', thumbnail_filename, '-metadata:s:t', 'mimetype=image/jpeg']
+
+            self._downloader.to_screen('[ffmpeg] Adding thumbnail to "%s"' % filename)
+
+            self.run_ffmpeg_multiple_files([filename], temp_filename, options)
+
+            if not self._already_have_thumbnail:
+                os.remove(encodeFilename(thumbnail_filename))
+            else:
+                os.rename(encodeFilename(thumbnail_filename), encodeFilename(old_thumbnail_filename))
+            os.remove(encodeFilename(filename))
+            os.rename(encodeFilename(temp_filename), encodeFilename(filename))
+
          elif info['ext'] in ['m4a', 'mp4']:
              if not check_executable('AtomicParsley', ['-v']):
                  raise EmbedThumbnailPPError('AtomicParsley was not found. Please install.')
diff --git a/youtube_dl/postprocessor/execafterdownload.py b/youtube_dlc/postprocessor/execafterdownload.py

similarity index 100%

rename from youtube_dl/postprocessor/execafterdownload.py

rename to youtube_dlc/postprocessor/execafterdownload.py
diff --git a/youtube_dl/postprocessor/ffmpeg.py b/youtube_dlc/postprocessor/ffmpeg.py

similarity index 97%

rename from youtube_dl/postprocessor/ffmpeg.py

rename to youtube_dlc/postprocessor/ffmpeg.py

index fd3f921a8a11da2e8c31573889ea4d7f5a9fea25..dbc736c50930670303783c047ce261067a57c744 100644 (file)
--- a/youtube_dl/postprocessor/ffmpeg.py
+++ b/youtube_dlc/postprocessor/ffmpeg.py
@@ -447,6 +447,13 @@ def add(meta_list, info_list=None):
                          metadata[meta_f] = info[info_f]
                      break
  
+        # See [1-4] for some info on media metadata/metadata supported
+        # by ffmpeg.
+        # 1. https://kdenlive.org/en/project/adding-meta-data-to-mp4-video/
+        # 2. https://wiki.multimedia.cx/index.php/FFmpeg_Metadata
+        # 3. https://kodi.wiki/view/Video_file_tagging
+        # 4. http://atomicparsley.sourceforge.net/mpeg-4files.html
+
          add('title', ('track', 'title'))
          add('date', 'upload_date')
          add(('description', 'comment'), 'description')
@@ -457,6 +464,10 @@ def add(meta_list, info_list=None):
          add('album')
          add('album_artist')
          add('disc', 'disc_number')
+        add('show', 'series')
+        add('season_number')
+        add('episode_id', ('episode', 'episode_id'))
+        add('episode_sort', 'episode_number')
  
          if not metadata:
              self._downloader.to_screen('[ffmpeg] There isn\'t any metadata to add')
@@ -522,7 +533,7 @@ def can_merge(self):
          if is_outdated_version(
                  self._versions[self.basename], required_version):
              warning = ('Your copy of %s is outdated and unable to properly mux separate video and audio files, '
-                       'youtube-dl will download single file media. '
+                       'youtube-dlc will download single file media. '
                         'Update %s to version %s or newer to fix this.') % (
                             self.basename, self.basename, required_version)
              if self._downloader:
diff --git a/youtube_dl/postprocessor/metadatafromtitle.py b/youtube_dlc/postprocessor/metadatafromtitle.py

similarity index 100%

rename from youtube_dl/postprocessor/metadatafromtitle.py

rename to youtube_dlc/postprocessor/metadatafromtitle.py
diff --git a/youtube_dl/postprocessor/xattrpp.py b/youtube_dlc/postprocessor/xattrpp.py

similarity index 100%

rename from youtube_dl/postprocessor/xattrpp.py

rename to youtube_dlc/postprocessor/xattrpp.py
diff --git a/youtube_dl/socks.py b/youtube_dlc/socks.py

similarity index 100%

rename from youtube_dl/socks.py

rename to youtube_dlc/socks.py
diff --git a/youtube_dl/swfinterp.py b/youtube_dlc/swfinterp.py

similarity index 100%

rename from youtube_dl/swfinterp.py

rename to youtube_dlc/swfinterp.py
diff --git a/youtube_dl/update.py b/youtube_dlc/update.py

similarity index 90%

rename from youtube_dl/update.py

rename to youtube_dlc/update.py

index 002ea7f3386215c61bcf3bc60419d0059abf2bc2..d95a07c0ca94b01b836bc7b62bfc5fc505f66873 100644 (file)
--- a/youtube_dl/update.py
+++ b/youtube_dlc/update.py
@@ -9,6 +9,7 @@
  import sys
  from zipimport import zipimporter
  
+from .compat import compat_realpath
  from .utils import encode_compat_str
  
  from .version import __version__
@@ -37,7 +38,7 @@ def update_self(to_screen, verbose, opener):
      UPDATES_RSA_KEY = (0x9d60ee4d8f805312fdb15a62f87b95bd66177b91df176765d13514a0f1754bcd2057295c5b6f1d35daa6742c3ffc9a82d3e118861c207995a8031e151d863c9927e304576bc80692bc8e094896fcf11b66f3e29e04e3a71e9a11558558acea1840aec37fc396fb6b65dc81a1c4144e03bd1c011de62e3f1357b327d08426fe93, 65537)
  
      if not isinstance(globals().get('__loader__'), zipimporter) and not hasattr(sys, 'frozen'):
-        to_screen('It looks like you installed youtube-dl with a package manager, pip, setup.py or a tarball. Please use that to update.')
+        to_screen('It looks like you installed youtube-dlc with a package manager, pip, setup.py or a tarball. Please use that to update.')
          return
  
      # Check if there is a new version
@@ -49,7 +50,7 @@ def update_self(to_screen, verbose, opener):
          to_screen('ERROR: can\'t find the current version. Please try again later.')
          return
      if newversion == __version__:
-        to_screen('youtube-dl is up-to-date (' + __version__ + ')')
+        to_screen('youtube-dlc is up-to-date (' + __version__ + ')')
          return
  
      # Download and check versions info
@@ -75,7 +76,7 @@ def update_self(to_screen, verbose, opener):
      def version_tuple(version_str):
          return tuple(map(int, version_str.split('.')))
      if version_tuple(__version__) >= version_tuple(version_id):
-        to_screen('youtube-dl is up to date (%s)' % __version__)
+        to_screen('youtube-dlc is up to date (%s)' % __version__)
          return
  
      to_screen('Updating to version ' + version_id + ' ...')
@@ -84,7 +85,9 @@ def version_tuple(version_str):
      print_notes(to_screen, versions_info['versions'])
  
      # sys.executable is set to the full pathname of the exe-file for py2exe
-    filename = sys.executable if hasattr(sys, 'frozen') else sys.argv[0]
+    # though symlinks are not followed so that we need to do this manually
+    # with help of realpath
+    filename = compat_realpath(sys.executable if hasattr(sys, 'frozen') else sys.argv[0])
  
      if not os.access(filename, os.W_OK):
          to_screen('ERROR: no write permissions on %s' % filename)
@@ -123,14 +126,14 @@ def version_tuple(version_str):
              return
  
          try:
-            bat = os.path.join(directory, 'youtube-dl-updater.bat')
+            bat = os.path.join(directory, 'youtube-dlc-updater.bat')
              with io.open(bat, 'w') as batfile:
                  batfile.write('''
  @echo off
  echo Waiting for file handle to be closed ...
  ping 127.0.0.1 -n 5 -w 1000 > NUL
  move /Y "%s.new" "%s" > NUL
-echo Updated youtube-dl to version %s.
+echo Updated youtube-dlc to version %s.
  start /b "" cmd /c del "%%~f0"&exit /b"
                  \n''' % (exe, exe, version_id))
  
@@ -168,7 +171,7 @@ def version_tuple(version_str):
              to_screen('ERROR: unable to overwrite current version')
              return
  
-    to_screen('Updated youtube-dl. Restart youtube-dl to use the new version.')
+    to_screen('Updated youtube-dlc. Restart youtube-dlc to use the new version.')
  
  
  def get_notes(versions, fromVersion):
diff --git a/youtube_dl/utils.py b/youtube_dlc/utils.py

similarity index 97%

rename from youtube_dl/utils.py

rename to youtube_dlc/utils.py

index f6204692a81002cdfc44b02d183126e755283bd9..32b179c6fcbf2a8c5189e51cb27bf7e80dec8670 100644 (file)
--- a/youtube_dl/utils.py
+++ b/youtube_dlc/utils.py
@@ -7,6 +7,7 @@
  import binascii
  import calendar
  import codecs
+import collections
  import contextlib
  import ctypes
  import datetime
@@ -30,6 +31,7 @@
  import subprocess
  import sys
  import tempfile
+import time
  import traceback
  import xml.etree.ElementTree
  import zlib
@@ -1835,6 +1837,12 @@ def write_json_file(obj, fn):
                  os.unlink(fn)
              except OSError:
                  pass
+        try:
+            mask = os.umask(0)
+            os.umask(mask)
+            os.chmod(tf.name, 0o666 & ~mask)
+        except OSError:
+            pass
          os.rename(tf.name, fn)
      except Exception:
          try:
@@ -2309,12 +2317,12 @@ def make_HTTPS_handler(params, **kwargs):
  
  def bug_reports_message():
      if ytdl_is_updateable():
-        update_cmd = 'type  youtube-dl -U  to update'
+        update_cmd = 'type  youtube-dlc -U  to update'
      else:
          update_cmd = 'see  https://yt-dl.org/update  on how to update'
      msg = '; please report this issue on https://yt-dl.org/bug .'
      msg += ' Make sure you are using the latest version; %s.' % update_cmd
-    msg += ' Be sure to call youtube-dl with the --verbose flag and include its complete output.'
+    msg += ' Be sure to call youtube-dlc with the --verbose flag and include its complete output.'
      return msg
  
  
@@ -2328,7 +2336,7 @@ class ExtractorError(YoutubeDLError):
  
      def __init__(self, msg, tb=None, expected=False, cause=None, video_id=None):
          """ tb, if given, is the original traceback (so that it can be printed out).
-        If expected is set, this is a normal error message and most likely not a bug in youtube-dl.
+        If expected is set, this is a normal error message and most likely not a bug in youtube-dlc.
          """
  
          if sys.exc_info()[0] in (compat_urllib_error.URLError, socket.timeout, UnavailableVideoError):
@@ -2729,15 +2737,72 @@ def https_open(self, req):
  
  
  class YoutubeDLCookieJar(compat_cookiejar.MozillaCookieJar):
+    """
+    See [1] for cookie file format.
+
+    1. https://curl.haxx.se/docs/http-cookies.html
+    """
      _HTTPONLY_PREFIX = '#HttpOnly_'
+    _ENTRY_LEN = 7
+    _HEADER = '''# Netscape HTTP Cookie File
+# This file is generated by youtube-dlc.  Do not edit.
+
+'''
+    _CookieFileEntry = collections.namedtuple(
+        'CookieFileEntry',
+        ('domain_name', 'include_subdomains', 'path', 'https_only', 'expires_at', 'name', 'value'))
  
      def save(self, filename=None, ignore_discard=False, ignore_expires=False):
+        """
+        Save cookies to a file.
+
+        Most of the code is taken from CPython 3.8 and slightly adapted
+        to support cookie files with UTF-8 in both python 2 and 3.
+        """
+        if filename is None:
+            if self.filename is not None:
+                filename = self.filename
+            else:
+                raise ValueError(compat_cookiejar.MISSING_FILENAME_TEXT)
+
          # Store session cookies with `expires` set to 0 instead of an empty
          # string
          for cookie in self:
              if cookie.expires is None:
                  cookie.expires = 0
-        compat_cookiejar.MozillaCookieJar.save(self, filename, ignore_discard, ignore_expires)
+
+        with io.open(filename, 'w', encoding='utf-8') as f:
+            f.write(self._HEADER)
+            now = time.time()
+            for cookie in self:
+                if not ignore_discard and cookie.discard:
+                    continue
+                if not ignore_expires and cookie.is_expired(now):
+                    continue
+                if cookie.secure:
+                    secure = 'TRUE'
+                else:
+                    secure = 'FALSE'
+                if cookie.domain.startswith('.'):
+                    initial_dot = 'TRUE'
+                else:
+                    initial_dot = 'FALSE'
+                if cookie.expires is not None:
+                    expires = compat_str(cookie.expires)
+                else:
+                    expires = ''
+                if cookie.value is None:
+                    # cookies.txt regards 'Set-Cookie: foo' as a cookie
+                    # with no name, whereas http.cookiejar regards it as a
+                    # cookie with no value.
+                    name = ''
+                    value = cookie.name
+                else:
+                    name = cookie.name
+                    value = cookie.value
+                f.write(
+                    '\t'.join([cookie.domain, initial_dot, cookie.path,
+                               secure, expires, name, value]) + '\n')
  
      def load(self, filename=None, ignore_discard=False, ignore_expires=False):
          """Load cookies from a file."""
@@ -2747,12 +2812,30 @@ def load(self, filename=None, ignore_discard=False, ignore_expires=False):
              else:
                  raise ValueError(compat_cookiejar.MISSING_FILENAME_TEXT)
  
+        def prepare_line(line):
+            if line.startswith(self._HTTPONLY_PREFIX):
+                line = line[len(self._HTTPONLY_PREFIX):]
+            # comments and empty lines are fine
+            if line.startswith('#') or not line.strip():
+                return line
+            cookie_list = line.split('\t')
+            if len(cookie_list) != self._ENTRY_LEN:
+                raise compat_cookiejar.LoadError('invalid length %d' % len(cookie_list))
+            cookie = self._CookieFileEntry(*cookie_list)
+            if cookie.expires_at and not cookie.expires_at.isdigit():
+                raise compat_cookiejar.LoadError('invalid expires at %s' % cookie.expires_at)
+            return line
+
          cf = io.StringIO()
-        with open(filename) as f:
+        with io.open(filename, encoding='utf-8') as f:
              for line in f:
-                if line.startswith(self._HTTPONLY_PREFIX):
-                    line = line[len(self._HTTPONLY_PREFIX):]
-                cf.write(compat_str(line))
+                try:
+                    cf.write(prepare_line(line))
+                except compat_cookiejar.LoadError as e:
+                    write_string(
+                        'WARNING: skipping cookie file entry due to %s: %r\n'
+                        % (e, line), sys.stderr)
+                    continue
          cf.seek(0)
          self._really_load(cf, filename, ignore_discard, ignore_expires)
          # Session cookies are denoted by either `expires` field set to
@@ -2795,6 +2878,15 @@ def http_response(self, request, response):
      https_response = http_response
  
  
+class YoutubeDLRedirectHandler(compat_urllib_request.HTTPRedirectHandler):
+    if sys.version_info[0] < 3:
+        def redirect_request(self, req, fp, code, msg, headers, newurl):
+            # On python 2 urlh.geturl() may sometimes return redirect URL
+            # as byte string instead of unicode. This workaround allows
+            # to force it always return unicode.
+            return compat_urllib_request.HTTPRedirectHandler.redirect_request(self, req, fp, code, msg, headers, compat_str(newurl))
+
+
  def extract_timezone(date_str):
      m = re.search(
          r'^.{8,}?(?P<tz>Z$| ?(?P<sign>\+|-)(?P<hours>[0-9]{2}):?(?P<minutes>[0-9]{2})$)',
@@ -3640,7 +3732,7 @@ def get_exe_version(exe, args=['--version'],
      or False if the executable is not present """
      try:
          # STDIN should be redirected too. On UNIX-like systems, ffmpeg triggers
-        # SIGTTOU if youtube-dl is run in the background.
+        # SIGTTOU if youtube-dlc is run in the background.
          # See https://github.com/ytdl-org/youtube-dl/issues/955#issuecomment-209789656
          out, _ = subprocess.Popen(
              [encodeArgument(exe)] + args,
@@ -4052,7 +4144,7 @@ def is_outdated_version(version, limit, assume_new=True):
  
  
  def ytdl_is_updateable():
-    """ Returns if youtube-dl can be updated with -U """
+    """ Returns if youtube-dlc can be updated with -U """
      from zipimport import zipimporter
  
      return isinstance(globals().get('__loader__'), zipimporter) or hasattr(sys, 'frozen')
@@ -4081,6 +4173,7 @@ def mimetype2ext(mt):
          # Per RFC 3003, audio/mpeg can be .mp1, .mp2 or .mp3. Here use .mp3 as
          # it's the most popular one
          'audio/mpeg': 'mp3',
+        'audio/x-wav': 'wav',
      }.get(mt)
      if ext is not None:
          return ext
@@ -4106,6 +4199,7 @@ def mimetype2ext(mt):
          'vnd.ms-sstr+xml': 'ism',
          'quicktime': 'mov',
          'mp2t': 'ts',
+        'x-wav': 'wav',
      }.get(res, res)
  
  
@@ -5260,7 +5354,7 @@ def proxy_open(self, req, proxy, type):
              return None  # No Proxy
          if compat_urlparse.urlparse(proxy).scheme.lower() in ('socks', 'socks4', 'socks4a', 'socks5'):
              req.add_header('Ytdl-socks-proxy', proxy)
-            # youtube-dl's http/https handlers do wrapping the socket with socks
+            # youtube-dlc's http/https handlers do wrapping the socket with socks
              return None
          return compat_urllib_request.ProxyHandler.proxy_open(
              self, req, proxy, type)
@@ -5533,7 +5627,7 @@ def write_xattr(path, key, value):
                  # TODO: fallback to CLI tools
                  raise XAttrUnavailableError(
                      'python-pyxattr is detected but is too old. '
-                    'youtube-dl requires %s or above while your version is %s. '
+                    'youtube-dlc requires %s or above while your version is %s. '
                      'Falling back to other xattr implementations' % (
                          pyxattr_required_version, xattr.__version__))
  
diff --git a/youtube_dl/version.py b/youtube_dlc/version.py

similarity index 60%

rename from youtube_dl/version.py

rename to youtube_dlc/version.py

index 1227abc0a74c7a7617650e1ac37a7eab7ebbe78c..45b4d329107909e56532b13d0686eb116c2573b3 100644 (file)
--- a/youtube_dl/version.py
+++ b/youtube_dlc/version.py
@@ -1,3 +1,3 @@
  from __future__ import unicode_literals
  
-__version__ = '2019.11.28'
+__version__ = '2020.09.06'
author	Tom-Oliver Heidel <redacted>
	Sat, 12 Sep 2020 03:49:52 +0000 (05:49 +0200)
committer	Tom-Oliver Heidel <redacted>
	Sat, 12 Sep 2020 03:49:52 +0000 (05:49 +0200)