Merge branch 'subtitles-rework'

author Jaime Marquínez Ferrándiz <redacted>

Mon, 23 Feb 2015 16:13:03 +0000 (17:13 +0100)

committer Jaime Marquínez Ferrándiz <redacted>

Mon, 23 Feb 2015 16:13:03 +0000 (17:13 +0100)
author Jaime Marquínez Ferrándiz <redacted>
Mon, 23 Feb 2015 16:13:03 +0000 (17:13 +0100)
committer Jaime Marquínez Ferrándiz <redacted>
Mon, 23 Feb 2015 16:13:03 +0000 (17:13 +0100)
diff --git a/AUTHORS b/AUTHORS

index 47f12a9eefbf2fb0c050c38f605b2bed8170c772..bdd2a15dcf910938857ed3d3fb1161c3fd72b72e 100644 (file)
--- a/AUTHORS
+++ b/AUTHORS
@@ -111,3 +111,4 @@ Paul Hartmann
  Frans de Jonge
  Robin de Rooij
  Ryan Schmidt
+Leslie P. Polzer
diff --git a/Makefile b/Makefile

index 07c90c225aef878d451d059d8097d74096f56282..7087329564aa17db726d22fb3ce1fded54778635 100644 (file)
--- a/Makefile
+++ b/Makefile
@@ -1,7 +1,7 @@
  all: youtube-dl README.md CONTRIBUTING.md README.txt youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish supportedsites
  
  clean:
-       rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish *.dump *.part *.info.json *.mp4 *.flv *.mp3 CONTRIBUTING.md.tmp youtube-dl youtube-dl.exe
+       rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish *.dump *.part *.info.json *.mp4 *.flv *.mp3 *.avi CONTRIBUTING.md.tmp youtube-dl youtube-dl.exe
  
  PREFIX ?= /usr/local
  BINDIR ?= $(PREFIX)/bin
diff --git a/README.md b/README.md

index 731cea1e1cfb21534585103cb7a95075afaeac6a..8ea31d6059f05ac179f3879622af26cfc37ac110 100644 (file)
--- a/README.md
+++ b/README.md
@@ -161,6 +161,8 @@ ## Download Options:
      --playlist-reverse               Download playlist videos in reverse order
      --xattr-set-filesize             (experimental) set file xattribute
                                       ytdl.filesize with expected filesize
+    --hls-prefer-native              (experimental) Use the native HLS
+                                     downloader instead of ffmpeg.
      --external-downloader COMMAND    (experimental) Use the specified external
                                       downloader. Currently supports
                                       aria2c,curl,wget
@@ -397,6 +399,9 @@ ## Post-processing Options:
                                       postprocessors (default)
      --prefer-ffmpeg                  Prefer ffmpeg over avconv for running the
                                       postprocessors
+    --ffmpeg-location PATH           Location of the ffmpeg/avconv binary;
+                                     either the path to the binary or its
+                                     containing directory.
      --exec CMD                       Execute a command on the file after
                                       downloading, similar to find's -exec
                                       syntax. Example: --exec 'adb push {}
@@ -510,11 +515,15 @@ ### I extracted a video URL with -g, but it does not play on another machine / i
  
  ### ERROR: no fmt_url_map or conn information found in video info
  
-youtube has switched to a new video info format in July 2011 which is not supported by old versions of youtube-dl. You can update youtube-dl with `sudo youtube-dl --update`.
+YouTube has switched to a new video info format in July 2011 which is not supported by old versions of youtube-dl. See [above](#how-do-i-update-youtube-dl) for how to update youtube-dl.
  
  ### ERROR: unable to download video ###
  
-youtube requires an additional signature since September 2012 which is not supported by old versions of youtube-dl. You can update youtube-dl with `sudo youtube-dl --update`.
+YouTube requires an additional signature since September 2012 which is not supported by old versions of youtube-dl. See [above](#how-do-i-update-youtube-dl) for how to update youtube-dl.
+
+### ExtractorError: Could not find JS function u'OF'
+
+In February 2015, the new YouTube player contained a character sequence in a string that was misinterpreted by old versions of youtube-dl. See [above](#how-do-i-update-youtube-dl) for how to update youtube-dl.
  
  ### SyntaxError: Non-ASCII character ###
  
@@ -562,7 +571,7 @@ ### Can you add support for this anime video site, or site which shows current m
  
  ### How can I detect whether a given URL is supported by youtube-dl?
  
-For one, have a look at the [list of supported sites](docs/supportedsites.md). Note that it can sometimes happen that the site changes its URL scheme (say, from http://example.com/v/1234567 to http://example.com/v/1234567 ) and youtube-dl reports an URL of a service in that list as unsupported. In that case, simply report a bug.
+For one, have a look at the [list of supported sites](docs/supportedsites.md). Note that it can sometimes happen that the site changes its URL scheme (say, from http://example.com/video/1234567 to http://example.com/v/1234567 ) and youtube-dl reports an URL of a service in that list as unsupported. In that case, simply report a bug.
  
  It is *not* possible to detect whether a URL is supported or not. That's because youtube-dl contains a generic extractor which matches **all** URLs. You may be tempted to disable, exclude, or remove the generic extractor, but the generic extractor not only allows users to extract videos from lots of websites that embed a video from another service, but may also be used to extract video from a service that it's hosting itself. Therefore, we neither recommend nor support disabling, excluding, or removing the generic extractor.
  
diff --git a/devscripts/check-porn.py b/devscripts/check-porn.py

index 216282712c1b38b96c049f74a6cfe8a0fcd30806..6a5bd9eda333246c47064bf84cfc03da09de4caf 100644 (file)
--- a/devscripts/check-porn.py
+++ b/devscripts/check-porn.py
@@ -45,12 +45,12 @@
  
          RESULT = ('.' + domain + '\n' in LIST or '\n' + domain + '\n' in LIST)
  
-    if RESULT and ('info_dict' not in test or 'age_limit' not in test['info_dict']
-                   or test['info_dict']['age_limit'] != 18):
+    if RESULT and ('info_dict' not in test or 'age_limit' not in test['info_dict'] or
+                   test['info_dict']['age_limit'] != 18):
          print('\nPotential missing age_limit check: {0}'.format(test['name']))
  
-    elif not RESULT and ('info_dict' in test and 'age_limit' in test['info_dict']
-                         and test['info_dict']['age_limit'] == 18):
+    elif not RESULT and ('info_dict' in test and 'age_limit' in test['info_dict'] and
+                         test['info_dict']['age_limit'] == 18):
          print('\nPotential false negative: {0}'.format(test['name']))
  
      else:
diff --git a/docs/supportedsites.md b/docs/supportedsites.md

index 8bce8fede2ffd39feaba82c4494bcbbe0b405b40..9f70db80ac39c6eaac088d951eca4382e01946c2 100644 (file)
--- a/docs/supportedsites.md
+++ b/docs/supportedsites.md
@@ -1,4 +1,5 @@
  # Supported sites
+ - **1tv**: Первый канал
   - **1up.com**
   - **220.ro**
   - **24video**
@@ -60,14 +61,19 @@ # Supported sites
   - **Brightcove**
   - **BuzzFeed**
   - **BYUtv**
+ - **Camdemy**
+ - **CamdemyFolder**
   - **Canal13cl**
   - **canalc2.tv**
   - **Canalplus**: canalplus.fr, piwiplus.fr and d8.tv
   - **CBS**
   - **CBSNews**: CBS News
+ - **CBSSports**
   - **CeskaTelevize**
   - **channel9**: Channel 9
   - **Chilloutzone**
+ - **chirbit**
+ - **chirbit:profile**
   - **Cinchcast**
   - **Cinemassacre**
   - **clipfish**
@@ -118,6 +124,7 @@ # Supported sites
   - **EllenTV**
   - **EllenTV:clips**
   - **ElPais**: El País
+ - **Embedly**
   - **EMPFlix**
   - **Engadget**
   - **Eporner**
@@ -134,7 +141,6 @@ # Supported sites
   - **fernsehkritik.tv:postecke**
   - **Firedrive**
   - **Firstpost**
- - **firsttv**: Видеоархив - Первый канал
   - **Flickr**
   - **Folketinget**: Folketinget (ft.dk; Danish parliament)
   - **Foxgay**
@@ -174,6 +180,7 @@ # Supported sites
   - **Helsinki**: helsinki.fi
   - **HentaiStigma**
   - **HistoricFilms**
+ - **History**
   - **hitbox**
   - **hitbox:live**
   - **HornBunny**
@@ -187,6 +194,7 @@ # Supported sites
   - **ign.com**
   - **imdb**: Internet Movie Database trailers
   - **imdb:list**: Internet Movie Database lists
+ - **Imgur**
   - **Ina**
   - **InfoQ**
   - **Instagram**
@@ -259,6 +267,7 @@ # Supported sites
   - **myvideo**
   - **MyVidster**
   - **n-tv.de**
+ - **NationalGeographic**
   - **Naver**
   - **NBA**
   - **NBC**
@@ -287,6 +296,8 @@ # Supported sites
   - **nowvideo**: NowVideo
   - **npo.nl**
   - **npo.nl:live**
+ - **npo.nl:radio**
+ - **npo.nl:radio:fragment**
   - **NRK**
   - **NRKTV**
   - **ntv.ru**
@@ -314,12 +325,14 @@ # Supported sites
   - **podomatic**
   - **PornHd**
   - **PornHub**
+ - **PornHubPlaylist**
   - **Pornotube**
   - **PornoXO**
   - **PromptFile**
   - **prosiebensat1**: ProSiebenSat.1 Digital
   - **Pyvideo**
   - **QuickVid**
+ - **R7**
   - **radio.de**
   - **radiobremen**
   - **radiofrance**
@@ -333,9 +346,9 @@ # Supported sites
   - **Roxwel**
   - **RTBF**
   - **Rte**
+ - **rtl.nl**: rtl.nl and rtlxl.nl
   - **RTL2**
   - **RTLnow**
- - **rtlxl.nl**
   - **RTP**
   - **RTS**: RTS.ch
   - **rtve.es:alacarta**: RTVE a la carta
@@ -347,6 +360,7 @@ # Supported sites
   - **rutube:movie**: Rutube movies
   - **rutube:person**: Rutube person videos
   - **RUTV**: RUTV.RU
+ - **Sandia**: Sandia National Laboratories
   - **Sapo**: SAPO Vídeos
   - **savefrom.net**
   - **SBS**: sbs.com.au
@@ -374,7 +388,8 @@ # Supported sites
   - **soundcloud:playlist**
   - **soundcloud:set**
   - **soundcloud:user**
- - **Soundgasm**
+ - **soundgasm**
+ - **soundgasm:profile**
   - **southpark.cc.com**
   - **southpark.de**
   - **Space**
@@ -440,6 +455,7 @@ # Supported sites
   - **Turbo**
   - **Tutv**
   - **tv.dfb.de**
+ - **TV4**: tv4.se and tv4play.se
   - **tvigle**: Интернет-телевидение Tvigle.ru
   - **tvp.pl**
   - **tvp.pl:Series**
@@ -527,6 +543,7 @@ # Supported sites
   - **XVideos**
   - **XXXYMovies**
   - **Yahoo**: Yahoo screen and movies
+ - **Yam**
   - **YesJapan**
   - **Ynet**
   - **YouJizz**
@@ -546,6 +563,7 @@ # Supported sites
   - **youtube:subscriptions**: YouTube.com subscriptions feed, "ytsubs" keyword (requires authentication)
   - **youtube:user**: YouTube.com user videos (URL or "ytuser" keyword)
   - **youtube:watch_later**: Youtube watch later list, ":ytwatchlater" for short (requires authentication)
+ - **Zapiks**
   - **ZDF**
   - **ZDFChannel**
   - **zingmp3:album**: mp3.zing.vn albums
diff --git a/test/helper.py b/test/helper.py

index 651ef99b983973dab1f17ab8619849af2b9fe9b1..12afdf184f0215e9947515cd3a8516ccad2e480e 100644 (file)
--- a/test/helper.py
+++ b/test/helper.py
@@ -113,6 +113,16 @@ def expect_info_dict(self, got_dict, expected_dict):
              self.assertTrue(
                  got.startswith(start_str),
                  'field %s (value: %r) should start with %r' % (info_field, got, start_str))
+        elif isinstance(expected, compat_str) and expected.startswith('contains:'):
+            got = got_dict.get(info_field)
+            contains_str = expected[len('contains:'):]
+            self.assertTrue(
+                isinstance(got, compat_str),
+                'Expected a %s object, but got %s for field %s' % (
+                    compat_str.__name__, type(got).__name__, info_field))
+            self.assertTrue(
+                contains_str in got,
+                'field %s (value: %r) should contain %r' % (info_field, got, contains_str))
          elif isinstance(expected, type):
              got = got_dict.get(info_field)
              self.assertTrue(isinstance(got, expected),
@@ -163,12 +173,14 @@ def _repr(v):
              info_dict_str += ''.join(
                  '    %s: %s,\n' % (_repr(k), _repr(v))
                  for k, v in test_info_dict.items() if k not in missing_keys)
-            info_dict_str += '\n'
+
+            if info_dict_str:
+                info_dict_str += '\n'
          info_dict_str += ''.join(
              '    %s: %s,\n' % (_repr(k), _repr(test_info_dict[k]))
              for k in missing_keys)
          write_string(
-            '\n\'info_dict\': {\n' + info_dict_str + '}\n', out=sys.stderr)
+            '\n\'info_dict\': {\n' + info_dict_str + '},\n', out=sys.stderr)
          self.assertFalse(
              missing_keys,
              'Missing keys in test definition: %s' % (
diff --git a/test/test_jsinterp.py b/test/test_jsinterp.py

index b91b8c4924339ab13e90980a42f5ca0a29ba7d32..fc73e5dc29a5c8faab88f4604f99df4ee9de6b2e 100644 (file)
--- a/test/test_jsinterp.py
+++ b/test/test_jsinterp.py
@@ -70,6 +70,8 @@ def test_assignments(self):
          self.assertEqual(jsi.call_function('f'), -11)
  
      def test_comments(self):
+        'Skipping: Not yet fully implemented'
+        return
          jsi = JSInterpreter('''
          function x() {
              var x = /* 1 + */ 2;
@@ -80,6 +82,15 @@ def test_comments(self):
          ''')
          self.assertEqual(jsi.call_function('x'), 52)
  
+        jsi = JSInterpreter('''
+        function f() {
+            var x = "/*";
+            var y = 1 /* comment */ + 2;
+            return y;
+        }
+        ''')
+        self.assertEqual(jsi.call_function('f'), 3)
+
      def test_precedence(self):
          jsi = JSInterpreter('''
          function x() {
diff --git a/test/test_swfinterp.py b/test/test_swfinterp.py

index 9f18055e629d3c21826ad8159bdf0ae55409bca2..f1e8998192b131613cb9d26a1167ce35e0a61e9f 100644 (file)
--- a/test/test_swfinterp.py
+++ b/test/test_swfinterp.py
@@ -34,8 +34,8 @@ def _make_testfunc(testfile):
      def test_func(self):
          as_file = os.path.join(TEST_DIR, testfile)
          swf_file = os.path.join(TEST_DIR, test_id + '.swf')
-        if ((not os.path.exists(swf_file))
-                or os.path.getmtime(swf_file) < os.path.getmtime(as_file)):
+        if ((not os.path.exists(swf_file)) or
+                os.path.getmtime(swf_file) < os.path.getmtime(as_file)):
              # Recompile
              try:
                  subprocess.check_call([
diff --git a/test/test_utils.py b/test/test_utils.py

index 1c29d0889b97991e20beed3fa9840bb4eb43a6da..c7373af1e2f64b8bc0bf0003d162b4e4b697eb87 100644 (file)
--- a/test/test_utils.py
+++ b/test/test_utils.py
@@ -370,6 +370,10 @@ def test_js_to_json_realworld(self):
              "playlist":[{"controls":{"all":null}}]
          }''')
  
+        inp = '"SAND Number: SAND 2013-7800P\\nPresenter: Tom Russo\\nHabanero Software Training - Xyce Software\\nXyce, Sandia\\u0027s"'
+        json_code = js_to_json(inp)
+        self.assertEqual(json.loads(json_code), json.loads(inp))
+
      def test_js_to_json_edgecases(self):
          on = js_to_json("{abc_def:'1\\'\\\\2\\\\\\'3\"4'}")
          self.assertEqual(json.loads(on), {"abc_def": "1'\\2\\'3\"4"})
diff --git a/test/test_youtube_signature.py b/test/test_youtube_signature.py

index 09696e19a29ded636f295dba334506dae403d3b4..060864434fe2ab81839dcde17475e6e9f61db0f2 100644 (file)
--- a/test/test_youtube_signature.py
+++ b/test/test_youtube_signature.py
@@ -64,6 +64,12 @@
          'js',
          '4646B5181C6C3020DF1D9C7FCFEA.AD80ABF70C39BD369CCCAE780AFBB98FA6B6CB42766249D9488C288',
          '82C8849D94266724DC6B6AF89BBFA087EACCD963.B93C07FBA084ACAEFCF7C9D1FD0203C6C1815B6B'
+    ),
+    (
+        'https://s.ytimg.com/yts/jsbin/html5player-en_US-vflKjOTVq/html5player.js',
+        'js',
+        '312AA52209E3623129A412D56A40F11CB0AF14AE.3EE09501CB14E3BCDC3B2AE808BF3F1D14E7FBF12',
+        '112AA5220913623229A412D56A40F11CB0AF14AE.3EE0950FCB14EEBCDC3B2AE808BF331D14E7FBF3',
      )
  ]
  
diff --git a/youtube_dl/YoutubeDL.py b/youtube_dl/YoutubeDL.py

index 70b364c9bb3e90afa063208925e7f2a2501e603a..76fc394bcff44f30ae6fa383ea54621a654a0864 100755 (executable)
--- a/youtube_dl/YoutubeDL.py
+++ b/youtube_dl/YoutubeDL.py
@@ -199,18 +199,25 @@ class YoutubeDL(object):
                         postprocessor.
      progress_hooks:    A list of functions that get called on download
                         progress, with a dictionary with the entries
-                       * status: One of "downloading" and "finished".
+                       * status: One of "downloading", "error", or "finished".
                                   Check this first and ignore unknown values.
  
-                       If status is one of "downloading" or "finished", the
+                       If status is one of "downloading", or "finished", the
                         following properties may also be present:
                         * filename: The final filename (always present)
+                       * tmpfilename: The filename we're currently writing to
                         * downloaded_bytes: Bytes on disk
                         * total_bytes: Size of the whole file, None if unknown
-                       * tmpfilename: The filename we're currently writing to
+                       * total_bytes_estimate: Guess of the eventual file size,
+                                               None if unavailable.
+                       * elapsed: The number of seconds since download started.
                         * eta: The estimated time in seconds, None if unknown
                         * speed: The download speed in bytes/second, None if
                                  unknown
+                       * fragment_index: The counter of the currently
+                                         downloaded video fragment.
+                       * fragment_count: The number of fragments (= individual
+                                         files that will be merged)
  
                         Progress hooks are guaranteed to be called at least once
                         (with status "finished") if the download is successful.
@@ -225,7 +232,6 @@ class YoutubeDL(object):
      call_home:         Boolean, true iff we are allowed to contact the
                         youtube-dl servers for debugging.
      sleep_interval:    Number of seconds to sleep before each download.
-    external_downloader:  Executable of the external downloader to call.
      listformats:       Print an overview of available video formats and exit.
      list_thumbnails:   Print a table of all thumbnails and exit.
      match_filter:      A function that gets called with the info_dict of
@@ -235,6 +241,10 @@ class YoutubeDL(object):
                         match_filter_func in utils.py is one example for this.
      no_color:          Do not emit color codes in output.
  
+    The following options determine which downloader is picked:
+    external_downloader: Executable of the external downloader to call.
+                       None or unset for standard (built-in) downloader.
+    hls_prefer_native: Use the native HLS downloader instead of ffmpeg/avconv.
  
      The following parameters are not used by YoutubeDL itself, they are used by
      the FileDownloader:
@@ -298,8 +308,8 @@ def __init__(self, params=None, auto_init=True):
                      raise
  
          if (sys.version_info >= (3,) and sys.platform != 'win32' and
-                sys.getfilesystemencoding() in ['ascii', 'ANSI_X3.4-1968']
-                and not params.get('restrictfilenames', False)):
+                sys.getfilesystemencoding() in ['ascii', 'ANSI_X3.4-1968'] and
+                not params.get('restrictfilenames', False)):
              # On Python 3, the Unicode filesystem API will throw errors (#1474)
              self.report_warning(
                  'Assuming --restrict-filenames since file system encoding '
@@ -951,30 +961,9 @@ def _calc_headers(self, info_dict):
          return res
  
      def _calc_cookies(self, info_dict):
-        class _PseudoRequest(object):
-            def __init__(self, url):
-                self.url = url
-                self.headers = {}
-                self.unverifiable = False
-
-            def add_unredirected_header(self, k, v):
-                self.headers[k] = v
-
-            def get_full_url(self):
-                return self.url
-
-            def is_unverifiable(self):
-                return self.unverifiable
-
-            def has_header(self, h):
-                return h in self.headers
-
-            def get_header(self, h, default=None):
-                return self.headers.get(h, default)
-
-        pr = _PseudoRequest(info_dict['url'])
+        pr = compat_urllib_request.Request(info_dict['url'])
          self.cookiejar.add_cookie_header(pr)
-        return pr.headers.get('Cookie')
+        return pr.get_header('Cookie')
  
      def process_video_result(self, info_dict, download=True):
          assert info_dict.get('_type', 'video') == 'video'
@@ -1363,7 +1352,7 @@ def dl(name, info):
                      downloaded = []
                      success = True
                      merger = FFmpegMergerPP(self, not self.params.get('keepvideo'))
-                    if not merger._executable:
+                    if not merger.available:
                          postprocessors = []
                          self.report_warning('You have requested multiple '
                                              'formats but ffmpeg or avconv are not installed.'
@@ -1442,8 +1431,8 @@ def download(self, url_list):
          """Download a given list of URLs."""
          outtmpl = self.params.get('outtmpl', DEFAULT_OUTTMPL)
          if (len(url_list) > 1 and
-                '%' not in outtmpl
-                and self.params.get('max_downloads') != 1):
+                '%' not in outtmpl and
+                self.params.get('max_downloads') != 1):
              raise SameFileError(outtmpl)
  
          for url in url_list:
@@ -1610,29 +1599,18 @@ def _format_note(self, fdict):
          return res
  
      def list_formats(self, info_dict):
-        def line(format, idlen=20):
-            return (('%-' + compat_str(idlen + 1) + 's%-10s%-12s%s') % (
-                format['format_id'],
-                format['ext'],
-                self.format_resolution(format),
-                self._format_note(format),
-            ))
-
          formats = info_dict.get('formats', [info_dict])
-        idlen = max(len('format code'),
-                    max(len(f['format_id']) for f in formats))
-        formats_s = [
-            line(f, idlen) for f in formats
+        table = [
+            [f['format_id'], f['ext'], self.format_resolution(f), self._format_note(f)]
+            for f in formats
              if f.get('preference') is None or f['preference'] >= -1000]
          if len(formats) > 1:
-            formats_s[-1] += (' ' if self._format_note(formats[-1]) else '') + '(best)'
+            table[-1][-1] += (' ' if table[-1][-1] else '') + '(best)'
  
-        header_line = line({
-            'format_id': 'format code', 'ext': 'extension',
-            'resolution': 'resolution', 'format_note': 'note'}, idlen=idlen)
+        header_line = ['format code', 'extension', 'resolution', 'note']
          self.to_screen(
-            '[info] Available formats for %s:\n%s\n%s' %
-            (info_dict['id'], header_line, '\n'.join(formats_s)))
+            '[info] Available formats for %s:\n%s' %
+            (info_dict['id'], render_table(header_line, table)))
  
      def list_thumbnails(self, info_dict):
          thumbnails = info_dict.get('thumbnails')
@@ -1723,7 +1701,7 @@ def print_debug_header(self):
          self._write_string('[debug] Python version %s - %s\n' % (
              platform.python_version(), platform_name()))
  
-        exe_versions = FFmpegPostProcessor.get_versions()
+        exe_versions = FFmpegPostProcessor.get_versions(self)
          exe_versions['rtmpdump'] = rtmpdump_version()
          exe_str = ', '.join(
              '%s %s' % (exe, v)
diff --git a/youtube_dl/__init__.py b/youtube_dl/__init__.py

index 5f25850033a2d42e29297e1f4e59e5c8d2093e95..5ce20180098faf91adf598fac07e1c4553f3c746 100644 (file)
--- a/youtube_dl/__init__.py
+++ b/youtube_dl/__init__.py
@@ -189,14 +189,14 @@ def _real_main(argv=None):
          # In Python 2, sys.argv is a bytestring (also note http://bugs.python.org/issue2128 for Windows systems)
          if opts.outtmpl is not None:
              opts.outtmpl = opts.outtmpl.decode(preferredencoding())
-    outtmpl = ((opts.outtmpl is not None and opts.outtmpl)
-               or (opts.format == '-1' and opts.usetitle and '%(title)s-%(id)s-%(format)s.%(ext)s')
-               or (opts.format == '-1' and '%(id)s-%(format)s.%(ext)s')
-               or (opts.usetitle and opts.autonumber and '%(autonumber)s-%(title)s-%(id)s.%(ext)s')
-               or (opts.usetitle and '%(title)s-%(id)s.%(ext)s')
-               or (opts.useid and '%(id)s.%(ext)s')
-               or (opts.autonumber and '%(autonumber)s-%(id)s.%(ext)s')
-               or DEFAULT_OUTTMPL)
+    outtmpl = ((opts.outtmpl is not None and opts.outtmpl) or
+               (opts.format == '-1' and opts.usetitle and '%(title)s-%(id)s-%(format)s.%(ext)s') or
+               (opts.format == '-1' and '%(id)s-%(format)s.%(ext)s') or
+               (opts.usetitle and opts.autonumber and '%(autonumber)s-%(title)s-%(id)s.%(ext)s') or
+               (opts.usetitle and '%(title)s-%(id)s.%(ext)s') or
+               (opts.useid and '%(id)s.%(ext)s') or
+               (opts.autonumber and '%(autonumber)s-%(id)s.%(ext)s') or
+               DEFAULT_OUTTMPL)
      if not os.path.splitext(outtmpl)[1] and opts.extractaudio:
          parser.error('Cannot download a video and extract audio into the same'
                       ' file! Use "{0}.%(ext)s" instead of "{0}" as the output'
@@ -349,6 +349,8 @@ def _real_main(argv=None):
          'xattr_set_filesize': opts.xattr_set_filesize,
          'match_filter': match_filter,
          'no_color': opts.no_color,
+        'ffmpeg_location': opts.ffmpeg_location,
+        'hls_prefer_native': opts.hls_prefer_native,
      }
  
      with YoutubeDL(ydl_opts) as ydl:
diff --git a/youtube_dl/downloader/__init__.py b/youtube_dl/downloader/__init__.py

index eff1122c5c09eff494ad34af835b06e33c9e4751..9fb66e2f7f680a71c05fdd866c72b0db2dd91a77 100644 (file)
--- a/youtube_dl/downloader/__init__.py
+++ b/youtube_dl/downloader/__init__.py
@@ -34,6 +34,9 @@ def get_suitable_downloader(info_dict, params={}):
          if ed.supports(info_dict):
              return ed
  
+    if protocol == 'm3u8' and params.get('hls_prefer_native'):
+        return NativeHlsFD
+
      return PROTOCOL_MAP.get(protocol, HttpFD)
  
  
diff --git a/youtube_dl/downloader/common.py b/youtube_dl/downloader/common.py

index 7bb3a948d2ebd0eaca46ca72f72dfaa2e7ffd1bd..3ae90021a28e661ab532a2d42a7c4e0826d1f46f 100644 (file)
--- a/youtube_dl/downloader/common.py
+++ b/youtube_dl/downloader/common.py
@@ -1,4 +1,4 @@
-from __future__ import unicode_literals
+from __future__ import division, unicode_literals
  
  import os
  import re
@@ -54,6 +54,7 @@ def __init__(self, ydl, params):
          self.ydl = ydl
          self._progress_hooks = []
          self.params = params
+        self.add_progress_hook(self.report_progress)
  
      @staticmethod
      def format_seconds(seconds):
@@ -226,42 +227,64 @@ def _report_progress_status(self, msg, is_last_line=False):
              self.to_screen(clear_line + fullmsg, skip_eol=not is_last_line)
          self.to_console_title('youtube-dl ' + msg)
  
-    def report_progress(self, percent, data_len_str, speed, eta):
-        """Report download progress."""
-        if self.params.get('noprogress', False):
+    def report_progress(self, s):
+        if s['status'] == 'finished':
+            if self.params.get('noprogress', False):
+                self.to_screen('[download] Download completed')
+            else:
+                s['_total_bytes_str'] = format_bytes(s['total_bytes'])
+                if s.get('elapsed') is not None:
+                    s['_elapsed_str'] = self.format_seconds(s['elapsed'])
+                    msg_template = '100%% of %(_total_bytes_str)s in %(_elapsed_str)s'
+                else:
+                    msg_template = '100%% of %(_total_bytes_str)s'
+                self._report_progress_status(
+                    msg_template % s, is_last_line=True)
+
+        if self.params.get('noprogress'):
              return
-        if eta is not None:
-            eta_str = self.format_eta(eta)
-        else:
-            eta_str = 'Unknown ETA'
-        if percent is not None:
-            percent_str = self.format_percent(percent)
+
+        if s['status'] != 'downloading':
+            return
+
+        if s.get('eta') is not None:
+            s['_eta_str'] = self.format_eta(s['eta'])
          else:
-            percent_str = 'Unknown %'
-        speed_str = self.format_speed(speed)
+            s['_eta_str'] = 'Unknown ETA'
  
-        msg = ('%s of %s at %s ETA %s' %
-               (percent_str, data_len_str, speed_str, eta_str))
-        self._report_progress_status(msg)
+        if s.get('total_bytes') and s.get('downloaded_bytes') is not None:
+            s['_percent_str'] = self.format_percent(100 * s['downloaded_bytes'] / s['total_bytes'])
+        elif s.get('total_bytes_estimate') and s.get('downloaded_bytes') is not None:
+            s['_percent_str'] = self.format_percent(100 * s['downloaded_bytes'] / s['total_bytes_estimate'])
+        else:
+            if s.get('downloaded_bytes') == 0:
+                s['_percent_str'] = self.format_percent(0)
+            else:
+                s['_percent_str'] = 'Unknown %'
  
-    def report_progress_live_stream(self, downloaded_data_len, speed, elapsed):
-        if self.params.get('noprogress', False):
-            return
-        downloaded_str = format_bytes(downloaded_data_len)
-        speed_str = self.format_speed(speed)
-        elapsed_str = FileDownloader.format_seconds(elapsed)
-        msg = '%s at %s (%s)' % (downloaded_str, speed_str, elapsed_str)
-        self._report_progress_status(msg)
-
-    def report_finish(self, data_len_str, tot_time):
-        """Report download finished."""
-        if self.params.get('noprogress', False):
-            self.to_screen('[download] Download completed')
+        if s.get('speed') is not None:
+            s['_speed_str'] = self.format_speed(s['speed'])
+        else:
+            s['_speed_str'] = 'Unknown speed'
+
+        if s.get('total_bytes') is not None:
+            s['_total_bytes_str'] = format_bytes(s['total_bytes'])
+            msg_template = '%(_percent_str)s of %(_total_bytes_str)s at %(_speed_str)s ETA %(_eta_str)s'
+        elif s.get('total_bytes_estimate') is not None:
+            s['_total_bytes_estimate_str'] = format_bytes(s['total_bytes_estimate'])
+            msg_template = '%(_percent_str)s of ~%(_total_bytes_estimate_str)s at %(_speed_str)s ETA %(_eta_str)s'
          else:
-            self._report_progress_status(
-                ('100%% of %s in %s' %
-                 (data_len_str, self.format_seconds(tot_time))),
-                is_last_line=True)
+            if s.get('downloaded_bytes') is not None:
+                s['_downloaded_bytes_str'] = format_bytes(s['downloaded_bytes'])
+                if s.get('elapsed'):
+                    s['_elapsed_str'] = self.format_seconds(s['elapsed'])
+                    msg_template = '%(_downloaded_bytes_str)s at %(_speed_str)s (%(_elapsed_str)s)'
+                else:
+                    msg_template = '%(_downloaded_bytes_str)s at %(_speed_str)s'
+            else:
+                msg_template = '%(_percent_str)s % at %(_speed_str)s ETA %(_eta_str)s'
+
+        self._report_progress_status(msg_template % s)
  
      def report_resuming_byte(self, resume_len):
          """Report attempt to resume at given byte."""
@@ -288,14 +311,14 @@ def download(self, filename, info_dict):
          """
  
          nooverwrites_and_exists = (
-            self.params.get('nooverwrites', False)
-            and os.path.exists(encodeFilename(filename))
+            self.params.get('nooverwrites', False) and
+            os.path.exists(encodeFilename(filename))
          )
  
          continuedl_and_exists = (
-            self.params.get('continuedl', False)
-            and os.path.isfile(encodeFilename(filename))
-            and not self.params.get('nopart', False)
+            self.params.get('continuedl', False) and
+            os.path.isfile(encodeFilename(filename)) and
+            not self.params.get('nopart', False)
          )
  
          # Check file already present
diff --git a/youtube_dl/downloader/external.py b/youtube_dl/downloader/external.py

index ff031d2e04253b775e517347866fe3ee75a666c6..51c41c70462674ee3a07aae6f645c06ae7c88c71 100644 (file)
--- a/youtube_dl/downloader/external.py
+++ b/youtube_dl/downloader/external.py
@@ -75,7 +75,7 @@ def _call_downloader(self, tmpfilename, info_dict):
  
  class CurlFD(ExternalFD):
      def _make_cmd(self, tmpfilename, info_dict):
-        cmd = [self.exe, '-o', tmpfilename]
+        cmd = [self.exe, '--location', '-o', tmpfilename]
          for key, val in info_dict['http_headers'].items():
              cmd += ['--header', '%s: %s' % (key, val)]
          cmd += self._source_address('--interface')
diff --git a/youtube_dl/downloader/f4m.py b/youtube_dl/downloader/f4m.py

index 0e7a1c20075499e58b977da4154ce287b144f958..7b8fe8cf57cfb57672f153512d60c93d4fe18b62 100644 (file)
--- a/youtube_dl/downloader/f4m.py
+++ b/youtube_dl/downloader/f4m.py
@@ -1,4 +1,4 @@
-from __future__ import unicode_literals
+from __future__ import division, unicode_literals
  
  import base64
  import io
@@ -15,7 +15,6 @@
  from ..utils import (
      struct_pack,
      struct_unpack,
-    format_bytes,
      encodeFilename,
      sanitize_open,
      xpath_text,
@@ -252,17 +251,6 @@ def real_download(self, filename, info_dict):
          requested_bitrate = info_dict.get('tbr')
          self.to_screen('[download] Downloading f4m manifest')
          manifest = self.ydl.urlopen(man_url).read()
-        self.report_destination(filename)
-        http_dl = HttpQuietDownloader(
-            self.ydl,
-            {
-                'continuedl': True,
-                'quiet': True,
-                'noprogress': True,
-                'ratelimit': self.params.get('ratelimit', None),
-                'test': self.params.get('test', False),
-            }
-        )
  
          doc = etree.fromstring(manifest)
          formats = [(int(f.attrib.get('bitrate', -1)), f)
@@ -298,39 +286,65 @@ def real_download(self, filename, info_dict):
          # For some akamai manifests we'll need to add a query to the fragment url
          akamai_pv = xpath_text(doc, _add_ns('pv-2.0'))
  
+        self.report_destination(filename)
+        http_dl = HttpQuietDownloader(
+            self.ydl,
+            {
+                'continuedl': True,
+                'quiet': True,
+                'noprogress': True,
+                'ratelimit': self.params.get('ratelimit', None),
+                'test': self.params.get('test', False),
+            }
+        )
          tmpfilename = self.temp_name(filename)
          (dest_stream, tmpfilename) = sanitize_open(tmpfilename, 'wb')
+
          write_flv_header(dest_stream)
          write_metadata_tag(dest_stream, metadata)
  
          # This dict stores the download progress, it's updated by the progress
          # hook
          state = {
+            'status': 'downloading',
              'downloaded_bytes': 0,
-            'frag_counter': 0,
+            'frag_index': 0,
+            'frag_count': total_frags,
+            'filename': filename,
+            'tmpfilename': tmpfilename,
          }
          start = time.time()
  
-        def frag_progress_hook(status):
-            frag_total_bytes = status.get('total_bytes', 0)
-            estimated_size = (state['downloaded_bytes'] +
-                              (total_frags - state['frag_counter']) * frag_total_bytes)
-            if status['status'] == 'finished':
+        def frag_progress_hook(s):
+            if s['status'] not in ('downloading', 'finished'):
+                return
+
+            frag_total_bytes = s.get('total_bytes', 0)
+            if s['status'] == 'finished':
                  state['downloaded_bytes'] += frag_total_bytes
-                state['frag_counter'] += 1
-                progress = self.calc_percent(state['frag_counter'], total_frags)
-                byte_counter = state['downloaded_bytes']
+                state['frag_index'] += 1
+
+            estimated_size = (
+                (state['downloaded_bytes'] + frag_total_bytes) /
+                (state['frag_index'] + 1) * total_frags)
+            time_now = time.time()
+            state['total_bytes_estimate'] = estimated_size
+            state['elapsed'] = time_now - start
+
+            if s['status'] == 'finished':
+                progress = self.calc_percent(state['frag_index'], total_frags)
              else:
-                frag_downloaded_bytes = status['downloaded_bytes']
-                byte_counter = state['downloaded_bytes'] + frag_downloaded_bytes
+                frag_downloaded_bytes = s['downloaded_bytes']
                  frag_progress = self.calc_percent(frag_downloaded_bytes,
                                                    frag_total_bytes)
-                progress = self.calc_percent(state['frag_counter'], total_frags)
+                progress = self.calc_percent(state['frag_index'], total_frags)
                  progress += frag_progress / float(total_frags)
  
-            eta = self.calc_eta(start, time.time(), estimated_size, byte_counter)
-            self.report_progress(progress, format_bytes(estimated_size),
-                                 status.get('speed'), eta)
+                state['eta'] = self.calc_eta(
+                    start, time_now, estimated_size, state['downloaded_bytes'] + frag_downloaded_bytes)
+                state['speed'] = s.get('speed')
+            self._hook_progress(state)
+
          http_dl.add_progress_hook(frag_progress_hook)
  
          frags_filenames = []
@@ -354,8 +368,8 @@ def frag_progress_hook(status):
              frags_filenames.append(frag_filename)
  
          dest_stream.close()
-        self.report_finish(format_bytes(state['downloaded_bytes']), time.time() - start)
  
+        elapsed = time.time() - start
          self.try_rename(tmpfilename, filename)
          for frag_file in frags_filenames:
              os.remove(frag_file)
@@ -366,6 +380,7 @@ def frag_progress_hook(status):
              'total_bytes': fsize,
              'filename': filename,
              'status': 'finished',
+            'elapsed': elapsed,
          })
  
          return True
diff --git a/youtube_dl/downloader/hls.py b/youtube_dl/downloader/hls.py

index e527ee425365a096b50f541b1c75c82dcb9013fb..8be4f424907e55adfac91af5eb587b62b54b8487 100644 (file)
--- a/youtube_dl/downloader/hls.py
+++ b/youtube_dl/downloader/hls.py
@@ -23,15 +23,14 @@ def real_download(self, filename, info_dict):
          tmpfilename = self.temp_name(filename)
  
          ffpp = FFmpegPostProcessor(downloader=self)
-        program = ffpp._executable
-        if program is None:
+        if not ffpp.available:
              self.report_error('m3u8 download detected but ffmpeg or avconv could not be found. Please install one.')
              return False
          ffpp.check_version()
  
          args = [
              encodeArgument(opt)
-            for opt in (program, '-y', '-i', url, '-f', 'mp4', '-c', 'copy', '-bsf:a', 'aac_adtstoasc')]
+            for opt in (ffpp.executable, '-y', '-i', url, '-f', 'mp4', '-c', 'copy', '-bsf:a', 'aac_adtstoasc')]
          args.append(encodeFilename(tmpfilename, True))
  
          retval = subprocess.call(args)
@@ -48,7 +47,7 @@ def real_download(self, filename, info_dict):
              return True
          else:
              self.to_stderr('\n')
-            self.report_error('%s exited with code %d' % (program, retval))
+            self.report_error('%s exited with code %d' % (ffpp.basename, retval))
              return False
  
  
diff --git a/youtube_dl/downloader/http.py b/youtube_dl/downloader/http.py

index 49170cf9d47634602efe7832b235e4a751e25817..2e3dac8251dbaf5d8b3e1a90bc459f362d14f72e 100644 (file)
--- a/youtube_dl/downloader/http.py
+++ b/youtube_dl/downloader/http.py
@@ -1,11 +1,10 @@
  from __future__ import unicode_literals
  
+import errno
  import os
+import socket
  import time
  
-from socket import error as SocketError
-import errno
-
  from .common import FileDownloader
  from ..compat import (
      compat_urllib_request,
@@ -15,7 +14,6 @@
      ContentTooShortError,
      encodeFilename,
      sanitize_open,
-    format_bytes,
  )
  
  
@@ -102,7 +100,7 @@ def real_download(self, filename, info_dict):
                              resume_len = 0
                              open_mode = 'wb'
                              break
-            except SocketError as e:
+            except socket.error as e:
                  if e.errno != errno.ECONNRESET:
                      # Connection reset is no problem, just retry
                      raise
@@ -137,7 +135,6 @@ def real_download(self, filename, info_dict):
                  self.to_screen('\r[download] File is larger than max-filesize (%s bytes > %s bytes). Aborting.' % (data_len, max_data_len))
                  return False
  
-        data_len_str = format_bytes(data_len)
          byte_counter = 0 + resume_len
          block_size = self.params.get('buffersize', 1024)
          start = time.time()
@@ -196,20 +193,19 @@ def real_download(self, filename, info_dict):
              # Progress message
              speed = self.calc_speed(start, now, byte_counter - resume_len)
              if data_len is None:
-                eta = percent = None
+                eta = None
              else:
-                percent = self.calc_percent(byte_counter, data_len)
                  eta = self.calc_eta(start, time.time(), data_len - resume_len, byte_counter - resume_len)
-            self.report_progress(percent, data_len_str, speed, eta)
  
              self._hook_progress({
+                'status': 'downloading',
                  'downloaded_bytes': byte_counter,
                  'total_bytes': data_len,
                  'tmpfilename': tmpfilename,
                  'filename': filename,
-                'status': 'downloading',
                  'eta': eta,
                  'speed': speed,
+                'elapsed': now - start,
              })
  
              if is_test and byte_counter == data_len:
@@ -221,7 +217,13 @@ def real_download(self, filename, info_dict):
              return False
          if tmpfilename != '-':
              stream.close()
-        self.report_finish(data_len_str, (time.time() - start))
+
+        self._hook_progress({
+            'downloaded_bytes': byte_counter,
+            'total_bytes': data_len,
+            'tmpfilename': tmpfilename,
+            'status': 'error',
+        })
          if data_len is not None and byte_counter != data_len:
              raise ContentTooShortError(byte_counter, int(data_len))
          self.try_rename(tmpfilename, filename)
@@ -235,6 +237,7 @@ def real_download(self, filename, info_dict):
              'total_bytes': byte_counter,
              'filename': filename,
              'status': 'finished',
+            'elapsed': time.time() - start,
          })
  
          return True
diff --git a/youtube_dl/downloader/rtmp.py b/youtube_dl/downloader/rtmp.py

index f7eeb6f43f09670e8ecb6cba1791d49d09ecbf15..0a52c34c72dd5a24e31e69229b990efc11adcdb4 100644 (file)
--- a/youtube_dl/downloader/rtmp.py
+++ b/youtube_dl/downloader/rtmp.py
@@ -11,7 +11,6 @@
  from ..utils import (
      check_executable,
      encodeFilename,
-    format_bytes,
      get_exe_version,
  )
  
@@ -51,23 +50,23 @@ def run_rtmpdump(args):
                      if not resume_percent:
                          resume_percent = percent
                          resume_downloaded_data_len = downloaded_data_len
-                    eta = self.calc_eta(start, time.time(), 100 - resume_percent, percent - resume_percent)
-                    speed = self.calc_speed(start, time.time(), downloaded_data_len - resume_downloaded_data_len)
+                    time_now = time.time()
+                    eta = self.calc_eta(start, time_now, 100 - resume_percent, percent - resume_percent)
+                    speed = self.calc_speed(start, time_now, downloaded_data_len - resume_downloaded_data_len)
                      data_len = None
                      if percent > 0:
                          data_len = int(downloaded_data_len * 100 / percent)
-                    data_len_str = '~' + format_bytes(data_len)
-                    self.report_progress(percent, data_len_str, speed, eta)
-                    cursor_in_new_line = False
                      self._hook_progress({
+                        'status': 'downloading',
                          'downloaded_bytes': downloaded_data_len,
-                        'total_bytes': data_len,
+                        'total_bytes_estimate': data_len,
                          'tmpfilename': tmpfilename,
                          'filename': filename,
-                        'status': 'downloading',
                          'eta': eta,
+                        'elapsed': time_now - start,
                          'speed': speed,
                      })
+                    cursor_in_new_line = False
                  else:
                      # no percent for live streams
                      mobj = re.search(r'([0-9]+\.[0-9]{3}) kB / [0-9]+\.[0-9]{2} sec', line)
@@ -75,15 +74,15 @@ def run_rtmpdump(args):
                          downloaded_data_len = int(float(mobj.group(1)) * 1024)
                          time_now = time.time()
                          speed = self.calc_speed(start, time_now, downloaded_data_len)
-                        self.report_progress_live_stream(downloaded_data_len, speed, time_now - start)
-                        cursor_in_new_line = False
                          self._hook_progress({
                              'downloaded_bytes': downloaded_data_len,
                              'tmpfilename': tmpfilename,
                              'filename': filename,
                              'status': 'downloading',
+                            'elapsed': time_now - start,
                              'speed': speed,
                          })
+                        cursor_in_new_line = False
                      elif self.params.get('verbose', False):
                          if not cursor_in_new_line:
                              self.to_screen('')
diff --git a/youtube_dl/extractor/__init__.py b/youtube_dl/extractor/__init__.py

index 13292073c2499d9b74a9157f1f80fc108a1762e2..40fc92cf77c793ed3581ee5f179106dffb16d3b8 100644 (file)
--- a/youtube_dl/extractor/__init__.py
+++ b/youtube_dl/extractor/__init__.py
@@ -58,10 +58,15 @@
  from .canalc2 import Canalc2IE
  from .cbs import CBSIE
  from .cbsnews import CBSNewsIE
+from .cbssports import CBSSportsIE
  from .ccc import CCCIE
  from .ceskatelevize import CeskaTelevizeIE
  from .channel9 import Channel9IE
  from .chilloutzone import ChilloutzoneIE
+from .chirbit import (
+    ChirbitIE,
+    ChirbitProfileIE,
+)
  from .cinchcast import CinchcastIE
  from .clipfish import ClipfishIE
  from .cliphunter import CliphunterIE
@@ -121,6 +126,7 @@
      EllenTVClipsIE,
  )
  from .elpais import ElPaisIE
+from .embedly import EmbedlyIE
  from .empflix import EMPFlixIE
  from .engadget import EngadgetIE
  from .eporner import EpornerIE
@@ -204,6 +210,7 @@
      ImdbIE,
      ImdbListIE
  )
+from .imgur import ImgurIE
  from .ina import InaIE
  from .infoq import InfoQIE
  from .instagram import InstagramIE, InstagramUserIE
@@ -282,6 +289,7 @@
  from .myspass import MySpassIE
  from .myvideo import MyVideoIE
  from .myvidster import MyVidsterIE
+from .nationalgeographic import NationalGeographicIE
  from .naver import NaverIE
  from .nba import NBAIE
  from .nbc import (
@@ -350,13 +358,17 @@
  from .playvid import PlayvidIE
  from .podomatic import PodomaticIE
  from .pornhd import PornHdIE
-from .pornhub import PornHubIE
+from .pornhub import (
+    PornHubIE,
+    PornHubPlaylistIE,
+)
  from .pornotube import PornotubeIE
  from .pornoxo import PornoXOIE
  from .promptfile import PromptFileIE
  from .prosiebensat1 import ProSiebenSat1IE
  from .pyvideo import PyvideoIE
  from .quickvid import QuickVidIE
+from .r7 import R7IE
  from .radiode import RadioDeIE
  from .radiobremen import RadioBremenIE
  from .radiofrance import RadioFranceIE
@@ -371,7 +383,7 @@
  from .roxwel import RoxwelIE
  from .rtbf import RTBFIE
  from .rte import RteIE
-from .rtlnl import RtlXlIE
+from .rtlnl import RtlNlIE
  from .rtlnow import RTLnowIE
  from .rtl2 import RTL2IE
  from .rtp import RTPIE
@@ -386,6 +398,7 @@
      RutubePersonIE,
  )
  from .rutv import RUTVIE
+from .sandia import SandiaIE
  from .sapo import SapoIE
  from .savefrom import SaveFromIE
  from .sbs import SBSIE
@@ -416,7 +429,10 @@
      SoundcloudUserIE,
      SoundcloudPlaylistIE
  )
-from .soundgasm import SoundgasmIE
+from .soundgasm import (
+    SoundgasmIE,
+    SoundgasmProfileIE
+)
  from .southpark import (
      SouthParkIE,
      SouthparkDeIE,
@@ -482,6 +498,7 @@
  from .tunein import TuneInIE
  from .turbo import TurboIE
  from .tutv import TutvIE
+from .tv4 import TV4IE
  from .tvigle import TvigleIE
  from .tvp import TvpIE, TvpSeriesIE
  from .tvplay import TVPlayIE
@@ -579,6 +596,7 @@
      YahooIE,
      YahooSearchIE,
  )
+from .yam import YamIE
  from .yesjapan import YesJapanIE
  from .ynet import YnetIE
  from .youjizz import YouJizzIE
@@ -602,6 +620,7 @@
      YoutubeUserIE,
      YoutubeWatchLaterIE,
  )
+from .zapiks import ZapiksIE
  from .zdf import ZDFIE, ZDFChannelIE
  from .zingmp3 import (
      ZingMp3SongIE,
diff --git a/youtube_dl/extractor/adobetv.py b/youtube_dl/extractor/adobetv.py

index 28e07f8b04ed89fe7c79f445f3454adfb04d0561..97d12856092975a094ec18a5fd7ecafef39c255a 100644 (file)
--- a/youtube_dl/extractor/adobetv.py
+++ b/youtube_dl/extractor/adobetv.py
@@ -28,7 +28,6 @@ class AdobeTVIE(InfoExtractor):
  
      def _real_extract(self, url):
          video_id = self._match_id(url)
-
          webpage = self._download_webpage(url, video_id)
  
          player = self._parse_json(
@@ -44,8 +43,10 @@ def _real_extract(self, url):
              self._html_search_meta('datepublished', webpage, 'upload date'))
  
          duration = parse_duration(
-            self._html_search_meta('duration', webpage, 'duration')
-            or self._search_regex(r'Runtime:\s*(\d{2}:\d{2}:\d{2})', webpage, 'duration'))
+            self._html_search_meta('duration', webpage, 'duration') or
+            self._search_regex(
+                r'Runtime:\s*(\d{2}:\d{2}:\d{2})',
+                webpage, 'duration', fatal=False))
  
          view_count = str_to_int(self._search_regex(
              r'<div class="views">\s*Views?:\s*([\d,.]+)\s*</div>',
diff --git a/youtube_dl/extractor/adultswim.py b/youtube_dl/extractor/adultswim.py

index 502a9c25ad8fd6ab8fed0a46f2f52077f988aad9..34b8b01157bb930937f6f69c4950d8d01c39ed6e 100644 (file)
--- a/youtube_dl/extractor/adultswim.py
+++ b/youtube_dl/extractor/adultswim.py
@@ -38,6 +38,7 @@ class AdultSwimIE(InfoExtractor):
              },
          ],
          'info_dict': {
+            'id': 'rQxZvXQ4ROaSOqq-or2Mow',
              'title': 'Rick and Morty - Pilot',
              'description': "Rick moves in with his daughter's family and establishes himself as a bad influence on his grandson, Morty. "
          }
@@ -55,6 +56,7 @@ class AdultSwimIE(InfoExtractor):
              }
          ],
          'info_dict': {
+            'id': '-t8CamQlQ2aYZ49ItZCFog',
              'title': 'American Dad - Putting Francine Out of Business',
              'description': 'Stan hatches a plan to get Francine out of the real estate business.Watch more American Dad on [adult swim].'
          },
diff --git a/youtube_dl/extractor/appletrailers.py b/youtube_dl/extractor/appletrailers.py

index 287f71e076e91a44ea331c995410fbe8b40d178d..576f03b5b71115771555e1d8d46f4a108eb9de93 100644 (file)
--- a/youtube_dl/extractor/appletrailers.py
+++ b/youtube_dl/extractor/appletrailers.py
@@ -11,9 +11,12 @@
  
  
  class AppleTrailersIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:www\.)?trailers\.apple\.com/trailers/(?P<company>[^/]+)/(?P<movie>[^/]+)'
-    _TEST = {
+    _VALID_URL = r'https?://(?:www\.)?trailers\.apple\.com/(?:trailers|ca)/(?P<company>[^/]+)/(?P<movie>[^/]+)'
+    _TESTS = [{
          "url": "http://trailers.apple.com/trailers/wb/manofsteel/",
+        'info_dict': {
+            'id': 'manofsteel',
+        },
          "playlist": [
              {
                  "md5": "d97a8e575432dbcb81b7c3acb741f8a8",
@@ -60,7 +63,10 @@ class AppleTrailersIE(InfoExtractor):
                  },
              },
          ]
-    }
+    }, {
+        'url': 'http://trailers.apple.com/ca/metropole/autrui/',
+        'only_matching': True,
+    }]
  
      _JSON_RE = r'iTunes.playURL\((.*?)\);'
  
diff --git a/youtube_dl/extractor/bandcamp.py b/youtube_dl/extractor/bandcamp.py

index 490cc961a204d40d41fbb4e0306a66611f161a09..86929496708fccf3bc0febe78cd1e599fda1ab97 100644 (file)
--- a/youtube_dl/extractor/bandcamp.py
+++ b/youtube_dl/extractor/bandcamp.py
@@ -109,7 +109,7 @@ def _real_extract(self, url):
  
  class BandcampAlbumIE(InfoExtractor):
      IE_NAME = 'Bandcamp:album'
-    _VALID_URL = r'https?://(?:(?P<subdomain>[^.]+)\.)?bandcamp\.com(?:/album/(?P<title>[^?#]+)|/?(?:$|[?#]))'
+    _VALID_URL = r'https?://(?:(?P<subdomain>[^.]+)\.)?bandcamp\.com(?:/album/(?P<album_id>[^?#]+)|/?(?:$|[?#]))'
  
      _TESTS = [{
          'url': 'http://blazo.bandcamp.com/album/jazz-format-mixtape-vol-1',
@@ -133,31 +133,37 @@ class BandcampAlbumIE(InfoExtractor):
          ],
          'info_dict': {
              'title': 'Jazz Format Mixtape vol.1',
+            'id': 'jazz-format-mixtape-vol-1',
+            'uploader_id': 'blazo',
          },
          'params': {
              'playlistend': 2
          },
-        'skip': 'Bandcamp imposes download limits. See test_playlists:test_bandcamp_album for the playlist test'
+        'skip': 'Bandcamp imposes download limits.'
      }, {
          'url': 'http://nightbringer.bandcamp.com/album/hierophany-of-the-open-grave',
          'info_dict': {
              'title': 'Hierophany of the Open Grave',
+            'uploader_id': 'nightbringer',
+            'id': 'hierophany-of-the-open-grave',
          },
          'playlist_mincount': 9,
      }, {
          'url': 'http://dotscale.bandcamp.com',
          'info_dict': {
              'title': 'Loom',
+            'id': 'dotscale',
+            'uploader_id': 'dotscale',
          },
          'playlist_mincount': 7,
      }]
  
      def _real_extract(self, url):
          mobj = re.match(self._VALID_URL, url)
-        playlist_id = mobj.group('subdomain')
-        title = mobj.group('title')
-        display_id = title or playlist_id
-        webpage = self._download_webpage(url, display_id)
+        uploader_id = mobj.group('subdomain')
+        album_id = mobj.group('album_id')
+        playlist_id = album_id or uploader_id
+        webpage = self._download_webpage(url, playlist_id)
          tracks_paths = re.findall(r'<a href="(.*?)" itemprop="url">', webpage)
          if not tracks_paths:
              raise ExtractorError('The page doesn\'t contain any tracks')
@@ -168,8 +174,8 @@ def _real_extract(self, url):
              r'album_title\s*:\s*"(.*?)"', webpage, 'title', fatal=False)
          return {
              '_type': 'playlist',
+            'uploader_id': uploader_id,
              'id': playlist_id,
-            'display_id': display_id,
              'title': title,
              'entries': entries,
          }
diff --git a/youtube_dl/extractor/blinkx.py b/youtube_dl/extractor/blinkx.py

index 3e461e715e141b1ff4a294eb01b7657d16f05d4b..3b8eabe8f4e42283eaa8a2288413f971fdcd5b35 100644 (file)
--- a/youtube_dl/extractor/blinkx.py
+++ b/youtube_dl/extractor/blinkx.py
@@ -1,40 +1,35 @@
  from __future__ import unicode_literals
  
  import json
-import re
  
  from .common import InfoExtractor
-from ..utils import remove_start
+from ..utils import (
+    remove_start,
+    int_or_none,
+)
  
  
  class BlinkxIE(InfoExtractor):
-    _VALID_URL = r'^(?:https?://(?:www\.)blinkx\.com/#?ce/|blinkx:)(?P<id>[^?]+)'
+    _VALID_URL = r'(?:https?://(?:www\.)blinkx\.com/#?ce/|blinkx:)(?P<id>[^?]+)'
      IE_NAME = 'blinkx'
  
      _TEST = {
-        'url': 'http://www.blinkx.com/ce/8aQUy7GVFYgFzpKhT0oqsilwOGFRVXk3R1ZGWWdGenBLaFQwb3FzaWx3OGFRVXk3R1ZGWWdGenB',
-        'md5': '2e9a07364af40163a908edbf10bb2492',
+        'url': 'http://www.blinkx.com/ce/Da0Gw3xc5ucpNduzLuDDlv4WC9PuI4fDi1-t6Y3LyfdY2SZS5Urbvn-UPJvrvbo8LTKTc67Wu2rPKSQDJyZeeORCR8bYkhs8lI7eqddznH2ofh5WEEdjYXnoRtj7ByQwt7atMErmXIeYKPsSDuMAAqJDlQZ-3Ff4HJVeH_s3Gh8oQ',
+        'md5': '337cf7a344663ec79bf93a526a2e06c7',
          'info_dict': {
-            'id': '8aQUy7GV',
+            'id': 'Da0Gw3xc',
              'ext': 'mp4',
-            'title': 'Police Car Rolls Away',
-            'uploader': 'stupidvideos.com',
-            'upload_date': '20131215',
-            'timestamp': 1387068000,
-            'description': 'A police car gently rolls away from a fight. Maybe it felt weird being around a confrontation and just had to get out of there!',
-            'duration': 14.886,
-            'thumbnails': [{
-                'width': 100,
-                'height': 76,
-                'resolution': '100x76',
-                'url': 'http://cdn.blinkx.com/stream/b/41/StupidVideos/20131215/1873969261/1873969261_tn_0.jpg',
-            }],
+            'title': 'No Daily Show for John Oliver; HBO Show Renewed - IGN News',
+            'uploader': 'IGN News',
+            'upload_date': '20150217',
+            'timestamp': 1424215740,
+            'description': 'HBO has renewed Last Week Tonight With John Oliver for two more seasons.',
+            'duration': 47.743333,
          },
      }
  
-    def _real_extract(self, rl):
-        m = re.match(self._VALID_URL, rl)
-        video_id = m.group('id')
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
          display_id = video_id[:8]
  
          api_url = ('https://apib4.blinkx.com/api.php?action=play_video&' +
@@ -60,18 +55,20 @@ def _real_extract(self, rl):
              elif m['type'] in ('flv', 'mp4'):
                  vcodec = remove_start(m['vcodec'], 'ff')
                  acodec = remove_start(m['acodec'], 'ff')
-                tbr = (int(m['vbr']) + int(m['abr'])) // 1000
+                vbr = int_or_none(m.get('vbr') or m.get('vbitrate'), 1000)
+                abr = int_or_none(m.get('abr') or m.get('abitrate'), 1000)
+                tbr = vbr + abr if vbr and abr else None
                  format_id = '%s-%sk-%s' % (vcodec, tbr, m['w'])
                  formats.append({
                      'format_id': format_id,
                      'url': m['link'],
                      'vcodec': vcodec,
                      'acodec': acodec,
-                    'abr': int(m['abr']) // 1000,
-                    'vbr': int(m['vbr']) // 1000,
+                    'abr': abr,
+                    'vbr': vbr,
                      'tbr': tbr,
-                    'width': int(m['w']),
-                    'height': int(m['h']),
+                    'width': int_or_none(m.get('w')),
+                    'height': int_or_none(m.get('h')),
                  })
  
          self._sort_formats(formats)
diff --git a/youtube_dl/extractor/brightcove.py b/youtube_dl/extractor/brightcove.py

index ea0969d4d259a99653bebbcabcebb0e1f87719f3..0733bece7c45880ab5c20b916d5bd8c9700da548 100644 (file)
--- a/youtube_dl/extractor/brightcove.py
+++ b/youtube_dl/extractor/brightcove.py
@@ -95,6 +95,7 @@ class BrightcoveIE(InfoExtractor):
              'url': 'http://c.brightcove.com/services/viewer/htmlFederated?playerID=3550052898001&playerKey=AQ%7E%7E%2CAAABmA9XpXk%7E%2C-Kp7jNgisre1fG5OdqpAFUTcs0lP_ZoL',
              'info_dict': {
                  'title': 'Sealife',
+                'id': '3550319591001',
              },
              'playlist_mincount': 7,
          },
@@ -247,7 +248,7 @@ def _get_playlist_info(self, player_key):
          playlist_info = json_data['videoList']
          videos = [self._extract_video_info(video_info) for video_info in playlist_info['mediaCollectionDTO']['videoDTOs']]
  
-        return self.playlist_result(videos, playlist_id=playlist_info['id'],
+        return self.playlist_result(videos, playlist_id='%s' % playlist_info['id'],
                                      playlist_title=playlist_info['mediaCollectionDTO']['displayName'])
  
      def _extract_video_info(self, video_info):
diff --git a/youtube_dl/extractor/buzzfeed.py b/youtube_dl/extractor/buzzfeed.py

index a5d2af1749f188a086e5384b1de6a2441e624902..df503ecc0f50283f0cc77a867353912a47eee5dd 100644 (file)
--- a/youtube_dl/extractor/buzzfeed.py
+++ b/youtube_dl/extractor/buzzfeed.py
@@ -33,6 +33,7 @@ class BuzzFeedIE(InfoExtractor):
              'skip_download': True,  # Got enough YouTube download tests
          },
          'info_dict': {
+            'id': 'look-at-this-cute-dog-omg',
              'description': 're:Munchkin the Teddy Bear is back ?!',
              'title': 'You Need To Stop What You\'re Doing And Watching This Dog Walk On A Treadmill',
          },
@@ -42,8 +43,8 @@ class BuzzFeedIE(InfoExtractor):
                  'ext': 'mp4',
                  'upload_date': '20141124',
                  'uploader_id': 'CindysMunchkin',
-                'description': 're:© 2014 Munchkin the Shih Tzu',
-                'uploader': 'Munchkin the Shih Tzu',
+                'description': 're:© 2014 Munchkin the',
+                'uploader': 're:^Munchkin the',
                  'title': 're:Munchkin the Teddy Bear gets her exercise',
              },
          }]
diff --git a/youtube_dl/extractor/cbs.py b/youtube_dl/extractor/cbs.py

index e43756ec69b1d7f1872e45cc6901b41752ec6ef6..1ceb9d8d9df6c0268e33de5e34c01a245e134e05 100644 (file)
--- a/youtube_dl/extractor/cbs.py
+++ b/youtube_dl/extractor/cbs.py
@@ -1,7 +1,5 @@
  from __future__ import unicode_literals
  
-import re
-
  from .common import InfoExtractor
  
  
@@ -39,8 +37,7 @@ class CBSIE(InfoExtractor):
      }]
  
      def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-        video_id = mobj.group('id')
+        video_id = self._match_id(url)
          webpage = self._download_webpage(url, video_id)
          real_id = self._search_regex(
              r"video\.settings\.pid\s*=\s*'([^']+)';",
diff --git a/youtube_dl/extractor/cbssports.py b/youtube_dl/extractor/cbssports.py

new file mode 100644 (file)

index 0000000..ae47e74
--- /dev/null
+++ b/youtube_dl/extractor/cbssports.py
@@ -0,0 +1,30 @@
+from __future__ import unicode_literals
+
+import re
+
+from .common import InfoExtractor
+
+
+class CBSSportsIE(InfoExtractor):
+    _VALID_URL = r'http://www\.cbssports\.com/video/player/(?P<section>[^/]+)/(?P<id>[^/]+)'
+
+    _TEST = {
+        'url': 'http://www.cbssports.com/video/player/tennis/318462531970/0/us-open-flashbacks-1990s',
+        'info_dict': {
+            'id': '_d5_GbO8p1sT',
+            'ext': 'flv',
+            'title': 'US Open flashbacks: 1990s',
+            'description': 'Bill Macatee relives the best moments in US Open history from the 1990s.',
+        },
+    }
+
+    def _real_extract(self, url):
+        mobj = re.match(self._VALID_URL, url)
+        section = mobj.group('section')
+        video_id = mobj.group('id')
+        all_videos = self._download_json(
+            'http://www.cbssports.com/data/video/player/getVideos/%s?as=json' % section,
+            video_id)
+        # The json file contains the info of all the videos in the section
+        video_info = next(v for v in all_videos if v['pcid'] == video_id)
+        return self.url_result('theplatform:%s' % video_info['pid'], 'ThePlatform')
diff --git a/youtube_dl/extractor/chirbit.py b/youtube_dl/extractor/chirbit.py

new file mode 100644 (file)

index 0000000..b1eeaf1
--- /dev/null
+++ b/youtube_dl/extractor/chirbit.py
@@ -0,0 +1,84 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..utils import (
+    parse_duration,
+    int_or_none,
+)
+
+
+class ChirbitIE(InfoExtractor):
+    IE_NAME = 'chirbit'
+    _VALID_URL = r'https?://(?:www\.)?chirb\.it/(?:(?:wp|pl)/|fb_chirbit_player\.swf\?key=)?(?P<id>[\da-zA-Z]+)'
+    _TESTS = [{
+        'url': 'http://chirb.it/PrIPv5',
+        'md5': '9847b0dad6ac3e074568bf2cfb197de8',
+        'info_dict': {
+            'id': 'PrIPv5',
+            'ext': 'mp3',
+            'title': 'Фасадстрой',
+            'duration': 52,
+            'view_count': int,
+            'comment_count': int,
+        }
+    }, {
+        'url': 'https://chirb.it/fb_chirbit_player.swf?key=PrIPv5',
+        'only_matching': True,
+    }]
+
+    def _real_extract(self, url):
+        audio_id = self._match_id(url)
+
+        webpage = self._download_webpage(
+            'http://chirb.it/%s' % audio_id, audio_id)
+
+        audio_url = self._search_regex(
+            r'"setFile"\s*,\s*"([^"]+)"', webpage, 'audio url')
+
+        title = self._search_regex(
+            r'itemprop="name">([^<]+)', webpage, 'title')
+        duration = parse_duration(self._html_search_meta(
+            'duration', webpage, 'duration', fatal=False))
+        view_count = int_or_none(self._search_regex(
+            r'itemprop="playCount"\s*>(\d+)', webpage,
+            'listen count', fatal=False))
+        comment_count = int_or_none(self._search_regex(
+            r'>(\d+) Comments?:', webpage,
+            'comment count', fatal=False))
+
+        return {
+            'id': audio_id,
+            'url': audio_url,
+            'title': title,
+            'duration': duration,
+            'view_count': view_count,
+            'comment_count': comment_count,
+        }
+
+
+class ChirbitProfileIE(InfoExtractor):
+    IE_NAME = 'chirbit:profile'
+    _VALID_URL = r'https?://(?:www\.)?chirbit.com/(?:rss/)?(?P<id>[^/]+)'
+    _TEST = {
+        'url': 'http://chirbit.com/ScarletBeauty',
+        'info_dict': {
+            'id': 'ScarletBeauty',
+            'title': 'Chirbits by ScarletBeauty',
+        },
+        'playlist_mincount': 3,
+    }
+
+    def _real_extract(self, url):
+        profile_id = self._match_id(url)
+
+        rss = self._download_xml(
+            'http://chirbit.com/rss/%s' % profile_id, profile_id)
+
+        entries = [
+            self.url_result(audio_url.text, 'Chirbit')
+            for audio_url in rss.findall('./channel/item/link')]
+
+        title = rss.find('./channel/title').text
+
+        return self.playlist_result(entries, profile_id, title)
diff --git a/youtube_dl/extractor/common.py b/youtube_dl/extractor/common.py

index 7d8ce18085758469e833c555d9dd675264600ae7..87fce9cd89425150baff91577199f706db2a1e81 100644 (file)
--- a/youtube_dl/extractor/common.py
+++ b/youtube_dl/extractor/common.py
@@ -27,7 +27,6 @@
      compiled_regex_type,
      ExtractorError,
      float_or_none,
-    HEADRequest,
      int_or_none,
      RegexNotFoundError,
      sanitize_filename,
@@ -398,6 +397,16 @@ def _webpage_read_content(self, urlh, url_or_request, video_id, note=None, errno
              if blocked_iframe:
                  msg += ' Visit %s for more details' % blocked_iframe
              raise ExtractorError(msg, expected=True)
+        if '<title>The URL you requested has been blocked</title>' in content[:512]:
+            msg = (
+                'Access to this webpage has been blocked by Indian censorship. '
+                'Use a VPN or proxy server (with --proxy) to route around it.')
+            block_msg = self._html_search_regex(
+                r'</h1><p>(.*?)</p>',
+                content, 'block message', default=None)
+            if block_msg:
+                msg += ' (Message: "%s")' % block_msg.replace('\n', ' ')
+            raise ExtractorError(msg, expected=True)
  
          return content
  
@@ -735,6 +744,7 @@ def _formats_key(f):
                  f.get('language_preference') if f.get('language_preference') is not None else -1,
                  f.get('quality') if f.get('quality') is not None else -1,
                  f.get('tbr') if f.get('tbr') is not None else -1,
+                f.get('filesize') if f.get('filesize') is not None else -1,
                  f.get('vbr') if f.get('vbr') is not None else -1,
                  f.get('height') if f.get('height') is not None else -1,
                  f.get('width') if f.get('width') is not None else -1,
@@ -742,7 +752,6 @@ def _formats_key(f):
                  f.get('abr') if f.get('abr') is not None else -1,
                  audio_ext_preference,
                  f.get('fps') if f.get('fps') is not None else -1,
-                f.get('filesize') if f.get('filesize') is not None else -1,
                  f.get('filesize_approx') if f.get('filesize_approx') is not None else -1,
                  f.get('source_preference') if f.get('source_preference') is not None else -1,
                  f.get('format_id'),
@@ -759,9 +768,7 @@ def _check_formats(self, formats, video_id):
  
      def _is_valid_url(self, url, video_id, item='video'):
          try:
-            self._request_webpage(
-                HEADRequest(url), video_id,
-                'Checking %s URL' % item)
+            self._request_webpage(url, video_id, 'Checking %s URL' % item)
              return True
          except ExtractorError as e:
              if isinstance(e.cause, compat_HTTPError):
@@ -807,8 +814,8 @@ def _extract_f4m_formats(self, manifest_url, video_id, preference=None, f4m_id=N
              media_nodes = manifest.findall('{http://ns.adobe.com/f4m/2.0}media')
          for i, media_el in enumerate(media_nodes):
              if manifest_version == '2.0':
-                manifest_url = ('/'.join(manifest_url.split('/')[:-1]) + '/'
-                                + (media_el.attrib.get('href') or media_el.attrib.get('url')))
+                manifest_url = ('/'.join(manifest_url.split('/')[:-1]) + '/' +
+                                (media_el.attrib.get('href') or media_el.attrib.get('url')))
              tbr = int_or_none(media_el.attrib.get('bitrate'))
              formats.append({
                  'format_id': '-'.join(filter(None, [f4m_id, 'f4m-%d' % (i if tbr is None else tbr)])),
@@ -832,7 +839,7 @@ def _extract_m3u8_formats(self, m3u8_url, video_id, ext=None,
              'url': m3u8_url,
              'ext': ext,
              'protocol': 'm3u8',
-            'preference': -1,
+            'preference': preference - 1 if preference else -1,
              'resolution': 'multiple',
              'format_note': 'Quality selection URL',
          }]
@@ -847,6 +854,7 @@ def _extract_m3u8_formats(self, m3u8_url, video_id, ext=None,
              note='Downloading m3u8 information',
              errnote='Failed to download m3u8 information')
          last_info = None
+        last_media = None
          kv_rex = re.compile(
              r'(?P<key>[a-zA-Z_-]+)=(?P<val>"[^"]+"|[^",]+)(?:,|$)')
          for line in m3u8_doc.splitlines():
@@ -857,6 +865,13 @@ def _extract_m3u8_formats(self, m3u8_url, video_id, ext=None,
                      if v.startswith('"'):
                          v = v[1:-1]
                      last_info[m.group('key')] = v
+            elif line.startswith('#EXT-X-MEDIA:'):
+                last_media = {}
+                for m in kv_rex.finditer(line):
+                    v = m.group('val')
+                    if v.startswith('"'):
+                        v = v[1:-1]
+                    last_media[m.group('key')] = v
              elif line.startswith('#') or not line.strip():
                  continue
              else:
@@ -885,6 +900,9 @@ def _extract_m3u8_formats(self, m3u8_url, video_id, ext=None,
                      width_str, height_str = resolution.split('x')
                      f['width'] = int(width_str)
                      f['height'] = int(height_str)
+                if last_media is not None:
+                    f['m3u8_media'] = last_media
+                    last_media = None
                  formats.append(f)
                  last_info = {}
          self._sort_formats(formats)
diff --git a/youtube_dl/extractor/dailymotion.py b/youtube_dl/extractor/dailymotion.py

index 4ca8929263c165249c222723cf6992eedccd2fff..42b20a46ddefc1e4a7e66aacd0d959a1e062618f 100644 (file)
--- a/youtube_dl/extractor/dailymotion.py
+++ b/youtube_dl/extractor/dailymotion.py
@@ -190,6 +190,7 @@ class DailymotionPlaylistIE(DailymotionBaseInfoExtractor):
          'url': 'http://www.dailymotion.com/playlist/xv4bw_nqtv_sport/1#video=xl8v3q',
          'info_dict': {
              'title': 'SPORT',
+            'id': 'xv4bw_nqtv_sport',
          },
          'playlist_mincount': 20,
      }]
diff --git a/youtube_dl/extractor/defense.py b/youtube_dl/extractor/defense.py

index 2b90bf4fc2fcba04fe7e164602196586713d4225..98e3aedfd08ada1300cbf3114a41022949062402 100644 (file)
--- a/youtube_dl/extractor/defense.py
+++ b/youtube_dl/extractor/defense.py
@@ -25,8 +25,9 @@ def _real_extract(self, url):
              r"flashvars.pvg_id=\"(\d+)\";",
              webpage, 'ID')
  
-        json_url = ('http://static.videos.gouv.fr/brightcovehub/export/json/'
-                    + video_id)
+        json_url = (
+            'http://static.videos.gouv.fr/brightcovehub/export/json/%s' %
+            video_id)
          info = self._download_json(json_url, title, 'Downloading JSON config')
          video_url = info['renditions'][0]['url']
  
diff --git a/youtube_dl/extractor/embedly.py b/youtube_dl/extractor/embedly.py

new file mode 100644 (file)

index 0000000..1cdb11e
--- /dev/null
+++ b/youtube_dl/extractor/embedly.py
@@ -0,0 +1,16 @@
+# encoding: utf-8
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..compat import compat_urllib_parse_unquote
+
+
+class EmbedlyIE(InfoExtractor):
+    _VALID_URL = r'https?://(?:www|cdn\.)?embedly\.com/widgets/media\.html\?(?:[^#]*?&)?url=(?P<id>[^#&]+)'
+    _TESTS = [{
+        'url': 'https://cdn.embedly.com/widgets/media.html?src=http%3A%2F%2Fwww.youtube.com%2Fembed%2Fvideoseries%3Flist%3DUUGLim4T2loE5rwCMdpCIPVg&url=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DSU4fj_aEMVw%26list%3DUUGLim4T2loE5rwCMdpCIPVg&image=http%3A%2F%2Fi.ytimg.com%2Fvi%2FSU4fj_aEMVw%2Fhqdefault.jpg&key=8ee8a2e6a8cc47aab1a5ee67f9a178e0&type=text%2Fhtml&schema=youtube&autoplay=1',
+        'only_matching': True,
+    }]
+
+    def _real_extract(self, url):
+        return self.url_result(compat_urllib_parse_unquote(self._match_id(url)))
diff --git a/youtube_dl/extractor/escapist.py b/youtube_dl/extractor/escapist.py

index 4303feccdaedd865e2c769a282173de6a4d5ffce..b49b9869f2356eaa7e09916046a8f4b30f4f5d47 100644 (file)
--- a/youtube_dl/extractor/escapist.py
+++ b/youtube_dl/extractor/escapist.py
@@ -22,6 +22,7 @@ class EscapistIE(InfoExtractor):
              'uploader_id': 'the-escapist-presents',
              'uploader': 'The Escapist Presents',
              'title': "Breaking Down Baldur's Gate",
+            'thumbnail': 're:^https?://.*\.jpg$',
          }
      }
  
@@ -30,19 +31,18 @@ def _real_extract(self, url):
          webpage = self._download_webpage(url, video_id)
  
          uploader_id = self._html_search_regex(
-            r"<h1 class='headline'><a href='/videos/view/(.*?)'",
+            r"<h1\s+class='headline'>\s*<a\s+href='/videos/view/(.*?)'",
              webpage, 'uploader ID', fatal=False)
          uploader = self._html_search_regex(
-            r"<h1 class='headline'>(.*?)</a>",
+            r"<h1\s+class='headline'>(.*?)</a>",
              webpage, 'uploader', fatal=False)
          description = self._html_search_meta('description', webpage)
  
          raw_title = self._html_search_meta('title', webpage, fatal=True)
          title = raw_title.partition(' : ')[2]
  
-        player_url = self._og_search_video_url(webpage, name='player URL')
-        config_url = compat_urllib_parse.unquote(self._search_regex(
-            r'config=(.*)$', player_url, 'config URL'))
+        config_url = compat_urllib_parse.unquote(self._html_search_regex(
+            r'<param\s+name="flashvars"\s+value="config=([^"&]+)', webpage, 'config URL'))
  
          formats = []
  
@@ -81,5 +81,4 @@ def _add_format(name, cfgurl, quality):
              'title': title,
              'thumbnail': self._og_search_thumbnail(webpage),
              'description': description,
-            'player_url': player_url,
          }
diff --git a/youtube_dl/extractor/fivemin.py b/youtube_dl/extractor/fivemin.py

index 5b24b921c13d497d09474fa405df5b164451dd80..157094e8c99a598a66a98e51ee70e70502494057 100644 (file)
--- a/youtube_dl/extractor/fivemin.py
+++ b/youtube_dl/extractor/fivemin.py
@@ -14,6 +14,7 @@ class FiveMinIE(InfoExtractor):
      IE_NAME = '5min'
      _VALID_URL = r'''(?x)
          (?:https?://[^/]*?5min\.com/Scripts/PlayerSeed\.js\?(?:.*?&)?playList=|
+            https?://(?:(?:massively|www)\.)?joystiq\.com/video/|
              5min:)
          (?P<id>\d+)
          '''
diff --git a/youtube_dl/extractor/gdcvault.py b/youtube_dl/extractor/gdcvault.py

index fed968f5179ebf6159212da5ab75b024b3bc0a03..f7b467b0aff8f46aa028d1898f5909277e973318 100644 (file)
--- a/youtube_dl/extractor/gdcvault.py
+++ b/youtube_dl/extractor/gdcvault.py
@@ -7,6 +7,7 @@
      compat_urllib_parse,
      compat_urllib_request,
  )
+from ..utils import remove_end
  
  
  class GDCVaultIE(InfoExtractor):
@@ -65,10 +66,12 @@ def _parse_mp4(self, xml_description):
  
      def _parse_flv(self, xml_description):
          video_formats = []
-        akami_url = xml_description.find('./metadata/akamaiHost').text
+        akamai_url = xml_description.find('./metadata/akamaiHost').text
          slide_video_path = xml_description.find('./metadata/slideVideo').text
          video_formats.append({
-            'url': 'rtmp://' + akami_url + '/' + slide_video_path,
+            'url': 'rtmp://%s/ondemand?ovpfv=1.1' % akamai_url,
+            'play_path': remove_end(slide_video_path, '.flv'),
+            'ext': 'flv',
              'format_note': 'slide deck video',
              'quality': -2,
              'preference': -2,
@@ -76,7 +79,9 @@ def _parse_flv(self, xml_description):
          })
          speaker_video_path = xml_description.find('./metadata/speakerVideo').text
          video_formats.append({
-            'url': 'rtmp://' + akami_url + '/' + speaker_video_path,
+            'url': 'rtmp://%s/ondemand?ovpfv=1.1' % akamai_url,
+            'play_path': remove_end(speaker_video_path, '.flv'),
+            'ext': 'flv',
              'format_note': 'speaker video',
              'quality': -1,
              'preference': -1,
diff --git a/youtube_dl/extractor/generic.py b/youtube_dl/extractor/generic.py

index f4500e931ba1a2c72fa6e4e87e120e317e236e56..875e1bf05ff274a41f46518c48e990954b7e12e5 100644 (file)
--- a/youtube_dl/extractor/generic.py
+++ b/youtube_dl/extractor/generic.py
@@ -473,6 +473,7 @@ class GenericIE(InfoExtractor):
          {
              'url': 'http://discourse.ubuntu.com/t/unity-8-desktop-mode-windows-on-mir/1986',
              'info_dict': {
+                'id': '1986',
                  'title': 'Unity 8 desktop-mode windows on Mir! - Ubuntu Discourse',
              },
              'playlist_mincount': 2,
@@ -531,13 +532,31 @@ class GenericIE(InfoExtractor):
              'info_dict': {
                  'id': 'Mrj4DVp2zeA',
                  'ext': 'mp4',
-                'upload_date': '20150204',
+                'upload_date': '20150212',
                  'uploader': 'The National Archives UK',
                  'description': 'md5:a236581cd2449dd2df4f93412f3f01c6',
                  'uploader_id': 'NationalArchives08',
                  'title': 'Webinar: Using Discovery, The National Archives’ online catalogue',
              },
-        }
+        },
+        # rtl.nl embed
+        {
+            'url': 'http://www.rtlnieuws.nl/nieuws/buitenland/aanslagen-kopenhagen',
+            'playlist_mincount': 5,
+            'info_dict': {
+                'id': 'aanslagen-kopenhagen',
+                'title': 'Aanslagen Kopenhagen | RTL Nieuws',
+            }
+        },
+        # Zapiks embed
+        {
+            'url': 'http://www.skipass.com/news/116090-bon-appetit-s5ep3-baqueira-mi-cor.html',
+            'info_dict': {
+                'id': '118046',
+                'ext': 'mp4',
+                'title': 'EP3S5 - Bon Appétit - Baqueira Mi Corazon !',
+            }
+        },
      ]
  
      def report_following_redirect(self, new_url):
@@ -782,6 +801,13 @@ def _playlist_from_matches(matches, getter=None, ie=None):
                  'entries': entries,
              }
  
+        # Look for embedded rtl.nl player
+        matches = re.findall(
+            r'<iframe\s+(?:[a-zA-Z-]+="[^"]+"\s+)*?src="((?:https?:)?//(?:www\.)?rtl\.nl/system/videoplayer/[^"]+video_embed[^"]+)"',
+            webpage)
+        if matches:
+            return _playlist_from_matches(matches, ie='RtlNl')
+
          # Look for embedded (iframe) Vimeo player
          mobj = re.search(
              r'<iframe[^>]+?src=(["\'])(?P<url>(?:https?:)?//player\.vimeo\.com/video/.+?)\1', webpage)
@@ -789,7 +815,6 @@ def _playlist_from_matches(matches, getter=None, ie=None):
              player_url = unescapeHTML(mobj.group('url'))
              surl = smuggle_url(player_url, {'Referer': url})
              return self.url_result(surl)
-
          # Look for embedded (swf embed) Vimeo player
          mobj = re.search(
              r'<embed[^>]+?src="((?:https?:)?//(?:www\.)?vimeo\.com/moogaloop\.swf.+?)"', webpage)
@@ -1082,6 +1107,12 @@ def _playlist_from_matches(matches, getter=None, ie=None):
          if mobj is not None:
              return self.url_result(mobj.group('url'), 'Livestream')
  
+        # Look for Zapiks embed
+        mobj = re.search(
+            r'<iframe[^>]+src="(?P<url>https?://(?:www\.)?zapiks\.fr/index\.php\?.+?)"', webpage)
+        if mobj is not None:
+            return self.url_result(mobj.group('url'), 'Zapiks')
+
          def check_video(vurl):
              if YoutubeIE.suitable(vurl):
                  return True
diff --git a/youtube_dl/extractor/ign.py b/youtube_dl/extractor/ign.py

index 3db668cd0297ea0ff3c0168c2b3f5db1491a0db4..3aade9e740673da3193324add6a8a3ac4eff8b1f 100644 (file)
--- a/youtube_dl/extractor/ign.py
+++ b/youtube_dl/extractor/ign.py
@@ -34,6 +34,9 @@ class IGNIE(InfoExtractor):
          },
          {
              'url': 'http://me.ign.com/en/feature/15775/100-little-things-in-gta-5-that-will-blow-your-mind',
+            'info_dict': {
+                'id': '100-little-things-in-gta-5-that-will-blow-your-mind',
+            },
              'playlist': [
                  {
                      'info_dict': {
diff --git a/youtube_dl/extractor/imgur.py b/youtube_dl/extractor/imgur.py

new file mode 100644 (file)

index 0000000..fe5d95e
--- /dev/null
+++ b/youtube_dl/extractor/imgur.py
@@ -0,0 +1,97 @@
+from __future__ import unicode_literals
+
+import re
+
+from .common import InfoExtractor
+from ..utils import (
+    int_or_none,
+    js_to_json,
+    mimetype2ext,
+    ExtractorError,
+)
+
+
+class ImgurIE(InfoExtractor):
+    _VALID_URL = r'https?://(?:i\.)?imgur\.com/(?P<id>[a-zA-Z0-9]+)(?:\.mp4|\.gifv)?'
+
+    _TESTS = [{
+        'url': 'https://i.imgur.com/A61SaA1.gifv',
+        'info_dict': {
+            'id': 'A61SaA1',
+            'ext': 'mp4',
+            'title': 're:Imgur GIF$|MRW gifv is up and running without any bugs$',
+            'description': 're:The origin of the Internet\'s most viral images$|The Internet\'s visual storytelling community\. Explore, share, and discuss the best visual stories the Internet has to offer\.$',
+        },
+    }, {
+        'url': 'https://imgur.com/A61SaA1',
+        'info_dict': {
+            'id': 'A61SaA1',
+            'ext': 'mp4',
+            'title': 're:Imgur GIF$|MRW gifv is up and running without any bugs$',
+            'description': 're:The origin of the Internet\'s most viral images$|The Internet\'s visual storytelling community\. Explore, share, and discuss the best visual stories the Internet has to offer\.$',
+        },
+    }]
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+        webpage = self._download_webpage(url, video_id)
+
+        width = int_or_none(self._search_regex(
+            r'<param name="width" value="([0-9]+)"',
+            webpage, 'width', fatal=False))
+        height = int_or_none(self._search_regex(
+            r'<param name="height" value="([0-9]+)"',
+            webpage, 'height', fatal=False))
+
+        video_elements = self._search_regex(
+            r'(?s)<div class="video-elements">(.*?)</div>',
+            webpage, 'video elements', default=None)
+        if not video_elements:
+            raise ExtractorError(
+                'No sources found for video %s. Maybe an image?' % video_id,
+                expected=True)
+
+        formats = []
+        for m in re.finditer(r'<source\s+src="(?P<src>[^"]+)"\s+type="(?P<type>[^"]+)"', video_elements):
+            formats.append({
+                'format_id': m.group('type').partition('/')[2],
+                'url': self._proto_relative_url(m.group('src')),
+                'ext': mimetype2ext(m.group('type')),
+                'acodec': 'none',
+                'width': width,
+                'height': height,
+                'http_headers': {
+                    'User-Agent': 'youtube-dl (like wget)',
+                },
+            })
+
+        gif_json = self._search_regex(
+            r'(?s)var\s+videoItem\s*=\s*(\{.*?\})',
+            webpage, 'GIF code', fatal=False)
+        if gif_json:
+            gifd = self._parse_json(
+                gif_json, video_id, transform_source=js_to_json)
+            formats.append({
+                'format_id': 'gif',
+                'preference': -10,
+                'width': width,
+                'height': height,
+                'ext': 'gif',
+                'acodec': 'none',
+                'vcodec': 'gif',
+                'container': 'gif',
+                'url': self._proto_relative_url(gifd['gifUrl']),
+                'filesize': gifd.get('size'),
+                'http_headers': {
+                    'User-Agent': 'youtube-dl (like wget)',
+                },
+            })
+
+        self._sort_formats(formats)
+
+        return {
+            'id': video_id,
+            'formats': formats,
+            'description': self._og_search_description(webpage),
+            'title': self._og_search_title(webpage),
+        }
diff --git a/youtube_dl/extractor/livestream.py b/youtube_dl/extractor/livestream.py

index 5247c6f58500e301dab50ed48039df0c070b493a..3642089f7802238d77ec5c18e4f96b5cb21e3d72 100644 (file)
--- a/youtube_dl/extractor/livestream.py
+++ b/youtube_dl/extractor/livestream.py
@@ -37,6 +37,7 @@ class LivestreamIE(InfoExtractor):
          'url': 'http://new.livestream.com/tedx/cityenglish',
          'info_dict': {
              'title': 'TEDCity2.0 (English)',
+            'id': '2245590',
          },
          'playlist_mincount': 4,
      }, {
@@ -148,7 +149,8 @@ def is_relevant(vdata, vid):
                    if is_relevant(video_data, video_id)]
          if video_id is None:
              # This is an event page:
-            return self.playlist_result(videos, info['id'], info['full_name'])
+            return self.playlist_result(
+                videos, '%s' % info['id'], info['full_name'])
          else:
              if not videos:
                  raise ExtractorError('Cannot find video %s' % video_id)
diff --git a/youtube_dl/extractor/nationalgeographic.py b/youtube_dl/extractor/nationalgeographic.py

new file mode 100644 (file)

index 0000000..c18640c
--- /dev/null
+++ b/youtube_dl/extractor/nationalgeographic.py
@@ -0,0 +1,38 @@
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..utils import (
+    smuggle_url,
+    url_basename,
+)
+
+
+class NationalGeographicIE(InfoExtractor):
+    _VALID_URL = r'http://video\.nationalgeographic\.com/video/.*?'
+
+    _TEST = {
+        'url': 'http://video.nationalgeographic.com/video/news/150210-news-crab-mating-vin?source=featuredvideo',
+        'info_dict': {
+            'id': '4DmDACA6Qtk_',
+            'ext': 'flv',
+            'title': 'Mating Crabs Busted by Sharks',
+            'description': 'md5:16f25aeffdeba55aaa8ec37e093ad8b3',
+        },
+        'add_ie': ['ThePlatform'],
+    }
+
+    def _real_extract(self, url):
+        name = url_basename(url)
+
+        webpage = self._download_webpage(url, name)
+        feed_url = self._search_regex(r'data-feed-url="([^"]+)"', webpage, 'feed url')
+        guid = self._search_regex(r'data-video-guid="([^"]+)"', webpage, 'guid')
+
+        feed = self._download_xml('%s?byGuid=%s' % (feed_url, guid), name)
+        content = feed.find('.//{http://search.yahoo.com/mrss/}content')
+        theplatform_id = url_basename(content.attrib.get('url'))
+
+        return self.url_result(smuggle_url(
+            'http://link.theplatform.com/s/ngs/%s?format=SMIL&formats=MPEG4&manifest=f4m' % theplatform_id,
+            # For some reason, the normal links don't work and we must force the use of f4m
+            {'force_smil_url': True}))
diff --git a/youtube_dl/extractor/nbc.py b/youtube_dl/extractor/nbc.py

index 89a2845fe204695d206d53122585799f9a3e99cd..3645d3033f74ae174e3eaa85ad55bbe677d9daba 100644 (file)
--- a/youtube_dl/extractor/nbc.py
+++ b/youtube_dl/extractor/nbc.py
@@ -18,13 +18,13 @@ class NBCIE(InfoExtractor):
  
      _TESTS = [
          {
-            'url': 'http://www.nbc.com/chicago-fire/video/i-am-a-firefighter/2734188',
+            'url': 'http://www.nbc.com/the-tonight-show/segments/112966',
              # md5 checksum is not stable
              'info_dict': {
-                'id': 'bTmnLCvIbaaH',
+                'id': 'c9xnCo0YPOPH',
                  'ext': 'flv',
-                'title': 'I Am a Firefighter',
-                'description': 'An emergency puts Dawson\'sf irefighter skills to the ultimate test in this four-part digital series.',
+                'title': 'Jimmy Fallon Surprises Fans at Ben & Jerry\'s',
+                'description': 'Jimmy gives out free scoops of his new "Tonight Dough" ice cream flavor by surprising customers at the Ben & Jerry\'s scoop shop.',
              },
          },
          {
diff --git a/youtube_dl/extractor/netzkino.py b/youtube_dl/extractor/netzkino.py

index 93567d1e38bc7da5ea2e621cf1f3adb848ef3461..bc17e20aa9d736eb9e4ba0a39929f20db47d8465 100644 (file)
--- a/youtube_dl/extractor/netzkino.py
+++ b/youtube_dl/extractor/netzkino.py
@@ -29,6 +29,9 @@ class NetzkinoIE(InfoExtractor):
              'timestamp': 1344858571,
              'age_limit': 12,
          },
+        'params': {
+            'skip_download': 'Download only works from Germany',
+        }
      }
  
      def _real_extract(self, url):
diff --git a/youtube_dl/extractor/patreon.py b/youtube_dl/extractor/patreon.py

index 5429592a75a9f66fdc9f0e9fb908af9e67559aae..f179ea2008636f061c6a4cdad6fc69841a291076 100644 (file)
--- a/youtube_dl/extractor/patreon.py
+++ b/youtube_dl/extractor/patreon.py
@@ -1,9 +1,6 @@
  # encoding: utf-8
  from __future__ import unicode_literals
  
-import json
-import re
-
  from .common import InfoExtractor
  from ..utils import (
      js_to_json,
@@ -11,7 +8,7 @@
  
  
  class PatreonIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:www\.)?patreon\.com/creation\?hid=(.+)'
+    _VALID_URL = r'https?://(?:www\.)?patreon\.com/creation\?hid=(?P<id>[^&#]+)'
      _TESTS = [
          {
              'url': 'http://www.patreon.com/creation?hid=743933',
@@ -35,6 +32,23 @@ class PatreonIE(InfoExtractor):
                  'thumbnail': 're:^https?://.*$',
              },
          },
+        {
+            'url': 'https://www.patreon.com/creation?hid=1682498',
+            'info_dict': {
+                'id': 'SU4fj_aEMVw',
+                'ext': 'mp4',
+                'title': 'I\'m on Patreon!',
+                'uploader': 'TraciJHines',
+                'thumbnail': 're:^https?://.*$',
+                'upload_date': '20150211',
+                'description': 'md5:c5a706b1f687817a3de09db1eb93acd4',
+                'uploader_id': 'TraciJHines',
+            },
+            'params': {
+                'noplaylist': True,
+                'skip_download': True,
+            }
+        }
      ]
  
      # Currently Patreon exposes download URL via hidden CSS, so login is not
@@ -65,26 +79,29 @@ def _real_initialize(self):
      '''
  
      def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-        video_id = mobj.group(1)
-
+        video_id = self._match_id(url)
          webpage = self._download_webpage(url, video_id)
          title = self._og_search_title(webpage).strip()
  
          attach_fn = self._html_search_regex(
              r'<div class="attach"><a target="_blank" href="([^"]+)">',
              webpage, 'attachment URL', default=None)
+        embed = self._html_search_regex(
+            r'<div id="watchCreation">\s*<iframe class="embedly-embed" src="([^"]+)"',
+            webpage, 'embedded URL', default=None)
+
          if attach_fn is not None:
              video_url = 'http://www.patreon.com' + attach_fn
              thumbnail = self._og_search_thumbnail(webpage)
              uploader = self._html_search_regex(
                  r'<strong>(.*?)</strong> is creating', webpage, 'uploader')
+        elif embed is not None:
+            return self.url_result(embed)
          else:
-            playlist_js = self._search_regex(
+            playlist = self._parse_json(self._search_regex(
                  r'(?s)new\s+jPlayerPlaylist\(\s*\{\s*[^}]*},\s*(\[.*?,?\s*\])',
-                webpage, 'playlist JSON')
-            playlist_json = js_to_json(playlist_js)
-            playlist = json.loads(playlist_json)
+                webpage, 'playlist JSON'),
+                video_id, transform_source=js_to_json)
              data = playlist[0]
              video_url = self._proto_relative_url(data['mp3'])
              thumbnail = self._proto_relative_url(data.get('cover'))
diff --git a/youtube_dl/extractor/pornhub.py b/youtube_dl/extractor/pornhub.py

index fb2032832e4757e328d016ab289e892721d73af2..3a27e37890dc78b26af866c9884807c97c56ccb9 100644 (file)
--- a/youtube_dl/extractor/pornhub.py
+++ b/youtube_dl/extractor/pornhub.py
@@ -56,7 +56,7 @@ def _real_extract(self, url):
  
          video_title = self._html_search_regex(r'<h1 [^>]+>([^<]+)', webpage, 'title')
          video_uploader = self._html_search_regex(
-            r'(?s)From:&nbsp;.+?<(?:a href="/users/|a href="/channels/|<span class="username)[^>]+>(.+?)<',
+            r'(?s)From:&nbsp;.+?<(?:a href="/users/|a href="/channels/|span class="username)[^>]+>(.+?)<',
              webpage, 'uploader', fatal=False)
          thumbnail = self._html_search_regex(r'"image_url":"([^"]+)', webpage, 'thumbnail', fatal=False)
          if thumbnail:
@@ -110,3 +110,33 @@ def _real_extract(self, url):
              'formats': formats,
              'age_limit': 18,
          }
+
+
+class PornHubPlaylistIE(InfoExtractor):
+    _VALID_URL = r'https?://(?:www\.)?pornhub\.com/playlist/(?P<id>\d+)'
+    _TESTS = [{
+        'url': 'http://www.pornhub.com/playlist/6201671',
+        'info_dict': {
+            'id': '6201671',
+            'title': 'P0p4',
+        },
+        'playlist_mincount': 35,
+    }]
+
+    def _real_extract(self, url):
+        playlist_id = self._match_id(url)
+
+        webpage = self._download_webpage(url, playlist_id)
+
+        entries = [
+            self.url_result('http://www.pornhub.com/%s' % video_url, 'PornHub')
+            for video_url in set(re.findall('href="/?(view_video\.php\?viewkey=\d+[^"]*)"', webpage))
+        ]
+
+        playlist = self._parse_json(
+            self._search_regex(
+                r'playlistObject\s*=\s*({.+?});', webpage, 'playlist'),
+            playlist_id)
+
+        return self.playlist_result(
+            entries, playlist_id, playlist.get('title'), playlist.get('description'))
diff --git a/youtube_dl/extractor/r7.py b/youtube_dl/extractor/r7.py

new file mode 100644 (file)

index 0000000..976c8fe
--- /dev/null
+++ b/youtube_dl/extractor/r7.py
@@ -0,0 +1,88 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..utils import (
+    js_to_json,
+    unescapeHTML,
+    int_or_none,
+)
+
+
+class R7IE(InfoExtractor):
+    _VALID_URL = r'''(?x)https?://
+                        (?:
+                            (?:[a-zA-Z]+)\.r7\.com(?:/[^/]+)+/idmedia/|
+                            noticias\.r7\.com(?:/[^/]+)+/[^/]+-|
+                            player\.r7\.com/video/i/
+                        )
+                        (?P<id>[\da-f]{24})
+                        '''
+    _TESTS = [{
+        'url': 'http://videos.r7.com/policiais-humilham-suspeito-a-beira-da-morte-morre-com-dignidade-/idmedia/54e7050b0cf2ff57e0279389.html',
+        'md5': '403c4e393617e8e8ddc748978ee8efde',
+        'info_dict': {
+            'id': '54e7050b0cf2ff57e0279389',
+            'ext': 'mp4',
+            'title': 'Policiais humilham suspeito à beira da morte: "Morre com dignidade"',
+            'thumbnail': 're:^https?://.*\.jpg$',
+            'duration': 98,
+            'like_count': int,
+            'view_count': int,
+        },
+    }, {
+        'url': 'http://esportes.r7.com/videos/cigano-manda-recado-aos-fas/idmedia/4e176727b51a048ee6646a1b.html',
+        'only_matching': True,
+    }, {
+        'url': 'http://noticias.r7.com/record-news/video/representante-do-instituto-sou-da-paz-fala-sobre-fim-do-estatuto-do-desarmamento-5480fc580cf2285b117f438d/',
+        'only_matching': True,
+    }, {
+        'url': 'http://player.r7.com/video/i/54e7050b0cf2ff57e0279389?play=true&video=http://vsh.r7.com/54e7050b0cf2ff57e0279389/ER7_RE_BG_MORTE_JOVENS_570kbps_2015-02-2009f17818-cc82-4c8f-86dc-89a66934e633-ATOS_copy.mp4&linkCallback=http://videos.r7.com/policiais-humilham-suspeito-a-beira-da-morte-morre-com-dignidade-/idmedia/54e7050b0cf2ff57e0279389.html&thumbnail=http://vtb.r7.com/ER7_RE_BG_MORTE_JOVENS_570kbps_2015-02-2009f17818-cc82-4c8f-86dc-89a66934e633-thumb.jpg&idCategory=192&share=true&layout=full&full=true',
+        'only_matching': True,
+    }]
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+
+        webpage = self._download_webpage(
+            'http://player.r7.com/video/i/%s' % video_id, video_id)
+
+        item = self._parse_json(js_to_json(self._search_regex(
+            r'(?s)var\s+item\s*=\s*({.+?});', webpage, 'player')), video_id)
+
+        title = unescapeHTML(item['title'])
+        thumbnail = item.get('init', {}).get('thumbUri')
+        duration = None
+
+        statistics = item.get('statistics', {})
+        like_count = int_or_none(statistics.get('likes'))
+        view_count = int_or_none(statistics.get('views'))
+
+        formats = []
+        for format_key, format_dict in item['playlist'][0].items():
+            src = format_dict.get('src')
+            if not src:
+                continue
+            format_id = format_dict.get('format') or format_key
+            if duration is None:
+                duration = format_dict.get('duration')
+            if '.f4m' in src:
+                formats.extend(self._extract_f4m_formats(src, video_id, preference=-1))
+            elif src.endswith('.m3u8'):
+                formats.extend(self._extract_m3u8_formats(src, video_id, 'mp4', preference=-2))
+            else:
+                formats.append({
+                    'url': src,
+                    'format_id': format_id,
+                })
+        self._sort_formats(formats)
+
+        return {
+            'id': video_id,
+            'title': title,
+            'thumbnail': thumbnail,
+            'duration': duration,
+            'like_count': like_count,
+            'view_count': view_count,
+            'formats': formats,
+        }
diff --git a/youtube_dl/extractor/radiode.py b/youtube_dl/extractor/radiode.py

index f95bc9454334b9ca15c5f74cf034cebd26487841..aa5f6f8ad41d1dcdb3cb975e2fcf883c8d3ac7f9 100644 (file)
--- a/youtube_dl/extractor/radiode.py
+++ b/youtube_dl/extractor/radiode.py
@@ -1,7 +1,5 @@
  from __future__ import unicode_literals
  
-import json
-
  from .common import InfoExtractor
  
  
@@ -10,13 +8,13 @@ class RadioDeIE(InfoExtractor):
      _VALID_URL = r'https?://(?P<id>.+?)\.(?:radio\.(?:de|at|fr|pt|es|pl|it)|rad\.io)'
      _TEST = {
          'url': 'http://ndr2.radio.de/',
-        'md5': '3b4cdd011bc59174596b6145cda474a4',
          'info_dict': {
              'id': 'ndr2',
              'ext': 'mp3',
              'title': 're:^NDR 2 [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
              'description': 'md5:591c49c702db1a33751625ebfb67f273',
              'thumbnail': 're:^https?://.*\.png',
+            'is_live': True,
          },
          'params': {
              'skip_download': True,
@@ -25,16 +23,15 @@ class RadioDeIE(InfoExtractor):
  
      def _real_extract(self, url):
          radio_id = self._match_id(url)
-
          webpage = self._download_webpage(url, radio_id)
+        jscode = self._search_regex(
+            r"'components/station/stationService':\s*\{\s*'?station'?:\s*(\{.*?\s*\}),\n",
+            webpage, 'broadcast')
  
-        broadcast = json.loads(self._search_regex(
-            r'_getBroadcast\s*=\s*function\(\s*\)\s*{\s*return\s+({.+?})\s*;\s*}',
-            webpage, 'broadcast'))
-
+        broadcast = self._parse_json(jscode, radio_id)
          title = self._live_title(broadcast['name'])
          description = broadcast.get('description') or broadcast.get('shortDescription')
-        thumbnail = broadcast.get('picture4Url') or broadcast.get('picture4TransUrl')
+        thumbnail = broadcast.get('picture4Url') or broadcast.get('picture4TransUrl') or broadcast.get('logo100x100')
  
          formats = [{
              'url': stream['streamUrl'],
diff --git a/youtube_dl/extractor/rtlnl.py b/youtube_dl/extractor/rtlnl.py

index a3ca79f2ccfd2e00c09a4f9b2a9503fa85669b65..cfce4550ada568cfe13fae859a2bb745671074b5 100644 (file)
--- a/youtube_dl/extractor/rtlnl.py
+++ b/youtube_dl/extractor/rtlnl.py
@@ -1,16 +1,25 @@
+# coding: utf-8
  from __future__ import unicode_literals
  
-import re
-
  from .common import InfoExtractor
-from ..utils import parse_duration
+from ..utils import (
+    int_or_none,
+    parse_duration,
+)
  
  
-class RtlXlIE(InfoExtractor):
-    IE_NAME = 'rtlxl.nl'
-    _VALID_URL = r'https?://(www\.)?rtlxl\.nl/#!/[^/]+/(?P<uuid>[^/?]+)'
+class RtlNlIE(InfoExtractor):
+    IE_NAME = 'rtl.nl'
+    IE_DESC = 'rtl.nl and rtlxl.nl'
+    _VALID_URL = r'''(?x)
+        https?://(www\.)?
+        (?:
+            rtlxl\.nl/\#!/[^/]+/|
+            rtl\.nl/system/videoplayer/[^?#]+?/video_embed\.html\#uuid=
+        )
+        (?P<id>[0-9a-f-]+)'''
  
-    _TEST = {
+    _TESTS = [{
          'url': 'http://www.rtlxl.nl/#!/rtl-nieuws-132237/6e4203a6-0a5e-3596-8424-c599a59e0677',
          'md5': 'cc16baa36a6c169391f0764fa6b16654',
          'info_dict': {
@@ -22,21 +31,30 @@ class RtlXlIE(InfoExtractor):
              'upload_date': '20140814',
              'duration': 576.880,
          },
-    }
+    }, {
+        'url': 'http://www.rtl.nl/system/videoplayer/derden/rtlnieuws/video_embed.html#uuid=84ae5571-ac25-4225-ae0c-ef8d9efb2aed/autoplay=false',
+        'md5': 'dea7474214af1271d91ef332fb8be7ea',
+        'info_dict': {
+            'id': '84ae5571-ac25-4225-ae0c-ef8d9efb2aed',
+            'ext': 'mp4',
+            'timestamp': 1424039400,
+            'title': 'RTL Nieuws - Nieuwe beelden Kopenhagen: chaos direct na aanslag',
+            'thumbnail': 're:^https?://screenshots\.rtl\.nl/system/thumb/sz=[0-9]+x[0-9]+/uuid=84ae5571-ac25-4225-ae0c-ef8d9efb2aed$',
+            'upload_date': '20150215',
+            'description': 'Er zijn nieuwe beelden vrijgegeven die vlak na de aanslag in Kopenhagen zijn gemaakt. Op de video is goed te zien hoe omstanders zich bekommeren om één van de slachtoffers, terwijl de eerste agenten ter plaatse komen.',
+        }
+    }]
  
      def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-        uuid = mobj.group('uuid')
-
+        uuid = self._match_id(url)
          info = self._download_json(
              'http://www.rtl.nl/system/s4m/vfd/version=2/uuid=%s/fmt=flash/' % uuid,
              uuid)
  
          material = info['material'][0]
-        episode_info = info['episodes'][0]
-
          progname = info['abstracts'][0]['name']
          subtitle = material['title'] or info['episodes'][0]['name']
+        description = material.get('synopsis') or info['episodes'][0]['synopsis']
  
          # Use unencrypted m3u8 streams (See https://github.com/rg3/youtube-dl/issues/4118)
          videopath = material['videopath'].replace('.f4m', '.m3u8')
@@ -58,14 +76,29 @@ def _real_extract(self, url):
                  'quality': 0,
              }
          ])
-
          self._sort_formats(formats)
  
+        thumbnails = []
+        meta = info.get('meta', {})
+        for p in ('poster_base_url', '"thumb_base_url"'):
+            if not meta.get(p):
+                continue
+
+            thumbnails.append({
+                'url': self._proto_relative_url(meta[p] + uuid),
+                'width': int_or_none(self._search_regex(
+                    r'/sz=([0-9]+)', meta[p], 'thumbnail width', fatal=False)),
+                'height': int_or_none(self._search_regex(
+                    r'/sz=[0-9]+x([0-9]+)',
+                    meta[p], 'thumbnail height', fatal=False))
+            })
+
          return {
              'id': uuid,
              'title': '%s - %s' % (progname, subtitle),
              'formats': formats,
              'timestamp': material['original_date'],
-            'description': episode_info['synopsis'],
+            'description': description,
              'duration': parse_duration(material.get('duration')),
+            'thumbnails': thumbnails,
          }
diff --git a/youtube_dl/extractor/rtve.py b/youtube_dl/extractor/rtve.py

index 3469d9578f5317222a404ce8f5918dd133d6f381..e60f85b5b4842d90b49aeec9aa87da8def92d4f9 100644 (file)
--- a/youtube_dl/extractor/rtve.py
+++ b/youtube_dl/extractor/rtve.py
@@ -6,6 +6,7 @@
  import time
  
  from .common import InfoExtractor
+from ..compat import compat_urlparse
  from ..utils import (
      struct_unpack,
      remove_end,
@@ -96,12 +97,10 @@ def _real_extract(self, url):
              ).replace('.net.rtve', '.multimedia.cdn.rtve')
              video_path = self._download_webpage(
                  auth_url, video_id, 'Getting video url')
-            # Use mvod.akcdn instead of flash.akamaihd.multimedia.cdn to get
+            # Use mvod1.akcdn instead of flash.akamaihd.multimedia.cdn to get
              # the right Content-Length header and the mp4 format
-            video_url = (
-                'http://mvod.akcdn.rtve.es/{0}&v=2.6.8'
-                '&fp=MAC%2016,0,0,296&r=MRUGG&g=OEOJWFXNFGCP'.format(video_path)
-            )
+            video_url = compat_urlparse.urljoin(
+                'http://mvod1.akcdn.rtve.es/', video_path)
  
          return {
              'id': video_id,
diff --git a/youtube_dl/extractor/sandia.py b/youtube_dl/extractor/sandia.py

new file mode 100644 (file)

index 0000000..9c88167
--- /dev/null
+++ b/youtube_dl/extractor/sandia.py
@@ -0,0 +1,117 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+import itertools
+import json
+import re
+
+from .common import InfoExtractor
+from ..compat import (
+    compat_urllib_request,
+    compat_urlparse,
+)
+from ..utils import (
+    int_or_none,
+    js_to_json,
+    mimetype2ext,
+    unified_strdate,
+)
+
+
+class SandiaIE(InfoExtractor):
+    IE_DESC = 'Sandia National Laboratories'
+    _VALID_URL = r'https?://digitalops\.sandia\.gov/Mediasite/Play/(?P<id>[0-9a-f]+)'
+    _TEST = {
+        'url': 'http://digitalops.sandia.gov/Mediasite/Play/24aace4429fc450fb5b38cdbf424a66e1d',
+        'md5': '9422edc9b9a60151727e4b6d8bef393d',
+        'info_dict': {
+            'id': '24aace4429fc450fb5b38cdbf424a66e1d',
+            'ext': 'mp4',
+            'title': 'Xyce Software Training - Section 1',
+            'description': 're:(?s)SAND Number: SAND 2013-7800.{200,}',
+            'upload_date': '20120904',
+            'duration': 7794,
+        }
+    }
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+
+        req = compat_urllib_request.Request(url)
+        req.add_header('Cookie', 'MediasitePlayerCaps=ClientPlugins=4')
+        webpage = self._download_webpage(req, video_id)
+
+        js_path = self._search_regex(
+            r'<script type="text/javascript" src="(/Mediasite/FileServer/Presentation/[^"]+)"',
+            webpage, 'JS code URL')
+        js_url = compat_urlparse.urljoin(url, js_path)
+
+        js_code = self._download_webpage(
+            js_url, video_id, note='Downloading player')
+
+        def extract_str(key, **args):
+            return self._search_regex(
+                r'Mediasite\.PlaybackManifest\.%s\s*=\s*(.+);\s*?\n' % re.escape(key),
+                js_code, key, **args)
+
+        def extract_data(key, **args):
+            data_json = extract_str(key, **args)
+            if data_json is None:
+                return data_json
+            return self._parse_json(
+                data_json, video_id, transform_source=js_to_json)
+
+        formats = []
+        for i in itertools.count():
+            fd = extract_data('VideoUrls[%d]' % i, default=None)
+            if fd is None:
+                break
+            formats.append({
+                'format_id': '%s' % i,
+                'format_note': fd['MimeType'].partition('/')[2],
+                'ext': mimetype2ext(fd['MimeType']),
+                'url': fd['Location'],
+                'protocol': 'f4m' if fd['MimeType'] == 'video/x-mp4-fragmented' else None,
+            })
+        self._sort_formats(formats)
+
+        slide_baseurl = compat_urlparse.urljoin(
+            url, extract_data('SlideBaseUrl'))
+        slide_template = slide_baseurl + re.sub(
+            r'\{0:D?([0-9+])\}', r'%0\1d', extract_data('SlideImageFileNameTemplate'))
+        slides = []
+        last_slide_time = 0
+        for i in itertools.count(1):
+            sd = extract_str('Slides[%d]' % i, default=None)
+            if sd is None:
+                break
+            timestamp = int_or_none(self._search_regex(
+                r'^Mediasite\.PlaybackManifest\.CreateSlide\("[^"]*"\s*,\s*([0-9]+),',
+                sd, 'slide %s timestamp' % i, fatal=False))
+            slides.append({
+                'url': slide_template % i,
+                'duration': timestamp - last_slide_time,
+            })
+            last_slide_time = timestamp
+        formats.append({
+            'format_id': 'slides',
+            'protocol': 'slideshow',
+            'url': json.dumps(slides),
+            'preference': -10000,  # Downloader not yet written
+        })
+        self._sort_formats(formats)
+
+        title = extract_data('Title')
+        description = extract_data('Description', fatal=False)
+        duration = int_or_none(extract_data(
+            'Duration', fatal=False), scale=1000)
+        upload_date = unified_strdate(extract_data('AirDate', fatal=False))
+
+        return {
+            'id': video_id,
+            'title': title,
+            'description': description,
+            'formats': formats,
+            'upload_date': upload_date,
+            'duration': duration,
+        }
diff --git a/youtube_dl/extractor/sockshare.py b/youtube_dl/extractor/sockshare.py

index 7d3c0e93783afeac3d8e939e0cf317177df4ca9f..b5fa6f1da203c993873622a9ee80c923300eebb2 100644 (file)
--- a/youtube_dl/extractor/sockshare.py
+++ b/youtube_dl/extractor/sockshare.py
@@ -25,7 +25,6 @@ class SockshareIE(InfoExtractor):
              'id': '437BE28B89D799D7',
              'title': 'big_buck_bunny_720p_surround.avi',
              'ext': 'avi',
-            'thumbnail': 're:^http://.*\.jpg$',
          }
      }
  
@@ -45,7 +44,7 @@ def _real_extract(self, url):
              ''', webpage, 'hash')
  
          fields = {
-            "hash": confirm_hash,
+            "hash": confirm_hash.encode('utf-8'),
              "confirm": "Continue as Free User"
          }
  
@@ -68,7 +67,7 @@ def _real_extract(self, url):
              webpage, 'title', default=None)
          thumbnail = self._html_search_regex(
              r'<img\s+src="([^"]*)".+?name="bg"',
-            webpage, 'thumbnail')
+            webpage, 'thumbnail', default=None)
  
          formats = [{
              'format_id': 'sd',
diff --git a/youtube_dl/extractor/soundgasm.py b/youtube_dl/extractor/soundgasm.py

index a4f8ce6c3c8cce1854e5695783908a7804af0cac..3a4ddf57ea369a0b250a4d786738e0ea4db9e1dd 100644 (file)
--- a/youtube_dl/extractor/soundgasm.py
+++ b/youtube_dl/extractor/soundgasm.py
@@ -7,6 +7,7 @@
  
  
  class SoundgasmIE(InfoExtractor):
+    IE_NAME = 'soundgasm'
      _VALID_URL = r'https?://(?:www\.)?soundgasm\.net/u/(?P<user>[0-9a-zA-Z_\-]+)/(?P<title>[0-9a-zA-Z_\-]+)'
      _TEST = {
          'url': 'http://soundgasm.net/u/ytdl/Piano-sample',
@@ -38,3 +39,26 @@ def _real_extract(self, url):
              'title': audio_title,
              'description': description
          }
+
+
+class SoundgasmProfileIE(InfoExtractor):
+    IE_NAME = 'soundgasm:profile'
+    _VALID_URL = r'https?://(?:www\.)?soundgasm\.net/u/(?P<id>[^/]+)/?(?:\#.*)?$'
+    _TEST = {
+        'url': 'http://soundgasm.net/u/ytdl',
+        'info_dict': {
+            'id': 'ytdl',
+        },
+        'playlist_count': 1,
+    }
+
+    def _real_extract(self, url):
+        profile_id = self._match_id(url)
+
+        webpage = self._download_webpage(url, profile_id)
+
+        entries = [
+            self.url_result(audio_url, 'Soundgasm')
+            for audio_url in re.findall(r'href="([^"]+/u/%s/[^"]+)' % profile_id, webpage)]
+
+        return self.playlist_result(entries, profile_id)
diff --git a/youtube_dl/extractor/teamcoco.py b/youtube_dl/extractor/teamcoco.py

index a73da1c9c0d92657bd90f302b03e9fa8404c2dcf..5793dbc1085a86fdf573a432805be129dc62de94 100644 (file)
--- a/youtube_dl/extractor/teamcoco.py
+++ b/youtube_dl/extractor/teamcoco.py
@@ -1,8 +1,10 @@
  from __future__ import unicode_literals
  
+import base64
  import re
  
  from .common import InfoExtractor
+from ..utils import qualities
  
  
  class TeamcocoIE(InfoExtractor):
@@ -24,8 +26,8 @@ class TeamcocoIE(InfoExtractor):
              'info_dict': {
                  'id': '19705',
                  'ext': 'mp4',
-                "description": "Louis C.K. got starstruck by George W. Bush, so what? Part one.",
-                "title": "Louis C.K. Interview Pt. 1 11/3/11",
+                'description': 'Louis C.K. got starstruck by George W. Bush, so what? Part one.',
+                'title': 'Louis C.K. Interview Pt. 1 11/3/11',
                  'age_limit': 0,
              }
          }
@@ -42,42 +44,39 @@ def _real_extract(self, url):
          display_id = mobj.group('display_id')
          webpage = self._download_webpage(url, display_id)
  
-        video_id = mobj.group("video_id")
+        video_id = mobj.group('video_id')
          if not video_id:
              video_id = self._html_search_regex(
                  self._VIDEO_ID_REGEXES, webpage, 'video id')
  
-        data_url = 'http://teamcoco.com/cvp/2.0/%s.xml' % video_id
-        data = self._download_xml(
-            data_url, display_id, 'Downloading data webpage')
+        embed_url = 'http://teamcoco.com/embed/v/%s' % video_id
+        embed = self._download_webpage(
+            embed_url, video_id, 'Downloading embed page')
+
+        encoded_data = self._search_regex(
+            r'"preload"\s*:\s*"([^"]+)"', embed, 'encoded data')
+        data = self._parse_json(
+            base64.b64decode(encoded_data.encode('ascii')).decode('utf-8'), video_id)
  
-        qualities = ['500k', '480p', '1000k', '720p', '1080p']
          formats = []
-        for filed in data.findall('files/file'):
-            if filed.attrib.get('playmode') == 'all':
-                # it just duplicates one of the entries
-                break
-            file_url = filed.text
-            m_format = re.search(r'(\d+(k|p))\.mp4', file_url)
+        get_quality = qualities(['500k', '480p', '1000k', '720p', '1080p'])
+        for filed in data['files']:
+            m_format = re.search(r'(\d+(k|p))\.mp4', filed['url'])
              if m_format is not None:
                  format_id = m_format.group(1)
              else:
-                format_id = filed.attrib['bitrate']
+                format_id = filed['bitrate']
              tbr = (
-                int(filed.attrib['bitrate'])
-                if filed.attrib['bitrate'].isdigit()
+                int(filed['bitrate'])
+                if filed['bitrate'].isdigit()
                  else None)
  
-            try:
-                quality = qualities.index(format_id)
-            except ValueError:
-                quality = -1
              formats.append({
-                'url': file_url,
+                'url': filed['url'],
                  'ext': 'mp4',
                  'tbr': tbr,
                  'format_id': format_id,
-                'quality': quality,
+                'quality': get_quality(format_id),
              })
  
          self._sort_formats(formats)
@@ -86,8 +85,8 @@ def _real_extract(self, url):
              'id': video_id,
              'display_id': display_id,
              'formats': formats,
-            'title': self._og_search_title(webpage),
-            'thumbnail': self._og_search_thumbnail(webpage),
-            'description': self._og_search_description(webpage),
+            'title': data['title'],
+            'thumbnail': data.get('thumb', {}).get('href'),
+            'description': data.get('teaser'),
              'age_limit': self._family_friendly_search(webpage),
          }
diff --git a/youtube_dl/extractor/ted.py b/youtube_dl/extractor/ted.py

index 0c38c8f899b218b9d663c1fe908e6431aceb0631..4cec06f8bd6e2a18ac3062e916225746f5153c93 100644 (file)
--- a/youtube_dl/extractor/ted.py
+++ b/youtube_dl/extractor/ted.py
@@ -83,6 +83,22 @@ class TEDIE(InfoExtractor):
          'params': {
              'skip_download': True,
          },
+    }, {
+        # YouTube video
+        'url': 'http://www.ted.com/talks/jeffrey_kluger_the_sibling_bond',
+        'add_ie': ['Youtube'],
+        'info_dict': {
+            'id': 'aFBIPO-P7LM',
+            'ext': 'mp4',
+            'title': 'The hidden power of siblings: Jeff Kluger at TEDxAsheville',
+            'description': 'md5:3d7a4f50d95ca5dd67104e2a20f43fe1',
+            'uploader': 'TEDx Talks',
+            'uploader_id': 'TEDxTalks',
+            'upload_date': '20111216',
+        },
+        'params': {
+            'skip_download': True,
+        },
      }]
  
      _NATIVE_FORMATS = {
@@ -132,11 +148,16 @@ def _talk_info(self, url, video_name):
  
          talk_info = self._extract_info(webpage)['talks'][0]
  
-        if talk_info.get('external') is not None:
-            self.to_screen('Found video from %s' % talk_info['external']['service'])
+        external = talk_info.get('external')
+        if external:
+            service = external['service']
+            self.to_screen('Found video from %s' % service)
+            ext_url = None
+            if service.lower() == 'youtube':
+                ext_url = external.get('code')
              return {
                  '_type': 'url',
-                'url': talk_info['external']['uri'],
+                'url': ext_url or external['uri'],
              }
  
          formats = [{
diff --git a/youtube_dl/extractor/theonion.py b/youtube_dl/extractor/theonion.py

index b65d8e03f7741a712001099c601ee354830a74a1..10239c906201e460ed288386709dffc5b7f6efbc 100644 (file)
--- a/youtube_dl/extractor/theonion.py
+++ b/youtube_dl/extractor/theonion.py
@@ -4,11 +4,10 @@
  import re
  
  from .common import InfoExtractor
-from ..utils import ExtractorError
  
  
  class TheOnionIE(InfoExtractor):
-    _VALID_URL = r'(?x)https?://(?:www\.)?theonion\.com/video/[^,]+,(?P<article_id>[0-9]+)/?'
+    _VALID_URL = r'https?://(?:www\.)?theonion\.com/video/[^,]+,(?P<id>[0-9]+)/?'
      _TEST = {
          'url': 'http://www.theonion.com/video/man-wearing-mm-jacket-gods-image,36918/',
          'md5': '19eaa9a39cf9b9804d982e654dc791ee',
@@ -22,10 +21,8 @@ class TheOnionIE(InfoExtractor):
      }
  
      def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-        article_id = mobj.group('article_id')
-
-        webpage = self._download_webpage(url, article_id)
+        display_id = self._match_id(url)
+        webpage = self._download_webpage(url, display_id)
  
          video_id = self._search_regex(
              r'"videoId":\s(\d+),', webpage, 'video ID')
@@ -34,10 +31,6 @@ def _real_extract(self, url):
          thumbnail = self._og_search_thumbnail(webpage)
  
          sources = re.findall(r'<source src="([^"]+)" type="([^"]+)"', webpage)
-        if not sources:
-            raise ExtractorError(
-                'No sources found for video %s' % video_id, expected=True)
-
          formats = []
          for src, type_ in sources:
              if type_ == 'video/mp4':
@@ -54,15 +47,15 @@ def _real_extract(self, url):
                  })
              elif type_ == 'application/x-mpegURL':
                  formats.extend(
-                    self._extract_m3u8_formats(src, video_id, preference=-1))
+                    self._extract_m3u8_formats(src, display_id, preference=-1))
              else:
                  self.report_warning(
                      'Encountered unexpected format: %s' % type_)
-
          self._sort_formats(formats)
  
          return {
              'id': video_id,
+            'display_id': display_id,
              'title': title,
              'formats': formats,
              'thumbnail': thumbnail,
diff --git a/youtube_dl/extractor/theplatform.py b/youtube_dl/extractor/theplatform.py

index 5f24189cca8e11bbdc67665e672b0ac929ebed34..feac666f78baff49f4fb312a147acad67d320bc2 100644 (file)
--- a/youtube_dl/extractor/theplatform.py
+++ b/youtube_dl/extractor/theplatform.py
@@ -71,7 +71,9 @@ def _real_extract(self, url):
          if not provider_id:
              provider_id = 'dJ5BDC'
  
-        if mobj.group('config'):
+        if smuggled_data.get('force_smil_url', False):
+            smil_url = url
+        elif mobj.group('config'):
              config_url = url + '&form=json'
              config_url = config_url.replace('swf/', 'config/')
              config_url = config_url.replace('onsite/', 'onsite/config/')
diff --git a/youtube_dl/extractor/tv4.py b/youtube_dl/extractor/tv4.py

new file mode 100644 (file)

index 0000000..1c4b6d6
--- /dev/null
+++ b/youtube_dl/extractor/tv4.py
@@ -0,0 +1,100 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..utils import (
+    ExtractorError,
+    parse_iso8601,
+)
+
+
+class TV4IE(InfoExtractor):
+    IE_DESC = 'tv4.se and tv4play.se'
+    _VALID_URL = r'''(?x)https?://(?:www\.)?
+        (?:
+            tv4\.se/(?:[^/]+)/klipp/(?:.*)-|
+            tv4play\.se/
+            (?:
+                (?:program|barn)/(?:[^\?]+)\?video_id=|
+                iframe/video/|
+                film/|
+                sport/|
+            )
+        )(?P<id>[0-9]+)'''
+    _TESTS = [
+        {
+            'url': 'http://www.tv4.se/kalla-fakta/klipp/kalla-fakta-5-english-subtitles-2491650',
+            'md5': '909d6454b87b10a25aa04c4bdd416a9b',
+            'info_dict': {
+                'id': '2491650',
+                'ext': 'mp4',
+                'title': 'Kalla Fakta 5 (english subtitles)',
+                'thumbnail': 're:^https?://.*\.jpg$',
+                'timestamp': int,
+                'upload_date': '20131125',
+            },
+        },
+        {
+            'url': 'http://www.tv4play.se/iframe/video/3054113',
+            'md5': '77f851c55139ffe0ebd41b6a5552489b',
+            'info_dict': {
+                'id': '3054113',
+                'ext': 'mp4',
+                'title': 'Så här jobbar ficktjuvarna - se avslöjande bilder',
+                'thumbnail': 're:^https?://.*\.jpg$',
+                'description': 'Unika bilder avslöjar hur turisternas fickor vittjas mitt på Stockholms central. Två experter på ficktjuvarna avslöjar knepen du ska se upp för.',
+                'timestamp': int,
+                'upload_date': '20150130',
+            },
+        },
+        {
+            'url': 'http://www.tv4play.se/sport/3060959',
+            'only_matching': True,
+        },
+        {
+            'url': 'http://www.tv4play.se/film/2378136',
+            'only_matching': True,
+        },
+        {
+            'url': 'http://www.tv4play.se/barn/looney-tunes?video_id=3062412',
+            'only_matching': True,
+        },
+    ]
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+
+        info = self._download_json(
+            'http://www.tv4play.se/player/assets/%s.json' % video_id, video_id, 'Downloading video info JSON')
+
+        # If is_geo_restricted is true, it doesn't neceserally mean we can't download it
+        if info['is_geo_restricted']:
+            self.report_warning('This content might not be available in your country due to licensing restrictions.')
+        if info['requires_subscription']:
+            raise ExtractorError('This content requires subscription.', expected=True)
+
+        sources_data = self._download_json(
+            'https://prima.tv4play.se/api/web/asset/%s/play.json?protocol=http&videoFormat=MP4' % video_id, video_id, 'Downloading sources JSON')
+        sources = sources_data['playback']
+
+        formats = []
+        for item in sources.get('items', {}).get('item', []):
+            ext, bitrate = item['mediaFormat'], item['bitrate']
+            formats.append({
+                'format_id': '%s_%s' % (ext, bitrate),
+                'tbr': bitrate,
+                'ext': ext,
+                'url': item['url'],
+            })
+        self._sort_formats(formats)
+
+        return {
+            'id': video_id,
+            'title': info['title'],
+            'formats': formats,
+            'description': info.get('description'),
+            'timestamp': parse_iso8601(info.get('broadcast_date_time')),
+            'duration': info.get('duration'),
+            'thumbnail': info.get('image'),
+            'is_live': sources.get('live'),
+        }
diff --git a/youtube_dl/extractor/twitch.py b/youtube_dl/extractor/twitch.py

index 87290d002e44850e6b3584a97ff2a3e1be7c1a0f..4b0d8988d9cc0f866120d7f31a6facc25d47afec 100644 (file)
--- a/youtube_dl/extractor/twitch.py
+++ b/youtube_dl/extractor/twitch.py
@@ -349,6 +349,13 @@ def _real_extract(self, url):
              % (self._USHER_BASE, channel_id, compat_urllib_parse.urlencode(query).encode('utf-8')),
              channel_id, 'mp4')
  
+        # prefer the 'source' stream, the others are limited to 30 fps
+        def _sort_source(f):
+            if f.get('m3u8_media') is not None and f['m3u8_media'].get('NAME') == 'Source':
+                return 1
+            return 0
+        formats = sorted(formats, key=_sort_source)
+
          view_count = stream.get('viewers')
          timestamp = parse_iso8601(stream.get('created_at'))
  
diff --git a/youtube_dl/extractor/videolecturesnet.py b/youtube_dl/extractor/videolecturesnet.py

index ebd2a3dca3ac0e7bd812226c80c356a19b3677ab..d6a7eb2033e58a92df09be6c042c91d6e932f8b7 100644 (file)
--- a/youtube_dl/extractor/videolecturesnet.py
+++ b/youtube_dl/extractor/videolecturesnet.py
@@ -49,15 +49,31 @@ def _real_extract(self, url):
          thumbnail = (
              None if thumbnail_el is None else thumbnail_el.attrib.get('src'))
  
-        formats = [{
-            'url': v.attrib['src'],
-            'width': int_or_none(v.attrib.get('width')),
-            'height': int_or_none(v.attrib.get('height')),
-            'filesize': int_or_none(v.attrib.get('size')),
-            'tbr': int_or_none(v.attrib.get('systemBitrate')) / 1000.0,
-            'ext': v.attrib.get('ext'),
-        } for v in switch.findall('./video')
-            if v.attrib.get('proto') == 'http']
+        formats = []
+        for v in switch.findall('./video'):
+            proto = v.attrib.get('proto')
+            if proto not in ['http', 'rtmp']:
+                continue
+            f = {
+                'width': int_or_none(v.attrib.get('width')),
+                'height': int_or_none(v.attrib.get('height')),
+                'filesize': int_or_none(v.attrib.get('size')),
+                'tbr': int_or_none(v.attrib.get('systemBitrate')) / 1000.0,
+                'ext': v.attrib.get('ext'),
+            }
+            src = v.attrib['src']
+            if proto == 'http':
+                if self._is_valid_url(src, video_id):
+                    f['url'] = src
+                    formats.append(f)
+            elif proto == 'rtmp':
+                f.update({
+                    'url': v.attrib['streamer'],
+                    'play_path': src,
+                    'rtmp_real_time': True,
+                })
+                formats.append(f)
+        self._sort_formats(formats)
  
          return {
              'id': video_id,
diff --git a/youtube_dl/extractor/vimeo.py b/youtube_dl/extractor/vimeo.py

index 5930d598415a7bb4cd944d58a0831aa1e2bfd27d..8f540f5780570d06fa10e695555026c537b7c0f0 100644 (file)
--- a/youtube_dl/extractor/vimeo.py
+++ b/youtube_dl/extractor/vimeo.py
@@ -4,6 +4,7 @@
  import json
  import re
  import itertools
+import hashlib
  
  from .common import InfoExtractor
  from ..compat import (
@@ -17,6 +18,7 @@
      InAdvancePagedList,
      int_or_none,
      RegexNotFoundError,
+    smuggle_url,
      std_headers,
      unsmuggle_url,
      urlencode_postdata,
@@ -173,7 +175,7 @@ class VimeoIE(VimeoBaseInfoExtractor):
      def _verify_video_password(self, url, video_id, webpage):
          password = self._downloader.params.get('videopassword', None)
          if password is None:
-            raise ExtractorError('This video is protected by a password, use the --video-password option')
+            raise ExtractorError('This video is protected by a password, use the --video-password option', expected=True)
          token = self._search_regex(r'xsrft: \'(.*?)\'', webpage, 'login token')
          data = compat_urllib_parse.urlencode({
              'password': password,
@@ -223,6 +225,11 @@ def _real_extract(self, url):
          if mobj.group('pro') or mobj.group('player'):
              url = 'http://player.vimeo.com/video/' + video_id
  
+        password = self._downloader.params.get('videopassword', None)
+        if password:
+            headers['Cookie'] = '%s_password=%s' % (
+                video_id, hashlib.md5(password.encode('utf-8')).hexdigest())
+
          # Retrieve video webpage to extract further information
          request = compat_urllib_request.Request(url, None, headers)
          try:
@@ -266,8 +273,11 @@ def _real_extract(self, url):
                  raise ExtractorError('The author has restricted the access to this video, try with the "--referer" option')
  
              if re.search(r'<form[^>]+?id="pw_form"', webpage) is not None:
+                if data and '_video_password_verified' in data:
+                    raise ExtractorError('video password verification failed!')
                  self._verify_video_password(url, video_id, webpage)
-                return self._real_extract(url)
+                return self._real_extract(
+                    smuggle_url(url, {'_video_password_verified': 'verified'}))
              else:
                  raise ExtractorError('Unable to extract info section',
                                       cause=e)
@@ -398,6 +408,7 @@ class VimeoChannelIE(InfoExtractor):
      _TESTS = [{
          'url': 'http://vimeo.com/channels/tributes',
          'info_dict': {
+            'id': 'tributes',
              'title': 'Vimeo Tributes',
          },
          'playlist_mincount': 25,
@@ -476,6 +487,7 @@ class VimeoUserIE(VimeoChannelIE):
          'url': 'http://vimeo.com/nkistudio/videos',
          'info_dict': {
              'title': 'Nki',
+            'id': 'nkistudio',
          },
          'playlist_mincount': 66,
      }]
@@ -493,6 +505,7 @@ class VimeoAlbumIE(VimeoChannelIE):
      _TESTS = [{
          'url': 'http://vimeo.com/album/2632481',
          'info_dict': {
+            'id': '2632481',
              'title': 'Staff Favorites: November 2013',
          },
          'playlist_mincount': 13,
@@ -523,6 +536,7 @@ class VimeoGroupsIE(VimeoAlbumIE):
      _TESTS = [{
          'url': 'http://vimeo.com/groups/rolexawards',
          'info_dict': {
+            'id': 'rolexawards',
              'title': 'Rolex Awards for Enterprise',
          },
          'playlist_mincount': 73,
@@ -605,6 +619,7 @@ class VimeoLikesIE(InfoExtractor):
          'url': 'https://vimeo.com/user755559/likes/',
          'playlist_mincount': 293,
          "info_dict": {
+            'id': 'user755559_likes',
              "description": "See all the videos urza likes",
              "title": 'Videos urza likes',
          },
diff --git a/youtube_dl/extractor/vk.py b/youtube_dl/extractor/vk.py

index 81e02a6244d83327b05c6b76c490391b97b15f92..7dea8c59d2a30673b93ae181bab128b8ef0a8b58 100644 (file)
--- a/youtube_dl/extractor/vk.py
+++ b/youtube_dl/extractor/vk.py
@@ -217,6 +217,9 @@ class VKUserVideosIE(InfoExtractor):
      _TEMPLATE_URL = 'https://vk.com/videos'
      _TEST = {
          'url': 'http://vk.com/videos205387401',
+        'info_dict': {
+            'id': '205387401',
+        },
          'playlist_mincount': 4,
      }
  
diff --git a/youtube_dl/extractor/webofstories.py b/youtube_dl/extractor/webofstories.py

index 396cf4e8312ca73f90f45b3e24f3fb3561f54fa8..73077a312549f6b883fdf549a2b364f6de35db9f 100644 (file)
--- a/youtube_dl/extractor/webofstories.py
+++ b/youtube_dl/extractor/webofstories.py
@@ -45,19 +45,17 @@ def _real_extract(self, url):
          description = self._html_search_meta('description', webpage)
          thumbnail = self._og_search_thumbnail(webpage)
  
-        story_filename = self._search_regex(
-            r'\.storyFileName\("([^"]+)"\)', webpage, 'story filename')
-        speaker_id = self._search_regex(
-            r'\.speakerId\("([^"]+)"\)', webpage, 'speaker ID')
-        story_id = self._search_regex(
-            r'\.storyId\((\d+)\)', webpage, 'story ID')
-        speaker_type = self._search_regex(
-            r'\.speakerType\("([^"]+)"\)', webpage, 'speaker type')
-        great_life = self._search_regex(
-            r'isGreatLifeStory\s*=\s*(true|false)', webpage, 'great life story')
+        embed_params = [s.strip(" \r\n\t'") for s in self._search_regex(
+            r'(?s)\$\("#embedCode"\).html\(getEmbedCode\((.*?)\)',
+            webpage, 'embed params').split(',')]
+
+        (
+            _, speaker_id, story_id, story_duration,
+            speaker_type, great_life, _thumbnail, _has_subtitles,
+            story_filename, _story_order) = embed_params
+
          is_great_life_series = great_life == 'true'
-        duration = int_or_none(self._search_regex(
-            r'\.duration\((\d+)\)', webpage, 'duration', fatal=False))
+        duration = int_or_none(story_duration)
  
          # URL building, see: http://www.webofstories.com/scripts/player.js
          ms_prefix = ''
diff --git a/youtube_dl/extractor/wsj.py b/youtube_dl/extractor/wsj.py

index cbe3dc7bec5c982df8ec53431acdbb0fcd7e1d3a..2ddf29a694ec6365e9089bc18536320489b4d2c3 100644 (file)
--- a/youtube_dl/extractor/wsj.py
+++ b/youtube_dl/extractor/wsj.py
@@ -18,8 +18,8 @@ class WSJIE(InfoExtractor):
              'id': '1BD01A4C-BFE8-40A5-A42F-8A8AF9898B1A',
              'ext': 'mp4',
              'upload_date': '20150202',
-            'uploader_id': 'bbright',
-            'creator': 'bbright',
+            'uploader_id': 'jdesai',
+            'creator': 'jdesai',
              'categories': list,  # a long list
              'duration': 90,
              'title': 'Bills Coach Rex Ryan Updates His Old Jets Tattoo',
diff --git a/youtube_dl/extractor/xtube.py b/youtube_dl/extractor/xtube.py

index e8490b028e53080b8e685be13577a05603a4af9e..1644f53c876329f053406be3d3dc1aa463cddc1b 100644 (file)
--- a/youtube_dl/extractor/xtube.py
+++ b/youtube_dl/extractor/xtube.py
@@ -22,7 +22,7 @@ class XTubeIE(InfoExtractor):
              'id': 'kVTUy_G222_',
              'ext': 'mp4',
              'title': 'strange erotica',
-            'description': 'http://www.xtube.com an ET kind of thing',
+            'description': 'contains:an ET kind of thing',
              'uploader': 'greenshowers',
              'duration': 450,
              'age_limit': 18,
diff --git a/youtube_dl/extractor/yahoo.py b/youtube_dl/extractor/yahoo.py

index f8e7041a08d042ac44c13338439b5568bf4caac6..97dbac4cce53d7fe956b074fddbe40993fd5681f 100644 (file)
--- a/youtube_dl/extractor/yahoo.py
+++ b/youtube_dl/extractor/yahoo.py
@@ -24,7 +24,6 @@ class YahooIE(InfoExtractor):
      _TESTS = [
          {
              'url': 'http://screen.yahoo.com/julian-smith-travis-legg-watch-214727115.html',
-            'md5': '4962b075c08be8690a922ee026d05e69',
              'info_dict': {
                  'id': '2d25e626-2378-391f-ada0-ddaf1417e588',
                  'ext': 'mp4',
diff --git a/youtube_dl/extractor/yam.py b/youtube_dl/extractor/yam.py

new file mode 100644 (file)

index 0000000..b294767
--- /dev/null
+++ b/youtube_dl/extractor/yam.py
@@ -0,0 +1,81 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+import re
+
+from .common import InfoExtractor
+from ..compat import compat_urlparse
+from ..utils import (
+    float_or_none,
+    month_by_abbreviation,
+)
+
+
+class YamIE(InfoExtractor):
+    _VALID_URL = r'http://mymedia.yam.com/m/(?P<id>\d+)'
+
+    _TESTS = [{
+        # An audio hosted on Yam
+        'url': 'http://mymedia.yam.com/m/2283921',
+        'md5': 'c011b8e262a52d5473d9c2e3c9963b9c',
+        'info_dict': {
+            'id': '2283921',
+            'ext': 'mp3',
+            'title': '發現 - 趙薇 京華煙雲主題曲',
+            'uploader_id': 'princekt',
+            'upload_date': '20080807',
+            'duration': 313.0,
+        }
+    }, {
+        # An external video hosted on YouTube
+        'url': 'http://mymedia.yam.com/m/3598173',
+        'md5': '0238ceec479c654e8c2f1223755bf3e9',
+        'info_dict': {
+            'id': 'pJ2Deys283c',
+            'ext': 'mp4',
+            'upload_date': '20150202',
+            'uploader': '新莊社大瑜伽社',
+            'description': 'md5:f5cc72f0baf259a70fb731654b0d2eff',
+            'uploader_id': '2323agoy',
+            'title': '外婆的澎湖灣KTV-潘安邦',
+        }
+    }]
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+        page = self._download_webpage(url, video_id)
+
+        # Is it hosted externally on YouTube?
+        youtube_url = self._html_search_regex(
+            r'<embed src="(http://www.youtube.com/[^"]+)"',
+            page, 'YouTube url', default=None)
+        if youtube_url:
+            return self.url_result(youtube_url, 'Youtube')
+
+        api_page = self._download_webpage(
+            'http://mymedia.yam.com/api/a/?pID=' + video_id, video_id,
+            note='Downloading API page')
+        api_result_obj = compat_urlparse.parse_qs(api_page)
+
+        uploader_id = self._html_search_regex(
+            r'<!-- 發表作者 -->：[\n ]+<a href="/([a-z]+)"',
+            page, 'uploader id', fatal=False)
+        mobj = re.search(r'<!-- 發表於 -->(?P<mon>[A-Z][a-z]{2})  ' +
+                         r'(?P<day>\d{1,2}), (?P<year>\d{4})', page)
+        if mobj:
+            upload_date = '%s%02d%02d' % (
+                mobj.group('year'),
+                month_by_abbreviation(mobj.group('mon')),
+                int(mobj.group('day')))
+        else:
+            upload_date = None
+        duration = float_or_none(api_result_obj['totaltime'][0], scale=1000)
+
+        return {
+            'id': video_id,
+            'url': api_result_obj['mp3file'][0],
+            'title': self._html_search_meta('description', page),
+            'duration': duration,
+            'uploader_id': uploader_id,
+            'upload_date': upload_date,
+        }
diff --git a/youtube_dl/extractor/youtube.py b/youtube_dl/extractor/youtube.py

index 1b2dbf2765b64ddd7d4f1cfab2698a4b03f6571f..22db896b16066bff193bc1ef7eddab214b9440a9 100644 (file)
--- a/youtube_dl/extractor/youtube.py
+++ b/youtube_dl/extractor/youtube.py
@@ -540,26 +540,30 @@ def _extract_signature_function(self, video_id, player_url, example_sig):
          if cache_spec is not None:
              return lambda s: ''.join(s[i] for i in cache_spec)
  
+        download_note = (
+            'Downloading player %s' % player_url
+            if self._downloader.params.get('verbose') else
+            'Downloading %s player %s' % (player_type, player_id)
+        )
          if player_type == 'js':
              code = self._download_webpage(
                  player_url, video_id,
-                note='Downloading %s player %s' % (player_type, player_id),
+                note=download_note,
                  errnote='Download of %s failed' % player_url)
              res = self._parse_sig_js(code)
          elif player_type == 'swf':
              urlh = self._request_webpage(
                  player_url, video_id,
-                note='Downloading %s player %s' % (player_type, player_id),
+                note=download_note,
                  errnote='Download of %s failed' % player_url)
              code = urlh.read()
              res = self._parse_sig_swf(code)
          else:
              assert False, 'Invalid player type %r' % player_type
  
-        if cache_spec is None:
-            test_string = ''.join(map(compat_chr, range(len(example_sig))))
-            cache_res = res(test_string)
-            cache_spec = [ord(c) for c in cache_res]
+        test_string = ''.join(map(compat_chr, range(len(example_sig))))
+        cache_res = res(test_string)
+        cache_spec = [ord(c) for c in cache_res]
  
          self._downloader.cache.store('youtube-sigfuncs', func_id, cache_spec)
          return res
diff --git a/youtube_dl/extractor/zapiks.py b/youtube_dl/extractor/zapiks.py

new file mode 100644 (file)

index 0000000..22a9a57
--- /dev/null
+++ b/youtube_dl/extractor/zapiks.py
@@ -0,0 +1,110 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+import re
+
+from .common import InfoExtractor
+from ..utils import (
+    parse_duration,
+    parse_iso8601,
+    xpath_with_ns,
+    xpath_text,
+    int_or_none,
+)
+
+
+class ZapiksIE(InfoExtractor):
+    _VALID_URL = r'https?://(?:www\.)?zapiks\.(?:fr|com)/(?:(?:[a-z]{2}/)?(?P<display_id>.+?)\.html|index\.php\?.*\bmedia_id=(?P<id>\d+))'
+    _TESTS = [
+        {
+            'url': 'http://www.zapiks.fr/ep2s3-bon-appetit-eh-be-viva.html',
+            'md5': 'aeb3c473b2d564b2d46d664d28d5f050',
+            'info_dict': {
+                'id': '80798',
+                'ext': 'mp4',
+                'title': 'EP2S3 - Bon Appétit - Eh bé viva les pyrénées con!',
+                'description': 'md5:7054d6f6f620c6519be1fe710d4da847',
+                'thumbnail': 're:^https?://.*\.jpg$',
+                'duration': 528,
+                'timestamp': 1359044972,
+                'upload_date': '20130124',
+                'view_count': int,
+                'comment_count': int,
+            },
+        },
+        {
+            'url': 'http://www.zapiks.com/ep3s5-bon-appetit-baqueira-m-1.html',
+            'only_matching': True,
+        },
+        {
+            'url': 'http://www.zapiks.com/nl/ep3s5-bon-appetit-baqueira-m-1.html',
+            'only_matching': True,
+        },
+        {
+            'url': 'http://www.zapiks.fr/index.php?action=playerIframe&amp;media_id=118046&amp;width=640&amp;height=360&amp;autoStart=false&amp;language=fr',
+            'only_matching': True,
+        },
+    ]
+
+    def _real_extract(self, url):
+        mobj = re.match(self._VALID_URL, url)
+        video_id = mobj.group('id')
+        display_id = mobj.group('display_id') or video_id
+
+        webpage = self._download_webpage(url, display_id)
+
+        if not video_id:
+            video_id = self._search_regex(
+                r'data-media-id="(\d+)"', webpage, 'video id')
+
+        playlist = self._download_xml(
+            'http://www.zapiks.fr/view/index.php?action=playlist&media_id=%s&lang=en' % video_id,
+            display_id)
+
+        NS_MAP = {
+            'jwplayer': 'http://rss.jwpcdn.com/'
+        }
+
+        def ns(path):
+            return xpath_with_ns(path, NS_MAP)
+
+        item = playlist.find('./channel/item')
+
+        title = xpath_text(item, 'title', 'title') or self._og_search_title(webpage)
+        description = self._og_search_description(webpage, default=None)
+        thumbnail = xpath_text(
+            item, ns('./jwplayer:image'), 'thumbnail') or self._og_search_thumbnail(webpage, default=None)
+        duration = parse_duration(self._html_search_meta(
+            'duration', webpage, 'duration', default=None))
+        timestamp = parse_iso8601(self._html_search_meta(
+            'uploadDate', webpage, 'upload date', default=None), ' ')
+
+        view_count = int_or_none(self._search_regex(
+            r'UserPlays:(\d+)', webpage, 'view count', default=None))
+        comment_count = int_or_none(self._search_regex(
+            r'UserComments:(\d+)', webpage, 'comment count', default=None))
+
+        formats = []
+        for source in item.findall(ns('./jwplayer:source')):
+            format_id = source.attrib['label']
+            f = {
+                'url': source.attrib['file'],
+                'format_id': format_id,
+            }
+            m = re.search(r'^(?P<height>\d+)[pP]', format_id)
+            if m:
+                f['height'] = int(m.group('height'))
+            formats.append(f)
+        self._sort_formats(formats)
+
+        return {
+            'id': video_id,
+            'title': title,
+            'description': description,
+            'thumbnail': thumbnail,
+            'duration': duration,
+            'timestamp': timestamp,
+            'view_count': view_count,
+            'comment_count': comment_count,
+            'formats': formats,
+        }
diff --git a/youtube_dl/jsinterp.py b/youtube_dl/jsinterp.py

index 453e2732cc4faa453a98b153356c2188feef1d35..0e0c7d90d5aa2fbb8039dddf642ac4692f2974a7 100644 (file)
--- a/youtube_dl/jsinterp.py
+++ b/youtube_dl/jsinterp.py
@@ -30,13 +30,10 @@ class JSInterpreter(object):
      def __init__(self, code, objects=None):
          if objects is None:
              objects = {}
-        self.code = self._remove_comments(code)
+        self.code = code
          self._functions = {}
          self._objects = objects
  
-    def _remove_comments(self, code):
-        return re.sub(r'(?s)/\*.*?\*/', '', code)
-
      def interpret_statement(self, stmt, local_vars, allow_recursion=100):
          if allow_recursion < 0:
              raise ExtractorError('Recursion limit reached')
diff --git a/youtube_dl/options.py b/youtube_dl/options.py

index 4fcf8c83dd1574347c0e3e5cc0473ee293b4c323..5c2d153b13b9e060e4f5b4d1d9991d10a72413c4 100644 (file)
--- a/youtube_dl/options.py
+++ b/youtube_dl/options.py
@@ -424,6 +424,10 @@ def _hide_login_info(opts):
          '--xattr-set-filesize',
          dest='xattr_set_filesize', action='store_true',
          help='(experimental) set file xattribute ytdl.filesize with expected filesize')
+    downloader.add_option(
+        '--hls-prefer-native',
+        dest='hls_prefer_native', action='store_true',
+        help='(experimental) Use the native HLS downloader instead of ffmpeg.')
      downloader.add_option(
          '--external-downloader',
          dest='external_downloader', metavar='COMMAND',
@@ -735,6 +739,10 @@ def _hide_login_info(opts):
          '--prefer-ffmpeg',
          action='store_true', dest='prefer_ffmpeg',
          help='Prefer ffmpeg over avconv for running the postprocessors')
+    postproc.add_option(
+        '--ffmpeg-location', '--avconv-location', metavar='PATH',
+        dest='ffmpeg_location',
+        help='Location of the ffmpeg/avconv binary; either the path to the binary or its containing directory.')
      postproc.add_option(
          '--exec',
          metavar='CMD', dest='exec_cmd',
diff --git a/youtube_dl/postprocessor/ffmpeg.py b/youtube_dl/postprocessor/ffmpeg.py

index e42298f0e8c1a0d97d7c3a465ec6aea337ef380c..398fe050ede3d7da8678fd1453bc1ae475419362 100644 (file)
--- a/youtube_dl/postprocessor/ffmpeg.py
+++ b/youtube_dl/postprocessor/ffmpeg.py
@@ -30,54 +30,95 @@ class FFmpegPostProcessorError(PostProcessingError):
  class FFmpegPostProcessor(PostProcessor):
      def __init__(self, downloader=None, deletetempfiles=False):
          PostProcessor.__init__(self, downloader)
-        self._versions = self.get_versions()
          self._deletetempfiles = deletetempfiles
+        self._determine_executables()
  
      def check_version(self):
-        if not self._executable:
+        if not self.available:
              raise FFmpegPostProcessorError('ffmpeg or avconv not found. Please install one.')
  
-        required_version = '10-0' if self._uses_avconv() else '1.0'
+        required_version = '10-0' if self.basename == 'avconv' else '1.0'
          if is_outdated_version(
-                self._versions[self._executable], required_version):
+                self._versions[self.basename], required_version):
              warning = 'Your copy of %s is outdated, update %s to version %s or newer if you encounter any errors.' % (
-                self._executable, self._executable, required_version)
+                self.basename, self.basename, required_version)
              if self._downloader:
                  self._downloader.report_warning(warning)
  
      @staticmethod
-    def get_versions():
-        programs = ['avprobe', 'avconv', 'ffmpeg', 'ffprobe']
-        return dict((p, get_exe_version(p, args=['-version'])) for p in programs)
-
-    @property
-    def available(self):
-        return self._executable is not None
+    def get_versions(downloader=None):
+        return FFmpegPostProcessor(downloader)._versions
  
-    @property
-    def _executable(self):
-        if self._downloader.params.get('prefer_ffmpeg', False):
+    def _determine_executables(self):
+        programs = ['avprobe', 'avconv', 'ffmpeg', 'ffprobe']
+        prefer_ffmpeg = self._downloader.params.get('prefer_ffmpeg', False)
+
+        self.basename = None
+        self.probe_basename = None
+
+        self._paths = None
+        self._versions = None
+        if self._downloader:
+            location = self._downloader.params.get('ffmpeg_location')
+            if location is not None:
+                if not os.path.exists(location):
+                    self._downloader.report_warning(
+                        'ffmpeg-location %s does not exist! '
+                        'Continuing without avconv/ffmpeg.' % (location))
+                    self._versions = {}
+                    return
+                elif not os.path.isdir(location):
+                    basename = os.path.splitext(os.path.basename(location))[0]
+                    if basename not in programs:
+                        self._downloader.report_warning(
+                            'Cannot identify executable %s, its basename should be one of %s. '
+                            'Continuing without avconv/ffmpeg.' %
+                            (location, ', '.join(programs)))
+                        self._versions = {}
+                        return None
+                    location = os.path.dirname(os.path.abspath(location))
+                    if basename in ('ffmpeg', 'ffprobe'):
+                        prefer_ffmpeg = True
+
+                self._paths = dict(
+                    (p, os.path.join(location, p)) for p in programs)
+                self._versions = dict(
+                    (p, get_exe_version(self._paths[p], args=['-version']))
+                    for p in programs)
+        if self._versions is None:
+            self._versions = dict(
+                (p, get_exe_version(p, args=['-version'])) for p in programs)
+            self._paths = dict((p, p) for p in programs)
+
+        if prefer_ffmpeg:
              prefs = ('ffmpeg', 'avconv')
          else:
              prefs = ('avconv', 'ffmpeg')
          for p in prefs:
              if self._versions[p]:
-                return p
-        return None
+                self.basename = p
+                break
  
-    @property
-    def _probe_executable(self):
-        if self._downloader.params.get('prefer_ffmpeg', False):
+        if prefer_ffmpeg:
              prefs = ('ffprobe', 'avprobe')
          else:
              prefs = ('avprobe', 'ffprobe')
          for p in prefs:
              if self._versions[p]:
-                return p
-        return None
+                self.probe_basename = p
+                break
+
+    @property
+    def available(self):
+        return self.basename is not None
  
-    def _uses_avconv(self):
-        return self._executable == 'avconv'
+    @property
+    def executable(self):
+        return self._paths[self.basename]
+
+    @property
+    def probe_executable(self):
+        return self._paths[self.probe_basename]
  
      def run_ffmpeg_multiple_files(self, input_paths, out_path, opts):
          self.check_version()
@@ -88,7 +129,7 @@ def run_ffmpeg_multiple_files(self, input_paths, out_path, opts):
          files_cmd = []
          for path in input_paths:
              files_cmd.extend([encodeArgument('-i'), encodeFilename(path, True)])
-        cmd = ([encodeFilename(self._executable, True), encodeArgument('-y')] +
+        cmd = ([encodeFilename(self.executable, True), encodeArgument('-y')] +
                 files_cmd +
                 [encodeArgument(o) for o in opts] +
                 [encodeFilename(self._ffmpeg_filename_argument(out_path), True)])
@@ -127,13 +168,15 @@ def __init__(self, downloader=None, preferredcodec=None, preferredquality=None,
  
      def get_audio_codec(self, path):
  
-        if not self._probe_executable:
+        if not self.probe_executable:
              raise PostProcessingError('ffprobe or avprobe not found. Please install one.')
          try:
              cmd = [
-                encodeFilename(self._probe_executable, True),
+                encodeFilename(self.probe_executable, True),
                  encodeArgument('-show_streams'),
                  encodeFilename(self._ffmpeg_filename_argument(path), True)]
+            if self._downloader.params.get('verbose', False):
+                self._downloader.to_screen('[debug] %s command line: %s' % (self.basename, shell_quote(cmd)))
              handle = subprocess.Popen(cmd, stderr=compat_subprocess_get_DEVNULL(), stdout=subprocess.PIPE, stdin=subprocess.PIPE)
              output = handle.communicate()[0]
              if handle.wait() != 0:
@@ -223,14 +266,14 @@ def run(self, information):
              if self._nopostoverwrites and os.path.exists(encodeFilename(new_path)):
                  self._downloader.to_screen('[youtube] Post-process file %s exists, skipping' % new_path)
              else:
-                self._downloader.to_screen('[' + self._executable + '] Destination: ' + new_path)
+                self._downloader.to_screen('[' + self.basename + '] Destination: ' + new_path)
                  self.run_ffmpeg(path, new_path, acodec, more_opts)
          except:
              etype, e, tb = sys.exc_info()
              if isinstance(e, AudioConversionError):
                  msg = 'audio conversion failed: ' + e.msg
              else:
-                msg = 'error running ' + self._executable
+                msg = 'error running ' + self.basename
              raise PostProcessingError(msg)
  
          # Try to update the date time for extracted audio file.
diff --git a/youtube_dl/utils.py b/youtube_dl/utils.py

index 54fa17c388aa326cbbc83b1f2a7829a00deddb0a..475fad3c903f9a2923def9f186c746d067807a68 100644 (file)
--- a/youtube_dl/utils.py
+++ b/youtube_dl/utils.py
@@ -62,6 +62,11 @@
  }
  
  
+ENGLISH_MONTH_NAMES = [
+    'January', 'February', 'March', 'April', 'May', 'June',
+    'July', 'August', 'September', 'October', 'November', 'December']
+
+
  def preferredencoding():
      """Get preferred encoding.
  
@@ -895,8 +900,8 @@ def _windows_write_string(s, out):
      def not_a_console(handle):
          if handle == INVALID_HANDLE_VALUE or handle is None:
              return True
-        return ((GetFileType(handle) & ~FILE_TYPE_REMOTE) != FILE_TYPE_CHAR
-                or GetConsoleMode(handle, ctypes.byref(ctypes.wintypes.DWORD())) == 0)
+        return ((GetFileType(handle) & ~FILE_TYPE_REMOTE) != FILE_TYPE_CHAR or
+                GetConsoleMode(handle, ctypes.byref(ctypes.wintypes.DWORD())) == 0)
  
      if not_a_console(h):
          return False
@@ -1185,11 +1190,18 @@ def get_term_width():
  def month_by_name(name):
      """ Return the number of a month by (locale-independently) English name """
  
-    ENGLISH_NAMES = [
-        'January', 'February', 'March', 'April', 'May', 'June',
-        'July', 'August', 'September', 'October', 'November', 'December']
      try:
-        return ENGLISH_NAMES.index(name) + 1
+        return ENGLISH_MONTH_NAMES.index(name) + 1
+    except ValueError:
+        return None
+
+
+def month_by_abbreviation(abbrev):
+    """ Return the number of a month by (locale-independently) English
+        abbreviations """
+
+    try:
+        return [s[:3] for s in ENGLISH_MONTH_NAMES].index(abbrev) + 1
      except ValueError:
          return None
  
@@ -1548,8 +1560,8 @@ def fix_kv(m):
          return '"%s"' % v
  
      res = re.sub(r'''(?x)
-        "(?:[^"\\]*(?:\\\\|\\")?)*"|
-        '(?:[^'\\]*(?:\\\\|\\')?)*'|
+        "(?:[^"\\]*(?:\\\\|\\['"nu]))*[^"\\]*"|
+        '(?:[^'\\]*(?:\\\\|\\['"nu]))*[^'\\]*'|
          [a-zA-Z_][.a-zA-Z_0-9]*
          ''', fix_kv, code)
      res = re.sub(r',(\s*\])', lambda m: m.group(1), res)
@@ -1604,6 +1616,15 @@ def args_to_str(args):
      return ' '.join(shlex_quote(a) for a in args)
  
  
+def mimetype2ext(mt):
+    _, _, res = mt.rpartition('/')
+
+    return {
+        'x-ms-wmv': 'wmv',
+        'x-mp4-fragmented': 'mp4',
+    }.get(res, res)
+
+
  def urlhandle_detect_ext(url_handle):
      try:
          url_handle.headers
@@ -1619,7 +1640,7 @@ def urlhandle_detect_ext(url_handle):
              if e:
                  return e
  
-    return getheader('Content-Type').split("/")[1]
+    return mimetype2ext(getheader('Content-Type'))
  
  
  def age_restricted(content_limit, age_limit):
diff --git a/youtube_dl/version.py b/youtube_dl/version.py

index 492ddf2ea0dc55df54f0ca787cd1bb66da26d22a..17317b29c2ce841a6babc4d08a931f6b75020c33 100644 (file)
--- a/youtube_dl/version.py
+++ b/youtube_dl/version.py
@@ -1,3 +1,3 @@
  from __future__ import unicode_literals
  
-__version__ = '2015.02.11'
+__version__ = '2015.02.23'
author	Jaime Marquínez Ferrándiz <redacted>
	Mon, 23 Feb 2015 16:13:03 +0000 (17:13 +0100)
committer	Jaime Marquínez Ferrándiz <redacted>
	Mon, 23 Feb 2015 16:13:03 +0000 (17:13 +0100)
AUTHORS		patch \| blob \| blame \| history
Makefile		patch \| blob \| blame \| history
README.md		patch \| blob \| blame \| history
devscripts/check-porn.py		patch \| blob \| blame \| history
docs/supportedsites.md		patch \| blob \| blame \| history
test/helper.py		patch \| blob \| blame \| history
test/test_jsinterp.py		patch \| blob \| blame \| history
test/test_swfinterp.py		patch \| blob \| blame \| history
test/test_utils.py		patch \| blob \| blame \| history
test/test_youtube_signature.py		patch \| blob \| blame \| history
youtube_dl/YoutubeDL.py		patch \| blob \| blame \| history
youtube_dl/__init__.py		patch \| blob \| blame \| history
youtube_dl/downloader/__init__.py		patch \| blob \| blame \| history
youtube_dl/downloader/common.py		patch \| blob \| blame \| history
youtube_dl/downloader/external.py		patch \| blob \| blame \| history
youtube_dl/downloader/f4m.py		patch \| blob \| blame \| history
youtube_dl/downloader/hls.py		patch \| blob \| blame \| history
youtube_dl/downloader/http.py		patch \| blob \| blame \| history
youtube_dl/downloader/rtmp.py		patch \| blob \| blame \| history
youtube_dl/extractor/__init__.py		patch \| blob \| blame \| history
youtube_dl/extractor/adobetv.py		patch \| blob \| blame \| history
youtube_dl/extractor/adultswim.py		patch \| blob \| blame \| history
youtube_dl/extractor/appletrailers.py		patch \| blob \| blame \| history
youtube_dl/extractor/bandcamp.py		patch \| blob \| blame \| history
youtube_dl/extractor/blinkx.py		patch \| blob \| blame \| history
youtube_dl/extractor/brightcove.py		patch \| blob \| blame \| history
youtube_dl/extractor/buzzfeed.py		patch \| blob \| blame \| history
youtube_dl/extractor/cbs.py		patch \| blob \| blame \| history
youtube_dl/extractor/cbssports.py	[new file with mode: 0644]	patch \| blob
youtube_dl/extractor/chirbit.py	[new file with mode: 0644]	patch \| blob
youtube_dl/extractor/common.py		patch \| blob \| blame \| history
youtube_dl/extractor/dailymotion.py		patch \| blob \| blame \| history
youtube_dl/extractor/defense.py		patch \| blob \| blame \| history
youtube_dl/extractor/embedly.py	[new file with mode: 0644]	patch \| blob
youtube_dl/extractor/escapist.py		patch \| blob \| blame \| history
youtube_dl/extractor/fivemin.py		patch \| blob \| blame \| history
youtube_dl/extractor/gdcvault.py		patch \| blob \| blame \| history
youtube_dl/extractor/generic.py		patch \| blob \| blame \| history
youtube_dl/extractor/ign.py		patch \| blob \| blame \| history
youtube_dl/extractor/imgur.py	[new file with mode: 0644]	patch \| blob
youtube_dl/extractor/livestream.py		patch \| blob \| blame \| history
youtube_dl/extractor/nationalgeographic.py	[new file with mode: 0644]	patch \| blob
youtube_dl/extractor/nbc.py		patch \| blob \| blame \| history
youtube_dl/extractor/netzkino.py		patch \| blob \| blame \| history
youtube_dl/extractor/patreon.py		patch \| blob \| blame \| history
youtube_dl/extractor/pornhub.py		patch \| blob \| blame \| history
youtube_dl/extractor/r7.py	[new file with mode: 0644]	patch \| blob
youtube_dl/extractor/radiode.py		patch \| blob \| blame \| history
youtube_dl/extractor/rtlnl.py		patch \| blob \| blame \| history
youtube_dl/extractor/rtve.py		patch \| blob \| blame \| history
youtube_dl/extractor/sandia.py	[new file with mode: 0644]	patch \| blob
youtube_dl/extractor/sockshare.py		patch \| blob \| blame \| history
youtube_dl/extractor/soundgasm.py		patch \| blob \| blame \| history
youtube_dl/extractor/teamcoco.py		patch \| blob \| blame \| history
youtube_dl/extractor/ted.py		patch \| blob \| blame \| history
youtube_dl/extractor/theonion.py		patch \| blob \| blame \| history
youtube_dl/extractor/theplatform.py		patch \| blob \| blame \| history
youtube_dl/extractor/tv4.py	[new file with mode: 0644]	patch \| blob
youtube_dl/extractor/twitch.py		patch \| blob \| blame \| history
youtube_dl/extractor/videolecturesnet.py		patch \| blob \| blame \| history
youtube_dl/extractor/vimeo.py		patch \| blob \| blame \| history
youtube_dl/extractor/vk.py		patch \| blob \| blame \| history
youtube_dl/extractor/webofstories.py		patch \| blob \| blame \| history
youtube_dl/extractor/wsj.py		patch \| blob \| blame \| history
youtube_dl/extractor/xtube.py		patch \| blob \| blame \| history
youtube_dl/extractor/yahoo.py		patch \| blob \| blame \| history
youtube_dl/extractor/yam.py	[new file with mode: 0644]	patch \| blob
youtube_dl/extractor/youtube.py		patch \| blob \| blame \| history
youtube_dl/extractor/zapiks.py	[new file with mode: 0644]	patch \| blob
youtube_dl/jsinterp.py		patch \| blob \| blame \| history
youtube_dl/options.py		patch \| blob \| blame \| history
youtube_dl/postprocessor/ffmpeg.py		patch \| blob \| blame \| history
youtube_dl/utils.py		patch \| blob \| blame \| history
youtube_dl/version.py		patch \| blob \| blame \| history