RSS feeds from iTunesU

25 Jul 2014

Westminster Theological Seminary seem to have renewed their desire to release their courses on iTunesU. Within the last few weeks and months they have released new lecture courses:

Unfortunately there is not an easy way (that I have found) to subscribe to this content via podcast unless one has bought into the Apple ecosystem.

But with the help of the iTunesU website and some python code we can generate our own RSS feeds for import into any podcast app.

The first step is to scrape the content of the site:

import urllib2
from lxml import etree
itunesu_url = "https://itunes.apple.com/us/itunes-u/nt133-biblical-theology/id900461639"
response = urllib2.urlopen(itunesu_url)
htmlparser = etree.HTMLParser()
tree = etree.parse(response, htmlparser)

Using Chrome’s “Inspect Element” tool allows us to find the XPath of the table containing all the mp3 links:

'//*[@id="content"]/div/div[2]/div/div/table/tbody/tr'

Now we need to loop over every row in the table and pull out the required information: the name of the lecture, the url to the mp3 and the date it was published. Once we have this information we use the very handy PyRSS2Gen module to create our RSS items:

import datetime
import PyRSS2Gen
feed_items = list()
    for row in tree.xpath('//*[@id="content"]/div/div[2]/div/div/table/tbody/tr'):
        episode = dict(row.items())
        # print episode['preview-title'].encode('utf-8')
        feed_items.append(PyRSS2Gen.RSSItem(
            title=episode['preview-title'],
            link=episode['audio-preview-url'],
            # enclose the mp3 file so it is picked up by podcast apps:
            enclosure=PyRSS2Gen.Enclosure(episode['audio-preview-url'],
                                          urllib2.urlopen(
                                              episode['audio-preview-url']).info()['Content-Length'], "audio/mpeg"),
            guid=PyRSS2Gen.Guid(episode['audio-preview-url']),
            pubDate=datetime.datetime.strptime(urllib2.urlopen(episode['audio-preview-url']).info()['Date'],
                                               '%a, %d %b %Y %H:%M:%S %Z')))

Finally, we put all the RSS items into a feed and write it to a file:

rss = PyRSS2Gen.RSS2(
        title=episode['preview-album'], # use info from final episode
        link=itunesu_url,
        description=episode['preview-album'],
        lastBuildDate=datetime.datetime.now(),
        items=feed_items)
# make a nice file name:
rss.write_xml(open(slugify(episode['preview-album']) + ".xml", "w"))

Now you can provide this file to your podcast app of choice. For example, you can put it in your Public Dropbox folder and share the link with a podcast app on your phone.

Alternatively, here are the RSS feed for the 3 courses mentioned above:

(Unfortunately this does not seem to play nice with Lane Tipton’s Survey of Reformed Theology course as the iTunes website tries to open the desktop application. I can only guess that there is a flag somewhere that needs set to allow for the display of the preview page that we are scraping.)

Putting together all the code so we have a nice command line tool that can be run with python itunesu2rss.py url1 url2 url3: