23-09-2018 - Broken SSL Certs Courtesy of DCU

A Little Backstory

Between December and March of 2018 I was pretty heavily working on a command line tool called dcurooms. Very simple in design, it was built to display room information and book and request rooms.

For the most part it worked pretty well, supporting Python version from 2.7 up to 3.6 (there was no 3.7 at that time). In all honesty, it’s just a bunch of Python scripts sewn together quite badly. That said, it works. It makes use of mechanicalsoup, bs4 and requests and scrape DCU location timetables, such as this one for example.

I had some pretty nifty ideas for it going forward. Like, instead of constantly scraping the web, which indeed requires an connection to said web, there would be another option, -u or –update. This was scrape the entirety of any timetable you wanted and store it in a file which dcurooms could access. From there on, you just be parsing a file. I had other ideas but they faield to materialize. Why? Well…


The Breaking

I can’t remember what day it was exactly, in fact I’m sure an admin of the 2017-18 Redbrick Committee might know, but a bunch of DCU’s SSL certs broke, or at least they failed to verify. If I remember correctly this also affect Redbrick at the time, but again, that’s one of the admins of 2017-18. To make a long story short, this rendered dcurooms useless. Well, that’s not entirely true, you still had the room booing functionality, aside from that, however, the whole thing was borked. If you ran any other command, you were greeted with something that looked like this:

requests.exceptions.SSLError: HTTPSConnectionPool(host='www101.dcu.ie', port=443):
Max retries exceeded with url: /timetables/feed.php?room=GLA.LG25&week1=10&hour=11&day=4&template=location
(Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', 'tls_process_server_certificate', 'certificate verify failed')],)",),))

That’s the very end of the error message, the original was about 73 lines or so. dcurooms had been brought to it’s knees.


What Happened Next?

This is the part I guess I should feel guilty about. What happened when dcurooms died a small death? Well, I let it. I just kind of stopped caring at the time. I knew it wasn’t even a big fix, I just had moved on from it when it broke. Between Redbrick and other things (neo) I really had no time. No seriously, I had no time, we were in the home stretch of the SISTEM event. I just had no chance to catch my breath, let alone fix dcurooms’ SSL errors.

That was, until today…


Fixing dcurooms

So today, the day before I start third year, I had some time to catch my breath. I sat down, looked at a list of things I’ve put off for a while, and knowing I was about to be acquainted with Python once again, I said I’d go fix this issue. I installed dcurooms as per the README, but when I ran dcurooms --help I got this:

/usr/lib/python3/dist-packages/requests/__init__.py:80: RequestsDependencyWarning: urllib3 (1.23) or chardet (3.0.4) doesn't match a supported version!RequestsDependencyWarning)

Huh? I thought to myself. That wasn’t there the last time. I actually wasn’t sure what this was telling me at first. Was the specified version of requests in requirements.txt too old? No, that makes no sense. Was it that I needed to add urllib3 and chardet to requirements.txt? No, that wouldn’t make a difference.

So, like the idiot, I am, I somewhat panicked and tried to re-install the Python requests modules. Off I hit with my:

sudo pip3 uninstall requests

While that totally works now, it absolutely did not at the time. So I justed banged out a:

sudo apt-get remove python3-requests
sudo apt-get install python3-requests

and lo and behold it worked. I fixed the thing which was stopping me from fixing the other thing. Programming is great. No to fix the issue itself. Well, it turns out that igoring an SSL cert is incredibly easy with requests. When using requests.get(“www.dcu.ie”) you just add the param verify=False:

requests.get("www.dcu.ie", verify=False)

And would you look at that it… oh, it’s.. why isn’t it working?


The Real Issue

Kids.. no matter how long the error message is.. make sure you read it. Here I was, thinking it was a problem requests was having with the broken certs, when in reality it wasn’t getting near them at all. Why? because the error I was getting was occuring when the mechanicalsoup browser tried to open the timetable link. So that’s where dcurooms was having it’s issue. It’s the exact same problem, just it’s in a different part of the codebase.

So… what fixes that? Off I go reading the mechanicalsoup docs, looking for what I can. I spent a fair about of time scouring that documentation. I should have, I should have just read the overview on the mechanicalsoup Github repo. mechanicalsoup is built on top of requests. The solution therefore, is this:

browser.open("www.dcu.ie", verify=False)

The exact same parameter.


Conclusions

Today’s lessons - always read your error message properly, and don’t make every problem a new one. You’ve proabably dealt with something like it beforehand. Had I thought logically about it, I wouldn’t have needed to dive into the mechanicalsoup docs.

The good news is now I can actaully implement some of my idea from about 7 months ago.

By James McDermott


about. · blogs. · gallery. · contact. ·