I started working on a mini-project yesterday bto reinforce the different topics we’re covering in Flatiron. While it’s not done-enough to go over in detail yet I wanted to cover a few bits of SQL that I found particularly useful in getting the program to its current state.
The SQL side of the program exists to keep a unique record of articles off of Hacker News that meet certain criteria. The scrape can be run multiple times over multiple days, weeks, etc. so it was important that the SQL side of the program not continuously re-enter the same article over multiple scrapes. Here’s what the code looks like…
1 2 3 4 5 6 7 8 9 10 11 12 13
There are a few bits of this code I would like to highlight.
“CREATE TABLE if NOT EXISTS” – This one should be familiar to most Flatiron students already, but since some of our projects and examples have assumed the existance of a table I thought it would be worth mentioning. The ‘if NOT EXISTS’ part is important to make sure that you are not creating a new table every time the code runs, this code will only execute the first time the program runs, unless the ‘articles’ table is deleted from the database (this would have to be done manually by the user).
“title TEXT UNIQUE” – This is one that I had to look up what I wanted SQL to do before I could figure out how to do it. This tells the database that while the ‘title’ column is not the primary key, that it is unique and that SQL should raise an error if something attempts to add an item with the same title as something else to the table.
“REPLACE INTO” – The work-horse piece of code, normally we would use ‘INSERT INTO’ but in this case, I chose to use ‘REPLACE’ for two reasons. First, it resolves any conflicting titles by replacing the old entry with the newer one, which makes sure that we only have unique entries in the table. Second since it replaces the old version with the new one, it will update any other data about the article that has changed since the last time the scrape was run (for instance, if the number of points were to change).
These three peices that were previously unfamiliar to me do a lot of work in very little space. Because I don’t need to write methods to check for some of the things these take care of automatically, we save a lot of space and the code is easy to read. Thanks for reading, I hope if you’re working with SQL these little bits are helpful!