PenguinTV optimization

I’ve been going through the crufty and scary ptvDB.py looking for ways to improve performance. Armed with the trust profile and pstats modules, I cut polling time in half by removing redundant and unnecessary database hits.

I then got the idea of generating a print statement every time I hit the DB, so I can see exactly what the database is doing during regular operation.

The result is not always pretty:

Basic article read:

SELECT title, creator, link, description, feed_id, date FROM entries WHERE id=? (69443,) ok
SELECT id,entry_id,url,file,download_status,viewed,length,mimetype FROM media WHERE entry_id = ? (69443,) ok
SELECT read FROM entries WHERE id=? (69443,) ok

Ok that’s fine, we get the entry, any associated media, and then the read state.

What about an unread entry?
SELECT title, creator, link, description, feed_id, date FROM entries WHERE id=? (67523,) ok
SELECT id,entry_id,url,file,download_status,viewed,length,mimetype FROM media WHERE entry_id = ? (67523,) ok
SELECT read FROM entries WHERE id=? (67523,) ok
UPDATE entries SET read=? WHERE id=? (1, 67523) ok
UPDATE media SET viewed=? WHERE entry_id=? (1, 67523) ok
SELECT read FROM entries WHERE id=? (67523,) ok
SELECT id,entry_id,url,file,download_status,viewed,length,mimetype FROM media WHERE entry_id = ? (67523,) ok
SELECT title, creator, link, description, feed_id, date FROM entries WHERE id=? (67523,) ok
SELECT id,entry_id,url,file,download_status,viewed,length,mimetype FROM media WHERE entry_id = ? (67523,) ok
SELECT read FROM entries WHERE id=? (67523,) ok
SELECT feed_pointer,description FROM feeds WHERE id=? (233,) ok
SELECT id,title,fakedate FROM entries WHERE feed_id=? ORDER BY fakedate DESC (233,) ok
SELECT id FROM entries WHERE feed_id=? (233,) ok
SELECT feed_pointer FROM feeds WHERE id=? (233,) ok
SELECT pollfail FROM feeds WHERE id=? (233,) ok
SELECT feed_pointer FROM feeds WHERE id=? (233,) ok
SELECT read FROM entries WHERE feed_id=? (233,) ok
SELECT id FROM feeds WHERE feed_pointer=? (233,) ok

Oops. It’s whacking the database three times for the entry data, and updates the whole feed as well. Time to figure out what I’m doing wrong!