I have released version 0.4.4 of my web-based aggregator, Temboz. Apart from cosmetic fixes, the new version improves filtering by supplying convenience functions to simplify writing rules. For instance:
content_any_words('foot', 'football', 'tennis', 'rugby')
is equivalent to the older syntax:
('foot' in content_words or 'football' in content_words or 'tennis' in content_words or 'rugby' in content_words)
This release also includes an automatic garbage collection mechanism to keep the database size manageable. Uninteresting articles (those flagged “thumbs down”) are purged of their content every day between 3AM and 4AM. The article metadata itself (title, timestamps, permalinks) is kept to avoid the articles reappearing if they are still present in the feeds (some infrequently updated feeds keep really old articles in their XML). By default, articles older than 7 days are purged, this is configurable in param.py, and you will need to update this file to add the corresponding parameter.