Using FreshRSS's HTML + XPath with GoComics
FreshRSS supports scraping a site using XPath to generate new items for a feed. GoComics does not provide an RSS feed for comics to use but it is possible to use XPath on a comic’s page to get each day’s comic.
Set the feed URL to the comic’s page (e.g. https://www.gocomics.com/fminus
for F Minus).
Under “Type of feed source”, set these options:
- Type of feed source: HTML + XPath (Web scraping)
- finding news items:
//div[contains(@class, "gc-deck--cta-0")]//picture[contains(@class, "gc-card__image")]
- item title:
parent::h4
- item content:
./img
- item link (URL):
ancestor::a/@href
- item thumbnail:
descendant::img/@src
- item author:
"F Minus"
- Custom date/time format:
Y.m.d
This is enough to regularly check the GoComics page and fetch each day’s comic.
The title won’t reflect the date and instead be urn:sha1:<sha1 hash>
.
I haven’t figured out why that is but I also don’t care enough to fix it.