New plan for feed caching and stuff =================================== When we make a request for a section of a feed, we break up the information we received into many separate facts: - "id X has contents Y" (in our existing cache) - "the earliest thing before id X is id Y" - ditto "after" Those predecessor/successor facts live in some new cached object (I'll wait to decide what it should be indexed by). They expire at the same time as the "contents" ones. To convert a set of those facts back into a list of ids for sending to File: - iterate over the pred/succ links in recency order (so that more up-to-date info always takes priority) - make a collection of sequences of consecutive items we've built up so far - for each link saying A comes just before B, try to link the sequence ending with A to the one starting with B + creating each sequence as a singleton if we didn't have it already + and if the attempt fails, because A isn't at the end of a sequence or B isn't at the start of one (or they're the same sequence!), then assume that link was incompatible with more recent data, ignore it, move on - once we've done the best we can, output the connected component corresponding to the first (i.e. most recent) link we iterated over. That is our feed. Problems with this: - what about expiry, if something in the middle expires before the ends? + What I _want_ is to get a rough estimate of how many things in the middle go between the two ends, so I can use it for file percentages. + Perhaps what we do here is: don't actually _delete_ expired cache entries. Instead, still use them to build this list. But then delete the expired sections at the end of the process, and replace them with a placeholder saying 'about N things go here but we would have to re-retrieve them'. --------- I had also had the plan of abandoning Link: headers and constructing next/prevs myself based on knowing the id of the segment I was trying to retrieve. But it won't work, because some feed types don't _show_ their index ids. Example: listing your followees via accounts/NNN/following: you get back a plain array of Account, so the only id you have for each element is the _account_ id, but the next/prev links make it clear that they're indices into some other kind of 'follow event' list which is not exposed in the response. So if I want to re-retrieve a section of one of those feeds, I have no option but to reuse a Link header I was previously sent. So I guess we have to do something nasty involving _also_ storing those as extra facts? Perhaps along the lines of - At time t, id s was returned as part of link url U - At time t, link url U was returned as the way to find out what came before/after id s --------- OK so internal APIs: Client::fetch_feed() is the current _public_ entry point for feed retrieval. You give it a feed id, and an enum saying whether you want to fetch this feed for the first time (just 'make sure we have some of it at all'), or extend it into the past (triggered from File), or into the future (triggered from a stream update). It returns a boolean indicating whether anything new turned up. That has a subroutine base_feed_req() which I doubt we need to mess with, and decode_feed_response(). The latter is parametric, but only in that it takes a Foo which at different call sites might want to be a Vec or a VecDeque. It also sticks whatever useful stuff it can find into caches. The data retrieved by fetch_feed() goes into the Feed struct, which currently contains a pub VecDeque, directly accessed by FileDataSource. So perhaps step 1 is to break that dependency, making Feed::ids private, and providing a new API for things to find out what the feed currently contains. Then once Feed is properly opaque, we can reorganise its internals, and simultaneously, replace the way Client::fetch_feed shoves things into it. --------- OK so we're now kind of ready to start looking at stuff: iter_ids_futurewards is going to have to switch to returning a mixture of actual ids and placeholders saying "about N missing here". It's obvious how we start filling in the end of a placeholder, if the user viewing a File tries to scroll into it from known territory. But what if they jump to a percentage point in the middle? --------- 2024-05-17 remaining plan: I've done the thing where Feed stores a list of separate pred/succ relations, and I've rebased my old cache code on to it. I think the plan is now: - simplify CacheMap drastically, because it won't need the expire method any more, and I think that also means it won't need both of its internal hashmaps - give every ActivityState a method where it can make a list of all the 'cacheable facts' its currently displayed screenful depends on - on a periodic timer, retrieve that data from the active ActivityState (or both of them, if there's an overlay), and go and check whether all those facts are within their cache validity period. If not, schedule some re-fetches, and once those complete, redraw the display. That way we never have to deal with the total _absence_ of data in the cache, only with outdated data, for which our handling is 'display it pro tem, and use that as a placeholder display while re-fetching'. (Possibly we might also need to develop a different bottom-line busy indicator, to make it clear that cache expiry is the cause? In fact it might already be nice to distinguish 'processing network activity' from 'processing keystroke', hmmm. But that can be a separate thing.)