Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

We are building a content app using Firestore. The basic requirement is that there is one master collection, let's say 'content'. The number of documents could run into 1000s.

content1, content2, content3 ... content9999

We want to serve our users with content from this collection, making sure they don't see the same content twice, and every time they are in the app there's new content for them. At the same time, we don't want the same sequence of our content to be served to each user. Some randomisation would be good.

user1: content9, content123, content17, content33, content902 .. and so on
user2: content854, content79, content190, content567 ... and so on

I have been breaking my head as to how without duplicating the master collection can we possibly achieve this solution. Duplicating the master collection would just be so expensive, but will do the job.

Also, how can we possibly write cost-effective and performance-optimised queries especially when we want to maintain randomisation in the sequence of these content pieces?


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
239 views
Welcome To Ask or Share your Answers For Others

1 Answer

Here is my suggestion. Please view it as pseudo-code as I did not run it.

If the content document ids are not previsible

You have to store and maintain which user has seen which content, for example in a collection: /seen/uid_contentId

See here a clever way to get a random document from a collection. You need to store the size of the collection, perhaps as a document in another collection. So here is how you could do it:

const snapshot = await firestore.doc(`/userSeen/${uid}`).get(); // do it only once
const alreadySeen = snapshot.exists ? snapshot.data.contents : [];

async function getContent(uid) {
  for (let trials = 0; trials < 10; trials++) { // limit the cost
    const startAt = Math.random() * contentCollectionSize;
    const snapshot = await firestore.collection("/contents").startAt(startAt).limit(1).get();
    const document = snapshot.empty ? null : snapshot.docs[0]; // a random content

    if(document.exists && !alreadySeen.includes(document.id)) {
      alreadySeen.push(document.id);
      await firestore.doc(`/userSeen/${uid}`).set({contents: arrayUnion(document.id)}); // mark it as seen
      return document;
    }
  }

  return null;
}

Here you may have to make several queries to Firestore (capped to 10 to limit the cost), because you are not able to compute the content document ids on the client side.

If the content document ids follow a simple pattern: 1, 2, 3, ...

To save up costs and performance, you should store all the seen contents for each user in a single document (the limit is 1MB, that is more than 250,000 integers!). Then you download this document once per user, and check on the client side if a random content was already seen.

const snapshot = await firestore.doc(`/userSeen/${uid}`).get(); // do it only once
const alreadySeen = snapshot.exists ? snapshot.data.contents : [];


async function getContent(uid) {
  let idx = Math.random() * contentCollectionSize;

  for (let trials = 0; trials < contentCollectionSize; trials++) { 
    idx = idx + 1 < contentCollectionSize ? idx + 1 : 0;

    if(alreadySeen.includes(idx)) continue; // this shortcut reduces the number of Firestore queries

    const document = await firestore.doc(`/contents/${idx}`).get();

    if(document.exists){
      alreadySeen.push(idx);
      await firestore.doc(`/userSeen/${uid}`).set({contents: arrayUnion(idx)}); // mark it as seen
      return document;
    }
  }

  return null;
}

As you can see, this is much cheaper if you use previsible document ids for your content. But perhaps someone will have a better idea.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...