Skip to content
Snippets Groups Projects

Suggestions Feed Spec, initial draft

Open Agate requested to merge suggestions-spec into master
2 unresolved threads
1 file
+ 388
0
Compare changes
  • Side-by-side
  • Inline
+ 388
0
# Retribute Suggestions Feed Format Specification
**This document is a draft**
## Abstract
Retribute is an effort to ease the act of supporting and financing creators. We plan to do so
by providing standard, open, simple and easy-to use tools to creators and contributors, to automate
the most tedious tasks: finding who to support, where, how, how much, etc.
Retribute needs to be compatible with the widest possible range of existing applications and services. This is a strict requirement, as the project can only get traction and adoption if it's widely supported.
To ease the adoption process and ensure a consistent and stable ecosystem, Retribute involves a set of lightweight standards. The first one is the Retribute Suggestions Feed Format, described in the current document.
The goal of this specification is to describe the content, structure and format of the data Retribute clients need to pull from content platforms for the purpose of displaying meaningful suggestions to potential contributors.
## Glossary
**Creator**
A person providing a content, piece of work or service to an audience.
Examples:
- a blog writer
- a music band
- a person posting nude pictures on social media
- a game developper
- a newspaper
- an organization developping free and open-source software
- an individual operating and maintaining online services for their community
**Content platform**
A place where creators publish their content or provide their service. Those platforms can be webservices maintained by third-party such as video or audio hosting platforms (YouTube, Vimeo, SoundCloud, PeerTube, Funkwhale), but also operated by the creators themselves (Mastodon servers, personal blogs or websites, etc.)
**Contributor**
Contributors are typically members of the audience of a creator who want to support their work. They
usually enjoy the provided content or service through the content platform, but this is not necessarily the case.
**Contribution platforms**
Places advertised by creators to receive donations and other support expressions from their community.
Those platforms can be operated by a third-party (e.g a PayPal, Patreon, Liberapay or bank account) but also by
the creators themselves (e.g a cryptocurrency wallet, a postal address to receive checks, etc.).
**Retribute client**
Retribute clients are operated and used by contributors to automate the work of sending contributions to important creators in their life.
To do so, those clients provide suggestions, based on user configuration and historical data gathered from the content platforms where contributors are active.
## Design principles
The current document describes what kind of data is transmitted by a given content platform to the client, how it is structured and transmitted.
**Simplicity**
Because this data is intended to be produced by a wide range of content platforms, it must be easy to read and understand, as well as simple and low-footprint to produce.
**Privacy**
The data itself can contain sensitive historical information about a contributor activity. To mitigate the risk of leaks, we put an emphasis on exposing data that is strictly required by Retribute clients, and nothing more.
We also recommend using Object Capability URLs to expose the data as a mean to enforce authorized access (see below).
**Flexibility**
The data is used to provide suggestions to contributors. As there is no "one-size-fits-all" solution, special care is given to support fine grained customization of the suggestions by Retribute clients and contributors.
## Simple use-case
Alice wants to start using Retribute to support the network of creators she follows on the Fediverse.
She install a Retribute client on her phone. The client ask her to subscribe to her first Retribute Feed. Since she's using Mastodon to connect to the Fediverse, the client redirects her to her Mastodon settings page, where she finds a Retribute Suggestions Feed URL.
She copy-paste this URL in the client and click submit. Immediatly after that, the client fetches the suggestions from the feed and start to display a list of suggestions based on Alice latests comments, follows, boosts and favorites on the network of the past month.
## Advanced use-case
After a few weeks, Alice wants to extend her support to other creators. She listens to a lot of podcasts on her phone, and wants to help making those podcasts sustainable for the time being.
Since her podcasting app tracks her activity and has built-in Retribute support, it's simple. In her podcasting app settings, she enables Retribute support. On a weekly basis, the podcast app will now export a Retribute Suggestions Feed on her phone SD card.
Then, in her Retribute client, she selects the path to the Suggestions Feed exported by the podcasting app. Immediatly afterwards, the client refreshes to include the suggestions from this new feed.
## Data format
A Retribute Suggestions Feed is a JSON document. A complete example document is shown below for example purposes, but we'll describe the role of each part in detail:
```javascript
{
"version": "0.1",
"platform": {
"url": "https://alice.server.example",
"name": "Mastodon - alice.server.example",
"icon": "data:image/png;base64iVBORw0KGgoAAQAAAAUCAMAAA",
},
"account": {
"url": "https://alice.server.example/@alice",
"name": "Alice in fedi",
"avatar": "https://alice.server.example/@alice/avatar.png",
},
"suggestions": {
    • Does the fact that this property is an object and not an array mean that each week must have its own feed? We can't have a feed for another period of time? It could be useful to have the possibility to fetch all previous activities at once when connecting a new account (but it can maybe still be done by fetching all the previous weekly feeds?).

      • Author Owner

        Does the fact that this property is an object and not an array mean that each week must have its own feed?

        That was the initial design, but we could also imagine nesting multiple weekly buckets in the same document, like this:

        
        {
          "suggestions": [
            {"week": "2019-W24", "creators": []},
            {"week": "2019-W23", "creators": []},
        }

        Another option, which I'd tend to prefer, would be to add a property linking to the previous week document, if any. This way, clients could fetch the data they need, and the format itself would remain as simple as possible:

        {
          "version": "0.1",
          "previousSuggestions": "https://alice.server.example/retribute?token=hello&week=2019-W23",
          
        }

        What do you think?

      • My concern was mostly about the first time you connect this platform to your client, it should be able to get all your suggestions without making hundreds of requests to the platform… so maybe there could be two different endpoints? One for the first time connection only, with all previous weeks, and then one per week?

        Otherwise, I don't really like having suggestions as an array either, especially if it will contain only one item most of the time, but previousSuggestions wouldn't really fix the issue according to me.

      • Author Owner

        My concern was mostly about the first time you connect this platform to your client, it should be able to get all your suggestions without making hundreds of requests to the platform… so maybe there could be two different endpoints? One for the first time connection only, with all previous weeks, and then one per week?

        Is it really useful from a client/user perspective to fetch 100s of weeks of suggestions anyway? I'm genuinely asking, because my impression was that users would want suggestion based on their recent activity.

        If you add a suggestion feed to your Retribute client, you probably don't want suggestions regarding your activity from 2 years ago popping up?

      • Please register or sign in to reply
Please register or sign in to reply
"week": "2019-W24",
"activities": [
{
"id": "favorites",
"name": "Favorites",
"suggestedWeight": 1,
},
{
"id": "shares",
"name": "Boosts",
"suggestedWeight": 2,
},
{
"id": "replies",
"name": "Replies",
"suggestedWeight": 5,
},
],
"creators": [
{
"ids": [
"activitypub:https://fediverse.social/users/bob",
"keybase:bob"
],
"activitiesDetail": {
"favorites": 3,
"shares": 2,
"replies": 9,
}
"url": "https://fediverse.social/@bob",
"avatar": "https://fediverse.social/@bob/avatar.jpg",
"name": "I'm Bob",
},
{
"ids": [
"activitypub:https://fediverse.social/users/alyssa",
"twitter:alyssa",
],
"activitiesDetail": {
"favorites": 5,
"shares": 0,
"replies": 25,
}
"url": "https://fediverse.social/@alyssa",
"avatar": "https://fediverse.social/@alyssa/avatar.jpg",
"name": "Alyssa",
}
]
]
}
```
This document contains all the information a typical Retribute clients need to function and provide meaningful suggestions.
Now, let's dive into the details of the structure.
### `"version"`
```javascript
"version": "0.1",
```
This property indicates the version of the Retribute Suggestions Feed Format in use for the payload. We're using `0.1` in the draft, but this will be updated to `1.0` in the first official version, and incremented with future improvements.
### `"platform"`
```javascript
"platform": {
"url": "https://alice.server.example",
"name": "Mastodon - alice.server.example",
"icon": "data:image/png;base64iVBORw0KGgoAAQAAAAUCAMAAA",
},
```
The `platform` property refers to the content platform that generated the feed. It can be leveraged by clients to indicate the origin of the suggestions and for illustration purpose in the UI.
In this example, the `icon` property is a `base64` image, typically an application logo, but it could also be a HTTP URL.
The `url` and `icon` properties are optional.
### `"account"` (optional)
```javascript
"account": {
"url": "https://alice.server.example/@alice",
"name": "Alice in fedi",
"avatar": "https://alice.server.example/@alice/avatar.png",
},
```
The `account` property is similar to `platform` but refers to the contributor account on the content platform, if any.
The `url` and `avatar` properties are optional.
### `"suggestions"`
Now, because the `suggestions` property holds the most important data, is bigger than the previous properties and less self-explanatory, we'll describe it in much more detail.
#### `"week"`
```javascript
"suggestions": {
"week": "2019-W24",
...
}
```
The `week` property indicates the time range of the suggestions.
To ensure consistency and avoid client-side issues when aggregating suggestions from various content platforms, content platforms MUST use ISO week notation, and only generate feeds for weeks that are complete.
#### `"activities"`
```javascript
"suggestions": {
...
"activities": [
{
"id": "favorites",
"name": "Favorites",
"suggestedWeight": 1,
},
{
"id": "shares",
"name": "Boosts",
"suggestedWeight": 2,
},
{
"id": "replies",
"name": "Replies",
"suggestedWeight": 5,
},
],
...
}
```
The `activities` property describes the kind of interactions and activities used to compute the list of suggestions. Each object in the `activities` array MUST contain at least an `id` property, that is unique accross all the declared `activities`.
The `name` property is used for display purposes in clients.
    • Internationalization should be taken in account here: how does clients tell other apps that they need this string in Portuguese, for example? (because I don't think client can do the translation themselves, these strings are specific to each app).

      • Author Owner

        Nice catch, let's leave this open and come up with a solution

      • Author Owner

        So, the more I think about this, and the harder it gets :D

        A basic implementation to support i18n could be to provide an object instead of the string for the name property, with keys being locales, and values being the translation for this local:

        {
          "id": "replies",
          "name": {
            "en_US": "Replies",
            "fr_FR": "Réponses"
          },
          "suggestedWeight": 5,
        }

        But it means that Content Platforms are ultimately responsible for something that is only displayed in Retribute clients. And as soon as we want to support more complex i18n, it breaks (because of pluralization, for instance).

        Maybe we can solve this differently, and keep the Suggestion Feed format simple, as it is right now. If we defined a list of standard activities that clients should support, i18n wouldn't be a problem because clients could translate those for their target audiences.

        Standardizing the list of activities (while leaving some room for extensions) would also bring other benefits, such as more consistency between content platforms, the possibility to build upon this list to improve the clients UI, etc.

        What do you think @Ana?

      • I don't really like the idea of having a restricted set of activities, mostly because I don't really see how it could be made extensible without fragmentation (some clients may not be aware of certain activity types, because they are not standard, and thus they don't know how to translate them). So I think I18N should definitely be handled on the side of content platforms.

        Maybe the name could look like this:

        {
          "name": {
            "n < 2": "Réponse",
            "_": "Réponses",
          }
        }

        The keys would be boolean expressions, with only one variable, n (the number of items), and a default case that must be called _ (or something else, I wrote it this way because that's what I usually use for default cases). Also, you can notice that I only provided translations for one language, because I thought the content platform could use the Accept-Language header to give the correct set of translations directly.

        But maybe that's too much? Maybe we could do like gettext usually does, with name being an array, of which the first item would be the "singular" string, and the following ones the different plural forms? What do you think?

      • Author Owner

        I don't really like the idea of having a restricted set of activities

        Me neither to be honest. But it's probably an improvement to standardize some of them (the most common), since it will reduce fragmentation? Like we have "Audio" and "Video" in ActivityPub, for instance :)

        So I think I18N should definitely be handled on the side of content platforms.

        I tend to disagree here. IMHO, clients should be responsible for the I18N part, because:

        1. It's way simpler to handle this in a handful of clients, than on literally every content platform out here (there will be way more content platforms than clients). Some content platforms may even not have any I18N mecanism in place, and be entirely monolingual.
        2. It means that every single suggestion feed will be bloated with I18N data, for potentially dozens of languages
        3. The Accept-Language header could help a bit, but it doesn't work for all the use cases (a podcast app generating a static suggestion feed file)
        4. The content to translate is small and displayed on client-side only.

        It could also lead to some weird client behaviour, e.g If two content platforms use the same type but with different translations?

      • Please register or sign in to reply
Please register or sign in to reply
Because Retribute applies to a potentially really diverse type of applications and services, the standard doesn't define a strict set of allowed activities. Each content platform is then free to use any identifier and name it needs.
The `suggestedWeight` property is a positive integer that indicates the importance of the type of interaction compared to others in the list. This information is used by clients to build a final list of suggestions.
The higher the value is, the higher interactions with creators involving this specific type will land in the final suggestions list. Content platforms SHOULD define those weights according to their own user behaviour and use cases.
Clients CAN use this property in their ranking algorithm, assuming a weight of `1` if the property is missing. If they do so, clients MUST also provide a way for users to customize the `suggestedWeight` according to their needs.
#### `"creators"`
```javascript
"suggestions": {
...
"creators": [
{
"ids": [
"activitypub:https://fediverse.social/users/bob",
"keybase:bob"
],
"url": "https://fediverse.social/@bob",
"avatar": "https://fediverse.social/@bob/avatar.jpg",
"name": "I'm Bob",
"activitiesDetail": {
"favorites": 3,
"shares": 2,
"replies": 9,
}
},
{
"ids": [
"activitypub:https://fediverse.social/users/alyssa",
"twitter:alyssa",
],
"url": "https://fediverse.social/@alyssa",
"avatar": "https://fediverse.social/@alyssa/avatar.jpg",
"name": "Alyssa",
"activitiesDetail": {
"favorites": 5,
"shares": 0,
"replies": 25,
}
}
]
...
}
```
The `creators` property contains a list of creators suggestions, compiled by the content platform, using the list of `activities` described above involving the contributor and the creator.
**`"ids"`**
```javascript
"ids": [
"activitypub:https://fediverse.social/users/bob",
"keybase:bob"
],
```
`ids` is list of identifiers that the Retribute client can use to gather data about the creator, and contribution platforms in particular. The structure of ids and allowed values are to be documented separately, but an id follows this structure: `<id_type>:<id>`
Therefore, `activitypub:https://fediverse.social/users/bob` indicates that the ActivityPub identifier of Bob is `https://fediverse.social/users/bob`, and `keybase:bob` that their KeyBase identifier is `bob`.
Using this information, a Retribute client can crawl the corresponding profile and extract links to contribution platforms. Content platforms MUST include at least one id per creator suggestion.
The content platform MUST only include ids that are proven to belong to the creator. **Using unverified user input as an identifier is strictly discouraged as it can lead to contributions hijacking**. The verification process is left undocumented but can use existing methods such as OAuth, rel-me or any other method that can attest that the creator is responsible for the other identity.
**`"activitiesDetail"`**
```javascript
"activitiesDetail": {
"favorites": 3,
"shares": 2,
"replies": 9,
}
```
The `activitiesDetail` property is an object. It MUST contain at least one key, and the only allowed values are the one used as `id` in the `suggestions.activities` objects. Allowed values are `0` or any positive integer or float.
The property details the kind of activities and interactions that occured between the contributor and the creator (or its content). In this example, the feed represents Alice's activity, thus the `activitiesDetail` property in Bob's suggestions indicates that:
- She favorited Bob's posts 3 times
- She shared Bob's posts 2 times
- She replied to Bob 9 times
Combined with each activity's `suggestedWeight`, clients can use this data to power their ranking algorithm and let the user understand the rationale behind the suggestions.
**`"url"` (optional)**
```javascript
"url": "https://fediverse.social/@bob",
```
This property is used for information and display purpose by clients.
**`"avatar"` (optional)**
```javascript
"avatar": "https://fediverse.social/@bob/avatar.jpg",
```
This property is used for information and display purpose by clients.
**`"name"` (optional)**
```javascript
"name": "I'm Bob",
```
This property is used for information and display purpose by clients.
## Document retrieval
The JSON document described in the previous section can be retrived by clients using different means.
Content platforms or software generating Suggestions Feed MUST make the resulting JSON available using at least one of the following methods. Clients MUST support all the methods described below.
### HTTP GET
In this retrieval method, the content platform makes the document available at a given URL. The contributor copy-paste the URL in their Retribute client, and the client will do a weekly HTTP GET request to fetch the document using the URL.
Content platforms SHOULD use an Object Capability URL to expose the JSON document, to prevent unauthorized access.
In this scenario, the Suggestions Feed is similar to a RSS or Atom feed, that is pulled on a regular basis by the client.
### Filesystem
In this retrieval method, the content platform or the software generating the feed can't make the feed available using HTTP. This is typically the case if the feed is generated by an offline app, like in our podcasting example.
The content platform generates the JSON document and exports it in a file at a stable path, that is readable by the client. The contributor can then point the client to this path, and the client can process the contents of the file on a regular basis.
This method requires that the Retribute client and content platform share the same filesystem.
### Copy-pasting
In this retrieval method, the content platform or the software generating the feed make the JSON payload available to the user directly by copy pasting (or via one of the methods above).
The contributor copy-paste the JSON document directly in the Retribute client.
This method is impractical except for a very narrow set of use cases, like debugging or non-recurring activity.
Loading