Shutting Down Nitter: A Free Twitter Alternative

Last of the public instances Nitter fell into disrepair. The Nitter project developed a free frontend for accessing X.com/Twitter without imposing JavaScript, analytics, trackers and third-party services. On January 31, the issuance of tokens used by Nitter to provide access to content on X.com was stopped. On February 26, the last of the previously issued tokens expired, which led to a complete stop of Nitter.

After being purchased by Elon Musk, Twitter (now renamed X) began implementing a set of technical and organizational measures aimed at aggressively monetizing the platform, which was previously considered unprofitable. Among the changes, pricing was implemented for the information received by each account (introduced limits for different types of accounts – 10,000 for holders of a paid “blue tick”, 1000 for regular ones, 500 for new regular ones); “Developer” accounts with limits suitable for mass data extraction (scraping) have been transferred to the category of paid ones; Distribution of information to users without accounts has been stopped.

As an excuse in public stated (2023-07-01) that these are “temporary emergency measures” due to the fact that automated data loading by bots leads to a deterioration in service for ordinary users. Before this (2023-04-19) there were insinuations against Microsoft, related to the fact that this company illegally uses Twitter data to train AI. Later (2023-11-17) introduction of limits justified Musk's promised fight against bots.

Nitter was a software development project for surveillance protection Twitter users who do not send messages, but only read content, by providing them with an alternative site for viewing Twitter that does not require an account or JavaScript enabled. Such software is actually a scraper and intermediary, which, instead of storing data in the database, sends it to the end user (however, some service data is cached in Redis).

Thus Nitter software:

technically, it was exactly the type of software that Twitter management announced an active fight against;
was one of the few actively developed software for gaining access to data posted on Twitter, which made it attractive for use as a module for scraping in the narrower sense of the word – collecting data bypassing the official interfaces for this;
public instances of Nitter themselves became objects of scraping, which led to the fact that some instances implemented their own version of the captcha (1 additional POST request specific to a particular instance).

As a result of the analysis of workarounds for continuing work in the new conditions, RSS and some entry points on syndication.twitter.com were discovered that provided information to unregistered users in JSON format and were used for integration with other social networks. For some time Nitter received information through these interfaces, but then they were closed. After this, a way was found to use “guest accounts” that had read privileges. One of the types of “guest accounts” was intended for use on Internet of Things devices with limited browsers.

But Nitter used a different type of “guest accounts” that used OAuth instead of Cookies, registered via API and were apparently used by the Android app. This type of account has limits of 500 API requests within 15 minutes, and its “registration” is tied to an IP address (one “guest account” can be registered from one IP within 24 hours, but an already registered “account” can be used from others IP addresses).

Such “accounts” (access tokens) were operational for 30 days. At that time, an adequate solution to the problem of mass registration of temporary accounts could be crowdsourcing their registration by users, using something similar to Bibliogram (a user script that takes the guest token from the user and transfers it to the public instance).

At the end of January X stopped issuance of such tokens. The removal of the last access method put an end to Nitter as a public free multi-user service, as a result of which the author announced Nitter dead.

Some instances immediately closed down after this, others modified the code to severely save the use of existing tokens, in particular with their primary use for obtaining lists of tweets from accounts, with error messages being issued for everything else. On February 26th, the last guest tokens expired, causing all public instances to cease functioning. However, the bug tracker discusses issues that somehow affect guest accounts.

One of the radical solutions to the problem could be Twitter replacement by creating an alternative decentralized service based on ActivityPub and IPFS, where the main identifier of each message is its IPFS CID. We can imagine the following multi-level structure:

Data originally published to the federated service as the primary platform and mirrored to IPFS.
Data published on Twitter by the users themselves, but mirrored using a browser extension to their accounts on the federated platform, and from there to IPFS.
Data that was uploaded from Twitter by users themselves using the upload function and uploaded to Fediverse + IPFS using the mass upload function.

These 3 points, however, do not solve the problem of non-participation by Twitter users in the Twitter replacement program.

For each post identifier on each centralized platform, it may be advisable to maintain its mapping in the IPFS CID, which acts as a cache that allows you to find out its decentralized identifier without knowing the text of the post itself, but knowing its centralized identifier. When generating a URI in IPFS (which can be done without actually filling), the post text undergoes canonicalization, which consists of placing the data in an HTML-based container with machine-readable metadata, Unicode normalization, conversion to UTF-8, replacing whitespace characters with simple single spaces, and replacing all links to posts on this and other platforms that go through a similar procedure with URIs in IPFS.

Each platform has a machine-readable document that describes the rules for canonicalizing posts, including many services whose links are replaced with IPFS URIs in posts on that network. Each post in each network is canonicalized in accordance with the rules for canonicalization of posts in that network in force at the point in time to which the post itself is dated. During canonicalization, if a post contains a link to a post in one of the replaced platforms, the implementation extracts a centralized identifier from the link and checks for its presence in trusted indexes.

When present in an index, the implementation uses the decentralized identifier from the indexes. If absent, the implementation requests the post by reference, canonicalizes it and generates an identifier that can be placed in indexes. The implementation is not obligated to place the requested post on the decentralized network. An implementation may verify the validity of the identifier in the index by replaying the process locally. It is the responsibility of the index implementation to verify the correct generation of identifiers by locally reproducing the process.

This deterministic process will allow the generation of immutable content links even for tweets whose posters are not yet participating in the Twitter replacement program. When some of them upload their tweets to IPFS, the algorithm will generate identifiers for them identical to those already used in links to them, provided that the index contains the correct mappings and the content itself has not changed.

Thanks for reading: