Introduction to the Evaluation of Subgraphs in the TheGraph Blockchain Ecosystem
This article is directed towards existing and potential curators of TheGraph protocol. TheGraph is an indexing protocol for querying data on the Ethereum network. Thanks to the protocol every developer is able to create and publish open APIs and thus make blockchain data available to users both easily and cost-efficiently. The respective open APIs are called “subgraphs”. However, this raises some important questions:
Which subgraphs are worth being indexed and utilized by users?
And who can help to make the suitable subgraphs accessible?
The response to these questions is the job of curators. They use GRT tokens to signal which subgraphs are best suited to be indexed in their opinion. A more detailed description of their diverse roles and functions in the TheGraph network can be found in this blog post. To incentivise their work, curators can earn money by signaling on those subgraphs that become valuable to the network (for more information on the TheGraph economy, this blog might be helpful).
The following steps are necessary to evaluate a subgraph with regard to its quality. But first of all, the relevant quality criteria are described.
Quality Criteria for the Evaluation of Subgraphs
The most important evaluation criteria for a subgraph in the TheGraph network are completeness, complexity and accuracy.
Completeness describes the assessment whether all relevant data is covered for the intended use. Is important data missing or is it non-available for certain times or objects? Is all raw data listed in the subgraph? Completeness can also mean that additional information is contained in the subgraph. The data of specific smart contracts is extracted from the Ethereum network based on a defined manifest and stored in the desired entities of the data model through a schema and its respective mapping. However, in addition to the data that is directly retrieved from the blockchain, simply “copied” and brought into a readable form, data can for example also be aggregated. Through such aggregations in the subgraph insightful metrics and information can be provided.
The complexity of a subgraph depends largely on the smart contracts on which the subgraph is based. Some smart contracts contain only a few lines of Solidity code, while others consist of thousands of lines and various additional smart contracts. To create a valuable subgraph it is therefore essential to understand the Solidity code. But complexity can also arise in mapping and schema from the provision of additional information, for example through aggregations or the sharing of various smart contracts in a single subgraph.
Also crucial for a valuable subgraph is the accuracy. If a subgraph provides data that is either not valid or error-free, then the subgraph does not serve any purpose and conversely can even cause damage through misinformation or at least confusion. Errors can occur if incorrect values are assigned in the mapping or if false calculations and groupings are performed during aggregation.
How to curate Subgraphs in TheGraph
As a curator, it is essential to ask yourself: “Which is the data that could be of interest to the users of this subgraph?”, as well as “How are these data points to be interpreted?”.
The curators’ duty is to bring as much valuable data as possible into the TheGraph network and organize it. They can achieve this goal through signaling of valuable subgraphs and thus making them more visible. Further criteria for the evaluation are:
- Is important/relevant data missing?
- Is the data accurate?
- Which improvements must be made to the subgraph to make it better?
- Is the subgraph easy to comprehend?
- Can added value be generated by additional subgraphs?
For example, the Uniswap subgraph serves the purpose of visualizing historical data and allowing analysis. It is being used for exactly that purpose at info.uniswap.org. The subgraph is both complex and detailed. However, its data could be relatively uninteresting for other purposes around Uniswap. Consequently one could instead create a slimmed down subgraph which is only about the price data of the tokens. This would be simpler, easier to understand and less prone to errors.
To evaluate a subgraph, the following few steps are required:
- First of all, you have to search for the subgraph you want to evaluate in the TheGraph Explorer
- Subsequently, you can view the different entities in the subgraph’s playground and test simple queries. Oftentimes a few sample queries are already available. This allows you to get a feeling for the data, its potentials and limitations. Furthermore you can find out which entities are available with which attributes.
- In case there is a link to the Github repository, this is where you can get a deeper insight. Oftentimes a readme file is available, which provides more information about the subgraph. (Async Art subgraph Github)
- An important resource to understand the functionality of a subgraph is the subgraph manifest. This file provides the information about which functions or events from which smart contracts (Ethereum addresses) should be included in the individual subgraph. These in turn represent the description of raw data available in the Ethereum blockchain that should be obtained by means of the manifest. Oftentimes the manifest already at first glance tells you which events are to be collected later on.
- The GraphQL schema is the description of the data model of the subgraph. In the end of the evaluation, the data should be available in the way indicated in the schema. This file is easy to comprehend and requires no prior programming knowledge. It describes the different entities and their respective attributes. Thus you can already get an idea of the data scope of the subgraph.
- For those familiar with typescript or programming, it is useful to have a look at the mapping (factory.ts, mapping.ts). The mapping describes how the raw data is transformed from the subgraph manifest into the form described in the schema. Here the data is assigned to the entities. Mapping can be complex depending on the individual smart contract. It may also define aggregations, in case there are any. A look into the mapping of a subgraph is worthwhile to get a deeper understanding.
Conclusion regarding the Evaluation of Subgraphs in the TheGraph Blockchain Ecosystem
Based on the steps mentioned above, a better understanding of how a subgraph works and the scope of its functions can be obtained. Curators who want to understand and evaluate a subgraph should at least have a look at its manifest and schema. However, it is always worthwhile to have a look at the mapping as well. In the mapping possible bugs and errors in the data assignment can be revealed. Oftentimes the Github repositories of subgraphs are not stated in TheGraph Explorer. In such cases they must be found using a search engine. However, the subgraph manifest is always accessible via the ID of a subgraph (which is always displayed in the Explorer).
To access the manifest a link of the following format must be entered: https://ipfs.io/ipfs/SUBGRAPH-ID
In our particular example the link would be as follows:
I hope this short blog article can help some curators with their future evaluation of subgraphs. A valuable evaluation of subgraphs requires a lot of time and effort. But you should take your time, because as a curator you not only offer valuable support to the network, users, indexers as well as subgraph developers but through quality work you can also earn monetary profit.
Interested or questions?
+49 6131 3272372
Originally published at https://www.anyblockanalytics.com on November 19, 2020.