If you can run the Gateway + 5 nodes for less than 4-500 a month I suggest you go ahead with it.
I cant compete with that.
Difference can be read here - Radix Infrastructure: Node
We run separate full nodes for the reason the load characteristics are different. Fullnode doesnât participate in consensus and is always catching up. Besides for gateway purpose it will have slightly different configuration. For example, the fullnodes state version history length.
We run all nodes with below settings
RADIXDLT_STATE_HASH_TREE_STATE_VERSION_HISTORY_LENGTH: '60000'
Thank you for clarifying.
Iâm guessing those two extra nodes are the oneâs youâre using to feed the DataAggregator for
This seems like one of those container variables name, no?
What does it map to if youâre using systemD or running on windows with just bat file and config file?
Is it not the default? must it be explicitly set?
What are you using for this setting on the Data Aggregator?
GATEWAY_DATA_AGGREGATOR_CLEANUP_TRANSACTION_RETENTION_DAYS=?
As I understand it this setting is the dominant setting on reducing the database footprint.
Here in the default.config.envsubst file, there is mapping of all Docker env to config file parameter
We donât seem to specify GATEWAY_DATA_AGGREGATOR_CLEANUP_TRANSACTION_RETENTION_DAYS anywhere.
@Pawel_XRD Probably can answer that better on what is the use of that parameter
I guess I can also give a better context for my question. I am looking for a way to control the data Aggregator to only load the latest 90 days of transaction history from the Full Nodes. Leading to a more lean database where the Database management is then to execute regularly instructions to remove older data and then free up that allocation. Keeping a âsteadyâ 3 month rolling ledger in the database. It could drastically reduce the infrastructure needs for running a Gateway.
There is no support currently in Gateway to reduce the number of transaction it stores. There are 2 settings that can help you reduce database size:
DataAggregator__Storage__StoreReceiptStateUpdatesDataAggregator__Storage__StoreTransactionReceiptEvents
They were added in v1.8.2:
New configuration options
DataAggregator__Storage__StoreTransactionReceiptEvents, andDataAggregator__Storage__StoreReceiptStateUpdatesfor the data aggregator to configure if a transactionâs receipt events and receipt state updates should be stored in the database. It is meant to be used by gateway runners who want to reduce their database size. Keep in mind that when disabled, the corresponding properties will be missing on a response from both the/stream/transactionsand the/transaction/committed-detailsendpoints. You can save significant space by usingStoryOnlyForUserTransactionsAndEpochChangesand only excluding round change transactions, which arenât typically read from the/stream/transactionsendpoint.
Possible values:
StoreForAllTransactions(default) - will store data for all transactions.
StoryOnlyForUserTransactionsAndEpochChanges- will store data for user transactions and transactions that resulted in epoch change.
StoreOnlyForUserTransactions- will store data only for user transactions.
DoNotStore- will not store any data.
Regarding GATEWAY_DATA_AGGREGATOR_CLEANUP_TRANSACTION_RETENTION_DAYSIâm not sure where you found that configuration key but it doesnât exist.
The concept of truncating data in the database is not trivial, as you cannot simply store only the last 90 days of state version data. We store each modification per entity, so if an entity has not been modified for a long time, its current state may exist only in an older state version that would fall within the range you might want to truncate.
However, there are some potential improvements that could be made. For example, if you do not need information about older transactions returned from the following endpoints:
/stream/transactions/transaction/committed-details/transaction/status/transaction/subintent-status
it might be possible to truncate data from the ledger_transactions table.
We have not implemented this approach because the gateway we hosted and supported was intended to be general-purpose.
Also itâs critical to note that itâs always required to process all state versions from 1 till current when hosting gateway, you canât start processing from middle of the stream.
In other words, even if you implement data truncation and are fine with the above endpoints not returning data for older transactions, you would still need to process state versions starting from version 1 up to the current state version.
Thank you Pawel!
Must have been Mr Gemini that overextrapolated some internal defaults from the DataAggregator source code. âGATEWAY_DATA_AGGREGATOR_CLEANUP_TRANSACTION_RETENTION_DAYS: This was introduced to prevent the massive PostgreSQL âbloatâ that occurred during the early days of the Babylon mainnet. It targets the transactions, transaction_status, and events tables.â
Nevermind.
From what I read. We should stay away from trying this at home ![]()
It mainly leaves us with this path â If we want to reduce the burden on Stokenet Infrastructure and gateways, we should reset the testnet more often. And âindustrializeâ the reset process so that the impact is minimal. I am pretty sure that builders will adapt to the new normal quite fast. Where we reset every 3,6 or 12 months. Decision could be handled as two consultation.
Reset or not?
In the case of reset. How often?
With you experience on handling the gateway. What will be the effect on Disk writes when reducing the size to almost nothing and give the system more memory? I assume it will benefit the indexes and reads but will writes stay the same?
Proposal for Stokenet:
The current Stokenet hosted by the Foundation runs at an estimated cost of 2000+ USD per month. By reducing the number of validators, removing the blue-green setup, and lowering gateway scalability, we can reduce the cost to less than 30% of the current level. The result will be an infrastructure with lower Quality of Service for gateway services, while the nodes and validators will remain as secure as before.
As part of the following proposal, I aim to further optimize the storage layer by organizing yearly testnet resets. Given the number of moving parts at present, I believe the safest approach is to take over the current testnet without additional changes beyond reducing the gateway resilience setup, and then plan for the reset later this year.
I am not a long-time Radix infrastructure provider, but I bring 30 years of solid IT experience. I believe that handling the Stokenet infrastructure is a great way to both learn and contribute.
I welcome feedback and iteration with the community and RAC/DAO.
â Daffy.xrd
Awesome work Daffy. I would immediately vote âforâ this proposal.
Iâm not extremely knowledgeable on the subject. But - to put it bluntly - this is cheap enough that if it works Iâd be delighted, and if for some reason it fails the amount of money spent is not that substantial.
Thanks for your efforts ![]()
hello, great proposal, why this proposal needs external commodity validators ? 2 validators are not enough for approving txs ? testnet reset could be also each 6 months, if that helps reduce cost.
-
Because one of the commodity validators are Timan which is deemed trusted. And we would benefit of having more than a single entity running everything. Edit - I was under the impression that the recommended minimum is at least 3 validators. It can run perfectly on 2 but it would be better with 3 or more
-
I know. But right now the process and implications for a full reset is a bit blur. At least for me. Until we get better control of the reset process I would stick to 12 months. But right now the best effect on cost is lowering the Ledger (today 300GB â in 12 months max 400 GB) down to (10 GB â 100 GB) and the Gateway DB from (today 1,15 GB â in 12 months max 1,6 GB) down to (40 GB â 400 GB) as it will fit better the standard included disk sizes provided by the hosting vendors. Lowering it even further will not have the same effect on cost.
Even with 12 month history the backups and restore will be pretty fast for everyone. And a single gateway instance should benefit of being capable of caching a larger % of the DB in memory. Reducing the need for separate read/write instances. ( to be discussed as I am here assuming and would love some inputs)
That said. We may end up with more frequent resets. I just donât want to over promise at this stage.