Given the holiday break, the next major Bi-Weekly Sprint update will be for the week ending Jan 17, 2025. Here is what was done up to Jan 3, 2025.
- Resolved mainnet issues with nodes getting stopped.
- We figured out why the nodes were being stopped -- it was an AWS automatic maintenance process.
- Developing a process to be notified ahead of time when these maintenance cycles are planned, so we can update the nodes ourselves and prevent future downtime.
- Whitelisted dev ip addresses when running the infra-code scripts, which previously was triggering DDoS attack prevention daemons and blocking access incorrectly.
- Fixed full storage issues for the mainnnet nodes.
- Updated new node infrastructure code to accommodate mainnet to bring up new nodes in mainnet
- Reset big dipper nodes to display mainnet data. All BD nodes were down, but they are all back up now.
- Updated storage location for log for nodes, to prevent out-of-space issue that took two+ years to manifest (a testament to our network's terrific uptime)
- Update infrastructure to utilize EventBridge and SNS to handle AWS scheduled stop events.
- Updated docker file for building a chain without ignite-cli
- Brought back up several nodes that AWS had taken down.
- Developed process to pre-emptively stop and restart AWS nodes. Typically, due to the nature of AWS storage, you lose all the data on SSD's when you stop a node. Our process allows us to retain this data. Stopping and restarting a node is vital to accommodate unavoidable AWS maintenance cycles in a graceful way.