EPIC: Fast Clean Shutdowns and Testing Design

# Overview

This epic is coming from the long standing tension around "flaky tests".  Ultimately the reason tests are "flaky" is either that the tests are not designed properly or there are underlying problems with the code. This epic aims to address these concerns. 

## Test Designs

### Resource Management
One fundamental flaw in how our tests are being designed is that they are not properly handling the shutdown of nodes, and there are some tests that just miss shutting nodes down at all. 

Currently our tests that do shutdown nodes follow this general pattern:

```go
func TestExample(t *testing.T) {
  node := createNode()

  // some testing
  require.NoError(err)

  node.Stop()
}
```

The flaw in this design is that if the `require.NoError` fails, the `node.Stop()` is never called. This then negatively impacts other tests and potentially causes "flakiness" in those tests because the resources continue to run. This impact could be mitigated by using the fail fast flag, but there is value in seeing the entire test suite run to see all the potential impacts of new changes. 

Here is an example that anyone can run to verify:
https://gist.github.com/MSevey/a73f165f3bb640ba5339baafcd04e30b

So the better approach is to use `defer` statements to close nodes or to use `asserts` to allow the tests to continue running. 

```go
func TestDeferExample(t *testing.T) {
  node := createNode()
  defer func() {
    node.Stop()
  }()

  // some testing
  require.NoError(err)
}
```

```go
func TestAssertExample(t *testing.T) {
  node := createNode()

  // some testing
  assert.NoError(err)

  node.Stop()
}
```

In both of these examples the node will be stopped, either because of the `defer` or because the `assert` failure doesn't stop the test. In general, `defer` is better, but there are certain scenarios where we what to stop nodes in a specific way. 

### Subtests

Currently we are not defining subtests properly we is causing additional context confusion. 

This is an example of how we are currently writing subtests:
```go
func TestFraudProofService(t *testing.T) {
	testSingleAggregatorSingleFullNodeFraudProofGossip(t)
	testSingleAggregatorTwoFullNodeFraudProofSync(t)
}
```
This is not actually subtests, but a test that has refactored 2 test cases into functions.

This is how it should be structure to actually create subtests:
```go
func TestFraudProofService(t *testing.T) {
	t.Run("SingleAggregatorSingleFullNodeFraudProofGossip", func(t *testing.T) {
		testSingleAggregatorSingleFullNodeFraudProofGossip(t)
	})
        // or
	t.Run("SingleAggregatorTwoFullNodeFraudProofSync", testSingleAggregatorTwoFullNodeFraudProofSync)
}
```

## Fast Clean Shutdowns

Even with improved testing architecture, there are still improvements needed in how we are shutting down resources quickly and cleanly.  The easiest way to see this in action is to have the logger write to a file instead of stdout and ensure the file is closed when the test completes. What you will see is errors and panics from background threads trying to write to the closed file. 


```[tasklist]
### Tasks
- [ ] https://github.com/rollkit/rollkit/issues/1079
- [x] https://github.com/rollkit/rollkit/issues/1072
- [ ] Fast Clean Shutdown Cleanup for Background Threads
- [ ] https://github.com/rollkit/rollkit/issues/1070
```
 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

EPIC: Fast Clean Shutdowns and Testing Design #1069

Overview

Test Designs

Resource Management

Subtests

Fast Clean Shutdowns

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

EPIC: Fast Clean Shutdowns and Testing Design #1069

Description

Overview

Test Designs

Resource Management

Subtests

Fast Clean Shutdowns

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions