[Discuss] Remote Storage File Format

### 1. Blob Store Directory Structure
```
|__index-uuid
            |__shard
                  |__segments
                       |__metadata
                            |__<file-prefix>_metadata_<file-gen>_<version>
                       |__data
                            |__segments_<N>__<file-gen>
                            |__<N>.si__<file-gen>
                            |__<N>.cfe__<file-gen>
                            |__<N>.cfs__<file-gen>
                  |__translogs
                       |__metadata
                            |__<file-prefix>_metadata_<file-gen>_<version>
                       |__data
                            |__primary-term
                                |__translog-<file-gen>.tlog (with checkpoint blob metadata)
```
1. **file-gen** : Monotonically increasing file generation. At every relocation/recovery this start from the last checkpoint.
2. **version** : The version of the metadata file, although captured in the file contents, but might be helpful if we decide on switching metadata to another format like avro etc
3. **file-prefix** : The prefix that helps with faster searches of data for use cases like lastest metadata or files at particular point in time. S3 LIST API is guaranteed to return results in a [UTF-8 binary sort order,](https://docs.aws.amazon.com/AmazonS3/latest/userguide/ListingKeysUsingAPIs.html). Azure sorts LIST results in an [alphabetic sort order.](https://learn.microsoft.com/en-us/rest/api/storageservices/list-blobs). GCP too has [lexicographical sort order](https://cloud.google.com/storage/docs/listing-objects) support. Basis this, file names could be of a format that factors in timestamp as below. Since we have a heavy access pattern to get most recently created files first it imperative that we coerce the sort order using Long.MAX - term/timestamp referred to as inverted sort. The alternative is blob versioning but that provides less control to retain files based on timestamps and similarly search based on timestamps
   ```
    <inverted_primary_term>_<inverted_generation>_<inverted_timestamp>_<file_gen> : [Preferred]
   ```
   Benefits
   i. Fetch latest metadata files in constant time
   ii. Fetch data at a particular timestamp using binary search
   
     Other alternatives
       i. <inverted_timestamp>_<file_gen> 
       ii. <inverted_timestamp>_<inverted_primary_term>_<inverted_generation>_<file_gen> 

4. **Blob metadata**: Where the corresponding file metadata is less than a KB it's more optimal to attach metadata with the blob metadata eg: translog.ckp and helps reduce PUT calls significantly

### 2. Access Patterns
#### Translogs
1. High append-only writes
2. Full recoveries during failovers from the latest file
3. Point-in-time restores
7. Garbage collect unreferenced files
8. Version upgrades using raw translogs

#### Segments
1. New segment writes to remote store
2. New segment downloads from remote store
3. Full recovery during peer recovery
4. Delta recovery during failover
5. Point-in-time restores
9. Garbage collect unreferenced files



### 3. Metadata File Formats
#### Translog
```
{
    "CURRENT_VERSION": 1,
    "METADATA_CODEC: "md",
    "primaryTerm": 3,
    "generation": 160,
    "minTranslogGeneration": 157,
    "generationToPrimaryTermMapper": {
        160: 3,
        159: 3,
        158: 3,
        157: 2
    }
    "checksum" : "c7h5rwdgs423fsdae570s$%dk9"
    "contentLength" : 10
}
```

#### Segments
```
{
    "CURRENT_VERSION": 1,
    "METADATA_CODEC: "md",
    "generation": 160,
    "metadata": {
        "_0.si": {
            "originalFilename": "_0.si",
            "uploadedFilename": "_0.si__<primary_term>",
            "checksum": "238765",
            "length": 1234,
            "writtenBy": "9.6.0"
        },
        "_1.cfs": {
            "originalFilename": "_1.cfs",
            "uploadedFilename": "_1.cfs__<primary_term>",
            "checksum": "199345",
            "length": 5678,
            "writtenBy": "9.7.0"
        }
    },
    "segmentInfosBytes": [Byte Array]
    "checksum" : "c7h5rwdgs423fsdae570s$%dk9"
    "contentLength" : 10
}

```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Discuss] Remote Storage File Format #8437

1. Blob Store Directory Structure

2. Access Patterns

Translogs

Segments

3. Metadata File Formats

Translog

Segments

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Discuss] Remote Storage File Format #8437

Description

1. Blob Store Directory Structure

2. Access Patterns

Translogs

Segments

3. Metadata File Formats

Translog

Segments

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions