Skip to content

[Allocator Application] <Haven DP>< Open Dataset> PR #179 #183

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
martapiekarska opened this issue Oct 3, 2024 · 8 comments
Open

[Allocator Application] <Haven DP>< Open Dataset> PR #179 #183

martapiekarska opened this issue Oct 3, 2024 · 8 comments
Assignees
Labels
Allocator Application Application received from an Organization to receive DataCap and function as an Allocator Diligence Audit in Process Governance team is reviewing the DataCap distributions and verifying the deals were within standards Manual Pathway

Comments

@martapiekarska
Copy link
Collaborator

Allocator Application

Application Number

reccwctsGFwwGNptw

Organization Name

Haven DP

Organization On-chain Identity

f1mhd5x2d3gybsplhi2apqluxk6ircldusszf4miy

Allocator Pathway Name

Open Dataset

Github PR Number

179

Region of Operation

Europe,North America,South America,Oceania,Greater China Region,Asia minus GCR,Africa

GitHub ID

@haven-allocator

On-chain address

I have a multisig I want to provide now

Type of Allocator

Similar to existing allocator pathways

Filecoin Community Agreement

As a member in the Filecoin Community, I acknowledge that I must adhere to the Community Code of Conduct, as well other End User License Agreements for accessing various tools and services, such as GitHub and Slack. Additionally, I will adhere to all local & regional laws & regulations that may relate to my role as a business partner, organization, notary, allocator, or other operating entity
Acknowledge

Type of Allocator and RFA List

Manual - Novel pathway

Allocator Description

Our allocator pathway will:
  • Leverage Haven's current business model that is incentivized in gaining the support of our FIL initial pledge investors (who are interested in investing with a focus in the long term development of the Filecoin Network) by partnering with vetted SP entities (third party KYB, known node locations disclosed upfront) that are storing copies of quality datasets brining “value to the network” 

  • Enable Filecoin Network economic growth by ensuring upfront that multiple entities are vetted and involved in each data onboarding project and that data is distributed geographically in 2+ dispersed locations and accessible for anyone.

  • Require that data diligence processes are in place to be able to validate all information about the source data and preparation upfront and perform proper checks on data validity and accessibility throughout onboarding. 

  • Upfront source data checks would include:

    • The public, open dataset must be unique (meaning not already onboarded in a retrievable state on Filecoin 5+ times)
    • The dataset must be online - the allocator must be able to browse to and observe any part of that dataset at their discretion. The dataset must be accessible for any web viewer without gating on credentials.
  • **Upfront Data Preparation validation **would include:

    • Partnering with experienced Filecoin data preparer(s) who will be compensated for their time and effort in allocator projects and who can provide:
      • Clear documentation on how the data is prepared / transformed between the source data set and the data stored in Filecoin
      • Deals / Pieces stored to Filecoin should be full - there should not be significant 'slack' or 'padding' in stored data
      • Individual semantic items of the data should be able to be individually referenced / accessed.
      • When a piece is downloaded from the stored dataset, there must be a way to identify what part of the original dataset that piece represents and confirm that it is indeed valid data from the dataset. (this could be e.g. provided via a log file indicating the mapping of offsets / files into stored deals)
  • Validity and Accessibility of Stored Data

    • The data stored must be available to see retreival % via spark tool (and possibly other community approved tools), and through manual retrieval attempts by the pathway operators
    • The data must be accessible (for retrieval) from at least two distinct geographic locations
    • The data must be retrievable for free for anyone that wants to access it

Contributions to EcosystemOnboard >10PiBs of Data,Data Stewardship: Curate and provide high-quality datasets to be stored on the Filecoin network, enhancing the overall value and utility of the network.,Build better data onboarding pathway

Monetization and Fee structure

Other,SP fees,Block rewards, pools. Currently, our business processes involve splitting all storage deal block rewards proportionally between FIL Initial Pledge Investors and Storage Providers. Haven takes a portion of the block rewards as a fee for their work in investor/SP diligence and FIL allocation services. This fee will remain in place for SPs using our pledge funding service.

Additionally, as part of this allocator, fees toward services will need to be accounted for including:

  • Coordination of quality dataset acquisition and preparation (Data Preparer)
  • Coordination of SPs and distribution of DataCap per guidelines
  • Confirmation to the adherence of data accessibility (retrievals) requirement
  • Administration of the allocator (diligence, allocation mgmt)

These fees could take the form of an adjusted proportional split of the deal block rewards for all parties mentioned above and/or could involve payment by Storage Providers for these additional services.

Target Clients

Open/Public,Nonprofit organizations,Individuals

Client Diligence Check

3rd party Know your customer (KYC) service,3rd party Know your business (KYB) service,Manual verification

Description of client diligence

The "client” in our open allocator pathway is a shared applicant role across all parties involved in an onboarding project including FIL pledge investors, Filecoin Data Preparer and all SP entities storing a copy of the dataset who are coming together to onboard a quality, value add dataset. 

All of these parties will be vetted upfront, prior to project start, via a third party KYC/KYB/AML service and also a Filecoin wallet check service. 

Any entity is eligible to join a Haven allocator project as an investor, data preparer or storage provider as long as they meet and pass the above criteria and agree with the project structure and fees.

Type of data

Public, open, and retrievable

Description of Data Diligence

We** **will validate all information about the source dataset links provided in an application.

  • The public, open dataset must be unique (meaning not already onboarded to Filecoin 5+ times)
  • The dataset must be online - the allocator must be able to browse to and observe any part of that dataset at their discretion. The dataset must be accessible for any web viewer without gating on credentials.

The data preparer will be required to provide:

  • Clear documentation on how the data is prepared / transformed between the source data set and the data stored in Filecoin
  • Individual semantic items of the data that are able to be individually referenced / accessed.
  • When a piece is downloaded from the stored dataset, there must be a way to identify what part of the original dataset that piece represents and confirm that it is indeed valid data from the dataset. (this could be e.g. provided via a log file indicating the mapping of offsets / files into stored deals)

Allocator operators will sample after each allocation via:

  • Spark tool retrieval checks (or any other community approved retrieval check) - looking for 50%+ and improving monthly averages
  • Manual retrieval attempts to compare download pieces to original dataset

Data Preparation

Client-provided,IPFS Kubo,Go-CAR,Singularity,RIBS,Other existing ecosystem tooling

Replicas required, verified by CID checker

2+

Distribution required

Equal distribution of deals across regions

Number of Storage Providers required

2+

Retrieval Requirements

Public data highly retrievable over Spark.

Allocation Tranche Schedule TypeManual or other allocation schedule.

Allocations will be given as a percentage of the total amount requested. Initial allocations will be 10-15% of total requested, up to 1PiB max. Subsequent allocations will be

Will you use FIDL tooling, such as allocator.tech and other bots?

Yes, all available tools

GitHub Bookkeeping Repo Link

https://github.com/haven-allocator/open-dataset

Success metrics

Amount of data onboarded, daily & aggregate,Retrievability of data,Speed of allocations (TTD)

Timeline to begin allocating to clients

1 week from RKH approval

Funnel: Expected DataCap usage over 12 months

100-200PiB

Risk mitigation strategies

All parties involved an a data onboarding project on our allocator will agree to terms of the onboarding plan upfront. Haven requires signed contracts for all parties. This includes the Data Preparer that will complete the documentation and preparation of CIDs. Also, the SPs agreeing to a plan for depositing their portion of FIL toward the pledge, storing cold copies, hot copies, and ensuring geographical distribution and distribution plan. All SP entities involved go through KYC/KYB upfront and disclose location of nodes and agree to terms of storage plan. Any party not meeting contract requirements or found abusing their role will risk delaying or losing their FIL block reward payout per contract.

Dispute Resolutions

A dispute within the context of DataCap allocation refers to any contention or disagreement arising between parties involved in the storage and retrieval of data on the Filecoin network. This could involve discrepancies over DataCap distribution, disagreements on data compliance with stated parameters, or conflicts over the execution of storage deals. Disputes may be internal, involving yourself and your client, or external, where you will need to defend your decisions against another active notary or the Fil+ Governance Team.
For disputes between our allocator and client, hereby termed appeal(s), we will source the appeals using our own Open Data Allocator Appeals Form (https://docs.google.com/forms/d/e/1FAIpQLSeJ8joRhF9NGWXTZYR_NSRZ5j1CpTg4SG3tMuZPkZW4KZafsQ/viewform) for all our clients where they can submit an appeal and someone on the team will address it with a 5 day SLA. We would like to respect the privacy of the client and do not plan to host a public resolution process. For disputes raised by community members/non-clients about our allocation approach and strategy, we will comply with the public dispute tracker that is being built by the Filecoin Foundation Governance Team and will commit to an SLA of 10 days.

Compliance Audit Check

After each allocation we will manually review the on-chain deal making activity of the applicant to confirm compliance relying on: Allocator Compliance and CID compliance reports to identify non-compliance of deal making and distribution. Separately, we will utilize Spark tool for retrievals check and also complete data sampling checks and that information will be used to drive action on future allocations.
At any point if a party is caught providing fake or misleading information about themselves or their SP partners, we will close any open applications and add the GitHub user IDs and miner IDs involved and block them from future participation in the allocator.

Compliance Report content presented for audit

Client Diligence: KYC/KYB report on clients,Data Compliance: Data Samples,Compliance: CID report,Success metric: onchain report of data onboarded,Data Compliance: Manual report.

Connections to Filecoin Ecosystem

Previous allocator,Big data contributor,Event sponsor

Slack ID

@kz

@martapiekarska martapiekarska added the Proposal Modifications to improve the operating of Allocator processes. label Oct 3, 2024
@haven-allocator
Copy link

An answer was cutoff regarding allocation tranche schedule:

Allocations will be given as a percentage of the total amount requested. Initial allocations will be 10-15% of total requested, up to 1PiB max. Subsequent allocations will be proportional up to 4 allocations. Example: 10%, 20%, 30%, 40%

@haven-allocator
Copy link

As an additional note about Haven Digital Partners current contributions in the Filecoin ecosystem, in the past five months Haven has delegated 3M FIL to seven different SP entities enabling projects to seal 75PiBs (QAP) of data

@Kevin-FF-USA Kevin-FF-USA self-assigned this Oct 8, 2024
@Kevin-FF-USA Kevin-FF-USA added Allocator Application Application received from an Organization to receive DataCap and function as an Allocator and removed Proposal Modifications to improve the operating of Allocator processes. labels Oct 8, 2024
@Kevin-FF-USA
Copy link
Collaborator

Hi @haven-allocator,

One of the scoring mechanisms for pathways is their ability to onboard quality data to the network. Given your existing clients, suggesting a proposal to help establish your ability within the ecosystem to serve as an Allocator performing MANUAL diligence.

Proposal
Bring one of your clients into the ecosystem with an Existing Allocator Pathway. Establish that you have real clients and can maintain the diligence standards of this application. Demonstrate that to the community as the ability and value to onboarding this new Manual Pathway.

Steps

  1. Work with any existing Allocator to create an application on behalf of your client. FIDL runs an enterprise Allocator if you were looking for a pathway with existing support in place to help with questions.

  2. Once the data is onboarded, reply back to this application with the following
    1. Client ID
    2. Links to the DataCap Application
    3. Spark retrieval %
    4. Verification that the Data reached the targeted number of SP's
    5. What the data type was

Onboarding
Once the ability to onboard clients through the application process has been verified, this application will receive a KYC check and begin onboarding as an Allocator to onboard clients directly.

For questions or support

  • There is a live Filecoin Plus Program call every two weeks. Calendar link is here. Or as always please tag us in issue for comment or update.

@Kevin-FF-USA Kevin-FF-USA added the Awaiting Response from Allocator If there is a question that was raised in the issue that requires comment before moving forward. label Oct 8, 2024
@Kevin-FF-USA
Copy link
Collaborator

Hi @haven-allocator

Realize you have two open applications, so just wanted to be sure to send this friendly check in to both.
Did you have any questions about our recommendations for improving the details listed above for the viability of this pathway's application (or bringing a client through an existing pathway)?

@haven-allocator
Copy link

No questions at this time @Kevin-FF-USA - we will work to onboard through an active allocator for now and return to this after. Thank you.

@Kevin-FF-USA Kevin-FF-USA added Diligence Audit in Process Governance team is reviewing the DataCap distributions and verifying the deals were within standards and removed Awaiting Response from Allocator If there is a question that was raised in the issue that requires comment before moving forward. labels Oct 24, 2024
@Kevin-FF-USA
Copy link
Collaborator

Hi @haven-allocator,

Thanks for submitting this application for refresh.
Wanted to send you a friendly update - as this works its way through the system you should see a comment from Galen on behalf of the Governance this week. If you have any questions or need support until then, please let us know.

Warmly,
-Kevin

@Kevin-FF-USA
Copy link
Collaborator

Hi @haven-allocator,

Wanted to keep you updated on the MANUAL application process.

📝 Manual Pathways Note:
We are updating the Manual application process to align with the new Meta Allocator standards. You should soon see the release of the updated application form (IT WILL BE MADE AVAILABLE ON GOVERNANCE CALLS AND POSTED IN FIL-PLUS SLACK), which will provide new, clear guidance on how Manual applications will proceed going forward.

As a reminder, the prioritization for review and approval of new pathways is.

  1. New RFA through EPMA
  2. Existing high-performing pathway
  3. Recent applicant who has successfully partnered with existing allocators (‘shepherding’ clients)
  4. Entirely new applicant with manual pathway
  5. Previously removed pathway reapplying

Come to the next Governance Meeting for updates and questions.
#342

Looking forward to great things! 🚀

Image

Image

Image

@Kevin-FF-USA
Copy link
Collaborator

#183

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Allocator Application Application received from an Organization to receive DataCap and function as an Allocator Diligence Audit in Process Governance team is reviewing the DataCap distributions and verifying the deals were within standards Manual Pathway
Projects
None yet
Development

No branches or pull requests

3 participants