Skip to content

Add config requirements and environment variables (and fix broken links!) #238

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jun 26, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions credit-scorer/docs/guides/cloud_deployment.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
## ☁️ Cloud Deployment

CreditScorer supports storing artifacts remotely and executing pipelines on cloud infrastructure. For this example, we'll use AWS, but you can use any cloud provider you want. You can also refer to the [AWS Integration Guide](https://docs.zenml.io/how-to/popular-integrations/aws-guide) for detailed instructions.
CreditScorer supports storing artifacts remotely and executing pipelines on cloud infrastructure. For this example, we'll use AWS, but you can use any cloud provider you want. You can also refer to the [AWS Integration Guide](https://docs.zenml.io/stacks/popular-stacks/aws-guide) for detailed instructions.

### AWS Setup

Expand Down Expand Up @@ -75,6 +75,6 @@ Similar setup processes can be followed for other cloud providers:

For detailed configuration options for these providers, refer to the ZenML documentation:

- [GCP Integration Guide](https://docs.zenml.io/how-to/popular-integrations/gcp-guide)
- [Azure Integration Guide](https://docs.zenml.io/how-to/popular-integrations/azure-guide)
- [Kubernetes Integration Guide](https://docs.zenml.io/how-to/popular-integrations/kubernetes)
- [GCP Integration Guide](https://docs.zenml.io/stacks/popular-stacks/gcp-guide)
- [Azure Integration Guide](https://docs.zenml.io/stacks/popular-stacks/azure-guide)
- [Kubernetes Integration Guide](https://docs.zenml.io/stacks/popular-stacks/kubernetes)
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ def deployment_deploy() -> Annotated[
In this example, the step can be configured to use different input data.
See the documentation for more information:

https://docs.zenml.io/how-to/pipeline-development/use-configuration-files
https://docs.zenml.io/concepts/steps_and_pipelines/yaml_configuration

Args:
dataset_inf: The inference dataset.
Expand Down
2 changes: 1 addition & 1 deletion databricks-production-qa-demo/steps/etl/data_loader.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ def data_loader(
In this example, the step can be configured with number of rows and logic
to drop target column or not. See the documentation for more information:

https://docs.zenml.io/how-to/build-pipelines/use-pipeline-step-parameters
https://docs.zenml.io/concepts/steps_and_pipelines#pipeline-parameterization

Args:
is_inference: If `True` subset will be returned and target column
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ def train_data_preprocessor(
columns and normalize numerical columns. See the documentation for more
information:

https://docs.zenml.io/how-to/build-pipelines/use-pipeline-step-parameters
https://docs.zenml.io/concepts/steps_and_pipelines#pipeline-parameterization

Args:
dataset_trn: The train dataset.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ def train_data_splitter(
In this example, the step can be configured to use different test
set sizes. See the documentation for more information:

https://docs.zenml.io/how-to/pipeline-development/use-configuration-files
https://docs.zenml.io/concepts/steps_and_pipelines/yaml_configuration

Args:
dataset: Dataset read from source.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ def hp_tuning_single_search(
to use different input datasets and also have a flag to fall back to default
model architecture. See the documentation for more information:

https://docs.zenml.io/how-to/pipeline-development/use-configuration-files
https://docs.zenml.io/concepts/steps_and_pipelines/yaml_configuration

Args:
model_package: The package containing the model to use for hyperparameter tuning.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ def inference_predict(
In this example, the step can be configured to use different input data.
See the documentation for more information:

https://docs.zenml.io/how-to/pipeline-development/use-configuration-files
https://docs.zenml.io/concepts/steps_and_pipelines/yaml_configuration

Args:
dataset_inf: The inference dataset.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ def compute_performance_metrics_on_current_data(
and target environment stage for promotion.
See the documentation for more information:

https://docs.zenml.io/how-to/pipeline-development/use-configuration-files
https://docs.zenml.io/concepts/steps_and_pipelines/yaml_configuration

Args:
dataset_tst: The test dataset.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ def promote_with_metric_compare(
and target environment stage for promotion.
See the documentation for more information:

https://docs.zenml.io/how-to/pipeline-development/use-configuration-files
https://docs.zenml.io/concepts/steps_and_pipelines/yaml_configuration

Args:
latest_metric: Recently trained model metric results.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ def model_trainer(
hyperparameters to the model constructor. See the documentation for more
information:

https://docs.zenml.io/how-to/pipeline-development/use-configuration-files
https://docs.zenml.io/concepts/steps_and_pipelines/yaml_configuration

Args:
dataset_trn: The preprocessed train dataset.
Expand Down
16 changes: 16 additions & 0 deletions deep_research/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -632,6 +632,22 @@ This pipeline can integrate with:
- **Alerting Systems**: Schedule research on key topics and receive regular reports
- **Other ZenML Pipelines**: Chain with downstream analysis or processing

## ☁️ Cloud Orchestrator Configuration

When running the pipeline with a cloud orchestrator (like Kubernetes, AWS SageMaker, etc.), the configuration files automatically use environment variable substitution to pick up your API keys from the environment.

The configuration files in `configs/` use environment variable substitution like:
```yaml
settings:
docker:
environment:
OPENROUTER_API_KEY: ${OPENROUTER_API_KEY}
TAVILY_API_KEY: ${TAVILY_API_KEY}
# ... other keys
```

Simply ensure your environment variables are set in your orchestrator environment, and the pipeline will automatically pick them up. For security, consider using your cloud provider's secret management services (AWS Secrets Manager, Azure Key Vault, etc.) to inject these environment variables into your orchestrator runtime.

## 📄 License

This project is licensed under the Apache License 2.0.
45 changes: 28 additions & 17 deletions deep_research/configs/balanced_research.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,57 +23,68 @@ langfuse_project_name: "deep-research"
# Research parameters for balanced research
parameters:
query: "Default research query"

steps:
initial_query_decomposition_step:
parameters:
llm_model: "sambanova/DeepSeek-R1-Distill-Llama-70B"
max_sub_questions: 10 # Balanced number of sub-questions
max_sub_questions: 10 # Balanced number of sub-questions

process_sub_question_step:
parameters:
llm_model_search: "sambanova/Meta-Llama-3.3-70B-Instruct"
llm_model_synthesis: "sambanova/DeepSeek-R1-Distill-Llama-70B"
cap_search_length: 20000 # Standard cap for search length
cap_search_length: 20000 # Standard cap for search length

cross_viewpoint_analysis_step:
parameters:
llm_model: "sambanova/DeepSeek-R1-Distill-Llama-70B"
viewpoint_categories:
[
viewpoint_categories: [
"scientific",
"political",
"economic",
"social",
"ethical",
"historical",
] # Standard viewpoints
] # Standard viewpoints

generate_reflection_step:
parameters:
llm_model: "sambanova/DeepSeek-R1-Distill-Llama-70B"

get_research_approval_step:
parameters:
timeout: 3600 # 1 hour timeout
max_queries: 2 # Moderate additional queries
timeout: 3600 # 1 hour timeout
max_queries: 2 # Moderate additional queries

execute_approved_searches_step:
parameters:
llm_model: "sambanova/Meta-Llama-3.3-70B-Instruct"
cap_search_length: 20000

pydantic_final_report_step:
parameters:
llm_model: "sambanova/DeepSeek-R1-Distill-Llama-70B"

# Environment settings
settings:
docker:
requirements:
- openai>=1.0.0
- tavily-python>=0.2.8
- exa-py>=1.0.0
- PyYAML>=6.0
- click>=8.0.0
- pydantic>=2.0.0
- typing_extensions>=4.0.0
- typing_extensions>=4.0.0
- requests
- anthropic>=0.52.2
- litellm==1.69.1
- langfuse==2.60.8
environment:
OPENROUTER_API_KEY: ${OPENROUTER_API_KEY}
TAVILY_API_KEY: ${TAVILY_API_KEY}
EXA_API_KEY: ${EXA_API_KEY}
ANTHROPIC_API_KEY: ${ANTHROPIC_API_KEY}
LANGFUSE_PUBLIC_KEY: ${LANGFUSE_PUBLIC_KEY}
LANGFUSE_SECRET_KEY: ${LANGFUSE_SECRET_KEY}
LANGFUSE_HOST: ${LANGFUSE_HOST}
16 changes: 14 additions & 2 deletions deep_research/configs/deep_research.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -73,9 +73,21 @@ steps:
settings:
docker:
requirements:
- openai>=1.0.0
- tavily-python>=0.2.8
- exa-py>=1.0.0
- PyYAML>=6.0
- click>=8.0.0
- pydantic>=2.0.0
- typing_extensions>=4.0.0
- typing_extensions>=4.0.0
- requests
- anthropic>=0.52.2
- litellm==1.69.1
- langfuse==2.60.8
environment:
OPENROUTER_API_KEY: ${OPENROUTER_API_KEY}
TAVILY_API_KEY: ${TAVILY_API_KEY}
EXA_API_KEY: ${EXA_API_KEY}
ANTHROPIC_API_KEY: ${ANTHROPIC_API_KEY}
LANGFUSE_PUBLIC_KEY: ${LANGFUSE_PUBLIC_KEY}
LANGFUSE_SECRET_KEY: ${LANGFUSE_SECRET_KEY}
LANGFUSE_HOST: ${LANGFUSE_HOST}
14 changes: 13 additions & 1 deletion deep_research/configs/enhanced_research.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -63,9 +63,21 @@ steps:
settings:
docker:
requirements:
- openai>=1.0.0
- tavily-python>=0.2.8
- exa-py>=1.0.0
- PyYAML>=6.0
- click>=8.0.0
- pydantic>=2.0.0
- typing_extensions>=4.0.0
- requests
- anthropic>=0.52.2
- litellm==1.69.1
- langfuse==2.60.8
environment:
OPENROUTER_API_KEY: ${OPENROUTER_API_KEY}
TAVILY_API_KEY: ${TAVILY_API_KEY}
EXA_API_KEY: ${EXA_API_KEY}
ANTHROPIC_API_KEY: ${ANTHROPIC_API_KEY}
LANGFUSE_PUBLIC_KEY: ${LANGFUSE_PUBLIC_KEY}
LANGFUSE_SECRET_KEY: ${LANGFUSE_SECRET_KEY}
LANGFUSE_HOST: ${LANGFUSE_HOST}
16 changes: 14 additions & 2 deletions deep_research/configs/enhanced_research_with_approval.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -69,9 +69,21 @@ steps:
settings:
docker:
requirements:
- openai>=1.0.0
- tavily-python>=0.2.8
- exa-py>=1.0.0
- PyYAML>=6.0
- click>=8.0.0
- pydantic>=2.0.0
- typing_extensions>=4.0.0
- typing_extensions>=4.0.0
- requests
- anthropic>=0.52.2
- litellm==1.69.1
- langfuse==2.60.8
environment:
OPENROUTER_API_KEY: ${OPENROUTER_API_KEY}
TAVILY_API_KEY: ${TAVILY_API_KEY}
EXA_API_KEY: ${EXA_API_KEY}
ANTHROPIC_API_KEY: ${ANTHROPIC_API_KEY}
LANGFUSE_PUBLIC_KEY: ${LANGFUSE_PUBLIC_KEY}
LANGFUSE_SECRET_KEY: ${LANGFUSE_SECRET_KEY}
LANGFUSE_HOST: ${LANGFUSE_HOST}
14 changes: 13 additions & 1 deletion deep_research/configs/quick_research.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -51,9 +51,21 @@ steps:
settings:
docker:
requirements:
- openai>=1.0.0
- tavily-python>=0.2.8
- exa-py>=1.0.0
- PyYAML>=6.0
- click>=8.0.0
- pydantic>=2.0.0
- typing_extensions>=4.0.0
- requests
- anthropic>=0.52.2
- litellm==1.69.1
- langfuse==2.60.8
environment:
OPENROUTER_API_KEY: ${OPENROUTER_API_KEY}
TAVILY_API_KEY: ${TAVILY_API_KEY}
EXA_API_KEY: ${EXA_API_KEY}
ANTHROPIC_API_KEY: ${ANTHROPIC_API_KEY}
LANGFUSE_PUBLIC_KEY: ${LANGFUSE_PUBLIC_KEY}
LANGFUSE_SECRET_KEY: ${LANGFUSE_SECRET_KEY}
LANGFUSE_HOST: ${LANGFUSE_HOST}
14 changes: 13 additions & 1 deletion deep_research/configs/rapid_research.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -51,9 +51,21 @@ steps:
settings:
docker:
requirements:
- openai>=1.0.0
- tavily-python>=0.2.8
- exa-py>=1.0.0
- PyYAML>=6.0
- click>=8.0.0
- pydantic>=2.0.0
- typing_extensions>=4.0.0
- requests
- anthropic>=0.52.2
- litellm==1.69.1
- langfuse==2.60.8
environment:
OPENROUTER_API_KEY: ${OPENROUTER_API_KEY}
TAVILY_API_KEY: ${TAVILY_API_KEY}
EXA_API_KEY: ${EXA_API_KEY}
ANTHROPIC_API_KEY: ${ANTHROPIC_API_KEY}
LANGFUSE_PUBLIC_KEY: ${LANGFUSE_PUBLIC_KEY}
LANGFUSE_SECRET_KEY: ${LANGFUSE_SECRET_KEY}
LANGFUSE_HOST: ${LANGFUSE_HOST}
2 changes: 1 addition & 1 deletion end-to-end-computer-vision/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ zenml login <INSERT_ZENML_URL_HERE>
We will use GCP in the commands listed below, but it will work for other cloud
providers.

1) Follow our guide to set up your credentials for GCP [here](https://docs.zenml.io/how-to/auth-management/gcp-service-connector)
1) Follow our guide to set up your credentials for GCP [here](https://docs.zenml.io/stacks/service-connectors/connector-types/gcp-service-connector)
2) Set up a bucket in GCP to persist your training data
3) Set up a bucket to use as artifact store within ZenML
Learn how to set up a GCP artifact store stack component within ZenML
Expand Down
7 changes: 4 additions & 3 deletions eurorate-predictor/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,8 +92,7 @@ output "zenml_stack_name" {
}
```
To learn more about the terraform script, read the
[ZenML documentation.](https://docs.zenml.io/how-to/
stack-deployment/deploy-a-cloud-stack-with-terraform) or
[ZenML documentation.](https://docs.zenml.io/stacks/deployment/deploy-a-cloud-stack-with-terraform) or
see
the [Terraform registry](https://registry.terraform.io/
modules/zenml-io/zenml-stack).
Expand Down Expand Up @@ -163,4 +162,6 @@ For detailed documentation on using ZenML to build your own MLOps pipelines, ple

## 🔄 Continuous Improvement

EuroRate Predictor is designed for continuous improvement of your interest rate forecasts. As new ECB data becomes available, simply re-run the pipelines to generate updated predictions.
EuroRate Predictor is designed for continuous improvement of your interest rate
forecasts. As new ECB data becomes available, simply re-run the pipelines to
generate updated predictions.
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ def promote_metric_compare_promoter(
In this example, the step can be configured to use different input data.
See the documentation for more information:

https://docs.zenml.io/how-to/pipeline-development/use-configuration-files
https://docs.zenml.io/concepts/steps_and_pipelines/yaml_configuration

Args:
latest_metrics: Recently trained model metrics results.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ def tokenizer_loader(
For more information on how to configure steps in a pipeline, refer to the
following documentation:

https://docs.zenml.io/how-to/pipeline-development/use-configuration-files
https://docs.zenml.io/concepts/steps_and_pipelines/yaml_configuration

Args:
lower_case: A boolean value indicating whether to convert the input text to
Expand Down
Loading