Skip to content

Add a Command to Sync External OWASP Repositories #1656

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

rudransh-shrivastava
Copy link
Contributor

@rudransh-shrivastava rudransh-shrivastava commented Jun 23, 2025

Resolves #1538

Added a command that syncs external OWASP GitHub organization's repositories.

Example Change:

  • Project Juice Shop now has 12 repositories linked instead of 3.

Notes/Questions:

  • The command uses the Repository model to get the parent project, however, the current existing implementation only returns one project. The issue stated that if more than one project then we must log and skip. However, I don't know how to know if there are multiple projects linked.
  • What flags should the command have?
    since we are going to fetch a lot of organization's repositories, adding an offset may complicate things or cause confusion. Adding a flag for repository also doesn't make sense to me. However, we can add a flag for organization. Then only fetch that organization.

Copy link
Contributor

coderabbitai bot commented Jun 23, 2025

Summary by CodeRabbit

  • New Features

    • Added the ability to update and synchronize external OWASP-related GitHub organizations and their repositories.
    • Introduced new ways to query and access OWASP-related organizations and their associated projects.
  • Bug Fixes

    • Improved handling and error reporting for invalid GitHub credentials and missing organizations during synchronization.
  • Tests

    • Added comprehensive tests to ensure correct synchronization and error handling for OWASP-related GitHub organizations.
  • Chores

    • Updated database schema to enhance relationships between organizations and repositories.
      """

Summary by CodeRabbit

  • New Features

    • Added the ability to fetch and update external OWASP-related GitHub repositories.
    • Introduced new ways to query and manage OWASP-related organizations and their associated projects.
  • Bug Fixes

    • Improved database relationships for organizations and repositories to support enhanced querying and data integrity.
  • Tests

    • Added comprehensive tests for the new management command to ensure correct behavior and error handling.

Walkthrough

This change introduces a new Django management command to synchronize repositories from external OWASP-related GitHub organizations. It adds supporting model logic, a Makefile target for automation, a database migration, and comprehensive unit tests. The update ensures repositories from these organizations are fetched and linked to projects, with error handling and logging.

Changes

Files / Grouped Files Change Summary
backend/Makefile, backend/apps/github/Makefile Added Makefile targets to automate syncing of external OWASP-related organizations.
backend/apps/github/management/commands/github_update_owasp_related_organizations.py New management command to sync repositories from OWASP-related external organizations.
backend/apps/github/models/managers/organization.py, backend/apps/github/models/organization.py Added manager and property for querying/filtering OWASP-related organizations and their projects.
backend/apps/github/models/repository.py, backend/apps/github/migrations/0031_alter_repository_organization.py Added/updated related_name="repositories" for organization ForeignKey; migration for schema change.
backend/tests/apps/github/management/commands/github_update_owasp_related_organizations_test.py New unit tests for the external organizations sync management command.

Assessment against linked issues

Objective Addressed Explanation
Implement external organizations repositories sync using github.Organization model, excluding unrelated orgs (#1538)
Apply repository sync logic similar to existing OWASP/project sync commands (#1538)
Link new repositories to projects using organization repositories; skip and log error if org has >1 project (#1538)

Assessment against linked issues: Out-of-scope changes

No out-of-scope changes were found. All functional code changes directly support the objectives described in the linked issue.
"""


📜 Recent review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a1517df and 5342e38.

📒 Files selected for processing (1)
  • backend/tests/apps/github/management/commands/github_update_owasp_related_organizations_test.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • backend/tests/apps/github/management/commands/github_update_owasp_related_organizations_test.py
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate Unit Tests
  • Create PR with Unit Tests
  • Post Copyable Unit Tests in Comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai auto-generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@rudransh-shrivastava rudransh-shrivastava changed the title feat: add command to sync external OWASP repositories Add a Command to Sync External OWASP Repositories Jun 23, 2025
@rudransh-shrivastava
Copy link
Contributor Author

I am yet to add tests for the command.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0f66b66 and 95fed67.

📒 Files selected for processing (2)
  • backend/Makefile (2 hunks)
  • backend/apps/github/management/commands/github_update_external_repositories.py (1 hunks)
🧰 Additional context used
🪛 Pylint (3.3.7)
backend/apps/github/management/commands/github_update_external_repositories.py

[refactor] 24-24: Too many local variables (20/15)

(R0914)


[refactor] 19-19: Too few public methods (1/2)

(R0903)

🔇 Additional comments (4)
backend/Makefile (1)

61-64: LGTM! Consistent integration of the new command.

The new Makefile target follows the established pattern and is properly integrated into the data update workflow.

backend/apps/github/management/commands/github_update_external_repositories.py (3)

1-17: LGTM! Clean imports and setup.

The imports are well-organized and appropriate for the command's functionality.


40-43: LGTM! Appropriate organization filtering logic.

The query correctly filters for OWASP-related organizations while excluding the main OWASP organization.


77-81: Address missing multiple projects detection logic.

According to the PR objectives, the command should detect and handle cases where a repository is linked to multiple projects. The current implementation doesn't handle this scenario.

Based on the PR objectives, this logic needs to be implemented. Can you clarify how to detect multiple linked projects using the Repository model?

#!/bin/bash
# Search for Repository model and related project relationships
ast-grep --pattern 'class Repository {
  $$$
}'

# Search for methods that might return project relationships
rg -A 5 "def.*project" --type py

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (7)
backend/tests/apps/github/management/commands/github_update_external_repositories_test.py (7)

12-21: Fix formatting and approve fixture design.

The fixtures are well-designed for testing isolation. However, there are formatting issues that need to be addressed.

Apply this diff to fix the formatting issues:

+

 @pytest.fixture
 def command():
     return Command()

+
 @pytest.fixture
 def mock_gh_repository():

23-30: Fix parametrize syntax and formatting.

The test parametrization should use a tuple instead of a string for the first argument.

Apply this diff to fix the parametrize syntax and formatting:

+

 @pytest.mark.parametrize(
-    "num_orgs, num_repos_per_org, expected_sync_calls, expected_bulk_save_calls",
+    ("num_orgs", "num_repos_per_org", "expected_sync_calls", "expected_bulk_save_calls"),
     [
         (1, [2], 2, 1),    # 1 org with 2 repos
         (2, [1, 3], 4, 2),  # 2 orgs with 1 and 3 repos
         (2, [0, 2], 2, 1),  # 1 org with repos, 1 without
     ]
 )

74-74: Fix unused lambda argument.

The lambda function has an unused parameter which should be replaced with an underscore.

Apply this diff:

-    mock_sync_repository.side_effect = lambda gh_repo: (orgs[0], mock.Mock(project=mock_project))
+    mock_sync_repository.side_effect = lambda _: (orgs[0], mock.Mock(project=mock_project))

81-84: Fix unused loop variable.

The loop variable i is not used within the loop body. Use an underscore to indicate it's intentionally unused.

Apply this diff:

     assert mock_gh.get_organization.call_count == num_orgs
-    for i, org in enumerate(orgs):
+    for _, org in enumerate(orgs):
         mock_gh.get_organization.assert_any_call(org.login)

89-96: Fix parametrize syntax and formatting.

Same issues as the previous test function need to be addressed.

Apply this diff:

+

 @pytest.mark.parametrize(
-    "num_repos, expected_project_count",
+    ("num_repos", "expected_project_count"),
     [
         (1, 1),  # 1 repo with  1 project
         (3, 3),  # 3 repos with 3 projects
         (2, 0),  # 2 repos without 0 projects
     ]
 )

98-122: Approve test logic and add missing newline.

The test logic is well-structured and covers different scenarios appropriately. The mocking strategy is sound and the assertions are comprehensive.

Add a newline at the end of the file:

     assert len(projects) == expected_project_count
     assert mock_sync_repository.call_count == num_repos
+

1-122: Excellent test coverage with minor formatting improvements needed.

This test file provides comprehensive coverage for the new github_update_external_repositories command. The parametrized tests efficiently cover multiple scenarios and the mocking strategy properly isolates the command logic from external dependencies.

Consider adding tests for error scenarios (e.g., GitHub API failures, authentication issues) to make the test suite even more robust.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7c69f95 and 2e224ff.

📒 Files selected for processing (2)
  • backend/Makefile (2 hunks)
  • backend/tests/apps/github/management/commands/github_update_external_repositories_test.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • backend/Makefile
🧰 Additional context used
🪛 Ruff (0.11.9)
backend/tests/apps/github/management/commands/github_update_external_repositories_test.py

24-24: Wrong type passed to first argument of pytest.mark.parametrize; expected tuple

Use a tuple for the first argument

(PT006)


74-74: Unused lambda argument: gh_repo

(ARG005)


82-82: Loop control variable i not used within loop body

(B007)


90-90: Wrong type passed to first argument of pytest.mark.parametrize; expected tuple

Use a tuple for the first argument

(PT006)


122-122: No newline at end of file

Add trailing newline

(W292)

🪛 Flake8 (7.2.0)
backend/tests/apps/github/management/commands/github_update_external_repositories_test.py

[error] 12-12: expected 2 blank lines, found 1

(E302)


[error] 16-16: expected 2 blank lines, found 1

(E302)


[error] 23-23: expected 2 blank lines, found 1

(E302)


[error] 89-89: expected 2 blank lines, found 1

(E302)

🪛 Pylint (3.3.7)
backend/tests/apps/github/management/commands/github_update_external_repositories_test.py

[refactor] 36-36: Too many arguments (10/5)

(R0913)


[refactor] 36-36: Too many positional arguments (10/5)

(R0917)


[refactor] 36-36: Too many local variables (19/15)

(R0914)

🔇 Additional comments (1)
backend/tests/apps/github/management/commands/github_update_external_repositories_test.py (1)

1-10: Imports look good and comprehensive.

The imports properly include all necessary components for testing the Django management command with appropriate mocking.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
backend/tests/apps/github/management/commands/github_update_external_repositories_test.py (1)

97-104: Clarify test parameter documentation for better understanding

The comment on line 102 ("2 repos without 0 projects") is confusing and doesn't clearly explain the test scenario.

     @pytest.mark.parametrize(
         ("num_repos", "expected_project_count"),
         [
-            (1, 1),  # 1 repo with  1 project
-            (3, 3),  # 3 repos with 3 projects
-            (2, 0),  # 2 repos without 0 projects
+            (1, 1),  # 1 repo that successfully syncs to 1 project
+            (3, 3),  # 3 repos that successfully sync to 3 projects
+            (2, 0),  # 2 repos that fail to sync (return None projects)
         ],
     )
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2e224ff and 7e71903.

📒 Files selected for processing (1)
  • backend/tests/apps/github/management/commands/github_update_external_repositories_test.py (1 hunks)
🧰 Additional context used
🪛 Pylint (3.3.7)
backend/tests/apps/github/management/commands/github_update_external_repositories_test.py

[refactor] 43-43: Too many arguments (10/5)

(R0913)


[refactor] 43-43: Too many positional arguments (10/5)

(R0917)


[refactor] 43-43: Too many local variables (19/15)

(R0914)

🔇 Additional comments (4)
backend/tests/apps/github/management/commands/github_update_external_repositories_test.py (4)

1-10: LGTM: Clean and appropriate imports

The imports are well-organized and include all necessary components for testing the management command with proper mocking support.


13-24: LGTM: Well-structured test fixtures

The fixtures provide clean test setup with appropriate mock objects that represent the expected structure of the command and GitHub repository objects.


125-127: LGTM: Comprehensive test assertions

The test properly validates both the project count and the number of synchronization calls, ensuring the method behaves correctly across different scenarios.


118-122: ```shell
#!/bin/bash
set -e

echo "===== External Repositories Command (class Command) ====="
rg -n "class Command" -A 200 --glob "backend/apps/github/management/commands/github_update_external_repositories.py"


</details>

</blockquote></details>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
backend/tests/apps/github/management/commands/github_update_external_repositories_test.py (1)

76-83: Function complexity remains high despite improvements.

While the test logic is sound and comprehensive, the function still has 6 parameters which exceeds the recommended limit. The previous suggestion to refactor using helper methods or a test class structure would still improve maintainability.

🧹 Nitpick comments (1)
backend/tests/apps/github/management/commands/github_update_external_repositories_test.py (1)

39-39: Consider creating unique repository mocks for better test realism.

Using the same mock_gh_repository reference for all repositories in an organization may not accurately simulate real scenarios where repositories have different properties.

-        gh_repos.__iter__.return_value = [mock_gh_repository] * num_repos_per_org[i]
+        # Create unique mock repositories for each repo
+        repo_mocks = []
+        for j in range(num_repos_per_org[i]):
+            repo_mock = mock.Mock()
+            repo_mock.name = f"repo-{i}-{j}"
+            repo_mock.html_url = f"https://github.com/{org.login}/repo-{i}-{j}"
+            repo_mocks.append(repo_mock)
+        gh_repos.__iter__.return_value = repo_mocks
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7e71903 and 6723f39.

📒 Files selected for processing (1)
  • backend/tests/apps/github/management/commands/github_update_external_repositories_test.py (1 hunks)
🧰 Additional context used
🪛 Pylint (3.3.7)
backend/tests/apps/github/management/commands/github_update_external_repositories_test.py

[refactor] 76-76: Too many arguments (6/5)

(R0913)


[refactor] 76-76: Too many positional arguments (6/5)

(R0917)

🔇 Additional comments (6)
backend/tests/apps/github/management/commands/github_update_external_repositories_test.py (6)

1-11: Clean and well-organized imports.

The imports are appropriately structured and include all necessary modules for comprehensive testing.


14-24: Well-designed fixtures following pytest best practices.

Both fixtures provide clean abstractions that will be reusable across tests.


46-51: Excellent use of NamedTuple for test scenarios.

This approach significantly improves test readability compared to raw tuples in parametrized tests.


98-102: Good improvement to mock behavior accuracy.

The sync_side_effect function properly addresses the previous review concern by providing more realistic mock behavior that varies based on the repository being processed.


107-115: Comprehensive and well-structured test assertions.

The assertions effectively verify all critical aspects of the command execution, from setup to core functionality.


118-147: Well-designed focused test with clean structure.

This test effectively isolates and validates the sync_organization_repositories method with clear parametrization and appropriate assertions.

@rudransh-shrivastava rudransh-shrivastava marked this pull request as ready for review June 25, 2025 16:43
Copy link
Collaborator

@arkid15r arkid15r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good PR in terms of structure and overall logic. I have a couple questions I couldn't figure out (haven't dealt w/ this part of OWASP Nest for a while):

@arkid15r
Copy link
Collaborator

@rudransh-shrivastava please also resolve coderabbit and sonar comments/issues.

@rudransh-shrivastava
Copy link
Contributor Author

@arkid15r about a coderabbit comment, I'm not sure about the flags I should add to this command. Should I add a --organization flag?
I will resolve all other comments and issues.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
backend/tests/apps/github/management/commands/github_update_external_repositories_test.py (2)

127-128: Consider asserting on print output or documenting suppression intent.

The print patch is applied but not used in assertions. If this is intentional for output suppression, consider adding a comment to clarify the intent.

-        with mock.patch("builtins.print"):
+        with mock.patch("builtins.print"):  # Suppress command output during testing
             self.command.handle()

138-167: Consider adding error scenario test cases.

The current tests cover successful synchronization scenarios but don't test error conditions like GitHub API failures or network timeouts.

Consider adding test cases for error scenarios:

@pytest.mark.parametrize(
    ("side_effect", "expected_project_count"),
    [
        (Exception("GitHub API error"), 0),  # API failure
        (ConnectionError("Network timeout"), 0),  # Network error
    ],
)
@mock.patch("apps.github.management.commands.github_update_external_repositories.sync_repository")
def test_sync_organization_repositories_error_handling(
    mock_sync_repository, command, mock_gh_repository, side_effect, expected_project_count
):
    """Test repository synchronization error handling."""
    # Test setup and assertions for error scenarios
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6723f39 and 3bc2b2d.

📒 Files selected for processing (4)
  • backend/Makefile (1 hunks)
  • backend/apps/github/Makefile (1 hunks)
  • backend/apps/github/management/commands/github_update_external_repositories.py (1 hunks)
  • backend/tests/apps/github/management/commands/github_update_external_repositories_test.py (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • backend/apps/github/Makefile
🚧 Files skipped from review as they are similar to previous changes (2)
  • backend/Makefile
  • backend/apps/github/management/commands/github_update_external_repositories.py
🔇 Additional comments (5)
backend/tests/apps/github/management/commands/github_update_external_repositories_test.py (5)

1-11: LGTM: Clean imports and proper command testing structure.

The imports are well-organized and correctly reference the command module and its components.


13-24: LGTM: Well-defined fixtures with appropriate mocking.

The fixtures are properly structured and provide good test isolation. The mock GitHub repository fixture includes the essential attributes needed for testing.


26-49: LGTM: Comprehensive test data setup with proper mock specifications.

The helper function effectively creates realistic test scenarios with properly configured mocks. Good use of spec=Organization for type safety.


51-56: LGTM: Clean data structure for parameterized testing.

The Scenario NamedTuple provides a clear and maintainable way to define test parameters.


85-107: LGTM: Comprehensive test scenarios with good coverage.

The parameterized test scenarios effectively cover different combinations of organizations and repositories, including edge cases like organizations with zero repositories.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3bc2b2d and 0f1f9ca.

📒 Files selected for processing (1)
  • backend/tests/apps/github/management/commands/github_update_external_repositories_test.py (1 hunks)
🔇 Additional comments (8)
backend/tests/apps/github/management/commands/github_update_external_repositories_test.py (8)

1-11: LGTM - Clean imports and proper module structure.

The imports are well-organized with appropriate typing, mocking, and test dependencies. The import of specific classes from the command module ensures proper testing isolation.


14-25: LGTM - Well-structured test fixtures.

Both fixtures are appropriately scoped and provide clean, reusable test objects with sensible default values.


27-50: LGTM - Comprehensive helper function for test setup.

The setup_organizations function effectively creates mock organizations and GitHub objects with the required attributes and relationships. The parametrized approach allows testing various scenarios efficiently.


52-57: LGTM - Clear data structure for test parametrization.

The Scenario NamedTuple provides a clean way to organize test parameters with descriptive field names that make the test cases self-documenting.


59-85: LGTM - Comprehensive test setup with proper mocking.

The autouse setup fixture properly isolates the command under test by mocking all external dependencies including GitHub API, database operations, and environment variables.


86-138: LGTM - Thorough parametrized testing of main command functionality.

The test scenarios cover important edge cases including single/multiple organizations, varying repository counts, and empty organizations. The assertions verify both call counts and specific interactions with mocked dependencies.


140-170: LGTM - Focused testing of repository synchronization logic.

This test effectively isolates and validates the repository synchronization functionality with clear parametrization covering successful and failed sync scenarios.


172-225: LGTM - Comprehensive error handling test with minor fix needed.

The test effectively covers different error scenarios including recoverable errors (generic Exception, ConnectionError) and terminal errors (BadCredentialsException). The logic correctly expects immediate termination for credential errors while continuing processing for other errors. However, there's a duplicate assignment issue on lines 207-208 that needs correction.

@rudransh-shrivastava rudransh-shrivastava force-pushed the feature/sync-external-repositories branch from 0f1f9ca to f6c7cdd Compare June 28, 2025 10:29
@rudransh-shrivastava
Copy link
Contributor Author

ready for review.

@arkid15r
Copy link
Collaborator

@arkid15r about a coderabbit comment, I'm not sure about the flags I should add to this command. Should I add a --organization flag? I will resolve all other comments and issues.

Yes, having this one would be useful

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
backend/apps/github/management/commands/github_update_external_repositories.py (1)

60-60: Use appropriate logging level for non-exception cases.

This is not an exception context, so logger.error would be more appropriate than logger.exception.

Apply this diff:

-                logger.exception("Organization '%s' not found in the database", organization)
+                logger.error("Organization '%s' not found in the database", organization)
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f6c7cdd and 983f21e.

📒 Files selected for processing (2)
  • backend/apps/github/management/commands/github_update_external_repositories.py (1 hunks)
  • backend/tests/apps/github/management/commands/github_update_external_repositories_test.py (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
backend/apps/github/management/commands/github_update_external_repositories.py (1)
backend/apps/github/common.py (1)
  • sync_repository (25-229)
🔇 Additional comments (11)
backend/apps/github/management/commands/github_update_external_repositories.py (4)

23-35: LGTM! The argument parsing is well implemented.

The --organization flag is properly configured with appropriate help text and optional nature, addressing the PR objectives to allow filtering by specific organization.


45-50: LGTM! Proper environment variable validation.

The GitHub token validation correctly checks for existence and provides clear error messaging, which addresses the past review concerns about token handling.


72-84: LGTM! Excellent exception handling.

The exception handling properly catches BadCredentialsException specifically and falls back to generic exception handling, which addresses the past review concerns about exception specificity.


88-104: LGTM! Clean repository synchronization with proper error handling.

The method correctly calls sync_repository without attempting to bulk save the results, which aligns with the reviewer feedback about not creating/updating projects. The exception handling follows the same pattern as the main method.

backend/tests/apps/github/management/commands/github_update_external_repositories_test.py (7)

13-24: LGTM! Clean and appropriate fixtures.

The fixtures are well-structured, providing the necessary command instance and mock GitHub repository for testing.


26-48: LGTM! Well-designed helper function.

The setup_organizations function effectively creates realistic mock data for organizations and repositories, properly configuring the mock objects to simulate GitHub API responses.


51-55: LGTM! Good use of NamedTuple for test scenarios.

Using Scenario as a NamedTuple improves test readability and maintainability for the parameterized tests.


58-77: LGTM! Comprehensive test setup with proper mocking.

The autouse fixture properly mocks all external dependencies including the GitHub client, sync function, and database queries, which allows for isolated unit testing.


79-117: LGTM! Thorough parameterized testing.

The test covers multiple scenarios effectively, including organizations with varying numbers of repositories and edge cases like organizations with zero repositories. The assertions properly verify the expected behavior.


119-144: LGTM! Specific organization testing is thorough.

This test properly validates the --organization flag functionality, ensuring the command correctly filters to a specific organization and processes its repositories.


147-196: LGTM! Comprehensive error handling testing.

The parameterized error handling test covers various exception scenarios including generic errors, network errors, and credential errors. The test properly validates that the command continues processing after recoverable errors and handles credential errors appropriately.

@rudransh-shrivastava
Copy link
Contributor Author

@arkid15r Hello, I have added an organization flag, removed the bulk save stuff. Thank you very much for your reviews, they are really helpful.

I'm still unsure about this part of the issue, Use existing organization repositories linked projects for linking the new repositories to projects. If an organization has more than 1 project (get via Repository model) log an error and skip repository project setting. Please update the code as required, if there's anything more I can do, let me know!

Thank you!

@arkid15r arkid15r force-pushed the feature/sync-external-repositories branch from c5d050e to 2c3eb9f Compare June 29, 2025 20:17
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 43e1153 and 2c3eb9f.

📒 Files selected for processing (9)
  • backend/Makefile (1 hunks)
  • backend/apps/github/Makefile (1 hunks)
  • backend/apps/github/management/commands/github_update_external_repositories.py (1 hunks)
  • backend/apps/github/management/commands/github_update_owasp_related_organizations.py (1 hunks)
  • backend/apps/github/migrations/0031_alter_repository_organization.py (1 hunks)
  • backend/apps/github/models/managers/organization.py (1 hunks)
  • backend/apps/github/models/organization.py (4 hunks)
  • backend/apps/github/models/repository.py (1 hunks)
  • backend/tests/apps/github/management/commands/github_update_owasp_related_repositories_test.py (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • backend/apps/github/models/repository.py
🚧 Files skipped from review as they are similar to previous changes (3)
  • backend/apps/github/Makefile
  • backend/Makefile
  • backend/apps/github/management/commands/github_update_external_repositories.py
🧰 Additional context used
🪛 Pylint (3.3.7)
backend/apps/github/models/managers/organization.py

[refactor] 8-8: Too few public methods (1/2)

(R0903)

backend/apps/github/migrations/0031_alter_repository_organization.py

[refactor] 7-7: Too few public methods (0/2)

(R0903)

backend/apps/github/management/commands/github_update_owasp_related_organizations.py

[refactor] 37-37: Too many local variables (16/15)

(R0914)

🔇 Additional comments (7)
backend/apps/github/migrations/0031_alter_repository_organization.py (1)

7-25: LGTM!

The migration correctly adds the related_name="repositories" to the organization ForeignKey field, enabling reverse querying from Organization instances to their repositories. The migration structure follows Django conventions.

backend/apps/github/management/commands/github_update_owasp_related_organizations.py (1)

23-36: LGTM!

The argument parsing is well-implemented with clear help text and appropriate parameter configuration.

backend/tests/apps/github/management/commands/github_update_owasp_related_repositories_test.py (1)

59-147: LGTM!

The test structure is well-designed with comprehensive scenarios, proper mocking, and clear assertions. The parameterized tests effectively cover different combinations of organizations and repositories.

backend/apps/github/models/organization.py (4)

3-3: LGTM!

The type hinting improvements with from __future__ import annotations and TYPE_CHECKING block follow modern Python best practices and help avoid circular import issues.

Also applies to: 6-6, 16-18


29-31: LGTM!

The manager setup is correctly implemented, maintaining the default objects manager while adding the specialized related_organizations manager.


52-62: LGTM!

The related_projects property is well-implemented using dynamic model import to avoid circular dependencies. The logic correctly filters projects based on repository relationships and uses distinct() to prevent duplicates.


94-94: LGTM!

The return type annotation improvement from string literal to direct class reference is a good enhancement that leverages the postponed evaluation of annotations.

Comment on lines +11 to +12
def get_queryset(self):
"""Get open milestones."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix the incorrect docstring.

The docstring mentions "Get open milestones" but this method filters OWASP-related organizations. Update the docstring to accurately describe the method's purpose.

-    def get_queryset(self):
-        """Get open milestones."""
+    def get_queryset(self):
+        """Get OWASP-related organizations excluding the main OWASP organization."""
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def get_queryset(self):
"""Get open milestones."""
def get_queryset(self):
"""Get OWASP-related organizations excluding the main OWASP organization."""
🤖 Prompt for AI Agents
In backend/apps/github/models/managers/organization.py around lines 11 to 12,
the docstring for the get_queryset method incorrectly states "Get open
milestones" while the method actually filters OWASP-related organizations.
Update the docstring to accurately describe that this method returns a queryset
filtered for OWASP-related organizations.

)
return

organizations = Organization.related_organizations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix manager access - missing .all() call.

The related_organizations is a manager, not a queryset. You need to call .all() to get the queryset.

-        organizations = Organization.related_organizations
+        organizations = Organization.related_organizations.all()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
organizations = Organization.related_organizations
- organizations = Organization.related_organizations
+ organizations = Organization.related_organizations.all()
🤖 Prompt for AI Agents
In
backend/apps/github/management/commands/github_update_owasp_related_organizations.py
at line 55, the code assigns the manager `Organization.related_organizations`
directly to `organizations` without calling `.all()`. To fix this, append
`.all()` to `Organization.related_organizations` to retrieve the queryset
instead of the manager.

Comment on lines +47 to +54
try:
gh = github.Github(os.getenv("GITHUB_TOKEN"), per_page=GITHUB_ITEMS_PER_PAGE)
except BadCredentialsException:
logger.warning(
"Invalid GitHub token. Please create and update .env file with a valid token."
)
return

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve GitHub client initialization error handling.

The BadCredentialsException is only raised when making API calls, not during client initialization. The current try-catch block won't catch authentication errors properly.

-        try:
-            gh = github.Github(os.getenv("GITHUB_TOKEN"), per_page=GITHUB_ITEMS_PER_PAGE)
-        except BadCredentialsException:
-            logger.warning(
-                "Invalid GitHub token. Please create and update .env file with a valid token."
-            )
-            return
+        github_token = os.getenv("GITHUB_TOKEN")
+        if not github_token:
+            logger.error("GitHub token not found. Please create and update .env file with a valid token.")
+            return
+        
+        gh = github.Github(github_token, per_page=GITHUB_ITEMS_PER_PAGE)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
try:
gh = github.Github(os.getenv("GITHUB_TOKEN"), per_page=GITHUB_ITEMS_PER_PAGE)
except BadCredentialsException:
logger.warning(
"Invalid GitHub token. Please create and update .env file with a valid token."
)
return
github_token = os.getenv("GITHUB_TOKEN")
if not github_token:
logger.error("GitHub token not found. Please create and update .env file with a valid token.")
return
gh = github.Github(github_token, per_page=GITHUB_ITEMS_PER_PAGE)
🤖 Prompt for AI Agents
In
backend/apps/github/management/commands/github_update_owasp_related_organizations.py
around lines 47 to 54, the try-except block incorrectly expects
BadCredentialsException during GitHub client initialization, but this exception
occurs during API calls. Remove the try-except around client creation and
instead add error handling for BadCredentialsException around actual API calls
to properly catch authentication errors.

Comment on lines +91 to +96
try:
_, repository = sync_repository(gh_repository)
organization.related_projects.first().repositories.add(repository)
except Exception:
logger.exception("Error syncing repository %s", repository_url)
continue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve exception handling specificity.

The broad exception handling can mask important errors. Consider handling specific exceptions differently and adding more context to the error logging.

                try:
                    _, repository = sync_repository(gh_repository)
                    organization.related_projects.first().repositories.add(repository)
-                except Exception:
-                    logger.exception("Error syncing repository %s", repository_url)
-                    continue
+                except BadCredentialsException:
+                    logger.error("Invalid GitHub credentials. Please check your token.")
+                    return  # Exit entirely on auth failure
+                except Exception as e:
+                    logger.exception("Error syncing repository %s: %s", repository_url, str(e))
+                    continue
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
try:
_, repository = sync_repository(gh_repository)
organization.related_projects.first().repositories.add(repository)
except Exception:
logger.exception("Error syncing repository %s", repository_url)
continue
try:
_, repository = sync_repository(gh_repository)
organization.related_projects.first().repositories.add(repository)
except BadCredentialsException:
logger.error("Invalid GitHub credentials. Please check your token.")
return # Exit entirely on auth failure
except Exception as e:
logger.exception("Error syncing repository %s: %s", repository_url, str(e))
continue
🤖 Prompt for AI Agents
In
backend/apps/github/management/commands/github_update_owasp_related_organizations.py
around lines 91 to 96, the current exception handling uses a broad except
Exception clause which can obscure specific errors. Refactor the try-except
block to catch more specific exceptions that sync_repository might raise, and
add detailed context to the logger.exception call to improve error traceability.
Avoid using a bare except and ensure each exception type is handled or logged
appropriately.

Comment on lines 178 to 200
def test_sync_organization_repositories_error_handling(
mock_sync_repository, command, mock_gh_repository, side_effects
):
"""Test repository synchronization error handling."""
mock_organization = mock.Mock(spec=Organization)
mock_organization.login = "TestOrg"

mock_repositories = mock.MagicMock()
mock_repositories.totalCount = 2 # Try to sync 2 repositories
mock_repositories.__iter__.return_value = [mock_gh_repository] * 2

mock_sync_repository.side_effect = side_effects

with mock.patch(BUILTINS_PRINT), mock.patch("logging.getLogger") as mock_logger:
logger = mock.Mock()
mock_logger.return_value = logger
command.sync_organization_repositories(mock_organization, mock_repositories)

if isinstance(side_effects[0], BadCredentialsException):
logger.warning.assert_called_with(
"Invalid GitHub token. Please update .env file with a valid token."
)
assert mock_sync_repository.call_count == 2 # All cases try both repos
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix test method call - sync_organization_repositories doesn't exist.

The test calls command.sync_organization_repositories() but this method doesn't exist in the actual command. The command processes repositories within the handle method.

Consider refactoring the command to extract repository synchronization into a separate method, or update the test to properly test the handle method with error scenarios:

-        command.sync_organization_repositories(mock_organization, mock_repositories)
+        # Mock the organization queryset to return our test organization
+        with mock.patch(
+            "apps.github.management.commands.github_update_owasp_related_organizations.Organization.related_organizations.all"
+        ) as mock_orgs:
+            mock_orgs.return_value = [mock_organization]
+            mock_organization.related_projects.count.return_value = 1
+            mock_organization.related_projects.first.return_value.repositories.add = mock.Mock()
+            
+            command.handle(organization=None)

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In
backend/tests/apps/github/management/commands/github_update_owasp_related_repositories_test.py
around lines 178 to 200, the test calls a non-existent method
sync_organization_repositories on the command object. To fix this, either
refactor the command class to extract the repository synchronization logic into
a separate method that can be called directly in the test, or modify the test to
invoke the existing handle method and simulate error scenarios through its
inputs, ensuring the test properly triggers and verifies error handling during
repository synchronization.

@arkid15r
Copy link
Collaborator

@arkid15r Hello, I have added an organization flag, removed the bulk save stuff. Thank you very much for your reviews, they are really helpful.

I'm still unsure about this part of the issue, Use existing organization repositories linked projects for linking the new repositories to projects. If an organization has more than 1 project (get via Repository model) log an error and skip repository project setting. Please update the code as required, if there's anything more I can do, let me know!

Thank you!

Thank you, I implemented the required logic -- could you adjust the tests when you get a chance?

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
backend/tests/apps/github/management/commands/github_update_owasp_related_organizations_test.py (3)

20-26: Remove unused fixture.

The mock_gh_repository fixture is defined but never used in any of the tests.

Apply this diff to remove the unused fixture:

-@pytest.fixture
-def mock_gh_repository():
-    repo = mock.Mock()
-    repo.name = "test-repo"
-    repo.html_url = "https://github.com/TestOrg/test-repo"
-    return repo
-
-

163-164: Consider verifying print output instead of just suppressing it.

All tests patch builtins.print but don't verify what gets printed. If the command uses print for user output, consider capturing and asserting on the output.

Example of capturing and verifying print output:

-        with mock.patch(BUILTINS_PRINT):
-            self.command.handle(organization=None)
+        with mock.patch(BUILTINS_PRINT) as mock_print:
+            self.command.handle(organization=None)
+        
+        # Verify expected output
+        mock_print.assert_any_call("Processing TestOrg1...")

Also applies to: 180-181, 193-194, 204-205, 217-218, 238-239


245-245: Add newline at end of file.

Add a newline character at the end of the file to follow best practices.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2c3eb9f and a1517df.

📒 Files selected for processing (1)
  • backend/tests/apps/github/management/commands/github_update_owasp_related_organizations_test.py (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
backend/tests/apps/github/management/commands/github_update_owasp_related_organizations_test.py (1)
backend/apps/github/models/organization.py (1)
  • related_projects (53-61)
🔇 Additional comments (1)
backend/tests/apps/github/management/commands/github_update_owasp_related_organizations_test.py (1)

100-244: Excellent test coverage!

The test suite comprehensively covers various scenarios including:

  • Multiple organizations with different repository counts
  • Specific organization filtering
  • Error handling for invalid tokens and missing organizations
  • Exception handling during repository synchronization

The use of parameterized tests and well-structured mocks makes the tests maintainable and easy to understand.

Copy link

sonarqubecloud bot commented Jul 1, 2025

@rudransh-shrivastava
Copy link
Contributor Author

@arkid15r I've adjusted the tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement external organizations sync
2 participants