Fetching latest headlines…
Beyond Unit Tests: Implementing BDD and Penetration Testing in GBIM
NORTH AMERICA
🇺🇸 United StatesMay 11, 2026

Beyond Unit Tests: Implementing BDD and Penetration Testing in GBIM

0 views0 likes0 comments
Originally published byDev.to

Claim Summary

This claim demonstrates that the GBM project has implemented advanced testing across four key areas:

Area Tool/Approach Primary Evidence
Security Testing GitLab SAST, Dependency Scanning, Secret Detection, OWASP ZAP baseline Successful FE security pipeline; availability of BE secret detection, BDD abuse paths, and ZAP baseline jobs.
Penetration Testing Abuse-case testing based on OWASP Top 10 8 abuse path scenarios covering login, registration, activation tokens, and admin submissions.
BDD Behave Django and Playwright BDD BE: 3 features, 12 scenarios, 57 steps passed. FE: 3 Playwright BDD tests discovered with a successful pipeline.
Stress Testing Locust Scenarios for registration bursts, authenticated activities, QA, and admin dashboards prepared as manual headless runs with HTML/CSV output.

The strongest claims lie in BDD, penetration testing, and the frontend security pipeline, all of which show "green" execution outputs. For stress testing, the evidence focuses on Locust's manual run readiness and realistic user scenarios rather than pipeline gates, as results are heavily influenced by staging capacity, seed data, and email side effects.

MR Links and Commits

Merge Requests

Key Backend Commits

  • 64046e3c: Baseline setup for security templates, BDD, and load-test documentation.
  • 8b75a9ed: Backend BDD hardening with real DRF endpoints and penetration test scenarios.
  • f9f62adb: Added behave and behave-django to requirements.txt for environment consistency.
  • dd64fdf1: Finalized CI policy by removing k6 pipeline gates in favor of manual Locust runs.

Key Frontend Commits

  • fff3cb1f: Baseline setup for security templates and Playwright BDD scaffolding.
  • dd625101: Hardened BDD with API-backed flows and CI e2e setup.
  • 1e5beadb: Enabled MR workflow with security analyzers and auto-run BDD discovery.

Execution Evidence

Backend Local Verification

Check Result
Django System Check System check identified no issues
Behave BDD 3 features, 12 scenarios, 57 steps passed; 0 failed
Regression Subset 65 tests passed (Auth and Admin views)
Locust Package locust 2.43.4 verified in virtual environment
Staging Baseline POST /api/auth/register/ returned HTTP 202 in 5.13s

Frontend Verification

The frontend pipeline 279116 shows successful completion for e2e-bdd, semgrep-sast, secret_detection, and sonarqube.

Demonstrating Advanced Testing Tools

Security Testing

We integrated GitLab security templates to automate scanning for vulnerabilities. The frontend pipeline confirms that semgrep-sast, gemnasium-dependency_scanning, and secret_detection run to completion. The backend supplements this with OWASP ZAP baseline jobs and abuse-case BDD tests.

Penetration Testing

Instead of a manual checklist, we implemented penetration testing as functional abuse-case tests mapped to the OWASP Top 10 2021:

  • Broken Access Control (A01): Verified that non-admins receive a 403 Forbidden when accessing admin endpoints.
  • Insecure Design (A04): Registration attempts are throttled (429) after the limit is reached.
  • Identification and Authentication Failures (A07): Brute force protection triggers a lockout (429) on the login endpoint.
  • Software and Data Integrity Failures (A08): Mass assignment attempts (e.g., trying to register as a superuser) are rejected with a 400 error.

BDD (Behavior-Driven Development)

We use Gherkin syntax to create executable specifications.

  • Backend: Uses behave-django to test registration and admin flows.
  • Frontend: Uses Playwright BDD to ensure UI interactions reflect real-world user stories while communicating with the actual backend API.

Stress Testing

Locust scripts model realistic user behavior for headless execution:

  • RegistrationBurstUser: Simulates high-volume registration at the start of a semester.
  • AdminDashboardUser: Profiles the latency of admin listing and status updates.
  • Note: We consciously avoid using stress tests as pipeline gates because staging environments and rate limits make results non-deterministic.

Measurable Project Benefits

Issue Before Advanced Testing Improvement Concrete Data
BDD was only a scaffold without real endpoint logic Steps now call actual DRF APIs and assert DB/token states 12 scenarios passed
Registration was vulnerable to email enumeration Duplicate emails now return generic responses Scenario returns 202
Stress scripts used outdated API contracts Updated payloads to match current serializers Zero noise from malformed requests
Non-admin access lacked regression guards BDD validates 403 responses for unauthorized roles Admin endpoints secured

iterature and Best Practices

  • OWASP Top 10: Guided our penetration testing scenarios for access control and authentication.
  • Martin Fowler’s Test Pyramid: We maintain a solid base of unit tests while adding targeted BDD for critical user journeys.
  • Cucumber/Gherkin BDD: Provided a bridge between technical implementation and readable business logic.
  • GitLab DevSecOps: Integrated security scanning directly into the MR workflow to catch vulnerabilities early.

Critique and Quality Improvements

Critique 1: Testing lacked executable specifications for critical flows

Previously, registration and admin approval relied on scattered unit tests. While functional, they didn't provide a readable end-to-end journey for reviewers.
Improvement: BDD feature files now serve as the source of truth for how the system should behave, verified by 57 successful steps in the backend.

Critique 2: Security tests were not explicitly modeled as abuse cases

We had basic auth tests but no unified suite mapping OWASP risks to expected responses.
Improvement: We created a specific penetration test report and BDD scenarios that explicitly test for token tampering, expiry, and brute force protection.

Critique 3: Stress testing metrics were previously misleading

Initial stress scripts used incorrect payloads and flagged 429 (Rate Limit) as a system failure.
Improvement: Scripts now use the actual API contract. We also distinguish between expected protection (a 429 response) and unexpected failure, ensuring our metrics reflect real system health.

Conclusion

This claim meets the advanced testing criteria because:

  1. Tools are integrated and utilized: Behave, Playwright, GitLab SAST, and Locust are active parts of the development cycle.
  2. Logic over automation: We understand that a 429 response is a success for a security filter, not a failure for the system.
  3. Measurable impact: 12 BDD scenarios and 65 regression tests guard against critical business logic failures.
  4. Continuous Improvement: We identified weaknesses in our previous testing strategy and refined them to include repeatable security scans and realistic load profiles.

Comments (0)

Sign in to join the discussion

Be the first to comment!