What Happens During a Web Application Penetration Test? A Step-by-Step Walkthrough
Manual web application penetration testing is a security assessment in which certified testers attempt to exploit vulnerabilities in a web application using the same techniques a real attacker would use, without relying solely on automated scanners. Unlike vulnerability scans, a manual test verifies exploitability, chains multiple weaknesses into attack paths, and assesses business logic flaws that automated tools cannot detect. The result is a confirmed, prioritised list of vulnerabilities with proof-of-concept evidence and remediation guidance.
If you have signed off on a Statement of Work and are now wondering what actually happens between kickoff and final report delivery, this walkthrough covers the full engagement lifecycle. Each phase maps to specific techniques, tooling decisions, and the types of findings each phase is designed to surface.
The Seven Phases of a Manual Web Application Penetration Test
A structured engagement runs through these phases in sequence, though reconnaissance and enumeration often continue in parallel with active testing as new attack surface is discovered.
- Scoping and pre-engagement
- Reconnaissance and OSINT
- Application mapping and enumeration
- Authentication and session testing
- Business logic and functional testing
- Exploitation and chaining
- Reporting and remediation validation
Phase 1: Scoping and Pre-Engagement
Before any testing begins, the engagement is defined in writing. This phase establishes the test scope (domains, environments, IP ranges), the rules of engagement (testing windows, rate limits, out-of-scope systems), and the testing methodology (black box, grey box, or white box).
Grey box is the default for most SaaS clients. Testers receive standard user credentials and basic application documentation, which lets them skip the early guessing stages and spend their time on deeper, more meaningful attack paths. Black box is appropriate when the client wants to simulate an external attacker with zero prior knowledge. White box includes source code, architecture diagrams, and admin credentials, and is most appropriate for pre-launch security reviews.
A scoping document that is vague costs you. If the staging environment is not explicitly included, the tester will not test it. If a third-party integration is not listed, it is out of scope. Before signing off, make sure the target list includes every subdomain, API endpoint group, and environment that matters.
See how IVASTA structures its scoping process for web application engagements.
Phase 2: Reconnaissance and OSINT
Reconnaissance maps everything publicly discoverable about the target before the tester sends a single authenticated request. This phase surfaces forgotten assets, exposed credentials, and technology fingerprints that shape the rest of the engagement.
Common reconnaissance activities include:
- Subdomain enumeration: passive sources (Certificate Transparency logs, SecurityTrails, Shodan) combined with active brute-forcing using wordlists tuned to the client's industry.
- Technology fingerprinting: identifying the application server, framework version, CDN, WAF presence, and JavaScript libraries via response headers, HTML comments, and file paths.
- OSINT for credentials: checking breach databases and public GitHub repositories for leaked API keys, credentials, or environment files belonging to the target organisation.
- Historical content analysis: Wayback Machine scraping to find endpoints that were removed from navigation but not from the server.
Reconnaissance findings directly influence which attack vectors get prioritised. An application running an outdated version of a framework with a known deserialization vulnerability moves to the top of the queue.
Phase 3: Application Mapping and Enumeration
Mapping builds a complete picture of the application's attack surface before exploitation begins. Testers spider the application, capture all traffic through an intercepting proxy (Burp Suite is standard), and catalogue every endpoint, parameter, file upload handler, and API route.
For API-heavy SaaS applications, this phase includes:
- Importing any available OpenAPI or Swagger specifications and comparing documented routes against what the server actually responds to.
- Identifying unauthenticated endpoints that should require authentication.
- Mapping parameter types across GET and POST requests, including hidden fields, JSON body parameters, and GraphQL query structures.
- Identifying role-based access differences by comparing responses from accounts at different privilege levels.
Phase 4: Authentication, Authorisation, and Session Testing
This phase targets the mechanisms that are supposed to control who can do what inside the application. It is where the most critical findings in web application testing tend to surface, particularly in multi-tenant SaaS products.
Authentication testing
Testers evaluate the login mechanism for weak lockout policies, username enumeration via differing error messages or response times, insecure password reset flows, and multi-factor authentication bypass techniques. If OAuth or SAML is in use, the implementation is tested for common misconfigurations, including open redirects in the redirect_uri parameter and state parameter predictability.
Authorisation and access control testing
Broken Object Level Authorisation (BOLA) is the most common critical finding in API-heavy applications. A tester checks whether a low-privilege user can access or modify another user's resources by manipulating object identifiers in API calls.
A typical BOLA test sequence looks like this:
Broken Function Level Authorisation (BFLA) is tested separately: can a standard user call admin-only API functions by directly accessing the endpoint, bypassing the UI that hides those controls?
Phase 5: Business Logic and Functional Testing
Automated scanners fail almost entirely at business logic testing. This phase requires a tester to understand what the application is supposed to do and then probe the gaps between that intent and the actual implementation.
Examples of business logic flaws found in real SaaS engagements include:
- Negative quantity manipulation in e-commerce flows that credit the attacker's account rather than debiting it.
- Price parameter tampering where the client-side sends the unit price to the server and the server trusts it without verification.
- Multi-step workflow bypass where a user can skip step two of a three-step verification process by jumping directly to step three.
- Race conditions in account top-up or coupon redemption endpoints that allow the same resource to be consumed twice by sending concurrent requests.
These vulnerabilities do not show up in a scanner report. They require a tester who has read the product documentation, created legitimate accounts, and understands the financial or operational impact of a successful exploitation.
Phase 6: Exploitation and Attack Chaining
Active exploitation confirms that a vulnerability is genuinely exploitable, not just theoretically present. This is the distinction that separates a manual penetration test from a vulnerability scan. Testers attempt to achieve proof-of-concept exploitation for every confirmed vulnerability, stopping at the agreed scope boundary.
Attack chaining is where manual testing provides compounding value. Individual vulnerabilities that appear low severity in isolation can become critical when combined:
Each chain represents a realistic attacker path. Reporting the vulnerabilities individually would understate the actual risk. Chaining them demonstrates real-world business impact.
Phase 7: Reporting and Remediation Validation
A penetration test is only as useful as its report. The deliverable from an IVASTA engagement includes a full technical report with an executive summary, a risk-rated findings register, and proof-of-concept evidence for every confirmed vulnerability.
How findings are structured
Each finding contains: a CVSS score with the scoring rationale, a description of the vulnerability class, step-by-step reproduction instructions, a screenshot or terminal output confirming exploitation, the business impact of successful exploitation, and specific remediation guidance written for your development team.
Findings are rated on a five-tier scale: Critical, High, Medium, Low, and Informational. A Critical finding means an unauthenticated attacker can achieve account takeover, data exfiltration, or remote code execution with a single exploit. Informational findings are documented for completeness but carry no immediate remediation obligation.
Remediation validation
After developers address findings, a remediation validation (also called a retest) confirms that each fix works as intended and has not introduced a regression. IVASTA includes one round of remediation validation within the engagement scope for all Critical and High severity findings.
Why Manual Testing Finds What Scanners Miss
Automated vulnerability scanners are good at pattern matching against known signatures. They reliably identify unpatched software versions, basic SQL injection in unsanitised GET parameters, and common misconfigurations. They cannot reason about application behaviour.
IVASTA's testers hold OSCP certifications and conduct every engagement manually. Automated tooling is used for enumeration and traffic capture, not for the determination of exploitability. A scanner that reports a reflected XSS as a finding has not confirmed the XSS is exploitable through the WAF, or that it can reach a logged-in admin session. A tester has.
Request a Scoping Call
If you are preparing for a SOC 2 Type II audit, a customer security review, or a pre-launch security assessment, a web application penetration test is the most direct way to find and fix vulnerabilities before an attacker does. IVASTA will scope your engagement within 24 hours and deliver a proposal within 48 hours. Request a scoping call at IVASTA Security and a senior tester will reach out directly.


.png)
.png)
.png)
